My new startup idea: make developing LLM apps way easier

Throwing my hat in the ring. Looking for cofounders and investors!

Oct 07, 2023

I’ve spent the past month doing a deep dive into LLMs. I went from being deeply afraid of generative AI to becoming a true believer.

And let me tell you, developing LLM-based apps sucks.

There are lots of great startups working on making different parts of the process suck less. For example, the fine folks at Brev.dev, who just launched their v2, are solving for finding GPU compute and provisioning.

In my opinion, however, the greatest pain point is that we haven’t created effective idioms and patterns for AI-based app development, which I wrote about before. Nor am I the only one:

We need a dead simple way for developers to make AI apps, one that recognizes that we’re in a new, post-classical programming world.

We need Next.js for LLM apps.

Why OpenAI and Replicate aren’t enough

Currently, if you want an easy-ish experience, you use OpenAI or Replicate. They provide easy-to-use APIs that don’t require you to know anything about AI. You can use in-context learning or fine-tuning to get inferences that work for your app.

In the process, you pay an arm and a leg for access to a black box. There’s no guarantee that the black box will keep working. You’re not allowed to send any personally-identifiable information.

Most importantly, you can’t build proprietary equity. Models are becoming the new data, the secret sauce that makes your company valuable to investors and differentiates your product from the competition. When you rely on someone else’s AI PaaS, your locking yourself into their proprietary system without any easy way to transition if poo hits the fan.

Why Hugging Face and Llama aren’t enough

Many developers, recognizing the above, turn to Llama and Hugging Face to fine-tune models and create products. The problem with this approach is that it’s far too low-level.

Seriously. Look at this API for fine-tuning from Hugging Face’s TRL:

dataset = load_dataset("imdb", split="train")

peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

trainer = SFTTrainer(
    "EleutherAI/gpt-neo-125m",
    train_dataset=dataset,
    dataset_text_field="text",
    peft_config=peft_config
)

trainer.train()

To even begin to approach understanding what is going on in the above code, you need to learn what supervised fine-tuning, LoRA, and PEFT are, never mind how to “un-PEFT” your model from checkpoints once you’re ready to run inferences. And even then, we’re nowhere even close to making this code production-ready.

Adding LLMs to your app shouldn’t involve learning AI research. It should just work, and it should be exposed to a reasonably experienced developer in a way that they could tweak their model expressively without any more effort than learning React or Svelte.

The vision

I want to take this crazy stack:

Looking at this chart makes my head hurt

and simplify it into this stack:

Key ideas:

An open source framework like Next.js for building LLM apps. This means building React for LLM apps first: a framework that encapsulates opinionated best practices for handling data flows and deriving user-visible state from them. It introduces a vocabulary of concepts—like “hooks,” “update lifecycle”—that devs can use to expressively reason about LLMs, only exposing underlying ML theory when it delivers business value. Then we build a meta-framework, similar to Next.js, to allow for creating a complete app.
Streamline logging and orchestration, making them understandable to devs. The gold standard for logging is currently Weights and Balances, which requires becoming an ML engineer to interpret correctly. Let’s creating logging that allows devs to get useful insights without AI expertise. Let’s also create simple orchestration—more Heroku Elements than LangChain—that enables combining LLM providers through a plug-and-play interface.
The user owns the model, always. They can always export their model to another provider, and we never use their data to train our own models. Their model is their business, their model is sacrosanct.

What do I need to do this?

More like, what we need to this. I want to do this, but this is a crazy ambitious project that’s going to take many hands.

I need a team! A cofounder who’s passionate about using AI to improve the future and who cares about helping developers in this new, post-classical programming world would be a great start. I can build hella developer experience but I need folks who are great at marketing and/or really know what they’re doing with LLMs.

I need funding and mentorship! I will apply to Y Combinator for a third time. Their deadline for the winter batch is in a few days. Too bad I have scarcely more than an idea, but maybe if I build in the open for the next few days it will attract the right energy from the universe? Perhaps an angel?

I think in the meanwhile that I will try to make a small product with better logging for training models to validate at least part of the concept and charge for the service.

Maybe someone is already working on this. Maybe we can join forces.

If you can help, or if you can use my help, contact me: me@ersinakinci.com.

What do I call it?

Working title is PricklyPear.ai (bought the domain, so don’t even try 🤠). Because Atacama, a desert where llamas live, is apparently already taken.

Ersin Akinci

Discussion about this post