OpenAI makes its move to fend off a new wave of startups

Rather than putting out the best model embeddings, it's trying to beat everyone on cost. That should sound familiar.

Jan 25, 2024

∙ Paid

a friendly robot tailor standing in front of a sewing kit threading a needle through a large green sweater, the robot has measuring tape around its shoulders, sunday comics aesthetic, --ar 4:3 — midjourney

Update: About a day after the announcement for OpenAI’s new embedding models, Together AI announced a large price drop for its own embedding models. You can find the details (and more) in the issue which went out the next day!

For nearly a year after the launch of ChatGPT, the development of embeddings models—the tools that convert data into something usable for AI—was pretty quiet.

OpenAI enjoyed a very long honeymoon without any significant challenger with its embeddings model, ada-002, which was the easiest to use thanks to it being accessible through an API. And at the time there were even more performant embeddings models available to developers, but they weren’t served as an easy-to-use endpoint to the degree of OpenAI’s dead-simple API.

That’s changed very rapidly in just the last three months. Just a few weeks ago, Together AI said it would make available a large number of embedding models that outperformed ada-002, one of which for nearly one quarter of the price of ada-002, through an endpoint. Anyscale, another startup, did the same in November, which was again half the price of ada-002. And we also even saw around that time the launch of serious competitors in the form of a new model from Cohere and a new startup in Voyage AI.

Now, OpenAI is making its move to try and reclaim the space, and doing it in what feels like a very classic OpenAI fashion: winning on costs and betting on its platform.

OpenAI released details on two new embeddings models today, a small and large version called text-embedding-003, alongside a price drop for GPT-3.5 Turbo. Each model is superior to ada-002 and looks to bring it into more direct competition with challengers that have emerged with easy-to-use—and cheaper—endpoints accessible through an API than its classic embeddings model.

Put in the context of the massive wave of developments around November last year, it makes sense why OpenAI pushed the price of ada-002 down a whopping 75% in June last year, effectively presaging the wave of startups and launches beginning in around November. Before the price drop, one well-connected person I talked to joked that they loved ada-002, but wished it weren’t so expensive.

The growth of importance of embedding models (which, to be clear, were already important) dovetails with the proliferation of retrieval augmented generation (or RAG). RAG has essentially become a de facto counter to model hallucinations by retrieving relevant information for a prompt to improve its accuracy and performance—information that has to be embedded in some readily accessible format in the first place.

OpenAI is now fending off startups from all sides of its business, with embeddings emerging as a potential large opening given the industry’s increasing reliance on RAG. And OpenAI’s new models, based on standard benchmarks, don’t quite win on performance. But its smaller model isn’t trying to just win on price: it’s trying to crush the competition on it.

Finding the sweet spot between performance and cost

OpenAI’s new embeddings models come in multiple flavors, including a “small” and a “large” one with differences in the way they generate embeddings. Each also offers additional layers of customization in the form of dimensions, to better optimize an embedding setup. But we’ll set aside the discussion for that for a minute and just focus on the standard benchmark for embeddings—called MTEB—since that’s the one they are focusing on in their announcement.

(Again, this is predicated on how much value you place on standard benchmarks, particularly with even some large companies trying to lightly game them to look good.)

OpenAI’s new models come against the backdrop of launches of both proprietary and open source models offered through an endpoint through Cohere, Anyscale, Mistral, Voyage, and Together AI. Some of these models come in a number of different flavors (with Voyage offering a code-optimized one, which also came out this week) at different price points. For the sake of brevity, though, we can try to take a birds-eye glance at all of them.

Let’s take a look at how all of them compare, based on launch blog posts and the very handy Hugging Face MTEB evaluation leaderboard:

OpenAI makes its move to fend off a new wave of startups

Rather than putting out the best model embeddings, it's trying to beat everyone on cost. That should sound familiar.

Finding the sweet spot between performance and cost

This post is for paid subscribers