A new rival startup is emerging for OpenAI
AI endpoints are morphing into a race to the bottom, and OpenAI may have trouble winning on cost.
Well, that was short-lived!
OpenAI announced a series of updates to its embeddings models yesterday, one of which was one-fifth the price of its previous embeddings model—ada-002. That newest one, text-embedding-003-small, was briefly the cheapest embedding model available through an API. That distinction, though, lasted for just about 24 hours.
Now Together AI, a startup that’s focused on open source models and is quickly emerging as a big potential rival to OpenAI, tweeted (posted?) an update to its pricing for its embedding model endpoints—which we’ll get into specific details in a second—that have made two popular open source models available at a much cheaper rate than even OpenAI’s new models. And with the pricing update, Together AI again is challenging OpenAI at its own game: winning on price.
Embeddings models are becoming increasingly important as retrieval augmented generation, or RAG, becomes a de facto part of AI workflows. That allows models to retrieve relevant, up-to-date, and factual information to insert into prompts to both improve their effectiveness and ensure they are spitting out correct information. Earlier this month Together AI began serving open source embeddings models through an API, which were already cheaper than ada-002.
And open source models had already been more performant than OpenAI’s embeddings model based on standard evaluation benchmarks. The challenge for a long time, though, was you would have to serve them yourself—and spinning that up was complicated and potentially costly, even if the actual inferencing of it was substantially cheaper than ada-002.
OpenAI’s benefit is that it offers a single platform for companies—and enterprise deals could offer potential benefits across the board for all its products. There are obvious benefits to removing the complexity of managing a large number of APIs and procurement bills. After all, larger enterprises probably don’t want just one of its products, they will likely want several of what OpenAI offers: LLMs, text-to-image, text-to-speech, and speech-to-text.
But for companies looking to optimize on every angle (cost, performance, customization, control, and so on) the universe of APIs and tools available for those services OpenAI offers is expanding. It just might require some flexibility for managing the complex daisy chain of tools to power a product—something many enterprises have already shown they’re willing to do as indicated by the emergence of the modern data stack in the early 2020s.
OpenAI’s began to face challengers in embeddings starting late last year. Cohere and Voyage AI released embeddings models that were more performant than ada-002, and were equivalent in price. Anyscale also began serving a popular open source embeddings model at half the price of ada-002. And in January, Together AI began serving popular models at a fraction of the price of ada-002.
And perhaps more importantly, Together AI tells me, the startup plans to make one of the most performant open source embeddings models above available soon: e5-mistral-7b-instruct.
If anything, it’s another indicator of a race to the bottom for AI models offered through an API. While OpenAI essentially introduced the concept with the launch of its GPT-series models and its embeddings model, ada-002, Together AI has quickly assembled architecture for serving open source models in the same way. And switching to Together AI’s endpoints amounts to just changing a few lines of code.
It already seemed like Together AI was coming together as a potential rival to OpenAI, built on the success of the open source community, with its suite of endpoints. Together AI most recently raised $102.5 million in a series A round led by Kleiner Perkins, with one tranche valuing the startup at $565 million. (The company’s original seed funding was led by Lux Capital.)
With the release of Mixtral, Together AI was able to serve a model that was in the ballpark of GPT-3.5 Turbo at what at the time was a fraction of the price of OpenAI’s model. The same day OpenAI released its new embedding models, it also announced a price drop to GPT-3.5.
But with Together AI quickly responding to OpenAI’s attempt to command demand in the embeddings model market by offering a substantially lower price, it’s clear we’re starting to see a new rivalry come together.
A rivalry built on winning on price
OpenAI’s embeddings model enjoyed a long honeymoon without any real competitive API option. Even before the proliferation of RAG architecture, embeddings remained an important part of the AI development process. Information across a variety of unstructured forms has to be made available for use in AI workflows in the first place, and ada-002 was largely the default.
OpenAI abruptly slashed the cost of its embeddings model in June last year by a whopping 75%, to the surprise (and delight) of a lot of developers I spoke with at the time. It wasn’t entirely clear why OpenAI would take such a drastic move other than to further cement its platform as the first option for AI development. The reasons, though, materialized a little more than four months later.
OpenAI essentially decided to go after winning on price again with its newest embeddings models launched this week, bringing the cost down to around 20% of what ada-002 previously cost. But with the updates to Together AI’s APIs announced today, let’s take a look at where everything lands on pricing: