The race for open source language models is heating up
Meta is reportedly preparing a commercially-permissive LLM. It's been looming for some time now.
Author’s Note: There will be no issue coming out next Tuesday in observance with the national holiday on Monday. Tuesday’s issue will instead be coming out on Thursday.
While OpenAI has made it incredibly easy to deploy large language models in production, the pace of development for open source models has grown at a blistering pace—and it looks like it might speed up even further.
The Information reported this week that Meta is working on ways to make its language model LLaMA, currently available for research use only, commercially open source. Meta has been evaluating a fully open-sourced version of LLaMA since as early as May, including determining which size models to make available under an open source license. (I had heard at the time that one model under consideration, for example, had 3 billion parameters rather than its smallest 7 billion parameter model.)
What’s changed in the 6 weeks, though, is a wave of open source models that have emerged—both expected (such as MosaicML’s MPT-7B) and seemingly out of nowhere. The most significant launch in recent weeks was one of the latter in the form of Falcon 40B (and its sibling 7B version), a model out of UAE’s Technology Innovation Institute.
Initially under a license that would require royalties, the model abruptly changed to a more commercially permissive license, creating an immediately available high-performance open source model comparable with LLaMA that companies could potentially deploy in production. (Whether it actually “beats” LLaMA is still a subject of debate.)
The appeal of commercially open source models is pretty straightforward: a company that develops it essentially has enormous buy-in from the community and an opportunity to shape how the industry develops. Meta’s PyTorch is a great example, which has become the standard deep learning framework that powers most machine learning tools today.
Sources I spoke with speculated that the release of Falcon could accelerate Meta’s timetable for releasing a commercially-permissive version of LLaMA. My understanding is that the timetable for that release is still unclear, though progress has been made in its development in the last month.
It's still unclear what Meta's final open source model might look like, or if it will release a family of models like its initial LLaMA cluster. Models for commercial use are going to end up looking very different and face considerably more scrutiny than one just available for research purposes, particularly as many clamor for increased regulation of AI development.
All these recent developments have sparked what seems to be an even faster race to push out more powerful, commercially available open source models.