How Notion is tackling one of AI's biggest looming problems
If RAG is the solution to hallucinations, how do you determine who gets to see what? Plus: another technique to improve performance emerges in the wild.
11/17 Update: Friday’s issue will be moved to next week due to some scheduling issues.
Retrieval augmented generation (or RAG) seems to be increasingly a go-to solution for trying to combat hallucinations when using AI—at least as a first step before fine-tuning or pre-training your own model.
But as a new product coming out of Notion launches today shows, large-scale applications of RAG are running head-first into a pretty classic problem in cloud computing: identity and access management (or IAM). Or, more broadly, who gets to see what when they fire off a question to an AI model augmented by RAG.
Notion today launched a universal Q&A assistant for querying information within a given Notion domain. This is essentially a colossal RAG operation on top of a few other techniques to provide up-to-date, relevant information from directly within a Notion workspace through an assistant. But if a customer uses Notion as its kind of internal operating system—which could include information on sales, salaries, or other sensitive information—it has to ensure there are aggressive guardrails on some of its usage.
“Most people talk about RAG as a demo,” Notion CEO Ivan Zhao told me. “This is RAG for tens of millions of people.”
Even if the is just for internal use, larger organizations have to silo certain pieces of information. It could be for security reasons to ensure that information doesn’t leak out into a broader group, or a company might need to satisfy stricter IAM requirements by default. In that sense, Notion’s Q&A assistant can’t just do a blind retrieval for information. It also does an identity check on whether or not a user can actually retrieve a piece of information for a prompt, adding a deceptively challenging layer of complexity to the whole process.
RAG offers an enormous appeal to mitigate hallucinations when using AI models like OpenAI’s GPT-series. But larger organizations that are accustomed to siloing some information will have to navigate the growing problem around IAM for that information. And it places yet another blocker on implementing AI tooling internally at large organizations that can’t simply drop in an API and ship their data over to OpenAI.
On top of that, Notion wants to make its Q&A assistant (and the act of calling it) deeply integrated into its user experience to the point that it’s accessed with a keyboard shortcut. But that information doesn’t just need to be accurate. It has to be relevant and up to date to justify a company paying for it.
And any analyst or data scientist who’s tried to sift through dozens of deprecated tables for the correct information—or even the right column name—knows how difficult that last part can be.
But to get all that done in the first place, Notion had to tackle a two-pronged problem: embedding a constantly updating universe of data for accurate information retrieval, and finding a way to put a permissioning experience on top of that in the embedding process.
RAG, at the scale of tens of millions
While some companies had been deploying RAG for a while—Snowflake’s Neeva was doing it at around the beginning of the year—the technique became more popular over the summer. It offered a way to split the difference between the accuracy loss of an off the shelf model and having to go through the costly process of augmenting and customizing one through fine-tuning.
RAG’s killer use case was, and still is, data governance. In addition to pulling in information from alternate data sources, it can offer a direct citation of its origin to show that it’s grounded in fact. But at the scale of a company of Notion with less uniform data, the problem gets more complicated.
Neeva’s challenge—along with other search tools like You.com and Perplexity—was to index the web in such a way that you could retrieve up-to-date, accurate information. But pages on the web don’t really update that often, if at all. An FAQ for a product or developer tool, for example, might only get modest updates to align with product changes (if it even gets that). And an article (like this one) or a recipe probably won’t change at all.