Why search is going to change forever

AI Agent

Why search is going to change forever

How AI is reshaping search- from keyword matching to semantic and agentic search, enabling complex research across massive datasets in minutes instead of months.

Nadav Orren

Min read

04 Jun 2025

Imagine you’re trying to find the perfect recipe for a dinner party. One of your guests is vegetarian, so you want something that’s vegetarian and can feed eight people. In the old days, you’d flip through cookbooks for hours, checking indexes and skimming pages, until you found something you wanted to make. Today, you might Google “vegetarian dinner party recipes” and scour the results until you find something you’re satisfied with, or until you’re out of patience.

But here’s the thing: traditional search methods are surprisingly limited. They work by matching keywords rather than understanding what you actually want. If a perfect recipe exists but describes itself as a “plant-based recipe for your next gathering” instead of “vegetarian dinner party recipe” you might never find it.

Traditional search is even worse when it comes to complex questions that require piecing together information from multiple sources. Let’s say you’re trying to buy a house and you want to find neighborhoods where you can afford a 3-bedroom home, with good schools for your kids, a reasonable commute to your downtown office and low crime rates. A traditional search might find you real estate listings for 3-bedroom houses, separate articles about school district rankings, different websites about commute times, and crime statistics from various sources — but you’d have to manually cross-reference all this information to compile your best options. You’d end up with dozens of browser tabs open and hours of menial work.

For decades, this was simply how search worked. Librarians, researchers, and curious individuals spent countless hours manually sifting through information, following footnotes, cross-referencing sources and synthesizing findings.

Then came Large Language Models (LLMs) — AI systems like ChatGPT that can understand and generate text. These models seemed to promise a revolution in how we find and process information. Instead of just matching keywords, they could actually understand meaning, context, and nuance. They could read a recipe and understand that “plant-based” and “vegetarian” are essentially the same for our purposes.

On the surface, it sounds like these models can solve any search problem with ease, but there’s a big catch. Even the most powerful AI models today can only hold so much information in their “working memory” at once. Currently, even the most expensive models struggle to reliably process more than a few dozen pages of text without getting confused or producing lower-quality responses. Even if we are willing to tolerate low quality responses, all of these models also have a hard limit to the amount of text they will even attempt to read at once. This means they simply can’t be used for any tasks that require understanding more than hundreds of pages at once, at any quality. Large corporations nowadays can have millions of pages in their knowledge bases, and we all know how much data there is about almost any important subject on the internet. These models can’t come close to handling the vast amounts of information needed for serious research tasks.

Can we find a way to use AI models to search vast data in a smart, efficient way?

Semantic Search

This is where Retrieval Augmented Generation (RAG) comes in. Instead of loading all of the information we have to the model’s working memory, what if we could filter only the parts that are likely to be actually useful for the task at hand?

Let’s go back to our cookbook example, but make it more challenging. Imagine you have a digitized collection of 1 million cookbooks, and you want to create a list of the top 5 weirdest desserts that contain meat for a funny school project.

No AI model today can handle a million cookbooks worth of text. But here’s the thing: out of a million cookbooks, there are probably only a few hundred recipes that are both desserts AND contain meat. If we could somehow identify just those recipes and ignore all the rest, we could easily fit them into an AI model’s memory and get our answer in seconds.

But how can we go over millions of recipes and find just the right ones? We could send each single recipe to an LLM and ask whether it is relevant to the task, but when we are talking about millions of recipes, this would be ridiculously expensive and take years to accomplish!

We could also filter recipes by keywords, but devising a comprehensive list of all terms representing “meat” or “dessert” would be very difficult, and would almost certainly still miss some recipes and include some wrong ones.

The answer lies with embedding models. Think of these as translation systems that can take any short piece of text — a recipe, a paragraph, a product description — and convert it into a kind of universal coordinate system. Instead of storing recipes as words, we store them as points in a multi-dimensional space where similar recipes end up close to each other.

Here’s a simple (and simplified) way to picture it: imagine every recipe is a house in a huge city. These embedding models learn how to place each house such that neighboring houses are similar to one another. This would mean desserts end up clustered in one neighborhood and meat dishes are clustered in another neighborhood. On the border of these two neighborhoods we should find what we seek — these elusive meat desserts!

When you want to find “desserts that have meat in them,” you just ask the system to show you points on the map that are similar to that phrase in the query — houses that are close to the house our query would live in. The system doesn’t need to reread every recipe in light of each specific question that is asked. Instead, when a document is uploaded, we can read it once, place it as a point in our multi-dimensional map based on its meaning and subject. Once the map already exists, finding the neighbors of our query can be done in seconds with ease.

Note that this method is far more robust than traditional keyword search. Each point on the map is based on the characteristics of the recipe noted by the model. This means the model knows crab cakes are not the same as sponge cakes even though they share keywords that might fool traditional search methods.

Unfortunately, this approach still has limitations. Let’s say you find a recipe for “Oreo Bacon Cake” that looks perfect, but the instructions say: “For the chocolate base, follow steps 1–3 of the Classic Chocolate Cake recipe found on page 117.”

A traditional RAG search would retrieve the Oreo Bacon Cake recipe for the AI to work with, but it would miss the referenced chocolate cake recipe entirely. The AI would be like a cook trying to follow a recipe with missing steps — it could try to guess what “steps 1–3” might involve, but it wouldn’t have the actual information it needs.

If this happened to you as a human, you’d simply flip back a few pages to find the chocolate cake recipe and read those steps too. You might even notice that the chocolate cake recipe references a “Basic Buttercream” recipe on another page, so you’d look that up as well. You’d naturally follow the trail of information until you had everything you needed. Wouldn’t it be great if our AI models could do the same?

Agentic Search

This brings us to Agentic RAG — an approach that gives AI models the ability to conduct their own research just like a human investigator would. Instead of running one search and stopping, the AI can run multiple searches, follow leads, and piece together information from various sources. In this approach, we equip an LLM model with a suite of tools it can use for its research. Common tools to provide a model are:

Web search
Keyword search
Semantic search
Calculator

Then, we give the model a task, like compiling a list of the top 5 weirdest recipes in the database and figuring out how much their ingredients cost at Walmart. The model can now leverage its intelligence to craft clever queries and use them to search the data using its search tools. Critically, if the results of any search come short of answering the query, or they surface new avenues of investigation that might help complete the task, the model can continue to run more and more queries until it is completely satisfied with the results and feels it has the information it needs to complete its task.

Naturally, this paradigm also has its drawbacks. This approach is powerful, but it’s not perfect. Running multiple searches takes more time and costs more money. And, of course, no model is infallible — it still might choose to run imprecise and unproductive queries, or give up on the task too fast, or focus on the wrong details from the data it had obtained. And, as we already described, models are constrained with the amount of working memory they have — saturating the context window with previous research will lower the quality of the results.

Still, this approach is quite robust and has the power to handle very complicated queries over massive swaths of data, even when they require multiple investigatory steps and require consulting multiple different sources. This type of work, until now, could only be accomplished by experienced researchers and might have taken them months or even years. Now, it is feasible to perform these searches in mere minutes.

Looking Forward

As AI models become faster, smarter, and cheaper, agentic search will become even more powerful. We’re moving toward a future where asking an AI to “research the economic impact of climate change on coffee production in Guatemala” might be as simple and quick as asking “What is the capital of Guatemala?”

The real question isn’t whether this technology will improve — it’s how we’ll adapt to living in a world where comprehensive research and analysis becomes as accessible as sending a text message. Learning to use these tools effectively will likely make or break teams that need to work with a lot of documents and data.

If you’re interested in using state of the art AI tools to handle really complex tasks, you should check out pelles.ai/careers and apply!

My email is nadav.orren@gmail.com for any questions or comments.

‍

You might also want to check out these articles: