Building a Search Engine with LLMs

June 8, 2025 · Ryan X. Charles

I recently had to build a search engine for Hallucipedia. Instead of using full-text search or a vector-based search, which are standard techniques, I decided to build a novel search engine using large language models (LLMs).

Full-text-based search engines are great for finding exact matches, but they struggle with natural language queries that don’t match the exact text in the database. Vector-based search engines are better at understanding semantic meaning, but they can be difficult to set up.

Given that language models already have built-in vector representations of text, could it be possible to use a language model to perform searches instead of relying on full-text or vector-based search?

The answer is yes, and the fundamental idea is as follows:

First, for all text you wish to make searchable, ask the language model to produce a list of search terms that are relevant to the text. Accumulate these search terms into a database.
Second, when the user searches for something, ask the language model to produce a list of search terms relevant to the query. Then, look up these search terms in the database and return the results.

I have deployed this alternative search engine strategy at Hallucipedia, and it works surprisingly well. The search results are more accurate than full-text search because it sometimes includes strings not exactly in the text, and it is also significantly easier to implement than a vector-based search engine.

Building a Search Engine with LLMs

Earlier Blog Posts

Introducing Chatvim: Chat with Markdown Files in Neovim

Introducing Artintellica: Open-Source AI Resources for Learning Machine Learning

Why I Switched Back to Node.js from Bun

Why I Switched Back to Zsh from Nushell

Why I Moved Back to California from Texas