I recently had to build a search engine for Hallucipedia. Instead of using full-text search or a vector-based search, which are standard techniques, I decided to build a novel search engine using large language models (LLMs).
Full-text-based search engines are great for finding exact matches, but they struggle with natural language queries that don’t match the exact text in the database. Vector-based search engines are better at understanding semantic meaning, but they can be difficult to set up.
Given that language models already have built-in vector representations of text, could it be possible to use a language model to perform searches instead of relying on full-text or vector-based search?
The answer is yes, and the fundamental idea is as follows:
I have deployed this alternative search engine strategy at Hallucipedia, and it works surprisingly well. The search results are more accurate than full-text search because it sometimes includes strings not exactly in the text, and it is also significantly easier to implement than a vector-based search engine.