We had seen interesting developpments around vector databases, but then people stopped hyping them as you could just save them in normal databases without real differences. I wonder what will happen when the models can freely access them though.
I really don't understand how people figure out the vectors to actually store in the databases, regardless of the underlying storage model.
Isn't that itself the province of an LLM? Say I have a bunch of text. How do I save the text search "by similarity"? Sphinx and semantic search was hard, I remember. Facebook had Faiss. And here we are supposed to just save vectors on commodity hardware BEFORE using an LLM?
1. Take a bunch of text, run it through an LLM in embedding mode. The LLM turns the text into a vector. If the text is longer than the LLM context window, chunk it.
2. Store the vector in a vector DB.
3. Use the LLM to generate a vector of your question.
4. Query the vectordb for all similar vectors (that fit in the context window)
5. Get the text from all those vectors. Concatenate the text with the question from step 3.
6. Step 5 is your prompt. The LLM can now answer your question with a collection of similar/relevant text already provided to the LLM in the context window along with your question.
You don't even need an LLM. You can use Word2Vec, or even yank the embeddings matrix from the bottom layer of an LLM. And you can use CLIP and BLIP for images and audio, respectively.