You can now create document embeddings using Ollama. Also once these embeddings are created, you can store them on a vector database. You can read this article where I go over how you can do so.
from langchain_community.embeddings import OllamaEmbeddings
ollama_emb = OllamaEmbeddings(
model="mistral",
)
r1 = ollama_emb.embed_documents(
[
"Alpha is the first letter of Greek alphabet",
"Beta is the second letter of Greek alphabet",
"This is a random sentence"
]
)
r2 = ollama_emb.embed_query(
"What is the second letter of Greek alphabet"
)
Let’s inspect the array shapes-
print(np.array(r1).shape)
>>> (3,4096)
print(np.array(r2).shape)
>>> (4096,)
Now we can also find the cosine similarity between the vectors –
from sklearn.metrics.pairwise import cosine_similarity
cosine_similarity(np.array(r1), np.array(r2).reshape(1,-1))
>>>array([[0.62087283],
[0.65085897],
[0.36985642]])
Here we can clearly see that the second document in our 3 reference documents is the closest to our question. Similarly, you can also create embeddings from your text documents and store them and can later query them using Ollama and LangChain.