Embed Documents Using Ollama – OllamaEmbeddings

Written by

in

You can now create document embeddings using Ollama. Also once these embeddings are created, you can store them on a vector database. You can read this article where I go over how you can do so.

from langchain_community.embeddings import OllamaEmbeddings
ollama_emb = OllamaEmbeddings(
    model="mistral",
)
r1 = ollama_emb.embed_documents(
    [
        "Alpha is the first letter of Greek alphabet",
        "Beta is the second letter of Greek alphabet",
        "This is a random sentence"
    ]
)
r2 = ollama_emb.embed_query(
    "What is the second letter of Greek alphabet"
)

Let’s inspect the array shapes-

print(np.array(r1).shape)
>>> (3,4096)
print(np.array(r2).shape)
>>> (4096,)

Now we can also find the cosine similarity between the vectors –

from sklearn.metrics.pairwise import cosine_similarity
cosine_similarity(np.array(r1), np.array(r2).reshape(1,-1))
>>>array([[0.62087283],
       [0.65085897],
       [0.36985642]])

Here we can clearly see that the second document in our 3 reference documents is the closest to our question. Similarly, you can also create embeddings from your text documents and store them and can later query them using Ollama and LangChain.

artificial-intelligence Cosine Similarity Data Science Embedding Search Embeddings Machine Learning numpy Ollama Python vector database

Comments

Leave a comment Cancel reply

More posts