Embeddings in LangChain With Examples

We have already gone through two of the building blocks of creating a RAG pipeline, document loaders and text splitters in LangChain. In this article, we’ll explore how LangChain embeddings transform raw text into meaningful vectors that truly capture its semantic essence.

Embeddings in LangChain

In LangChain, embeddings are numerical representations of text that capture the inherent semantic meaning. This enables machines to perform semantic search, where comparisons are driven by meaning and concepts rather than mere keyword matches.

For creating such embeddings, embedding models (like OpenAIEmbeddings, GoogleGenerativeAIEmbeddings, OllamaEmbeddings) are used which transform raw text, such as a sentence, paragraph, or tweet, into a fixed-length vector of numbers that captures its semantic meaning.

What is semantic meaning

Now, the first question is what exactly is this "semantic meaning"? Consider the following four sentences.

I am running to the market.
I am heading to the market in a hurry.
I am on my way to the market.
I am rushing off to the market.

If you notice all of the four sentences convey the same meaning- sense of motion and urgency. So, in terms of embeddings, each version would produce embeddings that sit close together in semantic space, since they all express the same core intent: you’re moving toward the market.

This closeness is exactly what makes semantic search powerful, queries with slightly different wording but similar meaning will retrieve the same or related results.

How does embedding model work

Let’s break down how an embedding model transforms raw text into vectors that capture its meaning. If we take the simple raw text "I am running to the market" as example.

Text input

You start with the raw text: "I am running to the market".

Tokenization

The text is split into smaller units (tokens). Depending on the embedding model, these could be word by word (I, am, running...) or subwords ("run", "ning").

For example, produced tokens may look like this- ["I", "am", "run", "ning", "to", "the", "market"]

Mapping Tokens To IDs

Each token is mapped to a unique integer ID using the model’s pre-defined vocabulary.

For example, I - 101, am - 202, run - 305, ning - 402, etc.

This id acts as an index in an embedding matrix.

Embedding Lookup

Each token ID is mapped to a dense vector from the model’s embedding matrix. These vectors are usually high-dimensional (e.g., 768 or 1536 dimensions).

So, each word will get its own full dimension vector. For our example, tokens we'll have vectors for V_i, V_am, V_run and so on.

These vectors already exist in the models; magic is in how these vectors are trained. During training-

Words that appear in similar contexts get vectors that are close together.
Relationships between words are encoded as vector arithmetic.

Here is a simple program to show the embedding using GoogleGenerativeAIEmbeddings

from langchain_google_genai import GoogleGenerativeAIEmbeddings

from dotenv import load_dotenv

load_dotenv()

embeddings = GoogleGenerativeAIEmbeddings(model="gemini-embedding-2-preview")
query = "I am running to the market"
vector = embeddings.embed_query(query)

# vector dimensions
print(len(vector)) 
# first 10 values
print(vector[:10])

Output

3072
[0.013388703, -0.0026265276, -0.0013064864, 0.013196219, -0.0071006925, 0.0008229259, -0.009015757, 0.00064084254, 0.005457073, -0.0643481]

Note that models don't return separate vectors for each word. The model processes the entire sentence and produces one unified vector that represents the meaning of the whole sentence. That is the Pooling / Final Representation step in the embedding model that combines token-level embeddings into a single sentence-level embedding.

Embedding models give you a ready-to-use representation of the entire query because embedding API is designed for semantic search and comparison at the sentence or document level.

How does semantic meaning emerge

Words that appear in similar contexts get vectors that are close together. If we take the often used example of "king", "queen", "man", "woman".

\[v_{\text{king}} - v_{\text{man}} + v_{\text{woman}} \approx v_{\text{queen}}\]

The model has already learnt the concept of royalty which is already encoded into the vector of king. Man is, well just a common man!

When we do \(v_{\text{king}} - v_{\text{man}}\), difference is the concept of royalty.

When vector of woman is added to it, the result lands near the vector for "queen". Concept of royalty is already encoded into the vector of queen.

Similar analogies hold for geography (Paris – France + Italy \(\approx\) Rome) or verb tense (walk – walking + running \(\approx\) run). It shows embeddings capture a wide range of semantic relationships.

Here’s a simple geometric visualization of how embeddings capture meaning with king, queen, man, woman. Imagine a 2D plane where one axis represents gender and the other represents royalty.

Semantic proximity

If two sentences share nearly identical structure and meaning, for example

I am running to the market
I am walking to the market

As you can see both sentences share nearly identical structure and meaning:

Subject: “I”
Verb: movement toward a destination
Object: “the market”

Since both verbs describe locomotion, their embeddings are near each other in the model’s learned space. That is the concept of Semantic proximity.

The distance between these two vectors (often measured by cosine similarity) would be very small (provided they are embedded using the same model).

Metrics for comparing embeddings

Several metrics are commonly used to compare embeddings:

Cosine similarity- Measures the angle between two vectors.
Euclidean distance- Measures the straight-line distance between points.
Dot product- Measures how much one vector projects onto another.

We can check this programmatically in LangChain using numpy to calculate cosine similarity.

from langchain_ollama import OllamaEmbeddings
import numpy as np

embeddings = OllamaEmbeddings(model="nomic-embed-text")

v1 = np.array(embeddings.embed_query("I am running to the market"))
v2 = np.array(embeddings.embed_query("I am walking to the market"))

# cosine similarity using numpy
cos_sim = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

print("Similarity is", cos_sim)

Output

Similarity is 0.82145

LangChain Embedding Interface

LangChain provides a standard interface for text embedding models (like OpenAIEmbeddings, GoogleGenerativeAIEmbeddings, OllamaEmbeddings) through the Embeddings interface.

Two main methods are:

embed_documents(texts: List[str]): Embeds a list of documents. Returns a List[List[float]]
embed_query(text: str): Embeds a single query. Returns a List[float]

What is the next step?

You can now store embeddings, which are high-dimensional numerical representations of data, in a vector database (like Pinecone, FAISS, Weaviate, ChromaDB) for semantic search or similarity matching.

That's all for this topic Embeddings in LangChain With Examples. If you have any doubt or any suggestions to make please drop a comment. Thanks!

Related Topics

You may also like-

Tech Tutorials

Wednesday, April 15, 2026