Building applications with LLMs through composability
—
Embeddings are vector representations of text that capture semantic meaning. They are essential for semantic search, retrieval-augmented generation (RAG), clustering, classification, and other NLP tasks that require measuring text similarity.
LangChain provides a unified interface for initializing and using embeddings models from multiple providers through the init_embeddings() factory function.
Initialize embeddings models using the init_embeddings() factory function with string identifiers:
def init_embeddings(
model: str,
*,
provider: str | None = None,
**kwargs: Any
) -> EmbeddingsParameters:
model (str): Model identifier in format "provider:model-name". Examples: "openai:text-embedding-3-small", "cohere:embed-english-v3.0". Required.provider (str | None): Override provider detection. Useful when the provider cannot be automatically detected from the model string. Optional.**kwargs: Provider-specific parameters (API keys, dimensions, batch size, etc.). All extra keyword arguments are passed to the provider's embeddings class.Returns: Embeddings instance
These parameters are passed as **kwargs to init_embeddings():
api_key (str): API key for the provider (provider-specific, e.g., openai_api_key, cohere_api_key)dimensions (int): Output embedding dimensions (for models that support configurable dimensions)batch_size (int): Batch size for embedding multiple documentstimeout (float): Request timeout in secondsmax_retries (int): Maximum number of automatic retry attemptsfrom langchain.embeddings import init_embeddings
# Initialize OpenAI embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")
# Embed a single query
query_vector = embeddings.embed_query("What is the capital of France?")
print(len(query_vector)) # 1536 dimensions
# Embed multiple documents
documents = [
"Paris is the capital of France.",
"London is the capital of England.",
"Berlin is the capital of Germany."
]
doc_vectors = embeddings.embed_documents(documents)
print(len(doc_vectors)) # 3
print(len(doc_vectors[0])) # 1536 dimensions eachfrom langchain.embeddings import init_embeddings
# Initialize with custom parameters
embeddings = init_embeddings(
"openai:text-embedding-3-large",
dimensions=1024, # Reduce from default 3072
batch_size=100
)# Explicit API key
embeddings = init_embeddings(
"openai:text-embedding-3-small",
openai_api_key="sk-..."
)
# Or use environment variable OPENAI_API_KEY
embeddings = init_embeddings("openai:text-embedding-3-small")LangChain supports multiple embeddings providers through the init_embeddings() function:
Major Providers:
OpenAI (openai): text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
"openai:text-embedding-3-small", "openai:text-embedding-3-large"Azure OpenAI (azure_openai): OpenAI embeddings hosted on Azure
"azure_openai:text-embedding-3-small"Google Vertex AI (google_vertexai): text-embedding-004 and other Vertex AI embeddings
"google_vertexai:text-embedding-004"Google Generative AI (google_genai): embedding-001 and other Google AI embeddings
"google_genai:embedding-001"AWS Bedrock (bedrock): Titan embeddings and other Bedrock embeddings
"bedrock:amazon.titan-embed-text-v1"Cohere (cohere): embed-english-v3.0, embed-multilingual-v3.0
"cohere:embed-english-v3.0", "cohere:embed-multilingual-v3.0"Mistral AI (mistralai): mistral-embed
"mistralai:mistral-embed"HuggingFace (huggingface): sentence-transformers and other HuggingFace models
"huggingface:sentence-transformers/all-MiniLM-L6-v2"Ollama (ollama): Local embeddings models
"ollama:nomic-embed-text", "ollama:mxbai-embed-large"See Provider Reference for the complete list of supported embeddings providers.
Each provider has its own model naming convention. The general format is "provider:model-name", but the exact model name varies:
# OpenAI
embeddings = init_embeddings("openai:text-embedding-3-small")
embeddings = init_embeddings("openai:text-embedding-3-large")
embeddings = init_embeddings("openai:text-embedding-ada-002") # Legacy
# Cohere
embeddings = init_embeddings("cohere:embed-english-v3.0")
embeddings = init_embeddings("cohere:embed-multilingual-v3.0")
# Google Vertex AI
embeddings = init_embeddings("google_vertexai:text-embedding-004")
# AWS Bedrock
embeddings = init_embeddings("bedrock:amazon.titan-embed-text-v1")
# Local models
embeddings = init_embeddings("ollama:nomic-embed-text")
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")The Embeddings class is the base interface for all embeddings models. All models returned by init_embeddings() implement this interface.
class Embeddings:
"""
Base class for embeddings models.
All embeddings models support embedding single queries and
multiple documents, with both synchronous and asynchronous methods.
"""
def embed_query(self, text: str) -> list[float]: ...
def embed_documents(self, texts: list[str]) -> list[list[float]]: ...
async def aembed_query(self, text: str) -> list[float]: ...
async def aembed_documents(self, texts: list[str]) -> list[list[float]]: ...Methods:
embed_query(text) - Embed a single query text synchronously. Returns a vector (list of floats).embed_documents(texts) - Embed multiple documents synchronously. Returns a list of vectors.aembed_query(text) - Embed a single query text asynchronously. Returns a vector.aembed_documents(texts) - Embed multiple documents asynchronously. Returns a list of vectors.Note: The distinction between embed_query and embed_documents is important for some providers. Query embeddings may be optimized differently than document embeddings for retrieval tasks.
Embed a single query for search or retrieval:
from langchain.embeddings import init_embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")
# Synchronous
query_vector = embeddings.embed_query("What is machine learning?")
print(f"Vector dimension: {len(query_vector)}")
print(f"First 5 values: {query_vector[:5]}")
# Asynchronous
query_vector = await embeddings.aembed_query("What is machine learning?")Embed multiple documents for indexing or storage:
from langchain.embeddings import init_embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")
documents = [
"Machine learning is a subset of artificial intelligence.",
"Deep learning uses neural networks with multiple layers.",
"Natural language processing enables computers to understand text."
]
# Synchronous
doc_vectors = embeddings.embed_documents(documents)
print(f"Embedded {len(doc_vectors)} documents")
print(f"Each vector has {len(doc_vectors[0])} dimensions")
# Asynchronous
doc_vectors = await embeddings.aembed_documents(documents)Use embeddings to find similar documents:
import numpy as np
from langchain.embeddings import init_embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")
# Document corpus
documents = [
"The Eiffel Tower is in Paris.",
"The Colosseum is in Rome.",
"The Statue of Liberty is in New York.",
"The Great Wall is in China."
]
# Embed all documents
doc_vectors = embeddings.embed_documents(documents)
# Embed query
query = "Famous landmark in France"
query_vector = embeddings.embed_query(query)
# Compute cosine similarity
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Find most similar document
similarities = [
cosine_similarity(query_vector, doc_vec)
for doc_vec in doc_vectors
]
best_match_idx = np.argmax(similarities)
print(f"Most similar: {documents[best_match_idx]}")
print(f"Similarity: {similarities[best_match_idx]:.4f}")Some embedding models support configurable output dimensions:
from langchain.embeddings import init_embeddings
# OpenAI text-embedding-3-* models support dimensions parameter
small_embeddings = init_embeddings(
"openai:text-embedding-3-small",
dimensions=512 # Reduce from default 1536
)
large_embeddings = init_embeddings(
"openai:text-embedding-3-large",
dimensions=1024 # Reduce from default 3072
)
# Smaller dimensions trade off some quality for reduced storage and faster search
vector = small_embeddings.embed_query("Test")
print(len(vector)) # 512Embed large numbers of documents efficiently:
from langchain.embeddings import init_embeddings
embeddings = init_embeddings(
"openai:text-embedding-3-small",
batch_size=100 # Process 100 at a time
)
# Large document collection
large_corpus = [f"Document {i}" for i in range(1000)]
# Embed in batches
vectors = embeddings.embed_documents(large_corpus)Different providers require different authentication:
# OpenAI (uses OPENAI_API_KEY environment variable or parameter)
embeddings = init_embeddings("openai:text-embedding-3-small", openai_api_key="sk-...")
# Cohere (uses COHERE_API_KEY environment variable or parameter)
embeddings = init_embeddings("cohere:embed-english-v3.0", cohere_api_key="...")
# AWS Bedrock (uses AWS credentials from environment/IAM)
embeddings = init_embeddings("bedrock:amazon.titan-embed-text-v1")
# Azure OpenAI
embeddings = init_embeddings(
"azure_openai:text-embedding-3-small",
azure_deployment="my-embedding-deployment",
azure_endpoint="https://my-resource.openai.azure.com/",
api_key="..."
)
# Ollama (local, no authentication)
embeddings = init_embeddings("ollama:nomic-embed-text")
# HuggingFace (local or API)
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")Use multilingual models for cross-language retrieval:
from langchain.embeddings import init_embeddings
# Cohere multilingual embeddings
embeddings = init_embeddings("cohere:embed-multilingual-v3.0")
# Embed documents in different languages
documents = [
"Hello, how are you?", # English
"Bonjour, comment allez-vous?", # French
"Hola, ¿cómo estás?", # Spanish
]
doc_vectors = embeddings.embed_documents(documents)
# Query in any language
query = "greeting someone"
query_vector = embeddings.embed_query(query)Use local embeddings models for privacy or offline usage:
from langchain.embeddings import init_embeddings
# Ollama (requires Ollama running locally)
embeddings = init_embeddings("ollama:nomic-embed-text")
# HuggingFace (downloads model locally on first use)
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")
# No API calls or network requests needed
vector = embeddings.embed_query("This runs completely locally")Different models have different characteristics:
from langchain.embeddings import init_embeddings
# Small, fast, cost-effective
small = init_embeddings("openai:text-embedding-3-small") # 1536 dims
# Large, high quality, more expensive
large = init_embeddings("openai:text-embedding-3-large") # 3072 dims
# Free, local, privacy-friendly
local = init_embeddings("ollama:nomic-embed-text")
# Multilingual
multilingual = init_embeddings("cohere:embed-multilingual-v3.0")
# Test query
query = "What is artificial intelligence?"
# Compare dimensions
print(f"Small: {len(small.embed_query(query))} dims")
print(f"Large: {len(large.embed_query(query))} dims")
print(f"Local: {len(local.embed_query(query))} dims")from typing import Any
from langchain_core.embeddings import EmbeddingsInstall with Tessl CLI
npx tessl i tessl/pypi-langchain@1.2.1