CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-langchain

Building applications with LLMs through composability

Pending
Overview
Eval results
Files

embeddings.mddocs/core/

Embeddings

Embeddings are vector representations of text that capture semantic meaning. They are essential for semantic search, retrieval-augmented generation (RAG), clustering, classification, and other NLP tasks that require measuring text similarity.

LangChain provides a unified interface for initializing and using embeddings models from multiple providers through the init_embeddings() factory function.

Embeddings Initialization

Initialize embeddings models using the init_embeddings() factory function with string identifiers:

def init_embeddings(
    model: str,
    *,
    provider: str | None = None,
    **kwargs: Any
) -> Embeddings

Parameters:

  • model (str): Model identifier in format "provider:model-name". Examples: "openai:text-embedding-3-small", "cohere:embed-english-v3.0". Required.
  • provider (str | None): Override provider detection. Useful when the provider cannot be automatically detected from the model string. Optional.
  • **kwargs: Provider-specific parameters (API keys, dimensions, batch size, etc.). All extra keyword arguments are passed to the provider's embeddings class.

Returns: Embeddings instance

Common Embeddings Parameters

These parameters are passed as **kwargs to init_embeddings():

  • api_key (str): API key for the provider (provider-specific, e.g., openai_api_key, cohere_api_key)
  • dimensions (int): Output embedding dimensions (for models that support configurable dimensions)
  • batch_size (int): Batch size for embedding multiple documents
  • timeout (float): Request timeout in seconds
  • max_retries (int): Maximum number of automatic retry attempts

Basic Embeddings Usage

from langchain.embeddings import init_embeddings

# Initialize OpenAI embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")

# Embed a single query
query_vector = embeddings.embed_query("What is the capital of France?")
print(len(query_vector))  # 1536 dimensions

# Embed multiple documents
documents = [
    "Paris is the capital of France.",
    "London is the capital of England.",
    "Berlin is the capital of Germany."
]
doc_vectors = embeddings.embed_documents(documents)
print(len(doc_vectors))  # 3
print(len(doc_vectors[0]))  # 1536 dimensions each

Embeddings with Configuration

from langchain.embeddings import init_embeddings

# Initialize with custom parameters
embeddings = init_embeddings(
    "openai:text-embedding-3-large",
    dimensions=1024,  # Reduce from default 3072
    batch_size=100
)

Embeddings Authentication

# Explicit API key
embeddings = init_embeddings(
    "openai:text-embedding-3-small",
    openai_api_key="sk-..."
)

# Or use environment variable OPENAI_API_KEY
embeddings = init_embeddings("openai:text-embedding-3-small")

Embeddings Providers

LangChain supports multiple embeddings providers through the init_embeddings() function:

Major Providers:

  • OpenAI (openai): text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002

    • Examples: "openai:text-embedding-3-small", "openai:text-embedding-3-large"
  • Azure OpenAI (azure_openai): OpenAI embeddings hosted on Azure

    • Examples: "azure_openai:text-embedding-3-small"
  • Google Vertex AI (google_vertexai): text-embedding-004 and other Vertex AI embeddings

    • Examples: "google_vertexai:text-embedding-004"
  • Google Generative AI (google_genai): embedding-001 and other Google AI embeddings

    • Examples: "google_genai:embedding-001"
  • AWS Bedrock (bedrock): Titan embeddings and other Bedrock embeddings

    • Examples: "bedrock:amazon.titan-embed-text-v1"
  • Cohere (cohere): embed-english-v3.0, embed-multilingual-v3.0

    • Examples: "cohere:embed-english-v3.0", "cohere:embed-multilingual-v3.0"
  • Mistral AI (mistralai): mistral-embed

    • Examples: "mistralai:mistral-embed"
  • HuggingFace (huggingface): sentence-transformers and other HuggingFace models

    • Examples: "huggingface:sentence-transformers/all-MiniLM-L6-v2"
  • Ollama (ollama): Local embeddings models

    • Examples: "ollama:nomic-embed-text", "ollama:mxbai-embed-large"

See Provider Reference for the complete list of supported embeddings providers.

Embeddings Provider Examples

Each provider has its own model naming convention. The general format is "provider:model-name", but the exact model name varies:

# OpenAI
embeddings = init_embeddings("openai:text-embedding-3-small")
embeddings = init_embeddings("openai:text-embedding-3-large")
embeddings = init_embeddings("openai:text-embedding-ada-002")  # Legacy

# Cohere
embeddings = init_embeddings("cohere:embed-english-v3.0")
embeddings = init_embeddings("cohere:embed-multilingual-v3.0")

# Google Vertex AI
embeddings = init_embeddings("google_vertexai:text-embedding-004")

# AWS Bedrock
embeddings = init_embeddings("bedrock:amazon.titan-embed-text-v1")

# Local models
embeddings = init_embeddings("ollama:nomic-embed-text")
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")

Embeddings Interface

The Embeddings class is the base interface for all embeddings models. All models returned by init_embeddings() implement this interface.

class Embeddings:
    """
    Base class for embeddings models.

    All embeddings models support embedding single queries and
    multiple documents, with both synchronous and asynchronous methods.
    """

    def embed_query(self, text: str) -> list[float]: ...

    def embed_documents(self, texts: list[str]) -> list[list[float]]: ...

    async def aembed_query(self, text: str) -> list[float]: ...

    async def aembed_documents(self, texts: list[str]) -> list[list[float]]: ...

Methods:

  • embed_query(text) - Embed a single query text synchronously. Returns a vector (list of floats).
  • embed_documents(texts) - Embed multiple documents synchronously. Returns a list of vectors.
  • aembed_query(text) - Embed a single query text asynchronously. Returns a vector.
  • aembed_documents(texts) - Embed multiple documents asynchronously. Returns a list of vectors.

Note: The distinction between embed_query and embed_documents is important for some providers. Query embeddings may be optimized differently than document embeddings for retrieval tasks.

Embedding a Query

Embed a single query for search or retrieval:

from langchain.embeddings import init_embeddings

embeddings = init_embeddings("openai:text-embedding-3-small")

# Synchronous
query_vector = embeddings.embed_query("What is machine learning?")
print(f"Vector dimension: {len(query_vector)}")
print(f"First 5 values: {query_vector[:5]}")

# Asynchronous
query_vector = await embeddings.aembed_query("What is machine learning?")

Embedding Documents

Embed multiple documents for indexing or storage:

from langchain.embeddings import init_embeddings

embeddings = init_embeddings("openai:text-embedding-3-small")

documents = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing enables computers to understand text."
]

# Synchronous
doc_vectors = embeddings.embed_documents(documents)
print(f"Embedded {len(doc_vectors)} documents")
print(f"Each vector has {len(doc_vectors[0])} dimensions")

# Asynchronous
doc_vectors = await embeddings.aembed_documents(documents)

Semantic Search Example

Use embeddings to find similar documents:

import numpy as np
from langchain.embeddings import init_embeddings

embeddings = init_embeddings("openai:text-embedding-3-small")

# Document corpus
documents = [
    "The Eiffel Tower is in Paris.",
    "The Colosseum is in Rome.",
    "The Statue of Liberty is in New York.",
    "The Great Wall is in China."
]

# Embed all documents
doc_vectors = embeddings.embed_documents(documents)

# Embed query
query = "Famous landmark in France"
query_vector = embeddings.embed_query(query)

# Compute cosine similarity
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Find most similar document
similarities = [
    cosine_similarity(query_vector, doc_vec)
    for doc_vec in doc_vectors
]

best_match_idx = np.argmax(similarities)
print(f"Most similar: {documents[best_match_idx]}")
print(f"Similarity: {similarities[best_match_idx]:.4f}")

Configurable Dimensions

Some embedding models support configurable output dimensions:

from langchain.embeddings import init_embeddings

# OpenAI text-embedding-3-* models support dimensions parameter
small_embeddings = init_embeddings(
    "openai:text-embedding-3-small",
    dimensions=512  # Reduce from default 1536
)

large_embeddings = init_embeddings(
    "openai:text-embedding-3-large",
    dimensions=1024  # Reduce from default 3072
)

# Smaller dimensions trade off some quality for reduced storage and faster search
vector = small_embeddings.embed_query("Test")
print(len(vector))  # 512

Batch Processing

Embed large numbers of documents efficiently:

from langchain.embeddings import init_embeddings

embeddings = init_embeddings(
    "openai:text-embedding-3-small",
    batch_size=100  # Process 100 at a time
)

# Large document collection
large_corpus = [f"Document {i}" for i in range(1000)]

# Embed in batches
vectors = embeddings.embed_documents(large_corpus)

Authentication for Embeddings

Different providers require different authentication:

# OpenAI (uses OPENAI_API_KEY environment variable or parameter)
embeddings = init_embeddings("openai:text-embedding-3-small", openai_api_key="sk-...")

# Cohere (uses COHERE_API_KEY environment variable or parameter)
embeddings = init_embeddings("cohere:embed-english-v3.0", cohere_api_key="...")

# AWS Bedrock (uses AWS credentials from environment/IAM)
embeddings = init_embeddings("bedrock:amazon.titan-embed-text-v1")

# Azure OpenAI
embeddings = init_embeddings(
    "azure_openai:text-embedding-3-small",
    azure_deployment="my-embedding-deployment",
    azure_endpoint="https://my-resource.openai.azure.com/",
    api_key="..."
)

# Ollama (local, no authentication)
embeddings = init_embeddings("ollama:nomic-embed-text")

# HuggingFace (local or API)
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")

Multilingual Embeddings

Use multilingual models for cross-language retrieval:

from langchain.embeddings import init_embeddings

# Cohere multilingual embeddings
embeddings = init_embeddings("cohere:embed-multilingual-v3.0")

# Embed documents in different languages
documents = [
    "Hello, how are you?",  # English
    "Bonjour, comment allez-vous?",  # French
    "Hola, ¿cómo estás?",  # Spanish
]

doc_vectors = embeddings.embed_documents(documents)

# Query in any language
query = "greeting someone"
query_vector = embeddings.embed_query(query)

Local Embeddings

Use local embeddings models for privacy or offline usage:

from langchain.embeddings import init_embeddings

# Ollama (requires Ollama running locally)
embeddings = init_embeddings("ollama:nomic-embed-text")

# HuggingFace (downloads model locally on first use)
embeddings = init_embeddings("huggingface:sentence-transformers/all-MiniLM-L6-v2")

# No API calls or network requests needed
vector = embeddings.embed_query("This runs completely locally")

Comparing Embeddings Models

Different models have different characteristics:

from langchain.embeddings import init_embeddings

# Small, fast, cost-effective
small = init_embeddings("openai:text-embedding-3-small")  # 1536 dims

# Large, high quality, more expensive
large = init_embeddings("openai:text-embedding-3-large")  # 3072 dims

# Free, local, privacy-friendly
local = init_embeddings("ollama:nomic-embed-text")

# Multilingual
multilingual = init_embeddings("cohere:embed-multilingual-v3.0")

# Test query
query = "What is artificial intelligence?"

# Compare dimensions
print(f"Small: {len(small.embed_query(query))} dims")
print(f"Large: {len(large.embed_query(query))} dims")
print(f"Local: {len(local.embed_query(query))} dims")

Types

from typing import Any
from langchain_core.embeddings import Embeddings

Install with Tessl CLI

npx tessl i tessl/pypi-langchain@1.2.1

docs

index.md

quickstart.md

tile.json