CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-kiln-ai

Kiln AI is a comprehensive platform for building, evaluating, and deploying AI systems with dataset management, model fine-tuning, RAG, and evaluation capabilities.

Overview
Eval results
Files

rag-embeddings.mddocs/

RAG and Embeddings

Retrieval-augmented generation with text chunking, embeddings, and vector store integration. Supports document extraction, chunking strategies, embedding models, and vector database storage for semantic search.

Capabilities

RAG Configuration

Configuration for Retrieval-Augmented Generation pipelines.

from kiln_ai.datamodel import RagConfig

class RagConfig:
    """
    Configuration for RAG (Retrieval-Augmented Generation).

    Properties:
    - vector_store_config (VectorStoreConfig): Vector database configuration
    - embedding_config (EmbeddingConfig): Embedding model configuration
    - chunker_config (ChunkerConfig): Text chunking configuration
    - top_k (int): Number of results to retrieve
    """

Embedding Configuration

Configuration for embedding models and generation.

from kiln_ai.datamodel import EmbeddingConfig, Embedding, ChunkEmbeddings

class EmbeddingConfig:
    """
    Configuration for embeddings.

    Properties:
    - model_id (str): Embedding model identifier
    - provider (str): Embedding provider name
    - dimensions (int): Embedding vector dimensions
    """

class Embedding:
    """
    Single embedding vector.

    Properties:
    - vector (list[float]): Embedding vector values
    - metadata (dict): Additional embedding metadata
    """

class ChunkEmbeddings:
    """
    Embeddings for document chunks.

    Properties:
    - embeddings (list[Embedding]): List of embedding vectors
    - chunk_ids (list[str]): Corresponding chunk identifiers
    """

Embedding Adapters

Adapters for generating embeddings with various providers.

from kiln_ai.adapters.embedding import (
    BaseEmbeddingAdapter,
    LitellmEmbeddingAdapter,
    EmbeddingOptions
)

class BaseEmbeddingAdapter:
    """
    Abstract embedding adapter interface.

    Methods:
    - embed(): Generate single embedding
    - embed_batch(): Generate batch embeddings
    """

    async def embed(self, text: str) -> 'Embedding':
        """
        Generate embedding for text.

        Parameters:
        - text (str): Input text

        Returns:
        Embedding: Embedding vector with metadata
        """

    async def embed_batch(self, texts: list) -> 'EmbeddingResult':
        """
        Generate embeddings for multiple texts.

        Parameters:
        - texts (list[str]): Input texts

        Returns:
        EmbeddingResult: Batch embeddings with usage info
        """

class LitellmEmbeddingAdapter(BaseEmbeddingAdapter):
    """
    LiteLLM embedding adapter supporting multiple providers.

    Supports:
    - OpenAI (text-embedding-3-small, text-embedding-3-large)
    - Cohere (embed-english-v3.0, embed-multilingual-v3.0)
    - Voyage AI (voyage-large-2, voyage-code-2)
    - Many more through LiteLLM
    """

    def __init__(self, model_name: str, provider: str, options: 'EmbeddingOptions' = None):
        """
        Initialize LiteLLM embedding adapter.

        Parameters:
        - model_name (str): Embedding model identifier
        - provider (str): Provider name
        - options (EmbeddingOptions | None): Embedding options
        """

class EmbeddingOptions:
    """
    Embedding configuration options.

    Properties:
    - dimensions (int | None): Vector dimensions (if configurable)
    - encoding_format (str | None): Encoding format (e.g., "float", "base64")
    """

class EmbeddingResult:
    """
    Batch embedding result.

    Properties:
    - embeddings (list[Embedding]): Generated embeddings
    - usage (dict): Token usage information
    """

Embedding Registry

Get embedding adapters by model and provider.

from kiln_ai.adapters.embedding.embedding_registry import embedding_adapter_from_type

def embedding_adapter_from_type(model_name: str, provider: str):
    """
    Get embedding adapter instance.

    Parameters:
    - model_name (str): Embedding model identifier
    - provider (str): Provider name

    Returns:
    BaseEmbeddingAdapter: Embedding adapter instance
    """

Text Chunking

Configuration and adapters for text chunking strategies.

from kiln_ai.datamodel import ChunkerConfig, ChunkerType, Chunk, ChunkedDocument

class ChunkerConfig:
    """
    Configuration for text chunking.

    Properties:
    - chunker_type (ChunkerType): Type of chunker to use
    - chunk_size (int): Size of each chunk in characters
    - chunk_overlap (int): Overlap between chunks in characters
    """

class ChunkerType:
    """
    Available chunker types.

    Values:
    - fixed_window: Fixed-size window chunking with overlap
    """
    fixed_window = "fixed_window"

class Chunk:
    """
    Single text chunk with metadata.

    Properties:
    - text (str): Chunk content
    - start_index (int): Start position in source document
    - end_index (int): End position in source document
    - metadata (dict): Additional chunk metadata (source, page, etc.)
    """

class ChunkedDocument:
    """
    Document split into chunks.

    Properties:
    - chunks (list[Chunk]): List of text chunks
    - source_document (str): Original document content
    """

Chunking Adapters

Adapters for different chunking strategies.

from kiln_ai.adapters.chunkers import (
    BaseChunker,
    FixedWindowChunker,
    TextChunk,
    ChunkingResult
)

class BaseChunker:
    """
    Abstract chunker interface.

    Methods:
    - chunk(): Chunk single document
    - chunk_documents(): Chunk multiple documents
    """

    def chunk(self, text: str) -> 'ChunkingResult':
        """
        Chunk single document.

        Parameters:
        - text (str): Document text

        Returns:
        ChunkingResult: Chunking result with metadata
        """

    def chunk_documents(self, documents: list) -> list:
        """
        Chunk multiple documents.

        Parameters:
        - documents (list[str]): Document texts

        Returns:
        list[ChunkingResult]: Chunking results
        """

class FixedWindowChunker(BaseChunker):
    """
    Fixed-size window chunking with overlap.

    Splits text into chunks of fixed size with configurable overlap
    to maintain context across chunk boundaries.
    """

    def __init__(self, chunk_size: int, chunk_overlap: int):
        """
        Initialize fixed window chunker.

        Parameters:
        - chunk_size (int): Size of each chunk in characters
        - chunk_overlap (int): Overlap between chunks in characters
        """

class TextChunk:
    """
    Text chunk with positional metadata.

    Properties:
    - text (str): Chunk text
    - start (int): Start position in document
    - end (int): End position in document
    - metadata (dict): Additional metadata
    """

class ChunkingResult:
    """
    Result of chunking operation.

    Properties:
    - chunks (list[TextChunk]): Text chunks
    - metadata (dict): Chunking metadata (strategy, params)
    """

Chunker Registry

Get chunker adapters by type.

from kiln_ai.adapters.chunkers.chunker_registry import chunker_adapter_from_type

def chunker_adapter_from_type(chunker_type: str, config: dict):
    """
    Get chunker adapter from type.

    Parameters:
    - chunker_type (str): Type of chunker
    - config (dict): Chunker configuration

    Returns:
    BaseChunker: Chunker adapter instance
    """

Document Extraction

Extract and process documents for RAG pipelines.

from kiln_ai.datamodel import (
    Document,
    Extraction,
    ExtractorConfig,
    FileInfo,
    Kind,
    OutputFormat,
    ExtractorType,
    ExtractionSource
)

class Document:
    """
    Document with extracted content.

    Properties:
    - id (str): Unique identifier
    - content (str): Extracted content
    - metadata (dict): Document metadata
    - kind (Kind): Type of document (text, pdf, image, html)
    """

    @staticmethod
    def load_from_file(path: str) -> 'Document':
        """Load document from .kiln file."""

    def save_to_file(self) -> None:
        """Save document to .kiln file."""

class Extraction:
    """
    Result of document extraction.

    Properties:
    - document (Document): Extracted document
    - extractor_config (ExtractorConfig): Configuration used
    """

class ExtractorConfig:
    """
    Configuration for document extraction.

    Properties:
    - extractor_type (ExtractorType): Type of extractor
    - options (dict): Extractor-specific options
    """

class FileInfo:
    """
    Metadata about source file.

    Properties:
    - filename (str): Name of file
    - path (str): File system path
    - size (int): File size in bytes
    - mime_type (str): MIME type
    """

class Kind:
    """
    Type of document.

    Values:
    - text: Plain text document
    - pdf: PDF document
    - image: Image file
    - html: HTML document
    """
    text = "text"
    pdf = "pdf"
    image = "image"
    html = "html"

class OutputFormat:
    """
    Format for extracted output.

    Values:
    - markdown: Markdown format
    - plain_text: Plain text format
    - structured: Structured data format
    """
    markdown = "markdown"
    plain_text = "plain_text"
    structured = "structured"

class ExtractorType:
    """
    Type of extractor to use.

    Values:
    - litellm: LiteLLM-based extraction
    - custom: Custom extractor
    """
    litellm = "litellm"
    custom = "custom"

class ExtractionSource:
    """
    Source type for extraction.

    Values:
    - file: Extract from file
    - url: Extract from URL
    - text: Extract from text
    """
    file = "file"
    url = "url"
    text = "text"

Extraction Adapters

Adapters for extracting content from various sources.

from kiln_ai.adapters.extractors import (
    BaseExtractor,
    LitellmExtractor,
    ExtractionInput,
    ExtractionOutput,
    encode_file_litellm_format
)

class BaseExtractor:
    """
    Abstract extractor interface.

    Methods:
    - extract(): Extract single document
    - extract_batch(): Extract multiple documents
    """

    async def extract(self, input_data: 'ExtractionInput') -> 'ExtractionOutput':
        """
        Extract content from source.

        Parameters:
        - input_data (ExtractionInput): Input specification

        Returns:
        ExtractionOutput: Extracted content
        """

    async def extract_batch(self, inputs: list) -> list:
        """
        Extract from multiple sources.

        Parameters:
        - inputs (list[ExtractionInput]): Input specifications

        Returns:
        list[ExtractionOutput]: Extracted contents
        """

class LitellmExtractor(BaseExtractor):
    """
    LiteLLM-based document extraction.

    Uses vision-capable models to extract text from:
    - PDFs
    - Images
    - Scanned documents
    """

    def __init__(self, model_name: str, provider: str):
        """
        Initialize LiteLLM extractor.

        Parameters:
        - model_name (str): Vision-capable model
        - provider (str): Provider name
        """

class ExtractionInput:
    """
    Input for extraction.

    Properties:
    - content (str): Source content (text, file path, URL)
    - content_type (str): Type of content
    - options (dict): Extraction options
    """

class ExtractionOutput:
    """
    Extracted content.

    Properties:
    - text (str): Extracted text
    - metadata (dict): Extraction metadata
    - format (OutputFormat): Output format
    """

def encode_file_litellm_format(file_path: str, mime_type: str) -> str:
    """
    Encode file for LiteLLM API.

    Parameters:
    - file_path (str): Path to file
    - mime_type (str): MIME type of file

    Returns:
    str: Base64 encoded file data
    """

Extraction Runner

Execute extraction jobs.

from kiln_ai.adapters.extractors import ExtractorRunner, ExtractorJob

class ExtractorRunner:
    """
    Execute document extractions.

    Methods:
    - run(): Execute single extraction
    - run_batch(): Execute batch extractions
    """

    def __init__(self, config: 'ExtractorConfig'):
        """
        Initialize extraction runner.

        Parameters:
        - config (ExtractorConfig): Extractor configuration
        """

    async def run(self, input_data: 'ExtractionInput') -> 'ExtractionOutput':
        """Execute single extraction."""

    async def run_batch(self, inputs: list) -> list:
        """Execute batch extractions."""

class ExtractorJob:
    """
    Extraction job configuration.

    Properties:
    - extractor_config (ExtractorConfig): Extractor settings
    - inputs (list[ExtractionInput]): Inputs to process
    """

Extractor Registry

Get extractor adapters by type.

from kiln_ai.adapters.extractors.extractor_registry import extractor_adapter_from_type

def extractor_adapter_from_type(extractor_type: str, config: dict):
    """
    Get extractor adapter from type.

    Parameters:
    - extractor_type (str): Type of extractor
    - config (dict): Extractor configuration

    Returns:
    BaseExtractor: Extractor adapter instance
    """

Vector Store Configuration

Configuration for vector database integration.

from kiln_ai.datamodel import VectorStoreConfig, VectorStoreType, LanceDBConfigBaseProperties

class VectorStoreConfig:
    """
    Configuration for vector database.

    Properties:
    - vector_store_type (VectorStoreType): Type of vector store
    - connection_params (dict): Connection parameters
    """

class VectorStoreType:
    """
    Type of vector store.

    Values:
    - lancedb: LanceDB vector database
    """
    lancedb = "lancedb"

class LanceDBConfigBaseProperties:
    """
    LanceDB-specific configuration.

    Properties:
    - uri (str): Database URI (file path or connection string)
    - table_name (str): Table name for storage
    """

Usage Examples

Basic Embedding Generation

from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter

# Create embedding adapter
adapter = LitellmEmbeddingAdapter(
    model_name="text-embedding-3-small",
    provider="openai"
)

# Generate single embedding
text = "This is a sample document for embedding."
embedding = await adapter.embed(text)
print(f"Embedding dimensions: {len(embedding.vector)}")
print(f"First few values: {embedding.vector[:5]}")

# Generate batch embeddings
texts = [
    "First document",
    "Second document",
    "Third document"
]
result = await adapter.embed_batch(texts)
print(f"Generated {len(result.embeddings)} embeddings")
print(f"Tokens used: {result.usage}")

Text Chunking

from kiln_ai.adapters.chunkers import FixedWindowChunker

# Create chunker with 500 character chunks and 50 character overlap
chunker = FixedWindowChunker(chunk_size=500, chunk_overlap=50)

# Load document
with open("long_document.txt", "r") as f:
    document_text = f.read()

# Chunk document
result = chunker.chunk(document_text)
print(f"Created {len(result.chunks)} chunks")

for i, chunk in enumerate(result.chunks):
    print(f"\nChunk {i+1}:")
    print(f"  Position: {chunk.start}-{chunk.end}")
    print(f"  Length: {len(chunk.text)} characters")
    print(f"  Preview: {chunk.text[:100]}...")

Document Extraction

from kiln_ai.datamodel import ExtractorConfig, ExtractorType, ExtractionSource
from kiln_ai.adapters.extractors import ExtractorRunner, ExtractionInput

# Configure extractor
config = ExtractorConfig(
    extractor_type=ExtractorType.litellm,
    options={
        "model": "gpt-4o",
        "provider": "openai"
    }
)

# Create extraction runner
runner = ExtractorRunner(config)

# Extract from PDF
pdf_input = ExtractionInput(
    content="/path/to/document.pdf",
    content_type="application/pdf",
    options={"output_format": "markdown"}
)

extraction = await runner.run(pdf_input)
print(f"Extracted text length: {len(extraction.text)}")
print(f"Output format: {extraction.format}")

Complete RAG Pipeline

from kiln_ai.datamodel import (
    RagConfig,
    EmbeddingConfig,
    ChunkerConfig,
    ChunkerType,
    VectorStoreConfig,
    VectorStoreType
)
from kiln_ai.adapters.chunkers import FixedWindowChunker
from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter

# Configure RAG pipeline
rag_config = RagConfig(
    embedding_config=EmbeddingConfig(
        model_id="text-embedding-3-small",
        provider="openai",
        dimensions=1536
    ),
    chunker_config=ChunkerConfig(
        chunker_type=ChunkerType.fixed_window,
        chunk_size=500,
        chunk_overlap=50
    ),
    vector_store_config=VectorStoreConfig(
        vector_store_type=VectorStoreType.lancedb,
        connection_params={
            "uri": "/path/to/lancedb",
            "table_name": "documents"
        }
    ),
    top_k=5
)

# 1. Chunk documents
chunker = FixedWindowChunker(
    chunk_size=rag_config.chunker_config.chunk_size,
    chunk_overlap=rag_config.chunker_config.chunk_overlap
)

document_text = "Long document content..."
chunking_result = chunker.chunk(document_text)
chunks = [c.text for c in chunking_result.chunks]
print(f"Created {len(chunks)} chunks")

# 2. Generate embeddings
embedding_adapter = LitellmEmbeddingAdapter(
    model_name=rag_config.embedding_config.model_id,
    provider=rag_config.embedding_config.provider
)

embeddings_result = await embedding_adapter.embed_batch(chunks)
embeddings = embeddings_result.embeddings
print(f"Generated {len(embeddings)} embeddings")

# 3. Store in vector database (pseudocode - actual implementation varies)
# vector_store.add_documents(chunks, embeddings)

# 4. Query for relevant chunks
query = "What is the main topic?"
query_embedding = await embedding_adapter.embed(query)
# results = vector_store.search(query_embedding.vector, k=rag_config.top_k)

Semantic Search

from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter
import numpy as np

# Setup
adapter = LitellmEmbeddingAdapter(
    model_name="text-embedding-3-small",
    provider="openai"
)

# Documents to search
documents = [
    "Python is a high-level programming language.",
    "Machine learning is a subset of artificial intelligence.",
    "Neural networks are inspired by biological neurons.",
    "Data science involves statistics and programming."
]

# Generate embeddings
doc_embeddings = await adapter.embed_batch(documents)

# Query
query = "What is AI?"
query_embedding = await adapter.embed(query)

# Calculate cosine similarity
def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Find most similar documents
similarities = []
for i, doc_emb in enumerate(doc_embeddings.embeddings):
    sim = cosine_similarity(query_embedding.vector, doc_emb.vector)
    similarities.append((i, sim))

# Sort by similarity
similarities.sort(key=lambda x: -x[1])

print("Most relevant documents:")
for idx, sim in similarities[:3]:
    print(f"  [{sim:.3f}] {documents[idx]}")

Multi-document Processing

from kiln_ai.adapters.chunkers import FixedWindowChunker
from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter
from kiln_ai.datamodel import ChunkedDocument, ChunkEmbeddings
import glob

# Initialize components
chunker = FixedWindowChunker(chunk_size=500, chunk_overlap=50)
embedder = LitellmEmbeddingAdapter(
    model_name="text-embedding-3-small",
    provider="openai"
)

# Process multiple documents
document_paths = glob.glob("/path/to/docs/*.txt")
all_chunks = []
chunk_metadata = []

for doc_path in document_paths:
    with open(doc_path, "r") as f:
        content = f.read()

    # Chunk document
    result = chunker.chunk(content)

    # Track chunks
    for chunk in result.chunks:
        all_chunks.append(chunk.text)
        chunk_metadata.append({
            "source": doc_path,
            "start": chunk.start,
            "end": chunk.end
        })

print(f"Total chunks: {len(all_chunks)}")

# Generate embeddings for all chunks
embeddings_result = await embedder.embed_batch(all_chunks)

# Store with metadata
for i, (chunk, embedding, metadata) in enumerate(zip(
    all_chunks,
    embeddings_result.embeddings,
    chunk_metadata
)):
    print(f"Chunk {i}: {metadata['source']}")
    # Store in vector database with metadata

Image and PDF Extraction

from kiln_ai.adapters.extractors import LitellmExtractor, ExtractionInput
from kiln_ai.datamodel import OutputFormat

# Create extractor with vision-capable model
extractor = LitellmExtractor(
    model_name="gpt-4o",
    provider="openai"
)

# Extract from image
image_input = ExtractionInput(
    content="/path/to/diagram.png",
    content_type="image/png",
    options={"output_format": OutputFormat.markdown}
)

image_extraction = await extractor.extract(image_input)
print("Extracted from image:")
print(image_extraction.text)

# Extract from PDF
pdf_input = ExtractionInput(
    content="/path/to/report.pdf",
    content_type="application/pdf",
    options={"output_format": OutputFormat.markdown}
)

pdf_extraction = await extractor.extract(pdf_input)
print("\nExtracted from PDF:")
print(pdf_extraction.text[:500])  # First 500 chars

Comparing Embedding Models

from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter
from kiln_ai.adapters.ml_embedding_model_list import built_in_embedding_models_from_provider

# Get available models
openai_models = built_in_embedding_models_from_provider("openai")

test_text = "This is a test document for comparing embedding models."

print("Comparing embedding models:\n")
for model_info in openai_models:
    # Create adapter
    adapter = LitellmEmbeddingAdapter(
        model_name=model_info.name,
        provider="openai"
    )

    # Generate embedding
    embedding = await adapter.embed(test_text)

    print(f"{model_info.name}:")
    print(f"  Dimensions: {len(embedding.vector)}")
    print(f"  Max input tokens: {model_info.max_input_tokens}")

Configurable Embedding Dimensions

from kiln_ai.adapters.embedding import LitellmEmbeddingAdapter, EmbeddingOptions

# OpenAI text-embedding-3 models support configurable dimensions
options = EmbeddingOptions(
    dimensions=512,  # Reduce from default 1536
    encoding_format="float"
)

adapter = LitellmEmbeddingAdapter(
    model_name="text-embedding-3-small",
    provider="openai",
    options=options
)

embedding = await adapter.embed("Sample text")
print(f"Embedding dimensions: {len(embedding.vector)}")  # Should be 512

Install with Tessl CLI

npx tessl i tessl/pypi-kiln-ai

docs

configuration.md

datamodel.md

evaluation.md

fine-tuning.md

index.md

models.md

prompts.md

rag-embeddings.md

task-execution.md

tools.md

tile.json