or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

async-clients.mdindex-management.mdindex.mdindexer-management.mdmodels.mdsearch-client.md
tile.json

tessl/pypi-azure-search-documents

Microsoft Azure AI Search Client Library for Python providing comprehensive search, indexing, and AI-powered document processing capabilities.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/azure-search-documents@11.5.x

To install, run

npx @tessl/cli install tessl/pypi-azure-search-documents@11.5.0

index.mddocs/

Azure Search Documents

Microsoft Azure AI Search Client Library for Python providing comprehensive search, indexing, and AI-powered document processing capabilities. This library enables developers to build rich search experiences and generative AI applications with vector, keyword, and hybrid query forms, filtered queries for metadata and geospatial search, faceted navigation, and advanced search index management.

Package Information

  • Package Name: azure-search-documents
  • Language: Python
  • Installation: pip install azure-search-documents
  • Version: 11.5.3

Core Imports

from azure.search.documents import SearchClient, ApiVersion
from azure.search.documents.indexes import SearchIndexClient, SearchIndexerClient

For async operations:

from azure.search.documents.aio import SearchClient as AsyncSearchClient
from azure.search.documents.indexes.aio import SearchIndexClient as AsyncSearchIndexClient

Common models and types:

from azure.search.documents.models import QueryType, SearchMode
from azure.search.documents.indexes.models import SearchIndex, SearchField

Basic Usage

from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient

# Initialize the search client
endpoint = "https://your-service.search.windows.net"
index_name = "your-index"
credential = AzureKeyCredential("your-admin-key")

client = SearchClient(endpoint, index_name, credential)

# Search for documents
results = client.search(search_text="python programming", top=10)
for result in results:
    print(f"Document ID: {result['id']}, Score: {result['@search.score']}")

# Upload documents
documents = [
    {"id": "1", "title": "Python Guide", "content": "Learn Python programming"},
    {"id": "2", "title": "Azure Search", "content": "Search service in Azure"}
]
result = client.upload_documents(documents)
print(f"Uploaded {len(result)} documents")

# Get document by key
document = client.get_document(key="1")
print(f"Retrieved: {document['title']}")

Architecture

Azure Search Documents follows a multi-client architecture:

  • SearchClient: Primary interface for search operations, document management, and query execution on a specific index
  • SearchIndexClient: Manages search indexes, schema definitions, synonym maps, and text analysis operations
  • SearchIndexerClient: Handles data ingestion through indexers, data sources, and AI enrichment skillsets
  • Models: Rich type system for search requests, responses, index definitions, and configuration objects

Each client supports both synchronous and asynchronous operations, with async variants in the aio submodules. The library integrates with Azure Core for authentication, retry policies, and logging.

Capabilities

Document Search and Querying

Core search functionality including text search, vector search, hybrid queries, autocomplete, suggestions, faceted navigation, and result filtering. Supports multiple query types from simple text to complex semantic search with AI-powered ranking.

class SearchClient:
    def __init__(self, endpoint: str, index_name: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
    def search(self, search_text: Optional[str] = None, **kwargs) -> SearchItemPaged: ...
    def suggest(self, search_text: str, suggester_name: str, **kwargs) -> List[Dict]: ...
    def autocomplete(self, search_text: str, suggester_name: str, **kwargs) -> List[Dict]: ...
    def get_document(self, key: str, selected_fields: Optional[List[str]] = None, **kwargs) -> Dict: ...
    def get_document_count(self, **kwargs) -> int: ...

Document Search and Querying

Document Indexing and Management

Document lifecycle management including upload, update, merge, and deletion operations. Supports both individual document operations and high-throughput batch processing with automatic retries and error handling.

class SearchClient:
    def upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
    def merge_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
    def merge_or_upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
    def delete_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
    def index_documents(self, batch: IndexDocumentsBatch, **kwargs) -> List[IndexingResult]: ...

class SearchIndexingBufferedSender:
    def __init__(self, endpoint: str, index_name: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
    def upload_documents(self, documents: List[Dict], **kwargs) -> None: ...
    def merge_documents(self, documents: List[Dict], **kwargs) -> None: ...
    def delete_documents(self, documents: List[Dict], **kwargs) -> None: ...
    def flush(self, timeout: Optional[int] = None, **kwargs) -> bool: ...

Document Search and Querying

Search Index Management

Comprehensive index schema management including creation, updates, deletion, and configuration. Handles field definitions, analyzers, scoring profiles, vector search configurations, and synonym maps for search customization.

class SearchIndexClient:
    def __init__(self, endpoint: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
    def create_index(self, index: SearchIndex, **kwargs) -> SearchIndex: ...
    def get_index(self, name: str, **kwargs) -> SearchIndex: ...
    def list_indexes(self, *, select: Optional[List[str]] = None, **kwargs) -> ItemPaged[SearchIndex]: ...
    def delete_index(self, index: Union[str, SearchIndex], **kwargs) -> None: ...
    def create_synonym_map(self, synonym_map: SynonymMap, **kwargs) -> SynonymMap: ...
    def analyze_text(self, index_name: str, analyze_request: AnalyzeTextOptions, **kwargs) -> AnalyzeResult: ...

Search Index Management

Data Ingestion and AI Enrichment

Automated data ingestion through indexers that connect to various data sources (Azure Blob Storage, SQL, Cosmos DB). Includes AI-powered content enrichment through skillsets with cognitive services integration, custom skills, and knowledge mining capabilities.

class SearchIndexerClient:
    def __init__(self, endpoint: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
    def create_indexer(self, indexer: SearchIndexer, **kwargs) -> SearchIndexer: ...
    def run_indexer(self, name: str, **kwargs) -> None: ...
    def get_indexer_status(self, name: str, **kwargs) -> SearchIndexerStatus: ...
    def create_data_source_connection(self, data_source: SearchIndexerDataSourceConnection, **kwargs) -> SearchIndexerDataSourceConnection: ...
    def create_skillset(self, skillset: SearchIndexerSkillset, **kwargs) -> SearchIndexerSkillset: ...

Data Ingestion and AI Enrichment

Data Models and Types

Rich type system including search request/response models, index schema definitions, skill configurations, and all enumeration types. Provides complete type safety and IntelliSense support for all Azure Search operations.

# Core search models
class SearchIndex: ...
class SearchField: ...
class IndexingResult: ...
class VectorQuery: ...

# Enumerations
class QueryType(str, Enum): ...
class SearchMode(str, Enum): ...
class IndexAction(str, Enum): ...

# Index configuration models  
class SearchIndexer: ...
class SearchIndexerSkillset: ...
class SynonymMap: ...

Data Models and Types

Async Client Operations

Asynchronous versions of all client classes providing the same functionality with async/await support for high-performance applications. Includes async context managers and iterators for efficient resource management.

# Async clients mirror sync functionality
class SearchClient:  # from azure.search.documents.aio
    async def search(self, search_text: Optional[str] = None, **kwargs) -> AsyncSearchItemPaged: ...
    async def upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...

class SearchIndexClient:  # from azure.search.documents.indexes.aio  
    async def create_index(self, index: SearchIndex, **kwargs) -> SearchIndex: ...

class SearchIndexerClient:  # from azure.search.documents.indexes.aio
    async def run_indexer(self, name: str, **kwargs) -> None: ...

Async Client Operations

Error Handling

The library raises specific exceptions for different error conditions:

from azure.search.documents import RequestEntityTooLargeError
from azure.core.exceptions import ResourceNotFoundError, ClientAuthenticationError

try:
    client.upload_documents(large_batch)
except RequestEntityTooLargeError:
    # Handle batch size too large
    pass
except ResourceNotFoundError:
    # Handle index not found
    pass
except ClientAuthenticationError:
    # Handle authentication errors
    pass

Common Types

# Import types
from typing import Dict, List, Optional, Union
from azure.core.credentials import AzureKeyCredential, TokenCredential

# API Version selection
class ApiVersion(str, Enum):
    """Supported Azure Search API versions."""
    V2020_06_30 = "2020-06-30"
    V2023_11_01 = "2023-11-01"  
    V2024_07_01 = "2024-07-01"  # Default version

# Authentication credentials
Union[AzureKeyCredential, TokenCredential]

# Document representation
Dict[str, Any]  # Documents are represented as dictionaries

# Search results
class SearchItemPaged:
    """Iterator for paginated search results"""
    def __iter__(self) -> Iterator[Dict[str, Any]]: ...
    def by_page(self) -> Iterator[List[Dict[str, Any]]]: ...

# Indexing results
class IndexingResult:
    key: str
    status: bool  
    error_message: Optional[str]
    status_code: int