0
# Azure Search Documents
1
2
Microsoft Azure AI Search Client Library for Python providing comprehensive search, indexing, and AI-powered document processing capabilities. This library enables developers to build rich search experiences and generative AI applications with vector, keyword, and hybrid query forms, filtered queries for metadata and geospatial search, faceted navigation, and advanced search index management.
3
4
## Package Information
5
6
- **Package Name**: azure-search-documents
7
- **Language**: Python
8
- **Installation**: `pip install azure-search-documents`
9
- **Version**: 11.5.3
10
11
## Core Imports
12
13
```python
14
from azure.search.documents import SearchClient, ApiVersion
15
from azure.search.documents.indexes import SearchIndexClient, SearchIndexerClient
16
```
17
18
For async operations:
19
20
```python
21
from azure.search.documents.aio import SearchClient as AsyncSearchClient
22
from azure.search.documents.indexes.aio import SearchIndexClient as AsyncSearchIndexClient
23
```
24
25
Common models and types:
26
27
```python
28
from azure.search.documents.models import QueryType, SearchMode
29
from azure.search.documents.indexes.models import SearchIndex, SearchField
30
```
31
32
## Basic Usage
33
34
```python
35
from azure.core.credentials import AzureKeyCredential
36
from azure.search.documents import SearchClient
37
38
# Initialize the search client
39
endpoint = "https://your-service.search.windows.net"
40
index_name = "your-index"
41
credential = AzureKeyCredential("your-admin-key")
42
43
client = SearchClient(endpoint, index_name, credential)
44
45
# Search for documents
46
results = client.search(search_text="python programming", top=10)
47
for result in results:
48
print(f"Document ID: {result['id']}, Score: {result['@search.score']}")
49
50
# Upload documents
51
documents = [
52
{"id": "1", "title": "Python Guide", "content": "Learn Python programming"},
53
{"id": "2", "title": "Azure Search", "content": "Search service in Azure"}
54
]
55
result = client.upload_documents(documents)
56
print(f"Uploaded {len(result)} documents")
57
58
# Get document by key
59
document = client.get_document(key="1")
60
print(f"Retrieved: {document['title']}")
61
```
62
63
## Architecture
64
65
Azure Search Documents follows a multi-client architecture:
66
67
- **SearchClient**: Primary interface for search operations, document management, and query execution on a specific index
68
- **SearchIndexClient**: Manages search indexes, schema definitions, synonym maps, and text analysis operations
69
- **SearchIndexerClient**: Handles data ingestion through indexers, data sources, and AI enrichment skillsets
70
- **Models**: Rich type system for search requests, responses, index definitions, and configuration objects
71
72
Each client supports both synchronous and asynchronous operations, with async variants in the `aio` submodules. The library integrates with Azure Core for authentication, retry policies, and logging.
73
74
## Capabilities
75
76
### Document Search and Querying
77
78
Core search functionality including text search, vector search, hybrid queries, autocomplete, suggestions, faceted navigation, and result filtering. Supports multiple query types from simple text to complex semantic search with AI-powered ranking.
79
80
```python { .api }
81
class SearchClient:
82
def __init__(self, endpoint: str, index_name: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
83
def search(self, search_text: Optional[str] = None, **kwargs) -> SearchItemPaged: ...
84
def suggest(self, search_text: str, suggester_name: str, **kwargs) -> List[Dict]: ...
85
def autocomplete(self, search_text: str, suggester_name: str, **kwargs) -> List[Dict]: ...
86
def get_document(self, key: str, selected_fields: Optional[List[str]] = None, **kwargs) -> Dict: ...
87
def get_document_count(self, **kwargs) -> int: ...
88
```
89
90
[Document Search and Querying](./search-client.md)
91
92
### Document Indexing and Management
93
94
Document lifecycle management including upload, update, merge, and deletion operations. Supports both individual document operations and high-throughput batch processing with automatic retries and error handling.
95
96
```python { .api }
97
class SearchClient:
98
def upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
99
def merge_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
100
def merge_or_upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
101
def delete_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
102
def index_documents(self, batch: IndexDocumentsBatch, **kwargs) -> List[IndexingResult]: ...
103
104
class SearchIndexingBufferedSender:
105
def __init__(self, endpoint: str, index_name: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
106
def upload_documents(self, documents: List[Dict], **kwargs) -> None: ...
107
def merge_documents(self, documents: List[Dict], **kwargs) -> None: ...
108
def delete_documents(self, documents: List[Dict], **kwargs) -> None: ...
109
def flush(self, timeout: Optional[int] = None, **kwargs) -> bool: ...
110
```
111
112
[Document Search and Querying](./search-client.md)
113
114
### Search Index Management
115
116
Comprehensive index schema management including creation, updates, deletion, and configuration. Handles field definitions, analyzers, scoring profiles, vector search configurations, and synonym maps for search customization.
117
118
```python { .api }
119
class SearchIndexClient:
120
def __init__(self, endpoint: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
121
def create_index(self, index: SearchIndex, **kwargs) -> SearchIndex: ...
122
def get_index(self, name: str, **kwargs) -> SearchIndex: ...
123
def list_indexes(self, *, select: Optional[List[str]] = None, **kwargs) -> ItemPaged[SearchIndex]: ...
124
def delete_index(self, index: Union[str, SearchIndex], **kwargs) -> None: ...
125
def create_synonym_map(self, synonym_map: SynonymMap, **kwargs) -> SynonymMap: ...
126
def analyze_text(self, index_name: str, analyze_request: AnalyzeTextOptions, **kwargs) -> AnalyzeResult: ...
127
```
128
129
[Search Index Management](./index-management.md)
130
131
### Data Ingestion and AI Enrichment
132
133
Automated data ingestion through indexers that connect to various data sources (Azure Blob Storage, SQL, Cosmos DB). Includes AI-powered content enrichment through skillsets with cognitive services integration, custom skills, and knowledge mining capabilities.
134
135
```python { .api }
136
class SearchIndexerClient:
137
def __init__(self, endpoint: str, credential: Union[AzureKeyCredential, TokenCredential], **kwargs) -> None: ...
138
def create_indexer(self, indexer: SearchIndexer, **kwargs) -> SearchIndexer: ...
139
def run_indexer(self, name: str, **kwargs) -> None: ...
140
def get_indexer_status(self, name: str, **kwargs) -> SearchIndexerStatus: ...
141
def create_data_source_connection(self, data_source: SearchIndexerDataSourceConnection, **kwargs) -> SearchIndexerDataSourceConnection: ...
142
def create_skillset(self, skillset: SearchIndexerSkillset, **kwargs) -> SearchIndexerSkillset: ...
143
```
144
145
[Data Ingestion and AI Enrichment](./indexer-management.md)
146
147
### Data Models and Types
148
149
Rich type system including search request/response models, index schema definitions, skill configurations, and all enumeration types. Provides complete type safety and IntelliSense support for all Azure Search operations.
150
151
```python { .api }
152
# Core search models
153
class SearchIndex: ...
154
class SearchField: ...
155
class IndexingResult: ...
156
class VectorQuery: ...
157
158
# Enumerations
159
class QueryType(str, Enum): ...
160
class SearchMode(str, Enum): ...
161
class IndexAction(str, Enum): ...
162
163
# Index configuration models
164
class SearchIndexer: ...
165
class SearchIndexerSkillset: ...
166
class SynonymMap: ...
167
```
168
169
[Data Models and Types](./models.md)
170
171
### Async Client Operations
172
173
Asynchronous versions of all client classes providing the same functionality with async/await support for high-performance applications. Includes async context managers and iterators for efficient resource management.
174
175
```python { .api }
176
# Async clients mirror sync functionality
177
class SearchClient: # from azure.search.documents.aio
178
async def search(self, search_text: Optional[str] = None, **kwargs) -> AsyncSearchItemPaged: ...
179
async def upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]: ...
180
181
class SearchIndexClient: # from azure.search.documents.indexes.aio
182
async def create_index(self, index: SearchIndex, **kwargs) -> SearchIndex: ...
183
184
class SearchIndexerClient: # from azure.search.documents.indexes.aio
185
async def run_indexer(self, name: str, **kwargs) -> None: ...
186
```
187
188
[Async Client Operations](./async-clients.md)
189
190
## Error Handling
191
192
The library raises specific exceptions for different error conditions:
193
194
```python
195
from azure.search.documents import RequestEntityTooLargeError
196
from azure.core.exceptions import ResourceNotFoundError, ClientAuthenticationError
197
198
try:
199
client.upload_documents(large_batch)
200
except RequestEntityTooLargeError:
201
# Handle batch size too large
202
pass
203
except ResourceNotFoundError:
204
# Handle index not found
205
pass
206
except ClientAuthenticationError:
207
# Handle authentication errors
208
pass
209
```
210
211
## Common Types
212
213
```python { .api }
214
# Import types
215
from typing import Dict, List, Optional, Union
216
from azure.core.credentials import AzureKeyCredential, TokenCredential
217
218
# API Version selection
219
class ApiVersion(str, Enum):
220
"""Supported Azure Search API versions."""
221
V2020_06_30 = "2020-06-30"
222
V2023_11_01 = "2023-11-01"
223
V2024_07_01 = "2024-07-01" # Default version
224
225
# Authentication credentials
226
Union[AzureKeyCredential, TokenCredential]
227
228
# Document representation
229
Dict[str, Any] # Documents are represented as dictionaries
230
231
# Search results
232
class SearchItemPaged:
233
"""Iterator for paginated search results"""
234
def __iter__(self) -> Iterator[Dict[str, Any]]: ...
235
def by_page(self) -> Iterator[List[Dict[str, Any]]]: ...
236
237
# Indexing results
238
class IndexingResult:
239
key: str
240
status: bool
241
error_message: Optional[str]
242
status_code: int
243
```