0
# Document Search and Querying
1
2
The SearchClient provides comprehensive functionality for searching documents, managing document lifecycle, and executing queries against Azure AI Search indexes. It supports text search, vector search, hybrid queries, suggestions, autocomplete, and sophisticated filtering and ranking capabilities.
3
4
## Capabilities
5
6
### Client Initialization
7
8
Create a SearchClient instance to connect to a specific search index.
9
10
```python { .api }
11
class SearchClient:
12
def __init__(
13
self,
14
endpoint: str,
15
index_name: str,
16
credential: Union[AzureKeyCredential, TokenCredential],
17
**kwargs
18
) -> None:
19
"""
20
Initialize SearchClient for a specific index.
21
22
Parameters:
23
- endpoint (str): The URL endpoint of an Azure search service
24
- index_name (str): The name of the index to connect to
25
- credential: A credential to authorize search client requests
26
- api_version (str, optional): The Search API version to use
27
- audience (str, optional): AAD audience for authentication
28
"""
29
30
def close(self) -> None:
31
"""Close the session."""
32
33
def __enter__(self) -> "SearchClient": ...
34
def __exit__(self, *args) -> None: ...
35
```
36
37
### Document Search
38
39
Execute search queries with various modes and options.
40
41
```python { .api }
42
def search(
43
self,
44
search_text: Optional[str] = None,
45
*,
46
include_total_count: Optional[bool] = None,
47
facets: Optional[List[str]] = None,
48
filter: Optional[str] = None,
49
highlight_fields: Optional[str] = None,
50
highlight_post_tag: Optional[str] = None,
51
highlight_pre_tag: Optional[str] = None,
52
minimum_coverage: Optional[float] = None,
53
order_by: Optional[List[str]] = None,
54
query_type: Optional[Union[str, QueryType]] = None,
55
scoring_parameters: Optional[List[str]] = None,
56
scoring_profile: Optional[str] = None,
57
search_fields: Optional[List[str]] = None,
58
search_mode: Optional[Union[str, SearchMode]] = None,
59
select: Optional[List[str]] = None,
60
skip: Optional[int] = None,
61
top: Optional[int] = None,
62
vector_queries: Optional[List[VectorQuery]] = None,
63
semantic_configuration_name: Optional[str] = None,
64
query_answer: Optional[Union[str, QueryAnswerType]] = None,
65
query_caption: Optional[Union[str, QueryCaptionType]] = None,
66
**kwargs
67
) -> SearchItemPaged:
68
"""
69
Execute a search query against the index.
70
71
Parameters:
72
- search_text (str, optional): Text to search for
73
- include_total_count (bool): Include total count of matches
74
- facets (List[str]): Facet expressions for navigation
75
- filter (str): OData filter expression
76
- highlight_fields (str): Fields to highlight in results
77
- highlight_pre_tag (str): Tag before highlighted text
78
- highlight_post_tag (str): Tag after highlighted text
79
- minimum_coverage (float): Minimum coverage percentage
80
- order_by (List[str]): Sort expressions
81
- query_type (QueryType): Type of query (simple, full, semantic)
82
- scoring_parameters (List[str]): Scoring parameter values
83
- scoring_profile (str): Scoring profile name
84
- search_fields (List[str]): Fields to search in
85
- search_mode (SearchMode): Search mode (any, all)
86
- select (List[str]): Fields to include in results
87
- skip (int): Number of results to skip
88
- top (int): Number of results to return
89
- vector_queries (List[VectorQuery]): Vector queries for similarity search
90
- semantic_configuration_name (str): Semantic search configuration
91
- query_answer (QueryAnswerType): Answer extraction type
92
- query_caption (QueryCaptionType): Caption extraction type
93
94
Returns:
95
SearchItemPaged: Iterator over search results
96
"""
97
```
98
99
### Suggestions and Autocomplete
100
101
Get search suggestions and autocomplete results based on partial input.
102
103
```python { .api }
104
def suggest(
105
self,
106
search_text: str,
107
suggester_name: str,
108
*,
109
filter: Optional[str] = None,
110
use_fuzzy_matching: Optional[bool] = None,
111
highlight_post_tag: Optional[str] = None,
112
highlight_pre_tag: Optional[str] = None,
113
minimum_coverage: Optional[float] = None,
114
order_by: Optional[List[str]] = None,
115
search_fields: Optional[List[str]] = None,
116
select: Optional[List[str]] = None,
117
top: Optional[int] = None,
118
**kwargs
119
) -> List[Dict]:
120
"""
121
Get search suggestions based on partial search text.
122
123
Parameters:
124
- search_text (str): Partial search text
125
- suggester_name (str): Name of the suggester to use
126
- filter (str): OData filter expression
127
- use_fuzzy_matching (bool): Enable fuzzy matching
128
- highlight_pre_tag (str): Tag before highlighted text
129
- highlight_post_tag (str): Tag after highlighted text
130
- minimum_coverage (float): Minimum coverage percentage
131
- order_by (List[str]): Sort expressions
132
- search_fields (List[str]): Fields to search in
133
- select (List[str]): Fields to include in results
134
- top (int): Number of suggestions to return
135
136
Returns:
137
List[Dict]: List of suggestion results
138
"""
139
140
def autocomplete(
141
self,
142
search_text: str,
143
suggester_name: str,
144
*,
145
autocomplete_mode: Optional[Union[str, AutocompleteMode]] = None,
146
filter: Optional[str] = None,
147
use_fuzzy_matching: Optional[bool] = None,
148
highlight_post_tag: Optional[str] = None,
149
highlight_pre_tag: Optional[str] = None,
150
minimum_coverage: Optional[float] = None,
151
search_fields: Optional[List[str]] = None,
152
top: Optional[int] = None,
153
**kwargs
154
) -> List[Dict]:
155
"""
156
Get autocomplete suggestions based on partial search text.
157
158
Parameters:
159
- search_text (str): Partial search text
160
- suggester_name (str): Name of the suggester to use
161
- autocomplete_mode (AutocompleteMode): Autocomplete behavior
162
- filter (str): OData filter expression
163
- use_fuzzy_matching (bool): Enable fuzzy matching
164
- highlight_pre_tag (str): Tag before highlighted text
165
- highlight_post_tag (str): Tag after highlighted text
166
- minimum_coverage (float): Minimum coverage percentage
167
- search_fields (List[str]): Fields to search in
168
- top (int): Number of completions to return
169
170
Returns:
171
List[Dict]: List of autocomplete results
172
"""
173
```
174
175
### Document Retrieval
176
177
Get individual documents and document counts.
178
179
```python { .api }
180
def get_document(
181
self,
182
key: str,
183
selected_fields: Optional[List[str]] = None,
184
**kwargs
185
) -> Dict:
186
"""
187
Retrieve a document by its key value.
188
189
Parameters:
190
- key (str): The key value of the document to retrieve
191
- selected_fields (List[str], optional): Fields to include in result
192
193
Returns:
194
Dict: The retrieved document
195
"""
196
197
def get_document_count(self, **kwargs) -> int:
198
"""
199
Get the count of documents in the index.
200
201
Returns:
202
int: Number of documents in the index
203
"""
204
```
205
206
### Document Upload and Indexing
207
208
Add, update, merge, and delete documents in the search index.
209
210
```python { .api }
211
def upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]:
212
"""
213
Upload documents to the index. Creates new documents or replaces existing ones.
214
215
Parameters:
216
- documents (List[Dict]): Documents to upload
217
218
Returns:
219
List[IndexingResult]: Results of the indexing operations
220
"""
221
222
def merge_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]:
223
"""
224
Merge documents into the index. Updates existing documents with provided fields.
225
226
Parameters:
227
- documents (List[Dict]): Documents to merge
228
229
Returns:
230
List[IndexingResult]: Results of the indexing operations
231
"""
232
233
def merge_or_upload_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]:
234
"""
235
Merge documents if they exist, upload if they don't.
236
237
Parameters:
238
- documents (List[Dict]): Documents to merge or upload
239
240
Returns:
241
List[IndexingResult]: Results of the indexing operations
242
"""
243
244
def delete_documents(self, documents: List[Dict], **kwargs) -> List[IndexingResult]:
245
"""
246
Delete documents from the index.
247
248
Parameters:
249
- documents (List[Dict]): Documents to delete (must include key field)
250
251
Returns:
252
List[IndexingResult]: Results of the deletion operations
253
"""
254
255
def index_documents(self, batch: IndexDocumentsBatch, **kwargs) -> List[IndexingResult]:
256
"""
257
Execute a batch of document operations.
258
259
Parameters:
260
- batch (IndexDocumentsBatch): Batch of document operations
261
262
Returns:
263
List[IndexingResult]: Results of the batch operations
264
"""
265
```
266
267
### Batch Document Operations
268
269
Create and manage batches of document operations for efficient processing.
270
271
```python { .api }
272
class IndexDocumentsBatch:
273
"""Batch container for document operations."""
274
275
def __init__(self) -> None:
276
"""Initialize an empty batch."""
277
278
def add_upload_actions(self, documents: List[Dict]) -> None:
279
"""Add upload actions to the batch."""
280
281
def add_delete_actions(self, documents: List[Dict]) -> None:
282
"""Add delete actions to the batch."""
283
284
def add_merge_actions(self, documents: List[Dict]) -> None:
285
"""Add merge actions to the batch."""
286
287
def add_merge_or_upload_actions(self, documents: List[Dict]) -> None:
288
"""Add merge-or-upload actions to the batch."""
289
290
def __len__(self) -> int:
291
"""Get the number of actions in the batch."""
292
```
293
294
### High-Throughput Document Indexing
295
296
Buffered sender for automatic batching and retry handling in high-volume scenarios.
297
298
```python { .api }
299
class SearchIndexingBufferedSender:
300
"""High-throughput document indexing with automatic batching."""
301
302
def __init__(
303
self,
304
endpoint: str,
305
index_name: str,
306
credential: Union[AzureKeyCredential, TokenCredential],
307
*,
308
auto_flush_interval: int = 60,
309
initial_batch_action_count: int = 512,
310
max_retries_per_action: int = 3,
311
max_retries: int = 3,
312
**kwargs
313
) -> None:
314
"""
315
Initialize buffered sender for high-throughput indexing.
316
317
Parameters:
318
- endpoint (str): Search service endpoint
319
- index_name (str): Target index name
320
- credential: Authentication credential
321
- auto_flush_interval (int): Auto-flush interval in seconds
322
- initial_batch_action_count (int): Initial batch size
323
- max_retries_per_action (int): Max retries per document
324
- max_retries (int): Max retries per batch
325
"""
326
327
def upload_documents(self, documents: List[Dict], **kwargs) -> None:
328
"""Queue documents for upload."""
329
330
def delete_documents(self, documents: List[Dict], **kwargs) -> None:
331
"""Queue documents for deletion."""
332
333
def merge_documents(self, documents: List[Dict], **kwargs) -> None:
334
"""Queue documents for merge."""
335
336
def merge_or_upload_documents(self, documents: List[Dict], **kwargs) -> None:
337
"""Queue documents for merge or upload."""
338
339
def flush(self, timeout: Optional[int] = None, **kwargs) -> bool:
340
"""
341
Flush all pending operations.
342
343
Parameters:
344
- timeout (int, optional): Timeout in seconds
345
346
Returns:
347
bool: True if all operations completed successfully
348
"""
349
350
def close(self, **kwargs) -> None:
351
"""Close the sender and flush remaining operations."""
352
353
def __enter__(self) -> "SearchIndexingBufferedSender": ...
354
def __exit__(self, *args) -> None: ...
355
```
356
357
### Request Customization
358
359
Send custom HTTP requests to the search service.
360
361
```python { .api }
362
def send_request(
363
self,
364
request: HttpRequest,
365
*,
366
stream: bool = False,
367
**kwargs
368
) -> HttpResponse:
369
"""
370
Send a custom HTTP request to the search service.
371
372
Parameters:
373
- request (HttpRequest): The HTTP request to send
374
- stream (bool): Whether to stream the response
375
376
Returns:
377
HttpResponse: The HTTP response
378
"""
379
```
380
381
## Usage Examples
382
383
### Basic Search
384
385
```python
386
from azure.search.documents import SearchClient
387
from azure.core.credentials import AzureKeyCredential
388
389
client = SearchClient(
390
endpoint="https://service.search.windows.net",
391
index_name="hotels",
392
credential=AzureKeyCredential("admin-key")
393
)
394
395
# Simple text search
396
results = client.search("luxury hotel", top=5)
397
for result in results:
398
print(f"{result['name']}: {result['@search.score']}")
399
400
# Filtered search with facets
401
results = client.search(
402
search_text="beach resort",
403
filter="rating ge 4",
404
facets=["category", "city"],
405
order_by=["rating desc", "name"]
406
)
407
```
408
409
### Vector Search
410
411
```python
412
from azure.search.documents.models import VectorizedQuery
413
414
# Vector search with pre-computed embedding
415
embedding = [0.1, 0.2, 0.3, ...] # Your computed vector
416
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=5, fields="content_vector")
417
418
results = client.search(
419
search_text=None,
420
vector_queries=[vector_query],
421
select=["id", "title", "content"]
422
)
423
```
424
425
### Batch Document Upload
426
427
```python
428
from azure.search.documents import IndexDocumentsBatch
429
430
# Create batch
431
batch = IndexDocumentsBatch()
432
batch.add_upload_actions([
433
{"id": "1", "title": "Document 1", "content": "Content 1"},
434
{"id": "2", "title": "Document 2", "content": "Content 2"}
435
])
436
batch.add_delete_actions([{"id": "old-doc"}])
437
438
# Execute batch
439
results = client.index_documents(batch)
440
for result in results:
441
print(f"Document {result.key}: {'succeeded' if result.status else 'failed'}")
442
```
443
444
### High-Volume Indexing
445
446
```python
447
from azure.search.documents import SearchIndexingBufferedSender
448
449
with SearchIndexingBufferedSender(endpoint, index_name, credential) as sender:
450
# Documents are automatically batched and sent
451
sender.upload_documents(large_document_list)
452
sender.merge_documents(updates)
453
# Automatic flush on context exit
454
```
455
456
## Common Types
457
458
```python { .api }
459
# Search results iterator
460
class SearchItemPaged:
461
def __iter__(self) -> Iterator[Dict[str, Any]]: ...
462
def by_page(self) -> Iterator[List[Dict[str, Any]]]: ...
463
def get_count(self) -> Optional[int]: ...
464
def get_coverage(self) -> Optional[float]: ...
465
def get_facets(self) -> Optional[Dict[str, List[Dict[str, Any]]]]: ...
466
467
# Indexing operation result
468
class IndexingResult:
469
key: str
470
status: bool
471
error_message: Optional[str]
472
status_code: int
473
474
# Exception for oversized requests
475
class RequestEntityTooLargeError(Exception):
476
"""Raised when the request payload is too large."""
477
```