Tessl Tile for pypi/azure-ai-documentintelligence@1.0.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

async-clients.md classifier-management.md document-analysis.md index.md model-management.md models-and-types.md

document-analysis.mddocs/

0
# Document Analysis Operations
1

2
Core document processing functionality for analyzing single documents, processing batches, and classifying documents. These operations support both prebuilt models (layout, invoice, receipt, etc.) and custom models with advanced features like high-resolution OCR, language detection, and structured data extraction.
3

4
## Capabilities
5

6
### Single Document Analysis
7

8
Analyzes individual documents using specified models to extract text, tables, key-value pairs, and structured data. Returns enhanced LRO poller with operation metadata.
9

10
```python { .api }
11
def begin_analyze_document(
12
    model_id: str,
13
    body: Union[AnalyzeDocumentRequest, JSON, IO[bytes]],
14
    *,
15
    pages: Optional[str] = None,
16
    locale: Optional[str] = None,
17
    string_index_type: Optional[Union[str, StringIndexType]] = None,
18
    features: Optional[List[Union[str, DocumentAnalysisFeature]]] = None,
19
    query_fields: Optional[List[str]] = None,
20
    output_content_format: Optional[Union[str, DocumentContentFormat]] = None,
21
    output: Optional[List[Union[str, AnalyzeOutputOption]]] = None,
22
    **kwargs: Any
23
) -> AnalyzeDocumentLROPoller[AnalyzeResult]:
24
    """
25
    Analyzes document with the specified model.
26

27
    Parameters:
28
    - model_id (str): Model ID for analysis (e.g., "prebuilt-layout", "prebuilt-invoice")
29
    - body: Document data as AnalyzeDocumentRequest, JSON dict, or file bytes
30
    - pages (str, optional): Page range specification (e.g., "1-3,5")  
31
    - locale (str, optional): Locale hint for better recognition
32
    - string_index_type (StringIndexType, optional): Character indexing scheme
33
    - features (List[DocumentAnalysisFeature], optional): Additional features to enable
34
    - query_fields (List[str], optional): Custom field extraction queries
35
    - output_content_format (DocumentContentFormat, optional): Content format (text/markdown)
36
    - output (List[AnalyzeOutputOption], optional): Additional outputs (pdf/figures)
37

38
    Returns:
39
    AnalyzeDocumentLROPoller[AnalyzeResult]: Enhanced poller with operation metadata
40
    """
41
```
42

43
Usage example:
44

45
```python
46
# Analyze with file upload
47
with open("document.pdf", "rb") as f:
48
    poller = client.begin_analyze_document(
49
        model_id="prebuilt-layout",
50
        body=f,
51
        features=["languages", "barcodes"],
52
        output_content_format="markdown"
53
    )
54
    result = poller.result()
55

56
# Access operation metadata
57
operation_id = poller.details["operation_id"]
58

59
# Analyze with custom fields
60
with open("invoice.pdf", "rb") as f:
61
    poller = client.begin_analyze_document(
62
        "prebuilt-invoice",
63
        f,
64
        query_fields=["Tax ID", "Purchase Order"]
65
    )
66
    result = poller.result()
67
```
68

69
### Batch Document Analysis
70

71
Processes multiple documents in a single operation for efficient bulk processing. Supports Azure Blob Storage as document source with flexible file selection.
72

73
```python { .api }
74
def begin_analyze_batch_documents(
75
    model_id: str,
76
    body: Union[AnalyzeBatchDocumentsRequest, JSON, IO[bytes]],
77
    **kwargs: Any
78
) -> LROPoller[AnalyzeBatchResult]:
79
    """
80
    Analyzes multiple documents in batch.
81

82
    Parameters:
83
    - model_id (str): Model ID for batch analysis
84
    - body: Batch request with Azure Blob source configuration
85

86
    Returns:
87
    LROPoller[AnalyzeBatchResult]: Batch operation poller
88
    """
89
```
90

91
### Batch Results Management
92

93
Retrieves and manages batch processing results with support for listing operations and accessing individual results.
94

95
```python { .api }
96
def list_analyze_batch_results(
97
    model_id: str,
98
    *,
99
    skip: Optional[int] = None,
100
    top: Optional[int] = None,
101
    **kwargs: Any
102
) -> Iterable[AnalyzeBatchOperation]:
103
    """
104
    Lists batch analysis operations for the specified model.
105

106
    Parameters:
107
    - model_id (str): Model ID to filter operations
108
    - skip (int, optional): Number of operations to skip
109
    - top (int, optional): Maximum operations to return
110

111
    Returns:
112
    Iterable[AnalyzeBatchOperation]: Paginated batch operations
113
    """
114

115
def get_analyze_batch_result(
116
    continuation_token: str,
117
    **kwargs: Any
118
) -> LROPoller[AnalyzeBatchResult]:
119
    """
120
    Continues batch analysis operation from continuation token.
121

122
    Parameters:
123
    - continuation_token (str): Continuation token for resuming batch operation
124

125
    Returns:
126
    LROPoller[AnalyzeBatchResult]: Batch operation poller
127
    """
128

129
def delete_analyze_batch_result(
130
    model_id: str,
131
    result_id: str,
132
    **kwargs: Any
133
) -> None:
134
    """
135
    Deletes batch analysis result.
136

137
    Parameters:
138
    - model_id (str): Model ID used for analysis  
139
    - result_id (str): Batch operation result ID to delete
140
    """
141
```
142

143
### Document Classification
144

145
Classifies documents using trained classifiers to automatically determine document types and route processing workflows.
146

147
```python { .api }
148
def begin_classify_document(
149
    classifier_id: str,
150
    body: Union[ClassifyDocumentRequest, JSON, IO[bytes]],
151
    *,
152
    string_index_type: Optional[Union[str, StringIndexType]] = None,
153
    split_mode: Optional[Union[str, SplitMode]] = None,
154
    pages: Optional[str] = None,
155
    **kwargs: Any
156
) -> LROPoller[AnalyzeResult]:
157
    """
158
    Classifies document using specified classifier.
159

160
    Parameters:
161
    - classifier_id (str): Document classifier ID
162
    - body: Document data as ClassifyDocumentRequest, JSON dict, or file bytes
163
    - string_index_type (StringIndexType, optional): Character indexing scheme
164
    - split_mode (SplitMode, optional): Document splitting behavior
165
    - pages (str, optional): Page range specification
166

167
    Returns:
168
    LROPoller[AnalyzeResult]: Classification result poller
169
    """
170
```
171

172
### Analysis Result Retrieval
173

174
Retrieves analysis outputs in various formats including searchable PDFs and extracted figure images.
175

176
```python { .api }
177
def get_analyze_result_pdf(
178
    model_id: str,
179
    result_id: str,
180
    **kwargs: Any
181
) -> Iterator[bytes]:
182
    """
183
    Gets analysis result as searchable PDF.
184

185
    Parameters:
186
    - model_id (str): Model ID used for analysis
187
    - result_id (str): Analysis result ID
188

189
    Returns:
190
    Iterator[bytes]: PDF content stream
191
    """
192

193
def get_analyze_result_figure(
194
    model_id: str,
195
    result_id: str,
196
    figure_id: str,
197
    **kwargs: Any
198
) -> Iterator[bytes]:
199
    """
200
    Gets extracted figure as image.
201

202
    Parameters:
203
    - model_id (str): Model ID used for analysis
204
    - result_id (str): Analysis result ID
205
    - figure_id (str): Figure identifier
206

207
    Returns:
208
    Iterator[bytes]: Image content stream
209
    """
210

211
def delete_analyze_result(
212
    model_id: str,
213
    result_id: str,
214
    **kwargs: Any
215
) -> None:
216
    """
217
    Deletes analysis result.
218

219
    Parameters:
220
    - model_id (str): Model ID used for analysis
221
    - result_id (str): Analysis result ID to delete
222
    """
223
```
224

225
## Request Types
226

227
```python { .api }
228
class AnalyzeDocumentRequest:
229
    """Request for single document analysis."""
230
    url_source: Optional[str]
231
    base64_source: Optional[str]
232
    pages: Optional[str]
233
    locale: Optional[str]
234
    string_index_type: Optional[StringIndexType]
235
    features: Optional[List[DocumentAnalysisFeature]]
236
    query_fields: Optional[List[str]]
237
    output_content_format: Optional[DocumentContentFormat]
238
    output: Optional[List[AnalyzeOutputOption]]
239

240
class AnalyzeBatchDocumentsRequest:
241
    """Request for batch document analysis."""
242
    azure_blob_source: Optional[AzureBlobContentSource]
243
    azure_blob_file_list_source: Optional[AzureBlobFileListContentSource]
244
    result_container_url: str
245
    result_prefix: Optional[str]
246
    overwrite_existing: Optional[bool]
247
    pages: Optional[str]
248
    locale: Optional[str]
249
    string_index_type: Optional[StringIndexType]
250
    features: Optional[List[DocumentAnalysisFeature]]
251
    query_fields: Optional[List[str]]
252
    output_content_format: Optional[DocumentContentFormat]
253
    output: Optional[List[AnalyzeOutputOption]]
254

255
class ClassifyDocumentRequest:
256
    """Request for document classification."""
257
    url_source: Optional[str]
258
    base64_source: Optional[str]
259
    pages: Optional[str]
260
    string_index_type: Optional[StringIndexType]
261
    split_mode: Optional[SplitMode]
262
```
263

264
## Response Types
265

266
```python { .api }
267
class AnalyzeResult:
268
    """Main analysis result containing extracted content and metadata."""
269
    api_version: Optional[str]
270
    model_id: str
271
    string_index_type: Optional[StringIndexType]
272
    content: Optional[str]
273
    pages: Optional[List[DocumentPage]]
274
    paragraphs: Optional[List[DocumentParagraph]]
275
    tables: Optional[List[DocumentTable]]
276
    figures: Optional[List[DocumentFigure]]
277
    sections: Optional[List[DocumentSection]]
278
    key_value_pairs: Optional[List[DocumentKeyValuePair]]
279
    styles: Optional[List[DocumentStyle]]
280
    languages: Optional[List[DocumentLanguage]]
281
    documents: Optional[List[AnalyzedDocument]]
282
    warnings: Optional[List[DocumentIntelligenceWarning]]
283

284
class AnalyzeBatchResult:
285
    """Results from batch document analysis."""
286
    succeeded_count: int
287
    failed_count: int
288
    skipped_count: int
289
    details: List[AnalyzeBatchOperationDetail]
290

291
class AnalyzeBatchOperation:
292
    """Batch operation metadata and status."""
293
    operation_id: str
294
    status: DocumentIntelligenceOperationStatus
295
    created_date_time: datetime
296
    last_updated_date_time: datetime
297
    percent_completed: Optional[int]
298
    result: Optional[AnalyzeBatchResult]
299
    error: Optional[DocumentIntelligenceError]
300
```
301

302
## Enhanced LRO Poller
303

304
```python { .api }
305
class AnalyzeDocumentLROPoller(LROPoller[AnalyzeResult]):
306
    """Enhanced poller for document analysis operations."""
307
    
308
    @property
309
    def details(self) -> Dict[str, Any]:
310
        """
311
        Returns operation metadata including operation_id.
312
        
313
        Returns:
314
        Dict containing operation_id extracted from Operation-Location header
315
        """
316
    
317
    @classmethod
318
    def from_continuation_token(
319
        cls,
320
        polling_method: PollingMethod,
321
        continuation_token: str,
322
        **kwargs: Any
323
    ) -> "AnalyzeDocumentLROPoller[AnalyzeResult]":
324
        """Resume operation from continuation token."""
325
```
326

327
## Client Utility Methods
328

329
```python { .api }
330
def send_request(
331
    request: HttpRequest,
332
    *,
333
    stream: bool = False,
334
    **kwargs: Any
335
) -> HttpResponse:
336
    """
337
    Sends custom HTTP request using the client's pipeline.
338
    
339
    Parameters:
340
    - request (HttpRequest): HTTP request to send
341
    - stream (bool): Whether to stream the response
342
    
343
    Returns:
344
    HttpResponse: Raw HTTP response
345
    """
346

347
def close() -> None:
348
    """Close the client and release resources."""
349
```

Version

Tile

Files

document-analysis.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

document-analysis.mddocs/