Tessl Tile for pypi/azure-ai-formrecognizer@3.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

data-models.md document-analysis.md form-recognition.md index.md model-management.md

form-recognition.mddocs/

0
# Form Recognition (Legacy API)
1

2
Traditional form processing capabilities using the legacy Form Recognizer API (v2.0, v2.1). This API provides prebuilt models for common document types and basic custom form training functionality. While still supported, the modern Document Analysis API is recommended for new applications.
3

4
## Capabilities
5

6
### Receipt Recognition
7

8
Extracts key information from receipts including merchant details, transaction amounts, dates, and line items using the prebuilt receipt model.
9

10
```python { .api }
11
def begin_recognize_receipts(receipt: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[RecognizedForm]]:
12
    """
13
    Recognize receipt data from documents.
14
    
15
    Parameters:
16
    - receipt: Receipt document as bytes or file stream
17
    - locale: Optional locale hint (e.g., "en-US")
18
    - include_field_elements: Include field elements in response
19
    - content_type: MIME type of the document
20
    
21
    Returns:
22
    LROPoller that yields List[RecognizedForm] with extracted receipt data
23
    """
24

25
def begin_recognize_receipts_from_url(receipt_url: str, **kwargs) -> LROPoller[List[RecognizedForm]]:
26
    """
27
    Recognize receipt data from document URL.
28
    
29
    Parameters:
30
    - receipt_url: Publicly accessible URL to receipt document
31
    - locale: Optional locale hint
32
    - include_field_elements: Include field elements in response
33
    
34
    Returns:
35
    LROPoller that yields List[RecognizedForm] with extracted receipt data
36
    """
37
```
38

39
#### Usage Example
40

41
```python
42
from azure.ai.formrecognizer import FormRecognizerClient
43
from azure.core.credentials import AzureKeyCredential
44

45
client = FormRecognizerClient(endpoint, AzureKeyCredential("key"))
46

47
# From local file
48
with open("receipt.jpg", "rb") as receipt_file:
49
    poller = client.begin_recognize_receipts(receipt_file, locale="en-US")
50
    receipts = poller.result()
51

52
# Access extracted data
53
for receipt in receipts:
54
    merchant_name = receipt.fields.get("MerchantName")
55
    if merchant_name:
56
        print(f"Merchant: {merchant_name.value}")
57
    
58
    total = receipt.fields.get("Total")
59
    if total:
60
        print(f"Total: {total.value}")
61
    
62
    # Access line items
63
    items = receipt.fields.get("Items")
64
    if items:
65
        for item in items.value:
66
            name = item.value.get("Name")
67
            price = item.value.get("TotalPrice")
68
            if name and price:
69
                print(f"Item: {name.value} - ${price.value}")
70
```
71

72
### Business Card Recognition
73

74
Extracts contact information from business cards including names, job titles, organizations, phone numbers, and email addresses.
75

76
```python { .api }
77
def begin_recognize_business_cards(business_card: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[RecognizedForm]]:
78
    """
79
    Extract business card information.
80
    
81
    Parameters:
82
    - business_card: Business card document as bytes or file stream
83
    - locale: Optional locale hint
84
    - include_field_elements: Include field elements in response
85
    - content_type: MIME type of the document
86
    
87
    Returns:
88
    LROPoller that yields List[RecognizedForm] with contact information
89
    """
90

91
def begin_recognize_business_cards_from_url(business_card_url: str, **kwargs) -> LROPoller[List[RecognizedForm]]:
92
    """
93
    Extract business card information from URL.
94
    
95
    Parameters:
96
    - business_card_url: Publicly accessible URL to business card
97
    - locale: Optional locale hint
98
    - include_field_elements: Include field elements in response
99
    
100
    Returns:
101
    LROPoller that yields List[RecognizedForm] with contact information
102
    """
103
```
104

105
### Invoice Recognition
106

107
Processes invoices to extract vendor information, customer details, invoice amounts, due dates, and line item details.
108

109
```python { .api }
110
def begin_recognize_invoices(invoice: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[RecognizedForm]]:
111
    """
112
    Extract invoice information using prebuilt model.
113
    
114
    Parameters:
115
    - invoice: Invoice document as bytes or file stream
116
    - locale: Optional locale hint
117
    - include_field_elements: Include field elements in response
118
    - content_type: MIME type of the document
119
    
120
    Returns:
121
    LROPoller that yields List[RecognizedForm] with invoice data
122
    """
123

124
def begin_recognize_invoices_from_url(invoice_url: str, **kwargs) -> LROPoller[List[RecognizedForm]]:
125
    """
126
    Extract invoice information from URL.
127
    
128
    Parameters:
129
    - invoice_url: Publicly accessible URL to invoice document
130
    - locale: Optional locale hint
131
    - include_field_elements: Include field elements in response
132
    
133
    Returns:
134
    LROPoller that yields List[RecognizedForm] with invoice data
135
    """
136
```
137

138
### Identity Document Recognition
139

140
Extracts information from identity documents such as driver's licenses and passports, including personal details, document numbers, and expiration dates.
141

142
```python { .api }
143
def begin_recognize_identity_documents(identity_document: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[RecognizedForm]]:
144
    """
145
    Extract identity document information.
146
    
147
    Parameters:
148
    - identity_document: ID document as bytes or file stream
149
    - include_field_elements: Include field elements in response
150
    - content_type: MIME type of the document
151
    
152
    Returns:
153
    LROPoller that yields List[RecognizedForm] with identity information
154
    """
155

156
def begin_recognize_identity_documents_from_url(identity_document_url: str, **kwargs) -> LROPoller[List[RecognizedForm]]:
157
    """
158
    Extract identity document information from URL.
159
    
160
    Parameters:
161
    - identity_document_url: Publicly accessible URL to ID document
162
    - include_field_elements: Include field elements in response
163
    
164
    Returns:
165
    LROPoller that yields List[RecognizedForm] with identity information
166
    """
167
```
168

169
### Content Recognition
170

171
Extracts layout information including text, tables, and selection marks without using a specific model. Useful for general document layout analysis.
172

173
```python { .api }
174
def begin_recognize_content(form: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[FormPage]]:
175
    """
176
    Extract layout information from documents.
177
    
178
    Parameters:
179
    - form: Document as bytes or file stream
180
    - language: Language code for text recognition
181
    - pages: Specific page numbers to analyze
182
    - reading_order: Reading order algorithm
183
    - content_type: MIME type of the document
184
    
185
    Returns:
186
    LROPoller that yields List[FormPage] with layout information
187
    """
188

189
def begin_recognize_content_from_url(form_url: str, **kwargs) -> LROPoller[List[FormPage]]:
190
    """
191
    Extract layout information from document URL.
192
    
193
    Parameters:
194
    - form_url: Publicly accessible URL to document
195
    - language: Language code for text recognition
196
    - pages: Specific page numbers to analyze
197
    - reading_order: Reading order algorithm
198
    
199
    Returns:
200
    LROPoller that yields List[FormPage] with layout information
201
    """
202
```
203

204
### Custom Form Recognition
205

206
Uses custom trained models to extract information from domain-specific forms and documents.
207

208
```python { .api }
209
def begin_recognize_custom_forms(model_id: str, form: Union[bytes, IO[bytes]], **kwargs) -> LROPoller[List[RecognizedForm]]:
210
    """
211
    Recognize forms using custom trained model.
212
    
213
    Parameters:
214
    - model_id: ID of custom trained model
215
    - form: Form document as bytes or file stream
216
    - include_field_elements: Include field elements in response
217
    - content_type: MIME type of the document
218
    
219
    Returns:
220
    LROPoller that yields List[RecognizedForm] with extracted custom form data
221
    """
222

223
def begin_recognize_custom_forms_from_url(model_id: str, form_url: str, **kwargs) -> LROPoller[List[RecognizedForm]]:
224
    """
225
    Recognize forms from URL using custom model.
226
    
227
    Parameters:
228
    - model_id: ID of custom trained model
229
    - form_url: Publicly accessible URL to form document
230
    - include_field_elements: Include field elements in response
231
    
232
    Returns:
233
    LROPoller that yields List[RecognizedForm] with extracted custom form data
234
    """
235
```
236

237
#### Custom Form Usage Example
238

239
```python
240
# Recognize custom form
241
model_id = "your-custom-model-id"
242

243
with open("custom_form.pdf", "rb") as form_file:
244
    poller = client.begin_recognize_custom_forms(model_id, form_file)
245
    forms = poller.result()
246

247
# Process results
248
for form in forms:
249
    print(f"Form type: {form.form_type}")
250
    print(f"Confidence: {form.form_type_confidence}")
251
    
252
    for field_name, field in form.fields.items():
253
        print(f"{field_name}: {field.value} (confidence: {field.confidence})")
254
```
255

256
## FormRecognizerClient
257

258
```python { .api }
259
class FormRecognizerClient:
260
    """
261
    Client for analyzing forms using Form Recognizer API v2.1 and below.
262
    """
263
    
264
    def __init__(
265
        self,
266
        endpoint: str,
267
        credential: Union[AzureKeyCredential, TokenCredential],
268
        **kwargs
269
    ):
270
        """
271
        Initialize FormRecognizerClient.
272
        
273
        Parameters:
274
        - endpoint: Cognitive Services endpoint URL
275
        - credential: Authentication credential
276
        - api_version: API version (default: FormRecognizerApiVersion.V2_1)
277
        """
278
    
279
    def close(self) -> None:
280
        """Close client and release resources."""
281

282
# Async version
283
class AsyncFormRecognizerClient:
284
    """
285
    Async client for analyzing forms using Form Recognizer API v2.1 and below.
286
    
287
    Provides the same methods as FormRecognizerClient but with async/await support.
288
    """
289
    
290
    def __init__(
291
        self,
292
        endpoint: str,
293
        credential: Union[AzureKeyCredential, AsyncTokenCredential],
294
        **kwargs
295
    ):
296
        """
297
        Initialize AsyncFormRecognizerClient.
298
        
299
        Parameters:
300
        - endpoint: Cognitive Services endpoint URL
301
        - credential: Authentication credential (must support async operations)
302
        - api_version: API version (default: FormRecognizerApiVersion.V2_1)
303
        """
304
    
305
    async def begin_recognize_receipts(self, receipt: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
306
    async def begin_recognize_receipts_from_url(self, receipt_url: str, **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
307
    async def begin_recognize_business_cards(self, business_card: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
308
    async def begin_recognize_business_cards_from_url(self, business_card_url: str, **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
309
    async def begin_recognize_identity_documents(self, identity_document: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
310
    async def begin_recognize_identity_documents_from_url(self, identity_document_url: str, **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
311
    async def begin_recognize_invoices(self, invoice: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
312
    async def begin_recognize_invoices_from_url(self, invoice_url: str, **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
313
    async def begin_recognize_content(self, form: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[FormPage]]: ...
314
    async def begin_recognize_content_from_url(self, form_url: str, **kwargs) -> AsyncLROPoller[List[FormPage]]: ...
315
    async def begin_recognize_custom_forms(self, model_id: str, form: Union[bytes, IO[bytes]], **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
316
    async def begin_recognize_custom_forms_from_url(self, model_id: str, form_url: str, **kwargs) -> AsyncLROPoller[List[RecognizedForm]]: ...
317
    
318
    async def close(self) -> None:
319
        """Close client and release resources."""
320
```
321

322
## Common Parameters
323

324
### Content Types
325
```python { .api }
326
class FormContentType(str, Enum):
327
    APPLICATION_PDF = "application/pdf"
328
    IMAGE_JPEG = "image/jpeg"
329
    IMAGE_PNG = "image/png"
330
    IMAGE_TIFF = "image/tiff"
331
    IMAGE_BMP = "image/bmp"
332
```
333

334
### Language Codes
335
Common locale values for enhanced recognition:
336
- `"en-US"` - English (United States)
337
- `"en-AU"` - English (Australia)  
338
- `"en-CA"` - English (Canada)
339
- `"en-GB"` - English (Great Britain)
340
- `"en-IN"` - English (India)
341

342
## Error Handling
343

344
```python { .api }
345
from azure.ai.formrecognizer import FormRecognizerError
346

347
try:
348
    poller = client.begin_recognize_receipts(receipt_data)
349
    result = poller.result()
350
except FormRecognizerError as e:
351
    print(f"Recognition failed: {e.error_code} - {e.message}")
352
    if hasattr(e, 'details'):
353
        for detail in e.details:
354
            print(f"Detail: {detail}")
355
```
356

357
## Polling Operations
358

359
All recognition operations return Long Running Operation (LRO) pollers:
360

361
```python
362
# Start operation
363
poller = client.begin_recognize_receipts(receipt_data)
364

365
# Check status
366
print(f"Status: {poller.status()}")
367

368
# Wait for completion (blocking)
369
result = poller.result()
370

371
# Poll with custom interval
372
result = poller.result(timeout=300)  # 5 minute timeout
373
```

Version

Tile

Files

form-recognition.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

form-recognition.mddocs/