or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-models.mddocument-analysis.mdform-recognition.mdindex.mdmodel-management.md

model-management.mddocs/

0

# Model Management

1

2

Comprehensive model lifecycle management capabilities across both legacy Form Recognizer API and modern Document Intelligence API. This includes building custom models, training classifiers, copying models between resources, model composition, and operation monitoring.

3

4

## Capabilities

5

6

### Legacy Model Management (FormTrainingClient)

7

8

Traditional model training and management for Form Recognizer API v2.1 and below, focusing on custom form models with supervised and unsupervised training.

9

10

#### Model Training

11

12

```python { .api }

13

def begin_training(training_files_url: str, use_training_labels: bool, **kwargs) -> LROPoller[CustomFormModel]:

14

"""

15

Train custom form model from training data.

16

17

Parameters:

18

- training_files_url: Azure Blob Storage URL containing training documents

19

- use_training_labels: Whether to use labeled training data (supervised)

20

- model_name: Optional name for the model

21

- prefix: Filter training files by prefix

22

23

Returns:

24

LROPoller that yields CustomFormModel when training completes

25

"""

26

```

27

28

#### Usage Example

29

30

```python

31

from azure.ai.formrecognizer import FormTrainingClient

32

from azure.core.credentials import AzureKeyCredential

33

34

training_client = FormTrainingClient(endpoint, AzureKeyCredential("key"))

35

36

# Train with labeled data (supervised)

37

training_files_url = "https://yourstorageaccount.blob.core.windows.net/training-data?sas-token"

38

39

poller = training_client.begin_training(

40

training_files_url=training_files_url,

41

use_training_labels=True,

42

model_name="Invoice Model v1"

43

)

44

45

model = poller.result()

46

print(f"Model ID: {model.model_id}")

47

print(f"Status: {model.status}")

48

print(f"Accuracy: {model.training_documents[0].page_count}")

49

50

# Use trained model

51

from azure.ai.formrecognizer import FormRecognizerClient

52

53

form_client = FormRecognizerClient(endpoint, AzureKeyCredential("key"))

54

with open("invoice.pdf", "rb") as invoice:

55

poller = form_client.begin_recognize_custom_forms(model.model_id, invoice)

56

result = poller.result()

57

```

58

59

#### Model Information and Listing

60

61

```python { .api }

62

def get_custom_model(model_id: str, **kwargs) -> CustomFormModel:

63

"""

64

Get detailed information about a custom model.

65

66

Parameters:

67

- model_id: ID of the custom model

68

69

Returns:

70

CustomFormModel with complete model details

71

"""

72

73

def list_custom_models(**kwargs) -> ItemPaged[CustomFormModelInfo]:

74

"""

75

List all custom models in the resource.

76

77

Returns:

78

ItemPaged iterator of CustomFormModelInfo objects

79

"""

80

81

def get_account_properties(**kwargs) -> AccountProperties:

82

"""

83

Get account information including model quotas.

84

85

Returns:

86

AccountProperties with quota and usage information

87

"""

88

```

89

90

#### Model Operations

91

92

```python { .api }

93

def delete_model(model_id: str, **kwargs) -> None:

94

"""

95

Delete a custom model.

96

97

Parameters:

98

- model_id: ID of model to delete

99

"""

100

101

def get_copy_authorization(**kwargs) -> Dict[str, str]:

102

"""

103

Generate authorization for copying model to another resource.

104

105

Parameters:

106

- resource_id: Target resource ID

107

- resource_region: Target resource region

108

109

Returns:

110

Dictionary with copy authorization details

111

"""

112

113

def begin_copy_model(model_id: str, target: Dict[str, str], **kwargs) -> LROPoller[CustomFormModelInfo]:

114

"""

115

Copy model to another Form Recognizer resource.

116

117

Parameters:

118

- model_id: Source model ID

119

- target: Copy authorization from target resource

120

121

Returns:

122

LROPoller that yields CustomFormModelInfo for copied model

123

"""

124

125

def begin_create_composed_model(model_ids: List[str], **kwargs) -> LROPoller[CustomFormModel]:

126

"""

127

Create composed model from multiple trained models.

128

129

Parameters:

130

- model_ids: List of model IDs to compose

131

- model_name: Optional name for composed model

132

133

Returns:

134

LROPoller that yields CustomFormModel for composed model

135

"""

136

```

137

138

### Modern Model Management (DocumentModelAdministrationClient)

139

140

Advanced model management for Document Intelligence API 2022-08-31 and later, supporting neural and template-based training modes with enhanced capabilities.

141

142

#### Model Building

143

144

```python { .api }

145

def begin_build_document_model(build_mode: Union[str, ModelBuildMode], **kwargs) -> DocumentModelAdministrationLROPoller[DocumentModelDetails]:

146

"""

147

Build custom document model from training data.

148

149

Parameters:

150

- build_mode: "template" or "neural" (ModelBuildMode enum)

151

- blob_container_url: Azure Blob Storage URL with training documents

152

- prefix: Filter training files by prefix

153

- model_id: Optional custom model ID

154

- description: Model description

155

- tags: Dictionary of custom tags

156

157

Returns:

158

DocumentModelAdministrationLROPoller that yields DocumentModelDetails

159

"""

160

161

def begin_compose_document_model(model_ids: List[str], **kwargs) -> DocumentModelAdministrationLROPoller[DocumentModelDetails]:

162

"""

163

Create composed model from multiple document models.

164

165

Parameters:

166

- model_ids: List of model IDs to compose (max 100)

167

- model_id: Optional custom model ID for composed model

168

- description: Model description

169

- tags: Dictionary of custom tags

170

171

Returns:

172

DocumentModelAdministrationLROPoller that yields DocumentModelDetails

173

"""

174

```

175

176

#### Build Modes

177

178

```python { .api }

179

class ModelBuildMode(str, Enum):

180

"""Model building approaches for different use cases."""

181

TEMPLATE = "template" # Fast training, structured forms with consistent layout

182

NEURAL = "neural" # Slower training, better for varied layouts and complex documents

183

```

184

185

#### Usage Example

186

187

```python

188

from azure.ai.formrecognizer import DocumentModelAdministrationClient, ModelBuildMode

189

from azure.core.credentials import AzureKeyCredential

190

191

admin_client = DocumentModelAdministrationClient(endpoint, AzureKeyCredential("key"))

192

193

# Build neural model for complex documents

194

blob_container_url = "https://yourstorageaccount.blob.core.windows.net/training?sas-token"

195

196

poller = admin_client.begin_build_document_model(

197

build_mode=ModelBuildMode.NEURAL,

198

blob_container_url=blob_container_url,

199

description="Contract Analysis Model",

200

tags={"project": "legal-docs", "version": "1.0"}

201

)

202

203

model = poller.result()

204

print(f"Model ID: {model.model_id}")

205

print(f"Created: {model.created_date_time}")

206

print(f"Description: {model.description}")

207

208

# Use the model

209

from azure.ai.formrecognizer import DocumentAnalysisClient

210

211

doc_client = DocumentAnalysisClient(endpoint, AzureKeyCredential("key"))

212

with open("contract.pdf", "rb") as document:

213

poller = doc_client.begin_analyze_document(model.model_id, document)

214

result = poller.result()

215

```

216

217

#### Model Information and Management

218

219

```python { .api }

220

def get_document_model(model_id: str, **kwargs) -> DocumentModelDetails:

221

"""

222

Get detailed information about a document model.

223

224

Parameters:

225

- model_id: Model identifier

226

227

Returns:

228

DocumentModelDetails with complete model information

229

"""

230

231

def list_document_models(**kwargs) -> ItemPaged[DocumentModelSummary]:

232

"""

233

List all document models in the resource.

234

235

Returns:

236

ItemPaged iterator of DocumentModelSummary objects

237

"""

238

239

def delete_document_model(model_id: str, **kwargs) -> None:

240

"""

241

Delete a document model.

242

243

Parameters:

244

- model_id: Model identifier to delete

245

"""

246

247

def get_resource_details(**kwargs) -> ResourceDetails:

248

"""

249

Get resource information including quotas and usage.

250

251

Returns:

252

ResourceDetails with quota information

253

"""

254

```

255

256

#### Model Copying

257

258

```python { .api }

259

def get_copy_authorization(**kwargs) -> TargetAuthorization:

260

"""

261

Generate authorization for copying model to this resource.

262

263

Parameters:

264

- model_id: Optional target model ID

265

- description: Optional description for copied model

266

- tags: Optional tags for copied model

267

268

Returns:

269

TargetAuthorization for model copying

270

"""

271

272

def begin_copy_document_model_to(model_id: str, target: TargetAuthorization, **kwargs) -> DocumentModelAdministrationLROPoller[DocumentModelDetails]:

273

"""

274

Copy document model to another resource.

275

276

Parameters:

277

- model_id: Source model ID

278

- target: TargetAuthorization from destination resource

279

280

Returns:

281

DocumentModelAdministrationLROPoller that yields DocumentModelDetails

282

"""

283

```

284

285

#### Model Copying Example

286

287

```python

288

# On target resource - generate authorization

289

target_admin_client = DocumentModelAdministrationClient(target_endpoint, target_credential)

290

target_auth = target_admin_client.get_copy_authorization(

291

model_id="copied-model-id",

292

description="Copied invoice model",

293

tags={"source": "prod-resource"}

294

)

295

296

# On source resource - perform copy

297

source_admin_client = DocumentModelAdministrationClient(source_endpoint, source_credential)

298

copy_poller = source_admin_client.begin_copy_document_model_to(

299

"source-model-id",

300

target_auth

301

)

302

303

copied_model = copy_poller.result()

304

print(f"Model copied to: {copied_model.model_id}")

305

```

306

307

### Document Classification

308

309

Building and managing document classifiers for automatic document type detection.

310

311

```python { .api }

312

def begin_build_document_classifier(**kwargs) -> DocumentModelAdministrationLROPoller[DocumentClassifierDetails]:

313

"""

314

Build custom document classifier.

315

316

Parameters:

317

- doc_types: Dictionary mapping document types to training data sources

318

- classifier_id: Optional custom classifier ID

319

- description: Classifier description

320

321

Returns:

322

DocumentModelAdministrationLROPoller that yields DocumentClassifierDetails

323

"""

324

325

def get_document_classifier(classifier_id: str, **kwargs) -> DocumentClassifierDetails:

326

"""

327

Get document classifier information.

328

329

Parameters:

330

- classifier_id: Classifier identifier

331

332

Returns:

333

DocumentClassifierDetails with classifier information

334

"""

335

336

def list_document_classifiers(**kwargs) -> ItemPaged[DocumentClassifierDetails]:

337

"""

338

List all document classifiers.

339

340

Returns:

341

ItemPaged iterator of DocumentClassifierDetails

342

"""

343

344

def delete_document_classifier(classifier_id: str, **kwargs) -> None:

345

"""

346

Delete document classifier.

347

348

Parameters:

349

- classifier_id: Classifier identifier to delete

350

"""

351

```

352

353

#### Classifier Building Example

354

355

```python

356

# Define document types and training data

357

doc_types = {

358

"invoice": {

359

"azure_blob_source": {

360

"container_url": "https://storage.blob.core.windows.net/invoices?sas",

361

"prefix": "training/"

362

}

363

},

364

"receipt": {

365

"azure_blob_source": {

366

"container_url": "https://storage.blob.core.windows.net/receipts?sas",

367

"prefix": "training/"

368

}

369

},

370

"contract": {

371

"azure_blob_file_list_source": {

372

"container_url": "https://storage.blob.core.windows.net/contracts?sas",

373

"file_list": "contract_files.json"

374

}

375

}

376

}

377

378

# Build classifier

379

poller = admin_client.begin_build_document_classifier(

380

doc_types=doc_types,

381

description="Financial Document Classifier",

382

classifier_id="financial-docs-v1"

383

)

384

385

classifier = poller.result()

386

print(f"Classifier ID: {classifier.classifier_id}")

387

print(f"Document types: {list(classifier.doc_types.keys())}")

388

```

389

390

### Operation Monitoring

391

392

Track and monitor long-running operations across the service.

393

394

```python { .api }

395

def list_operations(**kwargs) -> ItemPaged[OperationSummary]:

396

"""

397

List all operations for the resource.

398

399

Returns:

400

ItemPaged iterator of OperationSummary objects

401

"""

402

403

def get_operation(operation_id: str, **kwargs) -> OperationDetails:

404

"""

405

Get detailed information about a specific operation.

406

407

Parameters:

408

- operation_id: Operation identifier

409

410

Returns:

411

OperationDetails with complete operation information

412

"""

413

```

414

415

#### Operation Monitoring Example

416

417

```python

418

# List recent operations

419

operations = admin_client.list_operations()

420

421

for operation in operations:

422

print(f"Operation: {operation.operation_id}")

423

print(f"Kind: {operation.kind}")

424

print(f"Status: {operation.status}")

425

print(f"Progress: {operation.percent_completed}%")

426

print(f"Created: {operation.created_date_time}")

427

428

if operation.status == "failed":

429

# Get detailed error information

430

details = admin_client.get_operation(operation.operation_id)

431

if details.error:

432

print(f"Error: {details.error.code} - {details.error.message}")

433

```

434

435

## FormTrainingClient

436

437

```python { .api }

438

class FormTrainingClient:

439

"""

440

Client for training and managing custom models using Form Recognizer API v2.1 and below.

441

"""

442

443

def __init__(

444

self,

445

endpoint: str,

446

credential: Union[AzureKeyCredential, TokenCredential],

447

**kwargs

448

):

449

"""

450

Initialize FormTrainingClient.

451

452

Parameters:

453

- endpoint: Cognitive Services endpoint URL

454

- credential: Authentication credential

455

- api_version: API version (default: FormRecognizerApiVersion.V2_1)

456

"""

457

458

def get_form_recognizer_client(self, **kwargs) -> FormRecognizerClient:

459

"""

460

Get FormRecognizerClient using same configuration.

461

462

Returns:

463

FormRecognizerClient instance

464

"""

465

466

def close(self) -> None:

467

"""Close client and release resources."""

468

469

# Async version

470

class AsyncFormTrainingClient:

471

"""

472

Async client for training and managing custom models using Form Recognizer API v2.1 and below.

473

474

Provides the same methods as FormTrainingClient but with async/await support.

475

"""

476

477

def __init__(

478

self,

479

endpoint: str,

480

credential: Union[AzureKeyCredential, AsyncTokenCredential],

481

**kwargs

482

):

483

"""

484

Initialize AsyncFormTrainingClient.

485

486

Parameters:

487

- endpoint: Cognitive Services endpoint URL

488

- credential: Authentication credential (must support async operations)

489

- api_version: API version (default: FormRecognizerApiVersion.V2_1)

490

"""

491

492

async def begin_training(self, training_files_url: str, use_training_labels: bool, **kwargs) -> AsyncLROPoller[CustomFormModel]: ...

493

async def delete_model(self, model_id: str, **kwargs) -> None: ...

494

async def list_custom_models(self, **kwargs) -> AsyncItemPaged[CustomFormModelInfo]: ...

495

async def get_account_properties(self, **kwargs) -> AccountProperties: ...

496

async def get_custom_model(self, model_id: str, **kwargs) -> CustomFormModel: ...

497

async def get_copy_authorization(self, **kwargs) -> Dict[str, str]: ...

498

async def begin_copy_model(self, model_id: str, target: Dict[str, str], **kwargs) -> AsyncLROPoller[CustomFormModelInfo]: ...

499

async def begin_create_composed_model(self, model_ids: List[str], **kwargs) -> AsyncLROPoller[CustomFormModel]: ...

500

501

def get_form_recognizer_client(self, **kwargs) -> AsyncFormRecognizerClient:

502

"""

503

Get AsyncFormRecognizerClient using same configuration.

504

505

Returns:

506

AsyncFormRecognizerClient instance

507

"""

508

509

async def close(self) -> None:

510

"""Close client and release resources."""

511

```

512

513

## DocumentModelAdministrationClient

514

515

```python { .api }

516

class DocumentModelAdministrationClient:

517

"""

518

Client for building and managing models using Document Intelligence API 2022-08-31 and later.

519

"""

520

521

def __init__(

522

self,

523

endpoint: str,

524

credential: Union[AzureKeyCredential, TokenCredential],

525

**kwargs

526

):

527

"""

528

Initialize DocumentModelAdministrationClient.

529

530

Parameters:

531

- endpoint: Cognitive Services endpoint URL

532

- credential: Authentication credential

533

- api_version: API version (default: DocumentAnalysisApiVersion.V2023_07_31)

534

"""

535

536

def get_document_analysis_client(self, **kwargs) -> DocumentAnalysisClient:

537

"""

538

Get DocumentAnalysisClient using same configuration.

539

540

Returns:

541

DocumentAnalysisClient instance

542

"""

543

544

def close(self) -> None:

545

"""Close client and release resources."""

546

547

# Async version

548

class AsyncDocumentModelAdministrationClient:

549

"""

550

Async client for building and managing models using Document Intelligence API 2022-08-31 and later.

551

552

Provides the same methods as DocumentModelAdministrationClient but with async/await support.

553

"""

554

555

def __init__(

556

self,

557

endpoint: str,

558

credential: Union[AzureKeyCredential, AsyncTokenCredential],

559

**kwargs

560

):

561

"""

562

Initialize AsyncDocumentModelAdministrationClient.

563

564

Parameters:

565

- endpoint: Cognitive Services endpoint URL

566

- credential: Authentication credential (must support async operations)

567

- api_version: API version (default: DocumentAnalysisApiVersion.V2023_07_31)

568

"""

569

570

async def begin_build_document_model(self, build_mode: Union[str, ModelBuildMode], **kwargs) -> AsyncDocumentModelAdministrationLROPoller[DocumentModelDetails]: ...

571

async def begin_compose_document_model(self, model_ids: List[str], **kwargs) -> AsyncDocumentModelAdministrationLROPoller[DocumentModelDetails]: ...

572

async def get_copy_authorization(self, **kwargs) -> TargetAuthorization: ...

573

async def begin_copy_document_model_to(self, model_id: str, target: TargetAuthorization, **kwargs) -> AsyncDocumentModelAdministrationLROPoller[DocumentModelDetails]: ...

574

async def delete_document_model(self, model_id: str, **kwargs) -> None: ...

575

async def list_document_models(self, **kwargs) -> AsyncItemPaged[DocumentModelSummary]: ...

576

async def get_resource_details(self, **kwargs) -> ResourceDetails: ...

577

async def get_document_model(self, model_id: str, **kwargs) -> DocumentModelDetails: ...

578

async def list_operations(self, **kwargs) -> AsyncItemPaged[OperationSummary]: ...

579

async def get_operation(self, operation_id: str, **kwargs) -> OperationDetails: ...

580

async def begin_build_document_classifier(self, **kwargs) -> AsyncDocumentModelAdministrationLROPoller[DocumentClassifierDetails]: ...

581

async def get_document_classifier(self, classifier_id: str, **kwargs) -> DocumentClassifierDetails: ...

582

async def list_document_classifiers(self, **kwargs) -> AsyncItemPaged[DocumentClassifierDetails]: ...

583

async def delete_document_classifier(self, classifier_id: str, **kwargs) -> None: ...

584

585

def get_document_analysis_client(self, **kwargs) -> AsyncDocumentAnalysisClient:

586

"""

587

Get AsyncDocumentAnalysisClient using same configuration.

588

589

Returns:

590

AsyncDocumentAnalysisClient instance

591

"""

592

593

async def close(self) -> None:

594

"""Close client and release resources."""

595

```

596

597

## Training Data Requirements

598

599

### Blob Storage Structure

600

601

```

602

container/

603

├── training/

604

│ ├── document1.pdf

605

│ ├── document2.pdf

606

│ ├── document3.pdf

607

│ └── ...

608

└── labels/ # For supervised training

609

├── document1.pdf.labels.json

610

├── document2.pdf.labels.json

611

└── ...

612

```

613

614

### Label Format (Legacy API)

615

616

```json

617

{

618

"document": "document1.pdf",

619

"labels": [

620

{

621

"label": "VendorName",

622

"key": null,

623

"value": [

624

{

625

"page": 1,

626

"text": "Contoso Inc",

627

"boundingBoxes": [

628

[100, 200, 300, 200, 300, 250, 100, 250]

629

]

630

}

631

]

632

}

633

]

634

}

635

```

636

637

### Modern API Training Data

638

639

```json

640

{

641

"fields": {

642

"VendorName": {

643

"type": "string",

644

"valueString": "Contoso Inc"

645

},

646

"InvoiceTotal": {

647

"type": "number",

648

"valueNumber": 1234.56

649

}

650

},

651

"boundingRegions": [

652

{

653

"pageNumber": 1,

654

"polygon": [100, 200, 300, 200, 300, 250, 100, 250]

655

}

656

]

657

}

658

```

659

660

## Error Handling

661

662

```python { .api }

663

from azure.ai.formrecognizer import FormRecognizerError, DocumentAnalysisError

664

665

# Legacy API errors

666

try:

667

poller = training_client.begin_training(training_url, True)

668

model = poller.result()

669

except FormRecognizerError as e:

670

print(f"Training failed: {e.error_code} - {e.message}")

671

672

# Modern API errors

673

try:

674

poller = admin_client.begin_build_document_model(ModelBuildMode.NEURAL, blob_container_url=training_url)

675

model = poller.result()

676

except DocumentAnalysisError as e:

677

print(f"Model building failed: {e.code} - {e.message}")

678

if e.innererror:

679

print(f"Inner error: {e.innererror.code}")

680

```