CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/pypi-azure-ai-documentintelligence

tessl install tessl/pypi-azure-ai-documentintelligence@1.0.0

Azure AI Document Intelligence client library for Python - a cloud service that uses machine learning to analyze text and structured data from documents

Agent Success

Agent success rate when using this tile

76%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.19x

Baseline

Agent success rate without this tile

64%

task.mdevals/scenario-8/

Document Classifier Router

Create a small utility that classifies documents with a given classifier identifier and returns routing-ready metadata, including which pages belong to each detected document type.

Capabilities

Classify a complete document

  • Given a classifier identifier and a PDF, the utility returns at least one classification entry containing a document type label, confidence, and all page numbers when no page filter is provided. @test

Restrict classification to specific pages

  • When a page filter string such as "2-3" is supplied, only those pages are classified and included in the returned page list. @test

Per-page routing

  • When per-page routing is requested, the utility produces a separate classification entry per page, each with its own document type and confidence, so different pages in a single file can map to different types. @test

Implementation

@generates

API

from typing import BinaryIO, Dict, List, Optional, Union

def classify_document(
    classifier_id: str,
    source: Union[str, bytes, BinaryIO],
    pages: Optional[str] = None,
    per_page: bool = False,
) -> List[Dict[str, Union[str, float, List[int]]]]:
    """
    Runs classification using a remote classifier and returns routing metadata.

    - classifier_id: Identifier of the trained classifier to use.
    - source: Document input as a file path, URL, bytes, or open binary stream.
    - pages: Optional page filter string such as "1-2,4" to restrict classification.
    - per_page: When True, splits results per page instead of document-level groupings.

    Returns:
    A list of dictionaries with keys: doc_type (str), confidence (float), pages (list of page numbers).
    """

Dependencies { .dependencies }

azure-ai-documentintelligence { .dependency }

Python client library for cloud-hosted document intelligence classification and routing.

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/azure-ai-documentintelligence@1.0.x
tile.json