Tessl Tile for pypi/mistralai@1.9.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

agents.md audio.md batch.md beta.md chat-completions.md classification.md embeddings.md files.md fim.md fine-tuning.md index.md models.md ocr.md

ocr.mddocs/

0
# OCR (Optical Character Recognition)
1

2
Process documents and images to extract text and structured data using optical character recognition. The OCR API can analyze various document formats and extract text with position information.
3

4
## Capabilities
5

6
### Document Processing
7

8
Process documents and images to extract text and structural information.
9

10
```python { .api }
11
def process(
12
    model: str,
13
    document: Document,
14
    pages: Optional[List[int]] = None,
15
    **kwargs
16
) -> OCRResponse:
17
    """
18
    Process a document with OCR.
19

20
    Parameters:
21
    - model: OCR model identifier
22
    - document: Document to process (image or PDF)
23
    - pages: Optional list of page numbers to process
24

25
    Returns:
26
    OCRResponse with extracted text and structure information
27
    """
28
```
29

30
## Usage Examples
31

32
### Process Image Document
33

34
```python
35
from mistralai import Mistral
36
from mistralai.models import Document
37

38
client = Mistral(api_key="your-api-key")
39

40
# Process an image document
41
with open("document.pdf", "rb") as f:
42
    document = Document(
43
        type="application/pdf",
44
        data=f.read()
45
    )
46

47
response = client.ocr.process(
48
    model="ocr-model",
49
    document=document,
50
    pages=[1, 2, 3]  # Process first 3 pages
51
)
52

53
# Extract text from all pages
54
for page in response.pages:
55
    print(f"Page {page.page_number}:")
56
    print(f"Text: {page.text}")
57
    print(f"Dimensions: {page.dimensions.width}x{page.dimensions.height}")
58
    print()
59
```
60

61
### Process with Structure Analysis
62

63
```python
64
# Process document and analyze structure
65
response = client.ocr.process(
66
    model="ocr-model",
67
    document=document
68
)
69

70
# Access structured information
71
for page in response.pages:
72
    print(f"Page {page.page_number}:")
73
    
74
    # Extract images if present
75
    for image in page.images:
76
        print(f"  Image: {image.width}x{image.height} at ({image.x}, {image.y})")
77
    
78
    # Get text content
79
    print(f"  Text content: {len(page.text)} characters")
80
    print(f"  Preview: {page.text[:200]}...")
81
```
82

83
## Types
84

85
### Request Types
86

87
```python { .api }
88
class OCRRequest:
89
    model: str
90
    document: Document
91
    pages: Optional[List[int]]
92

93
class Document:
94
    type: str  # MIME type (e.g., "application/pdf", "image/jpeg")
95
    data: bytes  # Document content as bytes
96
```
97

98
### Response Types
99

100
```python { .api }
101
class OCRResponse:
102
    id: str
103
    object: str
104
    model: str
105
    pages: List[OCRPageObject]
106
    usage: Optional[OCRUsageInfo]
107

108
class OCRPageObject:
109
    page_number: int
110
    text: str
111
    dimensions: OCRPageDimensions
112
    images: List[OCRImageObject]
113

114
class OCRPageDimensions:
115
    width: float
116
    height: float
117

118
class OCRImageObject:
119
    x: float
120
    y: float
121
    width: float
122
    height: float
123

124
class OCRUsageInfo:
125
    prompt_tokens: int
126
    completion_tokens: int
127
    total_tokens: int
128
```
129

130
## Supported Formats
131

132
### Document Types
133

134
- **PDF**: Multi-page PDF documents
135
- **Images**: JPEG, PNG, TIFF formats
136
- **Scanned Documents**: Digital scans of physical documents
137

138
### Output Information
139

140
- **Text Content**: Extracted text with reading order
141
- **Layout Information**: Page dimensions and structure
142
- **Image Detection**: Embedded images and their positions
143
- **Coordinate Information**: Position data for text elements
144

145
## Best Practices
146

147
### Document Quality
148

149
- Use high-resolution images for better accuracy
150
- Ensure good contrast between text and background
151
- Minimize skew and rotation in source documents
152
- Clean, well-lit scans produce better results
153

154
### Processing Optimization
155

156
- Specify page ranges for large documents to reduce processing time
157
- Consider document orientation and layout complexity
158
- Test with representative samples before batch processing

Version

Tile

Files

ocr.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

ocr.mddocs/