Tessl Tile for pypi/tencentcloud-sdk-python@3.0.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

file-translation.md image-translation.md index.md speech-translation.md text-translation.md

image-translation.mddocs/

0
# Image Translation
1

2
OCR-based image translation for text content within images, supporting 13-18 languages with line-by-line translation capabilities. Two API endpoints provide different processing approaches: standard OCR translation and enhanced LLM-powered translation.
3

4
## Capabilities
5

6
### Standard Image Translation
7

8
Recognizes and translates text in images line by line for 13 languages using OCR technology. Suitable for documents, signs, and text-heavy images.
9

10
```python { .api }
11
def ImageTranslate(self, request: models.ImageTranslateRequest) -> models.ImageTranslateResponse:
12
    """
13
    Translate text within images using OCR recognition.
14
    
15
    Args:
16
        request: ImageTranslateRequest with image data and parameters
17
        
18
    Returns:
19
        ImageTranslateResponse with translated text records and positions
20
        
21
    Raises:
22
        TencentCloudSDKException: For various error conditions
23
    """
24
```
25

26
**Usage Example:**
27

28
```python
29
import base64
30
from tencentcloud.common import credential
31
from tencentcloud.tmt.v20180321.tmt_client import TmtClient
32
from tencentcloud.tmt.v20180321 import models
33

34
# Initialize client
35
cred = credential.Credential("SecretId", "SecretKey")
36
client = TmtClient(cred, "ap-beijing")
37

38
# Read and encode image
39
with open("document.png", "rb") as f:
40
    image_data = base64.b64encode(f.read()).decode()
41

42
# Create image translation request
43
req = models.ImageTranslateRequest()
44
req.SessionUuid = "unique-session-id"
45
req.Scene = "doc"  # Document scene
46
req.Data = image_data
47
req.Source = "en"
48
req.Target = "zh"
49
req.ProjectId = 0
50

51
# Perform image translation
52
resp = client.ImageTranslate(req)
53
print(f"Session: {resp.SessionUuid}")
54
print(f"Translation: {resp.Source} -> {resp.Target}")
55

56
# Process translated text records
57
for item in resp.ImageRecord.Value:
58
    print(f"Original: {item.SourceText}")
59
    print(f"Translated: {item.TargetText}")
60
    print(f"Position: ({item.X}, {item.Y}) {item.W}x{item.H}")
61
```
62

63
### Enhanced LLM Image Translation
64

65
Advanced image translation for 18 languages using LLM technology, providing improved accuracy and context understanding.
66

67
```python { .api }
68
def ImageTranslateLLM(self, request: models.ImageTranslateLLMRequest) -> models.ImageTranslateLLMResponse:
69
    """
70
    Translate text within images using enhanced LLM processing.
71
    
72
    Args:
73
        request: ImageTranslateLLMRequest with image data and parameters
74
        
75
    Returns:
76
        ImageTranslateLLMResponse with translated results and output image URL
77
        
78
    Raises:
79
        TencentCloudSDKException: For various error conditions
80
    """
81
```
82

83
**Usage Example:**
84

85
```python
86
# Create enhanced image translation request
87
req = models.ImageTranslateLLMRequest()
88
req.Data = image_data  # Base64 encoded image
89
req.Target = "zh"
90
# Alternatively, use URL instead of Data:
91
# req.Url = "https://example.com/image.jpg"
92

93
# Perform enhanced translation
94
resp = client.ImageTranslateLLM(req)
95
print(f"Enhanced translation completed")
96
print(f"Source language: {resp.Source}")
97
print(f"Full source text: {resp.SourceText}")
98
print(f"Full translated text: {resp.TargetText}")
99

100
# Save result image
101
import base64
102
with open("translated_image.jpg", "wb") as f:
103
    f.write(base64.b64decode(resp.Data))
104

105
# Process translation details
106
for detail in resp.TransDetails:
107
    print(f"Line: {detail.SourceLineText} -> {detail.TargetLineText}")
108
    print(f"Position: ({detail.BoundingBox.X}, {detail.BoundingBox.Y})")
109
```
110

111
## Request/Response Models
112

113
### ImageTranslateRequest
114

115
```python { .api }
116
class ImageTranslateRequest:
117
    """
118
    Request parameters for standard image translation.
119
    
120
    Attributes:
121
        SessionUuid (str): Unique session identifier
122
        Scene (str): Scene type (e.g., "doc" for documents)
123
        Data (str): Base64 encoded image data
124
        Source (str): Source language code
125
        Target (str): Target language code
126
        ProjectId (int): Project ID (default: 0)
127
    """
128
```
129

130
### ImageTranslateResponse
131

132
```python { .api }
133
class ImageTranslateResponse:
134
    """
135
    Response from standard image translation.
136
    
137
    Attributes:
138
        SessionUuid (str): Session identifier from request
139
        Source (str): Source language
140
        Target (str): Target language
141
        ImageRecord (ImageRecord): Image translation result
142
        RequestId (str): Unique request identifier
143
    """
144
```
145

146
### ImageTranslateLLMRequest
147

148
```python { .api }
149
class ImageTranslateLLMRequest:
150
    """
151
    Request parameters for enhanced LLM image translation.
152
    
153
    Attributes:
154
        Data (str): Base64 encoded image data (PNG, JPG, JPEG)
155
        Target (str): Target language code
156
        Url (str): Image URL (alternative to Data)
157
    """
158
```
159

160
### ImageTranslateLLMResponse
161

162
```python { .api }
163
class ImageTranslateLLMResponse:
164
    """
165
    Response from enhanced LLM image translation.
166
    
167
    Attributes:
168
        Data (str): Base64 encoded result image (JPG format)
169
        Source (str): Detected source language
170
        Target (str): Target language
171
        SourceText (str): All original text from image
172
        TargetText (str): All translated text
173
        Angle (float): Image rotation angle (0-359 degrees)
174
        TransDetails (list[TransDetail]): Translation detail information
175
        RequestId (str): Unique request identifier
176
    """
177
```
178

179
### ImageRecord
180

181
```python { .api }
182
class ImageRecord:
183
    """
184
    Image translation record container.
185
    
186
    Attributes:
187
        Value (list[ItemValue]): List of translated text items with positions
188
    """
189
```
190

191
### ItemValue
192

193
```python { .api }
194
class ItemValue:
195
    """
196
    Individual translated text item with position information.
197
    
198
    Attributes:
199
        SourceText (str): Original text
200
        TargetText (str): Translated text
201
        X (int): X coordinate
202
        Y (int): Y coordinate
203
        W (int): Width
204
        H (int): Height
205
    """
206
```
207

208
### TransDetail
209

210
```python { .api }
211
class TransDetail:
212
    """
213
    LLM translation detail for each text line.
214
    
215
    Attributes:
216
        SourceLineText (str): Original line text
217
        TargetLineText (str): Translated line text
218
        BoundingBox (BoundingBox): Text position and dimensions
219
        LinesCount (int): Number of lines
220
        LineHeight (int): Line height in pixels
221
        SpamCode (int): Content safety check result (0=normal)
222
    """
223
```
224

225
### BoundingBox
226

227
```python { .api }
228
class BoundingBox:
229
    """
230
    Bounding box coordinates for text positioning.
231
    
232
    Attributes:
233
        X (int): Left edge X coordinate
234
        Y (int): Top edge Y coordinate  
235
        Width (int): Box width in pixels
236
        Height (int): Box height in pixels
237
    """
238
```
239

240
## Supported Image Formats
241

242
### Input Formats (Both APIs)
243
- **PNG**: Portable Network Graphics
244
- **JPG/JPEG**: Joint Photographic Experts Group
245

246
### Output Formats
247
- **Standard API**: Text records with position data
248
- **LLM API**: JPG image with translated text + text records
249

250
## Language Support
251

252
### Standard Image Translation (13 languages)
253
Core language support for document translation:
254
- Chinese (zh, zh-TW, zh-HK, zh-TR)
255
- English (en), Japanese (ja), Korean (ko)  
256
- European: French (fr), German (de), Spanish (es), Italian (it)
257
- Others: Russian (ru), Arabic (ar)
258

259
### Enhanced LLM Translation (18 languages)
260
Extended language support with improved accuracy:
261
- All standard languages plus additional coverage
262
- Better context understanding for complex layouts
263
- Improved handling of mixed-language content
264

265
## Scene Types
266

267
### Document Scene ("doc")
268
Optimized for:
269
- Text documents and PDFs
270
- Business documents
271
- Academic papers
272
- Technical documentation
273
- Forms and contracts
274

275
### General Scene
276
Suitable for:
277
- Street signs and signage
278
- Product labels
279
- Handwritten notes
280
- Mixed content images
281

282
## Best Practices
283

284
### Image Quality
285
- Use high-resolution images (minimum 300 DPI recommended)
286
- Ensure good contrast between text and background
287
- Avoid blurry or distorted images
288
- Minimize image compression artifacts
289

290
### Text Layout
291
- Works best with horizontal text layouts
292
- Supports line-by-line processing
293
- Handles multiple text blocks per image
294
- Preserves relative positioning information
295

296
### API Selection
297
- **Use ImageTranslate** for: Simple document translation, cost-sensitive applications
298
- **Use ImageTranslateLLM** for: Complex layouts, mixed languages, higher accuracy requirements
299

300
## Error Handling
301

302
Common error scenarios for image translation:
303

304
- **FAILEDOPERATION_DOWNLOADERR**: Image data processing error
305
- **FAILEDOPERATION_LANGUAGERECOGNITIONERR**: Language detection failure
306
- **UNSUPPORTEDOPERATION_UNSUPPORTEDLANGUAGE**: Language pair not supported
307
- **INVALIDPARAMETER**: Invalid image data or parameters
308

309
Example error handling:
310

311
```python
312
try:
313
    resp = client.ImageTranslate(req)
314
    for record in resp.ImageRecord:
315
        print(f"Translated: {record.Value}")
316
except TencentCloudSDKException as e:
317
    if e.code == "FAILEDOPERATION_LANGUAGERECOGNITIONERR":
318
        print("Could not detect text in image")
319
    elif e.code == "UNSUPPORTEDOPERATION_UNSUPPORTEDLANGUAGE":
320
        print("Language pair not supported for image translation")
321
    else:
322
        print(f"Image translation error: {e.code} - {e.message}")
323
```

Version

Tile

Files

image-translation.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

image-translation.mddocs/