Tencent Cloud Machine Translation (TMT) SDK for Python providing comprehensive text, file, image, and speech translation capabilities
OCR-based image translation for text content within images, supporting 13-18 languages with line-by-line translation capabilities. Two API endpoints provide different processing approaches: standard OCR translation and enhanced LLM-powered translation.
Recognizes and translates text in images line by line for 13 languages using OCR technology. Suitable for documents, signs, and text-heavy images.
def ImageTranslate(self, request: models.ImageTranslateRequest) -> models.ImageTranslateResponse:
"""
Translate text within images using OCR recognition.
Args:
request: ImageTranslateRequest with image data and parameters
Returns:
ImageTranslateResponse with translated text records and positions
Raises:
TencentCloudSDKException: For various error conditions
"""Usage Example:
import base64
from tencentcloud.common import credential
from tencentcloud.tmt.v20180321.tmt_client import TmtClient
from tencentcloud.tmt.v20180321 import models
# Initialize client
cred = credential.Credential("SecretId", "SecretKey")
client = TmtClient(cred, "ap-beijing")
# Read and encode image
with open("document.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
# Create image translation request
req = models.ImageTranslateRequest()
req.SessionUuid = "unique-session-id"
req.Scene = "doc" # Document scene
req.Data = image_data
req.Source = "en"
req.Target = "zh"
req.ProjectId = 0
# Perform image translation
resp = client.ImageTranslate(req)
print(f"Session: {resp.SessionUuid}")
print(f"Translation: {resp.Source} -> {resp.Target}")
# Process translated text records
for item in resp.ImageRecord.Value:
print(f"Original: {item.SourceText}")
print(f"Translated: {item.TargetText}")
print(f"Position: ({item.X}, {item.Y}) {item.W}x{item.H}")Advanced image translation for 18 languages using LLM technology, providing improved accuracy and context understanding.
def ImageTranslateLLM(self, request: models.ImageTranslateLLMRequest) -> models.ImageTranslateLLMResponse:
"""
Translate text within images using enhanced LLM processing.
Args:
request: ImageTranslateLLMRequest with image data and parameters
Returns:
ImageTranslateLLMResponse with translated results and output image URL
Raises:
TencentCloudSDKException: For various error conditions
"""Usage Example:
# Create enhanced image translation request
req = models.ImageTranslateLLMRequest()
req.Data = image_data # Base64 encoded image
req.Target = "zh"
# Alternatively, use URL instead of Data:
# req.Url = "https://example.com/image.jpg"
# Perform enhanced translation
resp = client.ImageTranslateLLM(req)
print(f"Enhanced translation completed")
print(f"Source language: {resp.Source}")
print(f"Full source text: {resp.SourceText}")
print(f"Full translated text: {resp.TargetText}")
# Save result image
import base64
with open("translated_image.jpg", "wb") as f:
f.write(base64.b64decode(resp.Data))
# Process translation details
for detail in resp.TransDetails:
print(f"Line: {detail.SourceLineText} -> {detail.TargetLineText}")
print(f"Position: ({detail.BoundingBox.X}, {detail.BoundingBox.Y})")class ImageTranslateRequest:
"""
Request parameters for standard image translation.
Attributes:
SessionUuid (str): Unique session identifier
Scene (str): Scene type (e.g., "doc" for documents)
Data (str): Base64 encoded image data
Source (str): Source language code
Target (str): Target language code
ProjectId (int): Project ID (default: 0)
"""class ImageTranslateResponse:
"""
Response from standard image translation.
Attributes:
SessionUuid (str): Session identifier from request
Source (str): Source language
Target (str): Target language
ImageRecord (ImageRecord): Image translation result
RequestId (str): Unique request identifier
"""class ImageTranslateLLMRequest:
"""
Request parameters for enhanced LLM image translation.
Attributes:
Data (str): Base64 encoded image data (PNG, JPG, JPEG)
Target (str): Target language code
Url (str): Image URL (alternative to Data)
"""class ImageTranslateLLMResponse:
"""
Response from enhanced LLM image translation.
Attributes:
Data (str): Base64 encoded result image (JPG format)
Source (str): Detected source language
Target (str): Target language
SourceText (str): All original text from image
TargetText (str): All translated text
Angle (float): Image rotation angle (0-359 degrees)
TransDetails (list[TransDetail]): Translation detail information
RequestId (str): Unique request identifier
"""class ImageRecord:
"""
Image translation record container.
Attributes:
Value (list[ItemValue]): List of translated text items with positions
"""class ItemValue:
"""
Individual translated text item with position information.
Attributes:
SourceText (str): Original text
TargetText (str): Translated text
X (int): X coordinate
Y (int): Y coordinate
W (int): Width
H (int): Height
"""class TransDetail:
"""
LLM translation detail for each text line.
Attributes:
SourceLineText (str): Original line text
TargetLineText (str): Translated line text
BoundingBox (BoundingBox): Text position and dimensions
LinesCount (int): Number of lines
LineHeight (int): Line height in pixels
SpamCode (int): Content safety check result (0=normal)
"""class BoundingBox:
"""
Bounding box coordinates for text positioning.
Attributes:
X (int): Left edge X coordinate
Y (int): Top edge Y coordinate
Width (int): Box width in pixels
Height (int): Box height in pixels
"""Core language support for document translation:
Extended language support with improved accuracy:
Optimized for:
Suitable for:
Common error scenarios for image translation:
Example error handling:
try:
resp = client.ImageTranslate(req)
for record in resp.ImageRecord:
print(f"Translated: {record.Value}")
except TencentCloudSDKException as e:
if e.code == "FAILEDOPERATION_LANGUAGERECOGNITIONERR":
print("Could not detect text in image")
elif e.code == "UNSUPPORTEDOPERATION_UNSUPPORTEDLANGUAGE":
print("Language pair not supported for image translation")
else:
print(f"Image translation error: {e.code} - {e.message}")Install with Tessl CLI
npx tessl i tessl/pypi-tencentcloud-sdk-python-tmt