Azure Text Translation client library for Python that provides neural machine translation technology for quick and accurate source-to-target text translation in real time across all supported languages
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Identify sentence boundaries in text with automatic language detection and script-specific processing. This service determines where sentences begin and end in input text, providing length information for proper text segmentation and analysis.
Analyzes input text to identify sentence boundaries and returns length information for each detected sentence with optional language detection.
def find_sentence_boundaries(
body: Union[List[str], List[InputTextItem], IO[bytes]],
*,
client_trace_id: Optional[str] = None,
language: Optional[str] = None,
script: Optional[str] = None,
**kwargs: Any
) -> List[BreakSentenceItem]Parameters:
body: Text to analyze (strings, InputTextItem objects, or binary data)client_trace_id: Client-generated GUID for request trackinglanguage: Language code for the text (auto-detected if omitted)script: Script identifier for the text (default script assumed if omitted)Returns: List of sentence boundary analysis results
from azure.ai.translation.text import TextTranslationClient
from azure.core.credentials import AzureKeyCredential
client = TextTranslationClient(
credential=AzureKeyCredential("your-api-key"),
region="your-region"
)
# Basic sentence boundary detection with auto-detection
response = client.find_sentence_boundaries(
body=["The answer lies in machine translation. This is a test. How are you?"]
)
result = response[0]
print(f"Detected language: {result.detected_language.language}")
print(f"Detection confidence: {result.detected_language.score}")
print(f"Sentence lengths: {result.sent_len}")
# Output: [37, 15, 12] (character counts for each sentence)
# Multi-text analysis
multi_response = client.find_sentence_boundaries(
body=[
"First text with multiple sentences. This is sentence two.",
"Second text. Also has multiple parts. Three sentences total."
]
)
for i, result in enumerate(multi_response):
print(f"\nText {i+1}:")
print(f" Language: {result.detected_language.language}")
print(f" Sentence lengths: {result.sent_len}")
# Specify language and script explicitly
explicit_response = client.find_sentence_boundaries(
body=["¡Hola mundo! ¿Cómo estás hoy? Me alegro de verte."],
language="es",
script="Latn"
)
# Complex punctuation handling
complex_response = client.find_sentence_boundaries(
body=["Dr. Smith went to the U.S.A. yesterday. He said 'Hello!' to everyone."]
)
# Mixed language content (relies on auto-detection)
mixed_response = client.find_sentence_boundaries(
body=["English sentence. Sentence en français. Back to English."]
)class InputTextItem:
text: str # Text content to analyze for sentence boundariesclass BreakSentenceItem:
sent_len: List[int] # Character lengths of each detected sentence
detected_language: Optional[DetectedLanguage] # Auto-detected language infoclass DetectedLanguage:
language: str # Detected language code (ISO 639-1/639-3)
score: float # Detection confidence score (0.0 to 1.0)The service applies language-specific and script-specific rules for sentence boundary detection:
Sentence boundary detection is automatically used when include_sentence_length=True in translation requests:
# Translation with automatic sentence boundary detection
translation_response = client.translate(
body=["First sentence. Second sentence. Third sentence."],
to_language=["es"],
include_sentence_length=True
)
translation = translation_response[0].translations[0]
if translation.sent_len:
print(f"Source sentence lengths: {translation.sent_len.src_sent_len}")
print(f"Target sentence lengths: {translation.sent_len.trans_sent_len}")from azure.core.exceptions import HttpResponseError
try:
response = client.find_sentence_boundaries(
body=["Text to analyze"],
language="invalid-code" # Invalid language code
)
except HttpResponseError as error:
if error.error:
print(f"Error Code: {error.error.code}")
print(f"Message: {error.error.message}")Install with Tessl CLI
npx tessl i tessl/pypi-azure-ai-translation-text