or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

configuration.mdexternal-apis.mdindex.mdpackage-management.mdtext-processing.mdtranslation.md
tile.json

tessl/pypi-argostranslate

Open-source neural machine translation library based on OpenNMT's CTranslate2

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/argostranslate@1.9.x

To install, run

npx @tessl/cli install tessl/pypi-argostranslate@1.9.0

index.mddocs/

Argostranslate

An open-source offline neural machine translation library that enables developers to perform language translation without requiring internet connectivity or external API calls. Built on top of OpenNMT's CTranslate2 framework, argostranslate supports automatic language detection, pivoting through intermediate languages for indirect translation paths, and manages installable language model packages. The library offers multiple interfaces including Python API, command-line tools, and GUI applications, making it suitable for integration into various applications while maintaining offline functionality and user privacy.

Package Information

  • Package Name: argostranslate
  • Language: Python
  • Installation: pip install argostranslate
  • License: MIT
  • Python Version: >=3.5

Core Imports

import argostranslate.translate
import argostranslate.package

For translation functionality:

from argostranslate import translate

For package management:

from argostranslate import package

Basic Usage

from argostranslate import translate, package

# Install a translation package first (if not already installed)
available_packages = package.get_available_packages()
en_to_es_packages = [p for p in available_packages if p.from_code == "en" and p.to_code == "es"]
if en_to_es_packages:
    en_to_es_packages[0].install()

# Perform translation
translated_text = translate.translate("Hello world", "en", "es")
print(translated_text)  # "Hola mundo"

# Get available languages
installed_languages = translate.get_installed_languages()
for lang in installed_languages:
    print(f"{lang.code}: {lang.name}")

# Get translation object for reuse
translation = translate.get_translation_from_codes("en", "es")
if translation:
    result = translation.translate("How are you?")
    print(result)  # "¿Cómo estás?"

Architecture

Argostranslate uses a modular architecture built around several key components:

  • Translation Engine: Core translation functionality with support for multiple backends (OpenNMT, LibreTranslate, OpenAI)
  • Package System: Manages downloadable language model packages with automatic dependency resolution
  • Language Detection: Automatic source language identification when not specified
  • Translation Pivoting: Enables indirect translation through intermediate languages when direct models aren't available
  • Caching Layer: Performance optimization through translation result caching
  • CLI Tools: Command-line interfaces for both translation (argos-translate) and package management (argospm)

The library supports multiple translation backends, allowing users to choose between offline neural models (default), remote API services, or large language model providers based on their needs.

Capabilities

Core Translation

Primary translation functionality including simple text translation, multiple translation hypotheses, language detection, and translation chaining through intermediate languages when direct translation models are unavailable.

def translate(q: str, from_code: str, to_code: str) -> str:
    """Main translation function for simple text translation."""

def get_installed_languages() -> list[Language]:
    """Get list of installed languages."""

def get_language_from_code(code: str) -> Language | None:
    """Get language object from ISO code."""

def get_translation_from_codes(from_code: str, to_code: str) -> ITranslation | None:
    """Get translation object for reuse."""
class Language:
    def __init__(self, code: str, name: str): ...
    def get_translation(self, to: Language) -> ITranslation | None: ...

class ITranslation:
    def translate(self, input_text: str) -> str: ...
    def hypotheses(self, input_text: str, num_hypotheses: int = 4) -> list[Hypothesis]: ...

class Hypothesis:
    def __init__(self, value: str, score: float): ...

Translation

Package Management

Installation and management of translation model packages, including downloading from remote repositories, installing from local files, and managing package dependencies and updates.

def get_installed_packages(path: Path = None) -> list[Package]:
    """Get list of installed translation packages."""

def get_available_packages() -> list[AvailablePackage]:
    """Get list of packages available for download."""

def install_from_path(path: Path):
    """Install package from local file path."""

def uninstall(pkg: Package):
    """Remove installed package."""
class Package:
    def __init__(self, package_path: Path): ...
    def update(self): ...
    def get_readme(self) -> str | None: ...

class AvailablePackage:
    def __init__(self, metadata): ...
    def download(self) -> Path: ...
    def install(self): ...

Package Management

External API Integration

Integration with external translation services including LibreTranslate API and OpenAI language models, providing alternative translation backends when offline models are insufficient or unavailable.

class LibreTranslateAPI:
    def __init__(self, url: str = None, api_key: str = None): ...
    def translate(self, q: str, source: str = "en", target: str = "es") -> str: ...
    def languages(self): ...
    def detect(self, q: str): ...

class OpenAIAPI:
    def __init__(self, api_key: str): ...
    def infer(self, prompt: str) -> str | None: ...

External APIs

Text Processing

Advanced text processing capabilities including tokenization, sentence boundary detection, format preservation during translation, and byte pair encoding support for high-quality neural machine translation.

# Tokenization interfaces
class Tokenizer:
    def encode(self, sentence: str) -> List[str]: ...
    def decode(self, tokens: List[str]) -> str: ...

class SentencePieceTokenizer(Tokenizer):
    def __init__(self, model_file: Path): ...

class BPETokenizer(Tokenizer):
    def __init__(self, model_file: Path, from_code: str, to_code: str): ...

# Sentence boundary detection
def get_sbd_package() -> Package | None: ...
def detect_sentence(input_text: str, sbd_translation, sentence_guess_length: int = 150) -> int: ...

# Format preservation
class ITag:
    translateable: bool
    def text(self) -> str: ...

class Tag(ITag):
    def __init__(self, children: ITag | str, translateable: bool = True): ...

def translate_preserve_formatting(underlying_translation: ITranslation, input_text: str) -> str: ...

# Byte Pair Encoding
class BPE:
    def __init__(self, codes, merges: int = -1, separator: str = '@@', vocab = None, glossaries = None): ...
    def segment(self, sentence): ...

Text Processing

Configuration and Settings

Configuration management including data directories, cache settings, remote repository URLs, device selection (CPU/CUDA), API keys, and experimental feature flags.

# Configuration variables
debug: bool
data_dir: Path
package_data_dir: Path
cache_dir: Path
remote_repo: str
device: str
libretranslate_api_key: str
openai_api_key: str

# Model provider selection
class ModelProvider:
    OPENNMT = 0
    LIBRETRANSLATE = 1
    OPENAI = 2

model_provider: ModelProvider

Configuration

Command Line Interfaces

argos-translate CLI

Main command-line interface for performing translations directly from the terminal.

Entry Point: argos-translate

Usage: Provides command-line access to translation functionality with support for different languages and input/output options.

argospm CLI

Package manager for installing and managing translation model packages.

Entry Point: argospm

Common Commands:

  • Update package index
  • Install packages
  • List installed packages
  • Search available packages
  • Remove packages

Error Handling

The library uses standard Python exceptions. Common error scenarios include:

  • FileNotFoundError: When package files or directories are not found
  • Network errors during package downloads
  • Translation errors when unsupported language pairs are requested
  • API authentication errors for external services

Supported Languages

Argostranslate supports over 30 languages with automatic pivoting capabilities. Language support depends on installed translation packages. The library includes a comprehensive language database (languages.csv) with ISO codes and names for 184 languages.

Advanced Features

  • Translation Caching: Automatic caching of translation results for improved performance
  • Sentence Boundary Detection: Intelligent text segmentation for better translation quality
  • Format Preservation: Maintains text formatting during translation using tag-based processing
  • BPE Tokenization: Byte Pair Encoding support for improved translation accuracy
  • Fewshot Translation: Integration with large language models for advanced translation scenarios