or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-argostranslate

Open-source neural machine translation library based on OpenNMT's CTranslate2

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/argostranslate@1.9.x

To install, run

npx @tessl/cli install tessl/pypi-argostranslate@1.9.0

0

# Argostranslate

1

2

An open-source offline neural machine translation library that enables developers to perform language translation without requiring internet connectivity or external API calls. Built on top of OpenNMT's CTranslate2 framework, argostranslate supports automatic language detection, pivoting through intermediate languages for indirect translation paths, and manages installable language model packages. The library offers multiple interfaces including Python API, command-line tools, and GUI applications, making it suitable for integration into various applications while maintaining offline functionality and user privacy.

3

4

## Package Information

5

6

- **Package Name**: argostranslate

7

- **Language**: Python

8

- **Installation**: `pip install argostranslate`

9

- **License**: MIT

10

- **Python Version**: >=3.5

11

12

## Core Imports

13

14

```python

15

import argostranslate.translate

16

import argostranslate.package

17

```

18

19

For translation functionality:

20

21

```python

22

from argostranslate import translate

23

```

24

25

For package management:

26

27

```python

28

from argostranslate import package

29

```

30

31

## Basic Usage

32

33

```python

34

from argostranslate import translate, package

35

36

# Install a translation package first (if not already installed)

37

available_packages = package.get_available_packages()

38

en_to_es_packages = [p for p in available_packages if p.from_code == "en" and p.to_code == "es"]

39

if en_to_es_packages:

40

en_to_es_packages[0].install()

41

42

# Perform translation

43

translated_text = translate.translate("Hello world", "en", "es")

44

print(translated_text) # "Hola mundo"

45

46

# Get available languages

47

installed_languages = translate.get_installed_languages()

48

for lang in installed_languages:

49

print(f"{lang.code}: {lang.name}")

50

51

# Get translation object for reuse

52

translation = translate.get_translation_from_codes("en", "es")

53

if translation:

54

result = translation.translate("How are you?")

55

print(result) # "¿Cómo estás?"

56

```

57

58

## Architecture

59

60

Argostranslate uses a modular architecture built around several key components:

61

62

- **Translation Engine**: Core translation functionality with support for multiple backends (OpenNMT, LibreTranslate, OpenAI)

63

- **Package System**: Manages downloadable language model packages with automatic dependency resolution

64

- **Language Detection**: Automatic source language identification when not specified

65

- **Translation Pivoting**: Enables indirect translation through intermediate languages when direct models aren't available

66

- **Caching Layer**: Performance optimization through translation result caching

67

- **CLI Tools**: Command-line interfaces for both translation (`argos-translate`) and package management (`argospm`)

68

69

The library supports multiple translation backends, allowing users to choose between offline neural models (default), remote API services, or large language model providers based on their needs.

70

71

## Capabilities

72

73

### Core Translation

74

75

Primary translation functionality including simple text translation, multiple translation hypotheses, language detection, and translation chaining through intermediate languages when direct translation models are unavailable.

76

77

```python { .api }

78

def translate(q: str, from_code: str, to_code: str) -> str:

79

"""Main translation function for simple text translation."""

80

81

def get_installed_languages() -> list[Language]:

82

"""Get list of installed languages."""

83

84

def get_language_from_code(code: str) -> Language | None:

85

"""Get language object from ISO code."""

86

87

def get_translation_from_codes(from_code: str, to_code: str) -> ITranslation | None:

88

"""Get translation object for reuse."""

89

```

90

91

```python { .api }

92

class Language:

93

def __init__(self, code: str, name: str): ...

94

def get_translation(self, to: Language) -> ITranslation | None: ...

95

96

class ITranslation:

97

def translate(self, input_text: str) -> str: ...

98

def hypotheses(self, input_text: str, num_hypotheses: int = 4) -> list[Hypothesis]: ...

99

100

class Hypothesis:

101

def __init__(self, value: str, score: float): ...

102

```

103

104

[Translation](./translation.md)

105

106

### Package Management

107

108

Installation and management of translation model packages, including downloading from remote repositories, installing from local files, and managing package dependencies and updates.

109

110

```python { .api }

111

def get_installed_packages(path: Path = None) -> list[Package]:

112

"""Get list of installed translation packages."""

113

114

def get_available_packages() -> list[AvailablePackage]:

115

"""Get list of packages available for download."""

116

117

def install_from_path(path: Path):

118

"""Install package from local file path."""

119

120

def uninstall(pkg: Package):

121

"""Remove installed package."""

122

```

123

124

```python { .api }

125

class Package:

126

def __init__(self, package_path: Path): ...

127

def update(self): ...

128

def get_readme(self) -> str | None: ...

129

130

class AvailablePackage:

131

def __init__(self, metadata): ...

132

def download(self) -> Path: ...

133

def install(self): ...

134

```

135

136

[Package Management](./package-management.md)

137

138

### External API Integration

139

140

Integration with external translation services including LibreTranslate API and OpenAI language models, providing alternative translation backends when offline models are insufficient or unavailable.

141

142

```python { .api }

143

class LibreTranslateAPI:

144

def __init__(self, url: str = None, api_key: str = None): ...

145

def translate(self, q: str, source: str = "en", target: str = "es") -> str: ...

146

def languages(self): ...

147

def detect(self, q: str): ...

148

149

class OpenAIAPI:

150

def __init__(self, api_key: str): ...

151

def infer(self, prompt: str) -> str | None: ...

152

```

153

154

[External APIs](./external-apis.md)

155

156

### Text Processing

157

158

Advanced text processing capabilities including tokenization, sentence boundary detection, format preservation during translation, and byte pair encoding support for high-quality neural machine translation.

159

160

```python { .api }

161

# Tokenization interfaces

162

class Tokenizer:

163

def encode(self, sentence: str) -> List[str]: ...

164

def decode(self, tokens: List[str]) -> str: ...

165

166

class SentencePieceTokenizer(Tokenizer):

167

def __init__(self, model_file: Path): ...

168

169

class BPETokenizer(Tokenizer):

170

def __init__(self, model_file: Path, from_code: str, to_code: str): ...

171

172

# Sentence boundary detection

173

def get_sbd_package() -> Package | None: ...

174

def detect_sentence(input_text: str, sbd_translation, sentence_guess_length: int = 150) -> int: ...

175

176

# Format preservation

177

class ITag:

178

translateable: bool

179

def text(self) -> str: ...

180

181

class Tag(ITag):

182

def __init__(self, children: ITag | str, translateable: bool = True): ...

183

184

def translate_preserve_formatting(underlying_translation: ITranslation, input_text: str) -> str: ...

185

186

# Byte Pair Encoding

187

class BPE:

188

def __init__(self, codes, merges: int = -1, separator: str = '@@', vocab = None, glossaries = None): ...

189

def segment(self, sentence): ...

190

```

191

192

[Text Processing](./text-processing.md)

193

194

### Configuration and Settings

195

196

Configuration management including data directories, cache settings, remote repository URLs, device selection (CPU/CUDA), API keys, and experimental feature flags.

197

198

```python { .api }

199

# Configuration variables

200

debug: bool

201

data_dir: Path

202

package_data_dir: Path

203

cache_dir: Path

204

remote_repo: str

205

device: str

206

libretranslate_api_key: str

207

openai_api_key: str

208

209

# Model provider selection

210

class ModelProvider:

211

OPENNMT = 0

212

LIBRETRANSLATE = 1

213

OPENAI = 2

214

215

model_provider: ModelProvider

216

```

217

218

[Configuration](./configuration.md)

219

220

## Command Line Interfaces

221

222

### argos-translate CLI

223

224

Main command-line interface for performing translations directly from the terminal.

225

226

**Entry Point**: `argos-translate`

227

228

**Usage**: Provides command-line access to translation functionality with support for different languages and input/output options.

229

230

### argospm CLI

231

232

Package manager for installing and managing translation model packages.

233

234

**Entry Point**: `argospm`

235

236

**Common Commands**:

237

- Update package index

238

- Install packages

239

- List installed packages

240

- Search available packages

241

- Remove packages

242

243

## Error Handling

244

245

The library uses standard Python exceptions. Common error scenarios include:

246

247

- `FileNotFoundError`: When package files or directories are not found

248

- Network errors during package downloads

249

- Translation errors when unsupported language pairs are requested

250

- API authentication errors for external services

251

252

## Supported Languages

253

254

Argostranslate supports over 30 languages with automatic pivoting capabilities. Language support depends on installed translation packages. The library includes a comprehensive language database (`languages.csv`) with ISO codes and names for 184 languages.

255

256

## Advanced Features

257

258

- **Translation Caching**: Automatic caching of translation results for improved performance

259

- **Sentence Boundary Detection**: Intelligent text segmentation for better translation quality

260

- **Format Preservation**: Maintains text formatting during translation using tag-based processing

261

- **BPE Tokenization**: Byte Pair Encoding support for improved translation accuracy

262

- **Fewshot Translation**: Integration with large language models for advanced translation scenarios