or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

compilation.mdindex.mdlanguage-support.mdparsing.mdstream-processing.md

language-support.mddocs/

0

# Language and Dialect Support

1

2

Multi-language keyword support for international BDD development with dynamic dialect loading. The Gherkin parser supports over 70 natural languages, allowing teams to write specifications in their native language while maintaining compatibility with testing frameworks.

3

4

## Capabilities

5

6

### Dialect Class

7

8

Language-specific keyword sets with property-based access to translated Gherkin keywords.

9

10

```python { .api }

11

class Dialect:

12

@classmethod

13

def for_name(cls, name: str) -> Self | None:

14

"""

15

Load dialect by language code.

16

17

Parameters:

18

- name: Language code (e.g., 'en', 'fr', 'de', 'es')

19

20

Returns:

21

- Dialect: Language dialect instance or None if not found

22

"""

23

24

def __init__(self, spec: DialectSpec) -> None:

25

"""

26

Create dialect from specification.

27

28

Parameters:

29

- spec: Dialect specification with keyword translations

30

"""

31

32

@property

33

def feature_keywords(self) -> list[str]:

34

"""Feature keywords (e.g., ['Feature', 'Fonctionnalité'])"""

35

36

@property

37

def rule_keywords(self) -> list[str]:

38

"""Rule keywords (e.g., ['Rule', 'Règle'])"""

39

40

@property

41

def scenario_keywords(self) -> list[str]:

42

"""Scenario keywords (e.g., ['Scenario', 'Scénario'])"""

43

44

@property

45

def scenario_outline_keywords(self) -> list[str]:

46

"""Scenario Outline keywords (e.g., ['Scenario Outline', 'Plan du scénario'])"""

47

48

@property

49

def background_keywords(self) -> list[str]:

50

"""Background keywords (e.g., ['Background', 'Contexte'])"""

51

52

@property

53

def examples_keywords(self) -> list[str]:

54

"""Examples keywords (e.g., ['Examples', 'Exemples'])"""

55

56

@property

57

def given_keywords(self) -> list[str]:

58

"""Given step keywords (e.g., ['Given', 'Soit'])"""

59

60

@property

61

def when_keywords(self) -> list[str]:

62

"""When step keywords (e.g., ['When', 'Quand'])"""

63

64

@property

65

def then_keywords(self) -> list[str]:

66

"""Then step keywords (e.g., ['Then', 'Alors'])"""

67

68

@property

69

def and_keywords(self) -> list[str]:

70

"""And conjunction keywords (e.g., ['And', 'Et'])"""

71

72

@property

73

def but_keywords(self) -> list[str]:

74

"""But conjunction keywords (e.g., ['But', 'Mais'])"""

75

76

spec: DialectSpec

77

"""Raw dialect specification"""

78

```

79

80

### Dialect Specification

81

82

Structured definition of language-specific keywords for all Gherkin constructs.

83

84

```python { .api }

85

class DialectSpec(TypedDict):

86

and: list[str]

87

"""And conjunction keywords"""

88

89

background: list[str]

90

"""Background section keywords"""

91

92

but: list[str]

93

"""But conjunction keywords"""

94

95

examples: list[str]

96

"""Examples table keywords"""

97

98

feature: list[str]

99

"""Feature definition keywords"""

100

101

given: list[str]

102

"""Given step keywords"""

103

104

rule: list[str]

105

"""Rule section keywords"""

106

107

scenario: list[str]

108

"""Scenario keywords"""

109

110

scenarioOutline: list[str]

111

"""Scenario Outline keywords"""

112

113

then: list[str]

114

"""Then step keywords"""

115

116

when: list[str]

117

"""When step keywords"""

118

119

DIALECTS: dict[str, DialectSpec]

120

"""Global registry of all available language dialects"""

121

```

122

123

## Usage Examples

124

125

### Basic Language Support

126

127

```python

128

from gherkin.dialect import Dialect, DIALECTS

129

130

# List available languages

131

print(f"Supported languages: {list(DIALECTS.keys())}")

132

133

# Load English dialect (default)

134

english = Dialect.for_name("en")

135

print(f"Feature keywords: {english.feature_keywords}")

136

print(f"Given keywords: {english.given_keywords}")

137

138

# Load French dialect

139

french = Dialect.for_name("fr")

140

print(f"Feature keywords: {french.feature_keywords}")

141

print(f"Given keywords: {french.given_keywords}")

142

143

# Handle unknown dialect

144

unknown = Dialect.for_name("xyz")

145

if unknown is None:

146

print("Dialect not found")

147

```

148

149

### Parsing with Different Languages

150

151

```python

152

from gherkin import Parser

153

from gherkin.token_matcher import TokenMatcher

154

155

# French Gherkin content

156

french_gherkin = """

157

Fonctionnalité: Connexion utilisateur

158

Scénario: Connexion valide

159

Soit un utilisateur existant

160

Quand il saisit des identifiants valides

161

Alors il devrait être connecté

162

"""

163

164

# Create French token matcher

165

french_matcher = TokenMatcher("fr")

166

parser = Parser()

167

168

# Parse French content

169

document = parser.parse(french_gherkin, french_matcher)

170

feature = document['feature']

171

print(f"Feature name: {feature['name']}")

172

print(f"Language: {feature['language']}")

173

174

# German example

175

german_gherkin = """

176

Funktionalität: Benutzeranmeldung

177

Szenario: Gültige Anmeldung

178

Gegeben sei ein existierender Benutzer

179

Wenn er gültige Anmeldedaten eingibt

180

Dann sollte er angemeldet sein

181

"""

182

183

german_matcher = TokenMatcher("de")

184

german_document = parser.parse(german_gherkin, german_matcher)

185

```

186

187

### Language Detection

188

189

```python

190

def detect_language_from_keywords(gherkin_text: str) -> str | None:

191

"""Detect language from Gherkin keywords"""

192

193

lines = gherkin_text.strip().split('\n')

194

first_line = lines[0].strip() if lines else ""

195

196

# Check feature keywords across languages

197

for lang_code, dialect_spec in DIALECTS.items():

198

for feature_keyword in dialect_spec['feature']:

199

if first_line.startswith(feature_keyword + ':'):

200

return lang_code

201

202

return None

203

204

# Auto-detect language

205

french_text = "Fonctionnalité: Test automatique"

206

detected = detect_language_from_keywords(french_text)

207

print(f"Detected language: {detected}") # Output: fr

208

209

spanish_text = "Característica: Prueba automática"

210

detected = detect_language_from_keywords(spanish_text)

211

print(f"Detected language: {detected}") # Output: es

212

```

213

214

### Multi-Language Processing

215

216

```python

217

from gherkin.stream.gherkin_events import GherkinEvents

218

219

def process_multilingual_features(features: dict[str, str]) -> None:

220

"""Process features in multiple languages"""

221

222

options = GherkinEvents.Options(

223

print_source=False,

224

print_ast=True,

225

print_pickles=True

226

)

227

228

processor = GherkinEvents(options)

229

230

for file_name, content in features.items():

231

# Auto-detect or specify language

232

language = detect_language_from_keywords(content) or "en"

233

234

source_event = {

235

"source": {

236

"uri": file_name,

237

"location": {"line": 1},

238

"data": content,

239

"mediaType": "text/x.cucumber.gherkin+plain"

240

}

241

}

242

243

for envelope in processor.enum(source_event):

244

if "gherkinDocument" in envelope:

245

doc = envelope["gherkinDocument"]

246

feature = doc['feature']

247

print(f"{file_name} ({language}): {feature['name']}")

248

elif "parseError" in envelope:

249

error = envelope["parseError"]

250

print(f"Error in {file_name}: {error['message']}")

251

252

# Process mixed language features

253

multilingual_features = {

254

"login_en.feature": """

255

Feature: User Login

256

Scenario: Valid login

257

Given a user exists

258

When they enter credentials

259

Then they are logged in

260

""",

261

262

"login_fr.feature": """

263

Fonctionnalité: Connexion utilisateur

264

Scénario: Connexion valide

265

Soit un utilisateur existant

266

Quand il saisit des identifiants

267

Alors il devrait être connecté

268

""",

269

270

"login_es.feature": """

271

Característica: Inicio de sesión

272

Escenario: Inicio válido

273

Dado que existe un usuario

274

Cuando ingresa credenciales

275

Entonces debería estar conectado

276

"""

277

}

278

279

process_multilingual_features(multilingual_features)

280

```

281

282

### Custom Dialect Creation

283

284

```python

285

from gherkin.dialect import DialectSpec

286

287

# Create custom dialect (hypothetical Pirate English)

288

pirate_spec: DialectSpec = {

289

"feature": ["Treasure Map"],

290

"scenario": ["Adventure", "Quest"],

291

"given": ["Ahoy", "Avast"],

292

"when": ["When ye"],

293

"then": ["Then ye shall"],

294

"and": ["An'", "And"],

295

"but": ["But ye"],

296

"background": ["Ship's Log"],

297

"examples": ["Tales"],

298

"rule": ["Pirate Code"],

299

"scenarioOutline": ["Legend"]

300

}

301

302

# Create dialect instance

303

pirate_dialect = Dialect(pirate_spec)

304

print(f"Feature keywords: {pirate_dialect.feature_keywords}")

305

print(f"Given keywords: {pirate_dialect.given_keywords}")

306

```

307

308

### Language-Aware Error Messages

309

310

```python

311

from gherkin.errors import ParserException

312

313

def create_localized_error_handler(language: str):

314

"""Create error handler with language context"""

315

316

dialect = Dialect.for_name(language)

317

if not dialect:

318

dialect = Dialect.for_name("en") # Fallback to English

319

320

def handle_parse_error(error: ParserException) -> str:

321

"""Format error with language context"""

322

323

location = error.location

324

message = str(error)

325

326

# Add dialect context to error

327

expected_keywords = {

328

'Feature': dialect.feature_keywords,

329

'Scenario': dialect.scenario_keywords,

330

'Given': dialect.given_keywords,

331

'When': dialect.when_keywords,

332

'Then': dialect.then_keywords

333

}

334

335

localized_message = f"Parse error at line {location['line']}: {message}"

336

localized_message += f"\nExpected keywords in {language}:"

337

338

for keyword_type, keywords in expected_keywords.items():

339

localized_message += f"\n {keyword_type}: {', '.join(keywords)}"

340

341

return localized_message

342

343

return handle_parse_error

344

345

# Use with different languages

346

french_error_handler = create_localized_error_handler("fr")

347

german_error_handler = create_localized_error_handler("de")

348

```

349

350

## Supported Languages

351

352

Common language dialects with their feature keywords:

353

354

| Language | Code | Feature Keywords | Example |

355

|----------|------|------------------|---------|

356

| English | en | Feature | Feature: User login |

357

| French | fr | Fonctionnalité | Fonctionnalité: Connexion |

358

| German | de | Funktionalität | Funktionalität: Benutzer |

359

| Spanish | es | Característica | Característica: Usuario |

360

| Italian | it | Funzionalità | Funzionalità: Utente |

361

| Portuguese | pt | Funcionalidade | Funcionalidade: Usuário |

362

| Russian | ru | Функция | Функция: Пользователь |

363

| Chinese | zh-CN | 功能 | 功能: 用户登录 |

364

| Japanese | ja | フィーチャ | フィーチャ: ユーザーログイン |

365

| Korean | ko | 기능 | 기능: 사용자 로그인 |

366

367

Access the complete list programmatically:

368

369

```python

370

from gherkin.dialect import DIALECTS

371

372

for lang_code in sorted(DIALECTS.keys()):

373

dialect_spec = DIALECTS[lang_code]

374

feature_keywords = dialect_spec['feature']

375

print(f"{lang_code}: {', '.join(feature_keywords)}")

376

```