0
# Language and Dialect Support
1
2
Multi-language keyword support for international BDD development with dynamic dialect loading. The Gherkin parser supports over 70 natural languages, allowing teams to write specifications in their native language while maintaining compatibility with testing frameworks.
3
4
## Capabilities
5
6
### Dialect Class
7
8
Language-specific keyword sets with property-based access to translated Gherkin keywords.
9
10
```python { .api }
11
class Dialect:
12
@classmethod
13
def for_name(cls, name: str) -> Self | None:
14
"""
15
Load dialect by language code.
16
17
Parameters:
18
- name: Language code (e.g., 'en', 'fr', 'de', 'es')
19
20
Returns:
21
- Dialect: Language dialect instance or None if not found
22
"""
23
24
def __init__(self, spec: DialectSpec) -> None:
25
"""
26
Create dialect from specification.
27
28
Parameters:
29
- spec: Dialect specification with keyword translations
30
"""
31
32
@property
33
def feature_keywords(self) -> list[str]:
34
"""Feature keywords (e.g., ['Feature', 'Fonctionnalité'])"""
35
36
@property
37
def rule_keywords(self) -> list[str]:
38
"""Rule keywords (e.g., ['Rule', 'Règle'])"""
39
40
@property
41
def scenario_keywords(self) -> list[str]:
42
"""Scenario keywords (e.g., ['Scenario', 'Scénario'])"""
43
44
@property
45
def scenario_outline_keywords(self) -> list[str]:
46
"""Scenario Outline keywords (e.g., ['Scenario Outline', 'Plan du scénario'])"""
47
48
@property
49
def background_keywords(self) -> list[str]:
50
"""Background keywords (e.g., ['Background', 'Contexte'])"""
51
52
@property
53
def examples_keywords(self) -> list[str]:
54
"""Examples keywords (e.g., ['Examples', 'Exemples'])"""
55
56
@property
57
def given_keywords(self) -> list[str]:
58
"""Given step keywords (e.g., ['Given', 'Soit'])"""
59
60
@property
61
def when_keywords(self) -> list[str]:
62
"""When step keywords (e.g., ['When', 'Quand'])"""
63
64
@property
65
def then_keywords(self) -> list[str]:
66
"""Then step keywords (e.g., ['Then', 'Alors'])"""
67
68
@property
69
def and_keywords(self) -> list[str]:
70
"""And conjunction keywords (e.g., ['And', 'Et'])"""
71
72
@property
73
def but_keywords(self) -> list[str]:
74
"""But conjunction keywords (e.g., ['But', 'Mais'])"""
75
76
spec: DialectSpec
77
"""Raw dialect specification"""
78
```
79
80
### Dialect Specification
81
82
Structured definition of language-specific keywords for all Gherkin constructs.
83
84
```python { .api }
85
class DialectSpec(TypedDict):
86
and: list[str]
87
"""And conjunction keywords"""
88
89
background: list[str]
90
"""Background section keywords"""
91
92
but: list[str]
93
"""But conjunction keywords"""
94
95
examples: list[str]
96
"""Examples table keywords"""
97
98
feature: list[str]
99
"""Feature definition keywords"""
100
101
given: list[str]
102
"""Given step keywords"""
103
104
rule: list[str]
105
"""Rule section keywords"""
106
107
scenario: list[str]
108
"""Scenario keywords"""
109
110
scenarioOutline: list[str]
111
"""Scenario Outline keywords"""
112
113
then: list[str]
114
"""Then step keywords"""
115
116
when: list[str]
117
"""When step keywords"""
118
119
DIALECTS: dict[str, DialectSpec]
120
"""Global registry of all available language dialects"""
121
```
122
123
## Usage Examples
124
125
### Basic Language Support
126
127
```python
128
from gherkin.dialect import Dialect, DIALECTS
129
130
# List available languages
131
print(f"Supported languages: {list(DIALECTS.keys())}")
132
133
# Load English dialect (default)
134
english = Dialect.for_name("en")
135
print(f"Feature keywords: {english.feature_keywords}")
136
print(f"Given keywords: {english.given_keywords}")
137
138
# Load French dialect
139
french = Dialect.for_name("fr")
140
print(f"Feature keywords: {french.feature_keywords}")
141
print(f"Given keywords: {french.given_keywords}")
142
143
# Handle unknown dialect
144
unknown = Dialect.for_name("xyz")
145
if unknown is None:
146
print("Dialect not found")
147
```
148
149
### Parsing with Different Languages
150
151
```python
152
from gherkin import Parser
153
from gherkin.token_matcher import TokenMatcher
154
155
# French Gherkin content
156
french_gherkin = """
157
Fonctionnalité: Connexion utilisateur
158
Scénario: Connexion valide
159
Soit un utilisateur existant
160
Quand il saisit des identifiants valides
161
Alors il devrait être connecté
162
"""
163
164
# Create French token matcher
165
french_matcher = TokenMatcher("fr")
166
parser = Parser()
167
168
# Parse French content
169
document = parser.parse(french_gherkin, french_matcher)
170
feature = document['feature']
171
print(f"Feature name: {feature['name']}")
172
print(f"Language: {feature['language']}")
173
174
# German example
175
german_gherkin = """
176
Funktionalität: Benutzeranmeldung
177
Szenario: Gültige Anmeldung
178
Gegeben sei ein existierender Benutzer
179
Wenn er gültige Anmeldedaten eingibt
180
Dann sollte er angemeldet sein
181
"""
182
183
german_matcher = TokenMatcher("de")
184
german_document = parser.parse(german_gherkin, german_matcher)
185
```
186
187
### Language Detection
188
189
```python
190
def detect_language_from_keywords(gherkin_text: str) -> str | None:
191
"""Detect language from Gherkin keywords"""
192
193
lines = gherkin_text.strip().split('\n')
194
first_line = lines[0].strip() if lines else ""
195
196
# Check feature keywords across languages
197
for lang_code, dialect_spec in DIALECTS.items():
198
for feature_keyword in dialect_spec['feature']:
199
if first_line.startswith(feature_keyword + ':'):
200
return lang_code
201
202
return None
203
204
# Auto-detect language
205
french_text = "Fonctionnalité: Test automatique"
206
detected = detect_language_from_keywords(french_text)
207
print(f"Detected language: {detected}") # Output: fr
208
209
spanish_text = "Característica: Prueba automática"
210
detected = detect_language_from_keywords(spanish_text)
211
print(f"Detected language: {detected}") # Output: es
212
```
213
214
### Multi-Language Processing
215
216
```python
217
from gherkin.stream.gherkin_events import GherkinEvents
218
219
def process_multilingual_features(features: dict[str, str]) -> None:
220
"""Process features in multiple languages"""
221
222
options = GherkinEvents.Options(
223
print_source=False,
224
print_ast=True,
225
print_pickles=True
226
)
227
228
processor = GherkinEvents(options)
229
230
for file_name, content in features.items():
231
# Auto-detect or specify language
232
language = detect_language_from_keywords(content) or "en"
233
234
source_event = {
235
"source": {
236
"uri": file_name,
237
"location": {"line": 1},
238
"data": content,
239
"mediaType": "text/x.cucumber.gherkin+plain"
240
}
241
}
242
243
for envelope in processor.enum(source_event):
244
if "gherkinDocument" in envelope:
245
doc = envelope["gherkinDocument"]
246
feature = doc['feature']
247
print(f"{file_name} ({language}): {feature['name']}")
248
elif "parseError" in envelope:
249
error = envelope["parseError"]
250
print(f"Error in {file_name}: {error['message']}")
251
252
# Process mixed language features
253
multilingual_features = {
254
"login_en.feature": """
255
Feature: User Login
256
Scenario: Valid login
257
Given a user exists
258
When they enter credentials
259
Then they are logged in
260
""",
261
262
"login_fr.feature": """
263
Fonctionnalité: Connexion utilisateur
264
Scénario: Connexion valide
265
Soit un utilisateur existant
266
Quand il saisit des identifiants
267
Alors il devrait être connecté
268
""",
269
270
"login_es.feature": """
271
Característica: Inicio de sesión
272
Escenario: Inicio válido
273
Dado que existe un usuario
274
Cuando ingresa credenciales
275
Entonces debería estar conectado
276
"""
277
}
278
279
process_multilingual_features(multilingual_features)
280
```
281
282
### Custom Dialect Creation
283
284
```python
285
from gherkin.dialect import DialectSpec
286
287
# Create custom dialect (hypothetical Pirate English)
288
pirate_spec: DialectSpec = {
289
"feature": ["Treasure Map"],
290
"scenario": ["Adventure", "Quest"],
291
"given": ["Ahoy", "Avast"],
292
"when": ["When ye"],
293
"then": ["Then ye shall"],
294
"and": ["An'", "And"],
295
"but": ["But ye"],
296
"background": ["Ship's Log"],
297
"examples": ["Tales"],
298
"rule": ["Pirate Code"],
299
"scenarioOutline": ["Legend"]
300
}
301
302
# Create dialect instance
303
pirate_dialect = Dialect(pirate_spec)
304
print(f"Feature keywords: {pirate_dialect.feature_keywords}")
305
print(f"Given keywords: {pirate_dialect.given_keywords}")
306
```
307
308
### Language-Aware Error Messages
309
310
```python
311
from gherkin.errors import ParserException
312
313
def create_localized_error_handler(language: str):
314
"""Create error handler with language context"""
315
316
dialect = Dialect.for_name(language)
317
if not dialect:
318
dialect = Dialect.for_name("en") # Fallback to English
319
320
def handle_parse_error(error: ParserException) -> str:
321
"""Format error with language context"""
322
323
location = error.location
324
message = str(error)
325
326
# Add dialect context to error
327
expected_keywords = {
328
'Feature': dialect.feature_keywords,
329
'Scenario': dialect.scenario_keywords,
330
'Given': dialect.given_keywords,
331
'When': dialect.when_keywords,
332
'Then': dialect.then_keywords
333
}
334
335
localized_message = f"Parse error at line {location['line']}: {message}"
336
localized_message += f"\nExpected keywords in {language}:"
337
338
for keyword_type, keywords in expected_keywords.items():
339
localized_message += f"\n {keyword_type}: {', '.join(keywords)}"
340
341
return localized_message
342
343
return handle_parse_error
344
345
# Use with different languages
346
french_error_handler = create_localized_error_handler("fr")
347
german_error_handler = create_localized_error_handler("de")
348
```
349
350
## Supported Languages
351
352
Common language dialects with their feature keywords:
353
354
| Language | Code | Feature Keywords | Example |
355
|----------|------|------------------|---------|
356
| English | en | Feature | Feature: User login |
357
| French | fr | Fonctionnalité | Fonctionnalité: Connexion |
358
| German | de | Funktionalität | Funktionalität: Benutzer |
359
| Spanish | es | Característica | Característica: Usuario |
360
| Italian | it | Funzionalità | Funzionalità: Utente |
361
| Portuguese | pt | Funcionalidade | Funcionalidade: Usuário |
362
| Russian | ru | Функция | Функция: Пользователь |
363
| Chinese | zh-CN | 功能 | 功能: 用户登录 |
364
| Japanese | ja | フィーチャ | フィーチャ: ユーザーログイン |
365
| Korean | ko | 기능 | 기능: 사용자 로그인 |
366
367
Access the complete list programmatically:
368
369
```python
370
from gherkin.dialect import DIALECTS
371
372
for lang_code in sorted(DIALECTS.keys()):
373
dialect_spec = DIALECTS[lang_code]
374
feature_keywords = dialect_spec['feature']
375
print(f"{lang_code}: {', '.join(feature_keywords)}")
376
```