0
# Core Parsing
1
2
Primary parsing functionality that converts Gherkin text into structured Abstract Syntax Tree (AST) format. The parser handles tokenization, syntax analysis, and error recovery while supporting multiple input formats and comprehensive error reporting.
3
4
## Capabilities
5
6
### Parser Class
7
8
Main parser class that transforms Gherkin text into structured AST with configurable error handling and AST building.
9
10
```python { .api }
11
class Parser:
12
def __init__(self, ast_builder: AstBuilder | None = None) -> None:
13
"""
14
Create a new parser instance.
15
16
Parameters:
17
- ast_builder: Optional custom AST builder, defaults to AstBuilder()
18
"""
19
20
def parse(
21
self,
22
token_scanner_or_str: TokenScanner | str,
23
token_matcher: TokenMatcher | None = None,
24
) -> GherkinDocument:
25
"""
26
Parse Gherkin text or token stream into AST.
27
28
Parameters:
29
- token_scanner_or_str: Either raw Gherkin text string or TokenScanner instance
30
- token_matcher: Optional token matcher, defaults to TokenMatcher()
31
32
Returns:
33
- GherkinDocument: Parsed AST representation
34
35
Raises:
36
- CompositeParserException: Multiple parsing errors occurred
37
- ParserException: Single parsing error occurred
38
"""
39
40
stop_at_first_error: bool
41
"""Whether to stop parsing at the first error or collect all errors"""
42
```
43
44
### AST Builder
45
46
Builds Abstract Syntax Tree nodes during parsing with ID generation and comment tracking.
47
48
```python { .api }
49
class AstBuilder:
50
def __init__(self, id_generator: IdGenerator | None = None) -> None:
51
"""
52
Create AST builder with optional ID generator.
53
54
Parameters:
55
- id_generator: Optional ID generator, defaults to IdGenerator()
56
"""
57
58
def reset(self) -> None:
59
"""Reset builder state for new parsing session"""
60
61
def start_rule(self, rule_type: str) -> None:
62
"""Start processing a grammar rule"""
63
64
def end_rule(self, rule_type: str) -> None:
65
"""End processing a grammar rule"""
66
67
def build(self, token: Token) -> None:
68
"""Build AST node from token"""
69
70
def get_result(self) -> Any:
71
"""Get final parsed result"""
72
73
id_generator: IdGenerator
74
stack: list[AstNode]
75
comments: list[Comment]
76
```
77
78
### Token Processing
79
80
Low-level tokenization and scanning functionality for lexical analysis.
81
82
```python { .api }
83
class TokenScanner:
84
def __init__(self, source: str) -> None:
85
"""
86
Create token scanner for Gherkin source text.
87
88
Parameters:
89
- source: Raw Gherkin text to tokenize
90
"""
91
92
def read(self) -> Token:
93
"""Read next token from source"""
94
95
class TokenMatcher:
96
def __init__(self, dialect_name: str = "en") -> None:
97
"""
98
Create token matcher for specified language dialect.
99
100
Parameters:
101
- dialect_name: Language dialect code (default: "en")
102
"""
103
104
def reset(self) -> None:
105
"""Reset matcher state"""
106
107
def match_FeatureLine(self, token: Token) -> bool:
108
"""Match feature line tokens"""
109
110
def match_ScenarioLine(self, token: Token) -> bool:
111
"""Match scenario line tokens"""
112
113
def match_StepLine(self, token: Token) -> bool:
114
"""Match step line tokens"""
115
116
class GherkinInMarkdownTokenMatcher(TokenMatcher):
117
"""Token matcher for Gherkin embedded in Markdown documents"""
118
```
119
120
## Usage Examples
121
122
### Basic Text Parsing
123
124
```python
125
from gherkin import Parser
126
127
parser = Parser()
128
gherkin_text = """
129
Feature: Calculator
130
Scenario: Addition
131
Given I have 2 and 3
132
When I add them
133
Then I get 5
134
"""
135
136
document = parser.parse(gherkin_text)
137
feature = document['feature']
138
print(f"Feature: {feature['name']}")
139
print(f"Scenarios: {len(feature['children'])}")
140
```
141
142
### Custom AST Builder
143
144
```python
145
from gherkin import Parser
146
from gherkin.ast_builder import AstBuilder
147
from gherkin.stream.id_generator import IdGenerator
148
149
# Create custom ID generator
150
id_gen = IdGenerator()
151
ast_builder = AstBuilder(id_gen)
152
parser = Parser(ast_builder)
153
154
document = parser.parse(gherkin_text)
155
```
156
157
### Error Handling
158
159
```python
160
from gherkin import Parser
161
from gherkin.errors import CompositeParserException, ParserException
162
163
parser = Parser()
164
invalid_gherkin = """
165
Feature: Invalid
166
Scenario:
167
Given step without scenario name
168
"""
169
170
try:
171
document = parser.parse(invalid_gherkin)
172
except CompositeParserException as e:
173
print(f"Multiple errors: {len(e.errors)}")
174
for error in e.errors:
175
print(f" Line {error.location['line']}: {error}")
176
except ParserException as e:
177
print(f"Parse error at line {e.location['line']}: {e}")
178
```
179
180
### Token Stream Processing
181
182
```python
183
from gherkin import Parser
184
from gherkin.token_scanner import TokenScanner
185
from gherkin.token_matcher import TokenMatcher
186
187
# Manual token processing
188
scanner = TokenScanner(gherkin_text)
189
matcher = TokenMatcher("en") # English dialect
190
parser = Parser()
191
192
document = parser.parse(scanner, matcher)
193
```