or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

compilation.mdindex.mdlanguage-support.mdparsing.mdstream-processing.md

parsing.mddocs/

0

# Core Parsing

1

2

Primary parsing functionality that converts Gherkin text into structured Abstract Syntax Tree (AST) format. The parser handles tokenization, syntax analysis, and error recovery while supporting multiple input formats and comprehensive error reporting.

3

4

## Capabilities

5

6

### Parser Class

7

8

Main parser class that transforms Gherkin text into structured AST with configurable error handling and AST building.

9

10

```python { .api }

11

class Parser:

12

def __init__(self, ast_builder: AstBuilder | None = None) -> None:

13

"""

14

Create a new parser instance.

15

16

Parameters:

17

- ast_builder: Optional custom AST builder, defaults to AstBuilder()

18

"""

19

20

def parse(

21

self,

22

token_scanner_or_str: TokenScanner | str,

23

token_matcher: TokenMatcher | None = None,

24

) -> GherkinDocument:

25

"""

26

Parse Gherkin text or token stream into AST.

27

28

Parameters:

29

- token_scanner_or_str: Either raw Gherkin text string or TokenScanner instance

30

- token_matcher: Optional token matcher, defaults to TokenMatcher()

31

32

Returns:

33

- GherkinDocument: Parsed AST representation

34

35

Raises:

36

- CompositeParserException: Multiple parsing errors occurred

37

- ParserException: Single parsing error occurred

38

"""

39

40

stop_at_first_error: bool

41

"""Whether to stop parsing at the first error or collect all errors"""

42

```

43

44

### AST Builder

45

46

Builds Abstract Syntax Tree nodes during parsing with ID generation and comment tracking.

47

48

```python { .api }

49

class AstBuilder:

50

def __init__(self, id_generator: IdGenerator | None = None) -> None:

51

"""

52

Create AST builder with optional ID generator.

53

54

Parameters:

55

- id_generator: Optional ID generator, defaults to IdGenerator()

56

"""

57

58

def reset(self) -> None:

59

"""Reset builder state for new parsing session"""

60

61

def start_rule(self, rule_type: str) -> None:

62

"""Start processing a grammar rule"""

63

64

def end_rule(self, rule_type: str) -> None:

65

"""End processing a grammar rule"""

66

67

def build(self, token: Token) -> None:

68

"""Build AST node from token"""

69

70

def get_result(self) -> Any:

71

"""Get final parsed result"""

72

73

id_generator: IdGenerator

74

stack: list[AstNode]

75

comments: list[Comment]

76

```

77

78

### Token Processing

79

80

Low-level tokenization and scanning functionality for lexical analysis.

81

82

```python { .api }

83

class TokenScanner:

84

def __init__(self, source: str) -> None:

85

"""

86

Create token scanner for Gherkin source text.

87

88

Parameters:

89

- source: Raw Gherkin text to tokenize

90

"""

91

92

def read(self) -> Token:

93

"""Read next token from source"""

94

95

class TokenMatcher:

96

def __init__(self, dialect_name: str = "en") -> None:

97

"""

98

Create token matcher for specified language dialect.

99

100

Parameters:

101

- dialect_name: Language dialect code (default: "en")

102

"""

103

104

def reset(self) -> None:

105

"""Reset matcher state"""

106

107

def match_FeatureLine(self, token: Token) -> bool:

108

"""Match feature line tokens"""

109

110

def match_ScenarioLine(self, token: Token) -> bool:

111

"""Match scenario line tokens"""

112

113

def match_StepLine(self, token: Token) -> bool:

114

"""Match step line tokens"""

115

116

class GherkinInMarkdownTokenMatcher(TokenMatcher):

117

"""Token matcher for Gherkin embedded in Markdown documents"""

118

```

119

120

## Usage Examples

121

122

### Basic Text Parsing

123

124

```python

125

from gherkin import Parser

126

127

parser = Parser()

128

gherkin_text = """

129

Feature: Calculator

130

Scenario: Addition

131

Given I have 2 and 3

132

When I add them

133

Then I get 5

134

"""

135

136

document = parser.parse(gherkin_text)

137

feature = document['feature']

138

print(f"Feature: {feature['name']}")

139

print(f"Scenarios: {len(feature['children'])}")

140

```

141

142

### Custom AST Builder

143

144

```python

145

from gherkin import Parser

146

from gherkin.ast_builder import AstBuilder

147

from gherkin.stream.id_generator import IdGenerator

148

149

# Create custom ID generator

150

id_gen = IdGenerator()

151

ast_builder = AstBuilder(id_gen)

152

parser = Parser(ast_builder)

153

154

document = parser.parse(gherkin_text)

155

```

156

157

### Error Handling

158

159

```python

160

from gherkin import Parser

161

from gherkin.errors import CompositeParserException, ParserException

162

163

parser = Parser()

164

invalid_gherkin = """

165

Feature: Invalid

166

Scenario:

167

Given step without scenario name

168

"""

169

170

try:

171

document = parser.parse(invalid_gherkin)

172

except CompositeParserException as e:

173

print(f"Multiple errors: {len(e.errors)}")

174

for error in e.errors:

175

print(f" Line {error.location['line']}: {error}")

176

except ParserException as e:

177

print(f"Parse error at line {e.location['line']}: {e}")

178

```

179

180

### Token Stream Processing

181

182

```python

183

from gherkin import Parser

184

from gherkin.token_scanner import TokenScanner

185

from gherkin.token_matcher import TokenMatcher

186

187

# Manual token processing

188

scanner = TokenScanner(gherkin_text)

189

matcher = TokenMatcher("en") # English dialect

190

parser = Parser()

191

192

document = parser.parse(scanner, matcher)

193

```