0
# Pre-defined Parsers
1
2
Ready-to-use parser constants for common parsing tasks. These parsers handle frequent parsing scenarios like whitespace, character classes, and special positions.
3
4
## Capabilities
5
6
### Character Class Parsers
7
8
Parse common character categories.
9
10
```python { .api }
11
any_char: Parser
12
"""Parse any single character."""
13
14
letter: Parser
15
"""Parse any alphabetic character (using str.isalpha())."""
16
17
digit: Parser
18
"""Parse any digit character (using str.isdigit())."""
19
20
decimal_digit: Parser
21
"""Parse decimal digits 0-9 specifically."""
22
```
23
24
### Whitespace Parsing
25
26
Parse whitespace characters using regular expressions.
27
28
```python { .api }
29
whitespace: Parser
30
"""Parse one or more whitespace characters (regex r'\\s+')."""
31
```
32
33
### Position and Control Parsers
34
35
Handle special parsing positions and states.
36
37
```python { .api }
38
eof: Parser
39
"""Parse end of input - succeeds only when no more input remains."""
40
41
index: Parser
42
"""Get the current parse position as an integer."""
43
44
line_info: Parser
45
"""Get current line and column information as (line, column) tuple."""
46
```
47
48
## Usage Examples
49
50
### Basic Character Parsing
51
52
```python
53
from parsy import any_char, letter, digit, decimal_digit
54
55
# Parse any character
56
result = any_char.parse('x') # Returns 'x'
57
result = any_char.parse('5') # Returns '5'
58
result = any_char.parse('@') # Returns '@'
59
60
# Parse letters only
61
result = letter.parse('a') # Returns 'a'
62
result = letter.parse('Z') # Returns 'Z'
63
# letter.parse('5') # Raises ParseError
64
65
# Parse digits
66
result = digit.parse('7') # Returns '7'
67
result = decimal_digit.parse('3') # Returns '3'
68
```
69
70
### Whitespace Handling
71
72
```python
73
from parsy import whitespace, string, regex
74
75
# Parse whitespace
76
result = whitespace.parse(' ') # Returns ' '
77
result = whitespace.parse('\t\n ') # Returns '\t\n '
78
79
# Common pattern: optional whitespace
80
optional_ws = whitespace.optional()
81
82
# Lexeme pattern: parse something followed by optional whitespace
83
def lexeme(parser):
84
return parser << optional_ws
85
86
# Parse tokens with automatic whitespace handling
87
number = lexeme(regex(r'\d+').map(int))
88
plus = lexeme(string('+'))
89
90
# Parse "123 + 456" with automatic whitespace handling
91
@generate
92
def addition():
93
left = yield number
94
yield plus
95
right = yield number
96
return left + right
97
98
result = addition.parse('123 + 456') # Returns 579
99
```
100
101
### Position Tracking
102
103
```python
104
from parsy import index, line_info, string, regex, generate
105
106
# Track parse position
107
@generate
108
def positioned_parse():
109
start_pos = yield index
110
content = yield string('hello')
111
end_pos = yield index
112
return (start_pos, content, end_pos)
113
114
result = positioned_parse.parse('hello') # Returns (0, 'hello', 5)
115
116
# Track line and column information
117
@generate
118
def line_aware_parse():
119
start_line_info = yield line_info
120
content = yield regex(r'[^\n]+')
121
end_line_info = yield line_info
122
return {
123
'content': content,
124
'start': start_line_info,
125
'end': end_line_info
126
}
127
128
multiline_input = "line1\nline2\nline3"
129
# Position after "line1\n"
130
result = line_aware_parse.parse_partial(multiline_input[6:])
131
```
132
133
### End-of-Input Validation
134
135
```python
136
from parsy import eof, string, regex
137
138
# Ensure complete parsing
139
complete_number = regex(r'\d+').map(int) << eof
140
result = complete_number.parse('123') # Returns 123
141
# complete_number.parse('123abc') # Raises ParseError - input not fully consumed
142
143
# Parse complete expressions
144
@generate
145
def complete_expression():
146
expr = yield regex(r'[^;]+').desc('expression')
147
yield string(';')
148
yield eof
149
return expr.strip()
150
151
result = complete_expression.parse('x = 5 + 3;') # Returns 'x = 5 + 3'
152
# complete_expression.parse('x = 5 + 3; extra') # Raises ParseError
153
```
154
155
### Combining Pre-defined Parsers
156
157
```python
158
from parsy import letter, digit, any_char, whitespace, generate
159
160
# Build identifier parser
161
@generate
162
def identifier():
163
first = yield letter
164
rest = yield (letter | digit).many()
165
return first + ''.join(rest)
166
167
result = identifier.parse('var123') # Returns 'var123'
168
169
# Build word parser (letters separated by whitespace)
170
word = letter.at_least(1).concat()
171
words = word.sep_by(whitespace.at_least(1))
172
result = words.parse('hello world python') # Returns ['hello', 'world', 'python']
173
174
# Build line parser
175
@generate
176
def line_with_ending():
177
content = yield any_char.many().concat()
178
yield alt(string('\n'), eof)
179
return content
180
181
# Parse quoted string with escape sequences
182
@generate
183
def quoted_string():
184
yield string('"')
185
chars = yield (
186
string('\\') >> any_char | # Escaped character
187
any_char.should_fail('"') # Any char except quote
188
).many().concat()
189
yield string('"')
190
return chars
191
192
result = quoted_string.parse('"hello \\"world\\""') # Returns 'hello "world"'
193
```
194
195
### Utility Functions
196
197
```python
198
from parsy import line_info_at
199
200
# Get line info for any position
201
text = "line1\nline2\nline3"
202
line_col = line_info_at(text, 8) # Position after "line1\nli"
203
print(line_col) # Returns (1, 2) - line 1, column 2 (0-indexed)
204
205
# Error handling with position info
206
def parse_with_position_info(parser, text):
207
try:
208
return parser.parse(text)
209
except ParseError as e:
210
line, col = line_info_at(text, e.index)
211
print(f"Parse error at line {line}, column {col}: {e}")
212
raise
213
```