A Python parsing module providing an alternative approach to creating and executing simple grammars
npx @tessl/cli install tessl/pypi-pyparsing@3.2.00
# PyParsing
1
2
A comprehensive Python library that provides an alternative approach to creating and executing simple grammars versus traditional lex/yacc tools or regular expressions. PyParsing enables developers to construct grammar directly in Python code using a library of classes, offering readable grammar representations through self-explanatory class names and operator definitions.
3
4
## Package Information
5
6
- **Package Name**: pyparsing
7
- **Language**: Python
8
- **Installation**: `pip install pyparsing`
9
10
## Core Imports
11
12
```python
13
import pyparsing
14
```
15
16
Common usage pattern:
17
18
```python
19
from pyparsing import Word, Literal, alphas, nums, Suppress, Optional
20
```
21
22
For type annotations:
23
24
```python
25
from typing import Union, Optional, Iterable, NamedTuple
26
from pyparsing import ParserElement, ParseResults
27
```
28
29
For complete access to all elements:
30
31
```python
32
from pyparsing import *
33
```
34
35
## Basic Usage
36
37
```python
38
from pyparsing import Word, alphas, nums, Literal
39
40
# Define a simple grammar for "Hello, World!" pattern
41
greet = Word(alphas) + "," + Word(alphas) + "!"
42
43
# Parse a string
44
hello = "Hello, World!"
45
result = greet.parse_string(hello)
46
print(result) # ['Hello', ',', 'World', '!']
47
48
# More complex example: parsing simple arithmetic
49
from pyparsing import Word, nums, oneOf, infixNotation, opAssoc
50
51
# Define number and operators
52
number = Word(nums)
53
arithmetic_expr = infixNotation(number,
54
[
55
(oneOf('* /'), 2, opAssoc.LEFT),
56
(oneOf('+ -'), 2, opAssoc.LEFT),
57
])
58
59
# Parse arithmetic expressions
60
expr = "3 + 4 * 2"
61
result = arithmetic_expr.parseString(expr)
62
print(result)
63
```
64
65
## Architecture
66
67
PyParsing uses a compositional approach where complex grammars are built from simple parsing elements:
68
69
- **ParserElement**: Base class for all parsing components
70
- **Terminal Elements**: Match specific text patterns (Literal, Word, Regex)
71
- **Expression Classes**: Combine other elements (And, Or, MatchFirst, Each)
72
- **Enhancement Classes**: Modify parsing behavior (Optional, ZeroOrMore, OneOrMore)
73
- **Token Converters**: Transform results (Group, Suppress, Combine)
74
- **ParseResults**: Container for parsed data with list/dict/object access patterns
75
76
This design enables building parsers through composition, making grammars self-documenting and easily maintainable while handling common parsing challenges like whitespace, quoted strings, and comments.
77
78
## Capabilities
79
80
### Core Parser Elements
81
82
Fundamental building blocks for creating parsing expressions including terminal elements, expression combinators, and structural components that form the foundation of all pyparsing grammars.
83
84
```python { .api }
85
class ParserElement: ...
86
class Literal: ...
87
class Word: ...
88
class Regex: ...
89
class And: ...
90
class Or: ...
91
class MatchFirst: ...
92
```
93
94
[Core Elements](./core-elements.md)
95
96
### Expression Enhancement
97
98
Modifiers and enhancers that control repetition, optionality, lookahead/lookbehind, and token transformation to create sophisticated parsing behavior from basic elements.
99
100
```python { .api }
101
class Optional: ...
102
class ZeroOrMore: ...
103
class OneOrMore: ...
104
class FollowedBy: ...
105
class NotAny: ...
106
class Group: ...
107
class Suppress: ...
108
```
109
110
[Enhancement](./enhancement.md)
111
112
### Helper Functions and Utilities
113
114
High-level helper functions for common parsing patterns including delimited lists, nested expressions, HTML/XML parsing, infix notation, and specialized constructs.
115
116
```python { .api }
117
def one_of(strs: str) -> MatchFirst: ...
118
def delimited_list(expr: ParserElement) -> ParserElement: ...
119
def nested_expr() -> ParserElement: ...
120
def infix_notation(baseExpr: ParserElement, opList: list) -> ParserElement: ...
121
def counted_array(expr: ParserElement) -> ParserElement: ...
122
```
123
124
[Helpers](./helpers.md)
125
126
### Common Parser Expressions
127
128
Pre-built parser expressions for frequently used patterns including numeric types, identifiers, network addresses, dates, and parse actions for data conversion.
129
130
```python { .api }
131
class pyparsing_common:
132
integer: ParserElement
133
real: ParserElement
134
identifier: ParserElement
135
ipv4_address: ParserElement
136
uuid: ParserElement
137
@staticmethod
138
def convert_to_integer(): ...
139
@staticmethod
140
def convert_to_float(): ...
141
```
142
143
[Common Expressions](./common-expressions.md)
144
145
### Exception Handling
146
147
Exception classes for parsing errors with detailed location information and error recovery mechanisms for robust parser development.
148
149
```python { .api }
150
class ParseBaseException(Exception): ...
151
class ParseException(ParseBaseException): ...
152
class ParseFatalException(ParseException): ...
153
class RecursiveGrammarException(Exception): ...
154
```
155
156
[Exceptions](./exceptions.md)
157
158
### Testing and Debugging
159
160
Testing utilities and debugging tools for parser development including test runners, trace functions, and diagnostic configuration options.
161
162
```python { .api }
163
class pyparsing_test:
164
@staticmethod
165
def with_line_numbers(s: str) -> str: ...
166
167
def trace_parse_action(f: callable) -> callable: ...
168
def null_debug_action(*args) -> None: ...
169
```
170
171
[Testing & Debugging](./testing-debugging.md)
172
173
## Unicode Support
174
175
Unicode character set definitions organized by language/script families.
176
177
```python { .api }
178
class pyparsing_unicode:
179
"""Unicode character sets organized by language/script."""
180
181
# Language-specific character sets
182
Arabic: UnicodeRangeList
183
Chinese: UnicodeRangeList
184
Greek: UnicodeRangeList
185
Hebrew: UnicodeRangeList
186
Latin1: UnicodeRangeList
187
# ... many more language/script sets
188
189
def unicode_set(s: str) -> str:
190
"""Create character set from Unicode categories/ranges."""
191
192
class UnicodeRangeList:
193
"""Container for Unicode character ranges."""
194
```
195
196
## Configuration
197
198
Global configuration and diagnostic settings for pyparsing behavior.
199
200
```python { .api }
201
class __diag__:
202
"""Diagnostic configuration flags."""
203
warn_multiple_tokens_in_named_alternation: bool
204
warn_ungrouped_named_tokens_in_collection: bool
205
warn_name_set_on_empty_Forward: bool
206
warn_on_parse_using_empty_Forward: bool
207
warn_on_assignment_to_Forward: bool
208
warn_on_multiple_string_args_to_oneof: bool
209
enable_debug_on_named_expressions: bool
210
211
class __compat__:
212
"""Compatibility configuration flags."""
213
collect_all_And_tokens: bool
214
215
def enable_diag(diag_enum) -> None:
216
"""Enable specific diagnostic option."""
217
218
def disable_diag(diag_enum) -> None:
219
"""Disable specific diagnostic option."""
220
```
221
222
## Built-in Constants
223
224
```python { .api }
225
# Character sets for Word() construction
226
alphas: str # 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
227
nums: str # '0123456789'
228
alphanums: str # alphas + nums
229
hexnums: str # nums + 'ABCDEFabcdef'
230
printables: str # All printable ASCII characters
231
alphas8bit: str # Extended ASCII letters
232
punc8bit: str # Extended ASCII punctuation
233
identchars: str # Valid identifier start characters
234
identbodychars: str # Valid identifier body characters
235
236
# Pre-built parser expressions
237
empty: ParserElement # Always matches, consumes nothing
238
line_start: ParserElement # Matches at start of line
239
line_end: ParserElement # Matches at end of line
240
string_start: ParserElement # Matches at start of string
241
string_end: ParserElement # Matches at end of string
242
quoted_string: ParserElement # Any quoted string (single or double)
243
sgl_quoted_string: ParserElement # Single quoted strings
244
dbl_quoted_string: ParserElement # Double quoted strings
245
unicode_string: ParserElement # Unicode string literals
246
```
247
248
## Version Information
249
250
```python { .api }
251
__version__: str = "3.2.3"
252
__version_time__: str = "25 Mar 2025 01:38 UTC"
253
__author__: str = "Paul McGuire <ptmcg.gm+pyparsing@gmail.com>"
254
255
class version_info(NamedTuple):
256
major: int
257
minor: int
258
micro: int
259
releaselevel: str
260
serial: int
261
262
@property
263
def __version__(self) -> str:
264
"""Return version string."""
265
266
def __str__(self) -> str:
267
"""Return formatted version info."""
268
269
def __repr__(self) -> str:
270
"""Return detailed version representation."""
271
```