0
# BibTeX Expression Parsing
1
2
Low-level parsing control through the BibtexExpression class that provides access to the underlying pyparsing grammar. This module enables advanced customization of parsing behavior, parse actions, and error handling for specialized BibTeX processing needs.
3
4
## Capabilities
5
6
### Expression Parser
7
8
The core BibtexExpression class provides access to the pyparsing-based grammar that powers the BibTeX parser, enabling low-level customization and control.
9
10
```python { .api }
11
class BibtexExpression:
12
"""
13
Low-level BibTeX parsing expression using pyparsing grammar.
14
15
Provides access to the underlying parsing components and allows
16
for advanced customization of parsing behavior through parse actions
17
and grammar modifications.
18
"""
19
20
def __init__(self):
21
"""Create a new BibtexExpression parser instance."""
22
23
def parseFile(self, file_obj):
24
"""
25
Parse a BibTeX file using the expression grammar.
26
27
Parameters:
28
- file_obj: File object to parse
29
30
Returns:
31
Parsed result from pyparsing
32
"""
33
34
def add_log_function(self, log_fun):
35
"""
36
Add logging function for parsing events.
37
38
Parameters:
39
- log_fun (callable): Function to call for logging parse events
40
"""
41
42
def set_string_name_parse_action(self, fun):
43
"""
44
Set parse action for string name processing.
45
46
Parameters:
47
- fun (callable): Function to process string names during parsing
48
"""
49
```
50
51
### Parser Grammar Components
52
53
Access to individual grammar components for fine-grained parsing control and customization.
54
55
```python { .api }
56
# Grammar component attributes available on BibtexExpression instances:
57
58
# entry: pyparsing.ParserElement
59
# Grammar for parsing BibTeX entries (@article, @book, etc.)
60
61
# explicit_comment: pyparsing.ParserElement
62
# Grammar for parsing explicit @comment entries
63
64
# implicit_comment: pyparsing.ParserElement
65
# Grammar for parsing implicit comments (text outside entries)
66
67
# string_def: pyparsing.ParserElement
68
# Grammar for parsing @string definitions
69
70
# preamble_decl: pyparsing.ParserElement
71
# Grammar for parsing @preamble declarations
72
73
# main_expression: pyparsing.ParserElement
74
# Main grammar expression combining all BibTeX components
75
76
# ParseException: Exception
77
# Exception class for parsing errors (from pyparsing)
78
```
79
80
### Helper Functions
81
82
Utility functions for processing parsed content and manipulating parsing behavior.
83
84
```python { .api }
85
def strip_after_new_lines(s: str) -> str:
86
"""
87
Strip whitespace from continuation lines in multi-line strings.
88
89
Parameters:
90
- s (str): Input string with potential continuation lines
91
92
Returns:
93
str: String with cleaned continuation lines
94
"""
95
96
def add_logger_parse_action(expr, log_func):
97
"""
98
Add logging parse action to a pyparsing expression.
99
100
Parameters:
101
- expr: pyparsing expression to add logging to
102
- log_func (callable): Function to call for logging
103
104
Returns:
105
Modified pyparsing expression with logging
106
"""
107
```
108
109
## Usage Examples
110
111
### Custom Parse Actions
112
113
```python
114
from bibtexparser.bibtexexpression import BibtexExpression
115
116
# Create expression parser
117
expr = BibtexExpression()
118
119
# Add custom logging
120
def log_entries(tokens):
121
print(f"Parsed entry: {tokens[0].get('ID', 'unknown')}")
122
123
expr.add_log_function(log_entries)
124
125
# Parse with custom actions
126
with open('bibliography.bib') as f:
127
result = expr.parseFile(f)
128
```
129
130
### Low-Level Parsing Control
131
132
```python
133
from bibtexparser.bibtexexpression import BibtexExpression
134
from bibtexparser.bparser import BibTexParser
135
136
# Create custom parser with expression control
137
expr = BibtexExpression()
138
139
# Customize string name processing
140
def process_string_names(tokens):
141
# Custom processing of @string names
142
return tokens[0].lower()
143
144
expr.set_string_name_parse_action(process_string_names)
145
146
# Use with main parser
147
parser = BibTexParser()
148
# Note: BibtexExpression is used internally by BibTexParser
149
# This example shows the conceptual usage
150
```
151
152
### Error Handling with ParseException
153
154
```python
155
from bibtexparser.bibtexexpression import BibtexExpression
156
157
expr = BibtexExpression()
158
159
try:
160
with open('malformed.bib') as f:
161
result = expr.parseFile(f)
162
except expr.ParseException as e:
163
print(f"Parse error at line {e.lineno}: {e.msg}")
164
print(f"Context: {e.line}")
165
```
166
167
### Grammar Component Access
168
169
```python
170
from bibtexparser.bibtexexpression import BibtexExpression
171
172
expr = BibtexExpression()
173
174
# Access specific grammar components
175
entry_grammar = expr.entry
176
comment_grammar = expr.explicit_comment
177
string_grammar = expr.string_def
178
179
# Use individual components for specialized parsing
180
test_string = "@string{jan = \"January\"}"
181
try:
182
result = string_grammar.parseString(test_string)
183
print(f"Parsed string definition: {result}")
184
except expr.ParseException as e:
185
print(f"Failed to parse string: {e}")
186
```
187
188
## Integration with Main Parser
189
190
The BibtexExpression class is used internally by BibTexParser but can be accessed for advanced customization:
191
192
```python
193
from bibtexparser.bparser import BibTexParser
194
from bibtexparser.bibtexexpression import BibtexExpression
195
196
# Create parser with custom expression handling
197
parser = BibTexParser()
198
199
# The parser uses BibtexExpression internally
200
# Advanced users can subclass BibTexParser to access and modify
201
# the underlying expression grammar for specialized needs
202
203
class CustomBibTexParser(BibTexParser):
204
def __init__(self, *args, **kwargs):
205
super().__init__(*args, **kwargs)
206
# Access and customize the internal expression parser
207
# Note: This requires understanding of the internal implementation
208
```