0
# Parser System
1
2
Breathe's parser system processes Doxygen's XML output format, providing cached access to parsed XML data with comprehensive error handling. The system handles both index files (project overview) and compound files (detailed element documentation).
3
4
## Capabilities
5
6
### DoxygenParserFactory Class
7
8
Factory for creating XML parsers with shared caching to improve performance across multiple parsing operations.
9
10
```python { .api }
11
class DoxygenParserFactory:
12
def __init__(self, app: Sphinx):
13
"""
14
Initialize parser factory with Sphinx application.
15
16
Args:
17
app: Sphinx application instance
18
"""
19
20
def create_index_parser(self) -> DoxygenIndexParser:
21
"""
22
Create parser for Doxygen index.xml files.
23
24
Returns:
25
DoxygenIndexParser instance with shared cache
26
"""
27
28
def create_compound_parser(self, project_info: ProjectInfo) -> DoxygenCompoundParser:
29
"""
30
Create parser for Doxygen compound XML files.
31
32
Args:
33
project_info: Project configuration for XML location
34
35
Returns:
36
DoxygenCompoundParser instance for the specified project
37
"""
38
```
39
40
### DoxygenIndexParser Class
41
42
Parses Doxygen's index.xml files which contain the project-wide overview and references to all documented elements.
43
44
```python { .api }
45
class DoxygenIndexParser(Parser):
46
def parse(self, project_info: ProjectInfo):
47
"""
48
Parse the index.xml file for a project.
49
50
Handles caching and file state tracking. The index.xml file contains
51
the master list of all documented compounds (classes, files, etc.)
52
with their reference IDs.
53
54
Args:
55
project_info: Project configuration specifying XML location
56
57
Returns:
58
Parsed index data structure
59
60
Raises:
61
ParserError: If XML parsing fails
62
FileIOError: If file cannot be read
63
"""
64
```
65
66
### DoxygenCompoundParser Class
67
68
Parses individual compound XML files containing detailed documentation for specific elements (classes, files, namespaces, etc.).
69
70
```python { .api }
71
class DoxygenCompoundParser(Parser):
72
def __init__(self, app: Sphinx, cache: dict, project_info: ProjectInfo):
73
"""
74
Initialize compound parser for specific project.
75
76
Args:
77
app: Sphinx application instance
78
cache: Shared parser cache
79
project_info: Project configuration for XML location
80
"""
81
82
def parse(self, refid: str):
83
"""
84
Parse a compound XML file by reference ID.
85
86
Compound files contain detailed documentation for specific elements
87
like classes, files, namespaces, etc. Each compound has a unique
88
reference ID used as the filename (e.g., "classMyClass.xml").
89
90
Args:
91
refid: Reference ID of the compound to parse
92
93
Returns:
94
Parsed compound data structure
95
96
Raises:
97
ParserError: If XML parsing fails
98
FileIOError: If file cannot be read
99
"""
100
```
101
102
### Base Parser Class
103
104
Common functionality shared by all parser types.
105
106
```python { .api }
107
class Parser:
108
def __init__(self, app: Sphinx, cache: dict):
109
"""
110
Initialize base parser functionality.
111
112
Args:
113
app: Sphinx application instance
114
cache: Shared cache for parsed results
115
"""
116
```
117
118
### Exception Classes
119
120
Comprehensive error handling for XML parsing operations.
121
122
```python { .api }
123
class ParserError(Exception):
124
def __init__(self, error: Exception, filename: Path):
125
"""
126
XML parsing error with file context.
127
128
Args:
129
error: Underlying parsing exception
130
filename: Path to file that failed to parse
131
"""
132
133
def __str__(self) -> str:
134
"""Return string representation including filename and error."""
135
136
@property
137
def error(self) -> Exception:
138
"""Get the underlying parsing error."""
139
140
@property
141
def filename(self) -> Path:
142
"""Get the path to file that failed to parse."""
143
144
class FileIOError(Exception):
145
def __init__(self, error: Exception, filename: Path):
146
"""
147
File I/O error with file context.
148
149
Args:
150
error: Underlying I/O exception
151
filename: Path to file that couldn't be accessed
152
"""
153
154
def __str__(self) -> str:
155
"""Return string representation including filename and error."""
156
157
@property
158
def error(self) -> Exception:
159
"""Get the underlying I/O error."""
160
161
@property
162
def filename(self) -> Path:
163
"""Get the path to file that couldn't be accessed."""
164
```
165
166
## XML File Structure
167
168
The parser system works with Doxygen's standard XML output structure:
169
170
### Index File (index.xml)
171
Contains project overview and compound references:
172
173
```xml
174
<?xml version='1.0' encoding='UTF-8' standalone='no'?>
175
<doxygenindex>
176
<compound refid="classMyClass" kind="class">
177
<name>MyClass</name>
178
<member refid="classMyClass_1a123" kind="function">
179
<name>myMethod</name>
180
</member>
181
</compound>
182
<compound refid="namespaceMyNamespace" kind="namespace">
183
<name>MyNamespace</name>
184
</compound>
185
</doxygenindex>
186
```
187
188
### Compound Files (e.g., classMyClass.xml)
189
Contain detailed documentation for specific elements:
190
191
```xml
192
<?xml version='1.0' encoding='UTF-8' standalone='no'?>
193
<doxygen>
194
<compounddef id="classMyClass" kind="class">
195
<name>MyClass</name>
196
<briefdescription>Brief description of MyClass</briefdescription>
197
<detaileddescription>Detailed description...</detaileddescription>
198
<sectiondef kind="public-func">
199
<memberdef kind="function" id="classMyClass_1a123">
200
<name>myMethod</name>
201
<type>void</type>
202
<definition>void MyClass::myMethod</definition>
203
</memberdef>
204
</sectiondef>
205
</compounddef>
206
</doxygen>
207
```
208
209
## Caching System
210
211
The parser system implements intelligent caching to improve performance:
212
213
### File State Tracking
214
- Monitors XML file modification times
215
- Invalidates cache when files change
216
- Automatically re-parses updated files
217
218
### Shared Cache
219
- Single cache shared across all parser instances
220
- Keyed by absolute file path
221
- Stores parsed data structures, not raw XML
222
223
### Cache Integration
224
225
```python { .api }
226
# Cache is automatically managed
227
factory = DoxygenParserFactory(app)
228
parser = factory.create_index_parser()
229
230
# First parse - loads and caches
231
result1 = parser.parse(project_info)
232
233
# Second parse - returns cached result
234
result2 = parser.parse(project_info) # Fast cached access
235
236
# If XML file changes, cache is invalidated automatically
237
```
238
239
## Usage Examples
240
241
### Basic Parsing Setup
242
243
```python
244
from breathe.parser import DoxygenParserFactory
245
from breathe.project import ProjectInfo
246
247
# Create factory
248
factory = DoxygenParserFactory(app)
249
250
# Parse project index
251
index_parser = factory.create_index_parser()
252
index_data = index_parser.parse(project_info)
253
254
# Parse specific compound
255
compound_parser = factory.create_compound_parser(project_info)
256
class_data = compound_parser.parse("classMyClass")
257
```
258
259
### Error Handling
260
261
```python
262
from breathe.parser import ParserError, FileIOError
263
264
try:
265
index_data = index_parser.parse(project_info)
266
except ParserError as e:
267
print(f"XML parsing failed: {e.error}")
268
print(f"File: {e.filename}")
269
except FileIOError as e:
270
print(f"File I/O error: {e.error}")
271
print(f"File: {e.filename}")
272
```
273
274
### Integration with Directives
275
276
```python
277
# In directive implementation
278
try:
279
parser = self.parser_factory.create_compound_parser(project_info)
280
compound_data = parser.parse(refid)
281
# Process parsed data...
282
except ParserError as e:
283
return format_parser_error(
284
self.name, e.error, e.filename, self.state, self.lineno, True
285
)
286
```
287
288
### Custom Parser Usage
289
290
```python
291
# Access parser factory from directive
292
factory = self.env.temp_data["breathe_parser_factory"]
293
294
# Create parsers as needed
295
index_parser = factory.create_index_parser()
296
compound_parser = factory.create_compound_parser(project_info)
297
298
# Parse XML data
299
index_data = index_parser.parse(project_info)
300
for compound in index_data.compounds:
301
compound_data = compound_parser.parse(compound.refid)
302
# Process compound data...
303
```
304
305
## Generated Parser Modules
306
307
The parser system includes generated modules for XML schema handling:
308
309
### breathe.parser.index
310
- Generated parser for index.xml schema
311
- Handles project-wide compound listings
312
- Provides structured access to compound references
313
314
### breathe.parser.compound
315
- Generated parser for compound XML schema
316
- Handles detailed element documentation
317
- Supports all Doxygen compound types (class, file, namespace, etc.)
318
319
### breathe.parser.indexsuper
320
- Base classes for index parsing
321
- Common functionality for index operations
322
323
### breathe.parser.compoundsuper
324
- Base classes for compound parsing
325
- Common functionality for compound operations
326
327
## Performance Considerations
328
329
### Caching Strategy
330
- Parse results cached by absolute file path
331
- File modification time tracking prevents stale data
332
- Shared cache across all parser instances reduces memory usage
333
334
### Lazy Loading
335
- XML files only parsed when specifically requested
336
- Compound files parsed on-demand, not in bulk
337
- Index parsed once per project, compounds as needed
338
339
### Memory Management
340
- Cache stores parsed data structures, not raw XML
341
- Automatic cache invalidation prevents memory leaks
342
- Shared cache reduces duplicate parsed data
343
344
### Optimization Tips
345
- Use same ProjectInfoFactory across directives
346
- Let caching handle repeated access to same files
347
- Monitor XML file sizes - very large files may impact performance