0
# bibtexparser
1
2
A comprehensive BibTeX parser library for Python 3 that enables developers to parse and manipulate bibliographic data files. It provides a simple JSON-like API for loading BibTeX files into BibDatabase objects, supports both parsing from strings and files, and provides writing capabilities to export data back to BibTeX format.
3
4
## Package Information
5
6
- **Package Name**: bibtexparser
7
- **Language**: Python
8
- **Installation**: `pip install bibtexparser`
9
- **Dependencies**: pyparsing>=2.0.3
10
11
## Core Imports
12
13
```python
14
import bibtexparser
15
```
16
17
For advanced usage with custom parsers and writers:
18
19
```python
20
from bibtexparser.bparser import BibTexParser
21
from bibtexparser.bwriter import BibTexWriter
22
from bibtexparser.bibdatabase import BibDatabase
23
```
24
25
## Basic Usage
26
27
```python
28
import bibtexparser
29
30
# Parse a BibTeX file
31
with open('bibliography.bib') as bibtex_file:
32
bib_database = bibtexparser.load(bibtex_file)
33
34
# Access entries
35
for entry in bib_database.entries:
36
print(f"{entry['ID']}: {entry.get('title', 'No title')}")
37
38
# Parse from string
39
bibtex_str = """
40
@article{Einstein1905,
41
title={On the electrodynamics of moving bodies},
42
author={Einstein, Albert},
43
journal={Annalen der Physik},
44
year={1905}
45
}
46
"""
47
bib_database = bibtexparser.loads(bibtex_str)
48
49
# Write back to BibTeX format
50
bibtex_output = bibtexparser.dumps(bib_database)
51
print(bibtex_output)
52
53
# Write to file
54
with open('output.bib', 'w') as bibtex_file:
55
bibtexparser.dump(bib_database, bibtex_file)
56
```
57
58
## Architecture
59
60
bibtexparser uses a three-layer architecture:
61
62
- **High-level API**: Simple functions (loads, load, dumps, dump) for common use cases
63
- **Parser/Writer Layer**: Configurable BibTexParser and BibTexWriter classes for advanced control
64
- **Data Model**: BibDatabase and related classes for representing bibliographic data
65
- **Expression Layer**: pyparsing-based BibtexExpression for low-level parsing control
66
67
The library supports string interpolation, cross-reference resolution, and extensive customization through parsing hooks and field processing functions.
68
69
## Capabilities
70
71
### High-Level Parsing and Writing
72
73
Simple interface for parsing BibTeX strings and files into BibDatabase objects, and writing them back to BibTeX format. These functions handle the most common use cases with sensible defaults.
74
75
```python { .api }
76
def loads(bibtex_str: str, parser=None) -> BibDatabase: ...
77
def load(bibtex_file, parser=None) -> BibDatabase: ...
78
def dumps(bib_database: BibDatabase, writer=None) -> str: ...
79
def dump(bib_database: BibDatabase, bibtex_file, writer=None) -> None: ...
80
```
81
82
[Basic Operations](./basic-operations.md)
83
84
### Advanced Parsing Configuration
85
86
Configurable parser with options for handling non-standard entries, field homogenization, string interpolation, and cross-reference resolution. Includes customization hooks for processing entries during parsing.
87
88
```python { .api }
89
class BibTexParser:
90
def __init__(
91
self,
92
customization=None,
93
ignore_nonstandard_types: bool = True,
94
homogenize_fields: bool = False,
95
interpolate_strings: bool = True,
96
common_strings: bool = True,
97
add_missing_from_crossref: bool = False
98
): ...
99
100
def parse(self, bibtex_str: str, partial: bool = False) -> BibDatabase: ...
101
def parse_file(self, file, partial: bool = False) -> BibDatabase: ...
102
```
103
104
[Advanced Parsing](./advanced-parsing.md)
105
106
### Advanced Writing Configuration
107
108
Configurable writer with extensive formatting options including field ordering, indentation, alignment, and entry sorting. Supports various BibTeX syntax styles and output customization.
109
110
```python { .api }
111
class BibTexWriter:
112
def __init__(self, write_common_strings: bool = False): ...
113
def write(self, bib_database: BibDatabase) -> str: ...
114
115
from enum import Enum
116
class SortingStrategy(Enum):
117
ALPHABETICAL_ASC = auto()
118
ALPHABETICAL_DESC = auto()
119
PRESERVE = auto()
120
```
121
122
[Advanced Writing](./advanced-writing.md)
123
124
### Data Model and Database Operations
125
126
Core data structures for representing bibliographic databases including entries, comments, preambles, and string definitions. Supports cross-reference resolution and string expansion.
127
128
```python { .api }
129
class BibDatabase:
130
entries: list
131
comments: list
132
strings: dict
133
preambles: list
134
135
def load_common_strings(self) -> None: ...
136
def get_entry_dict(self) -> dict: ...
137
def expand_string(self, name: str) -> str: ...
138
def add_missing_from_crossref(self) -> None: ...
139
```
140
141
[Data Model](./data-model.md)
142
143
### Entry Customization and Processing
144
145
Collection of functions for customizing and processing bibliographic entries including name parsing, field normalization, LaTeX encoding conversion, and specialized field handling.
146
147
```python { .api }
148
def author(record: dict) -> dict: ...
149
def editor(record: dict) -> dict: ...
150
def journal(record: dict) -> dict: ...
151
def keyword(record: dict, sep: str = ',|;') -> dict: ...
152
def convert_to_unicode(record: dict) -> dict: ...
153
def homogenize_latex_encoding(record: dict) -> dict: ...
154
```
155
156
[Entry Customization](./entry-customization.md)
157
158
### LaTeX Encoding Utilities
159
160
Utilities for converting between LaTeX-encoded text and Unicode, supporting a comprehensive range of special characters, accents, and symbols commonly found in bibliographic data.
161
162
```python { .api }
163
def latex_to_unicode(string: str) -> str: ...
164
def string_to_latex(string: str) -> str: ...
165
def protect_uppercase(string: str) -> str: ...
166
```
167
168
[LaTeX Encoding](./latex-encoding.md)
169
170
## Types
171
172
```python { .api }
173
class BibDatabase:
174
"""Main bibliographic database container."""
175
entries: list # List of entry dictionaries
176
comments: list # List of comment strings
177
strings: dict # Dictionary of string definitions
178
preambles: list # List of preamble strings
179
180
class BibDataString:
181
"""Represents a BibTeX string definition."""
182
def __init__(self, bibdatabase: BibDatabase, name: str): ...
183
def get_value(self) -> str: ...
184
185
class BibDataStringExpression:
186
"""Represents BibTeX string expressions (concatenated strings)."""
187
def __init__(self, expression: list): ...
188
def get_value(self) -> str: ...
189
190
class UndefinedString(KeyError):
191
"""Exception raised when referencing undefined string."""
192
pass
193
194
class InvalidName(ValueError):
195
"""Exception raised for invalid name format."""
196
pass
197
```