0
# Utility Functions
1
2
File handling, URL processing, and path resolution utilities that support remote BibTeX file downloading, Zotero API integration, and MkDocs path resolution.
3
4
## Capabilities
5
6
### File Download Functions
7
8
Functions for downloading bibliography and CSL files from remote URLs with robust error handling and caching.
9
10
```python { .api }
11
def tempfile_from_url(name: str, url: str, suffix: str) -> str:
12
"""
13
Download bibfile from a URL to a temporary file.
14
15
Features:
16
- Automatic retry (up to 3 attempts)
17
- Special handling for Zotero API URLs
18
- UTF-8 encoding for downloaded content
19
- Temporary file with specified suffix
20
21
Args:
22
name (str): Descriptive name for logging purposes
23
url (str): URL to download from
24
suffix (str): File suffix for temporary file (e.g., ".bib")
25
26
Returns:
27
str: Path to downloaded temporary file
28
29
Raises:
30
RuntimeError: If download fails after retries or HTTP status != 200
31
32
Note:
33
Temporary files are not automatically deleted
34
"""
35
36
def tempfile_from_zotero_url(name: str, url: str, suffix: str) -> str:
37
"""
38
Download bibfile from the Zotero API with pagination support.
39
40
Features:
41
- Automatic pagination following "next" links
42
- Query parameter sanitization for BibTeX format
43
- Maximum 999 pages (supports ~100k items)
44
- Retry logic for network errors
45
46
Args:
47
name (str): Descriptive name for logging purposes
48
url (str): Zotero API URL
49
suffix (str): File suffix for temporary file
50
51
Returns:
52
str: Path to downloaded temporary file containing all pages
53
54
Raises:
55
RuntimeError: If download fails or HTTP status != 200
56
"""
57
```
58
59
### URL Processing Functions
60
61
Functions for processing and sanitizing URLs, particularly for Zotero API integration.
62
63
```python { .api }
64
def sanitize_zotero_query(url: str) -> str:
65
"""
66
Sanitize query params in the Zotero URL.
67
68
The query params are amended to meet the following requirements:
69
- mkdocs-bibtex expects all bib data to be in bibtex format.
70
- Requesting the maximum number of items (100) reduces the requests
71
required, hence reducing load times.
72
73
Args:
74
url (str): Original Zotero API URL
75
76
Returns:
77
str: Sanitized URL with format=bibtex and limit=100 parameters
78
79
Note:
80
Preserves existing query parameters while overriding format and limit
81
"""
82
```
83
84
### Path Resolution Functions
85
86
Functions for resolving file paths relative to MkDocs configuration.
87
88
```python { .api }
89
def get_path_relative_to_mkdocs_yaml(path: str, config: MkDocsConfig) -> str:
90
"""
91
Get the relative path of a file to the mkdocs.yaml file.
92
93
Args:
94
path (str): File path (relative or absolute)
95
config (MkDocsConfig): MkDocs configuration object
96
97
Returns:
98
str: Normalized absolute path relative to mkdocs.yml location
99
100
Note:
101
Uses os.path.normpath for cross-platform compatibility
102
"""
103
```
104
105
### Logging Utilities
106
107
Module-level logging configuration for the plugin.
108
109
```python { .api }
110
import logging
111
112
log: logging.Logger
113
"""
114
Logger instance for the mkdocs-bibtex plugin.
115
116
Logger name: "mkdocs.plugins.mkdocs-bibtex"
117
Used throughout the plugin for consistent logging.
118
"""
119
```
120
121
## Usage Examples
122
123
### Basic File Download
124
125
```python
126
from mkdocs_bibtex.utils import tempfile_from_url
127
128
# Download BibTeX file from URL
129
url = "https://example.com/references.bib"
130
temp_file = tempfile_from_url("bibliography", url, ".bib")
131
132
print(f"Downloaded to: {temp_file}")
133
# Downloaded to: /tmp/tmpXXXXXX.bib
134
135
# File can now be used with pybtex
136
from pybtex.database import parse_file
137
bib_data = parse_file(temp_file)
138
```
139
140
### Zotero API Integration
141
142
```python
143
from mkdocs_bibtex.utils import tempfile_from_zotero_url
144
145
# Download from Zotero group library
146
zotero_url = "https://api.zotero.org/groups/12345/items?format=bibtex"
147
temp_file = tempfile_from_zotero_url("Zotero library", zotero_url, ".bib")
148
149
print(f"Downloaded {temp_file}")
150
# Automatically handles pagination and downloads all items
151
```
152
153
### URL Sanitization
154
155
```python
156
from mkdocs_bibtex.utils import sanitize_zotero_query
157
158
# Original URL with custom parameters
159
original_url = "https://api.zotero.org/groups/12345/items?tag=research"
160
161
# Sanitize for mkdocs-bibtex requirements
162
sanitized_url = sanitize_zotero_query(original_url)
163
print(sanitized_url)
164
# https://api.zotero.org/groups/12345/items?tag=research&format=bibtex&limit=100
165
```
166
167
### Path Resolution
168
169
```python
170
from mkdocs_bibtex.utils import get_path_relative_to_mkdocs_yaml
171
172
# Resolve relative path
173
relative_path = "bibliography/refs.bib"
174
absolute_path = get_path_relative_to_mkdocs_yaml(relative_path, mkdocs_config)
175
176
print(f"Resolved path: {absolute_path}")
177
# /project/docs/bibliography/refs.bib (example)
178
```
179
180
### Plugin Integration Example
181
182
```python
183
from mkdocs_bibtex.utils import tempfile_from_url, get_path_relative_to_mkdocs_yaml
184
import validators
185
186
def load_bibliography_file(bib_file_config, mkdocs_config):
187
"""Load bibliography file from local path or URL."""
188
189
if validators.url(bib_file_config):
190
# Remote file - download to temporary location
191
return tempfile_from_url("bib file", bib_file_config, ".bib")
192
else:
193
# Local file - resolve relative to mkdocs.yml
194
return get_path_relative_to_mkdocs_yaml(bib_file_config, mkdocs_config)
195
196
# Usage in plugin
197
bib_file_path = load_bibliography_file(config.bib_file, mkdocs_config)
198
```
199
200
## Error Handling
201
202
### Network Error Handling
203
204
The download functions implement robust error handling:
205
206
```python
207
def download_with_retries(url, max_retries=3):
208
"""Example of retry logic used in the utility functions."""
209
210
for attempt in range(max_retries):
211
try:
212
response = requests.get(url)
213
if response.status_code == 200:
214
return response.text
215
else:
216
raise RuntimeError(f"HTTP {response.status_code}")
217
except requests.exceptions.RequestException:
218
if attempt == max_retries - 1:
219
raise RuntimeError(f"Failed to download after {max_retries} attempts")
220
continue
221
```
222
223
### File System Error Handling
224
225
```python
226
import os
227
from pathlib import Path
228
229
def safe_path_resolution(path, config):
230
"""Safe path resolution with error handling."""
231
232
try:
233
resolved_path = get_path_relative_to_mkdocs_yaml(path, config)
234
if not os.path.exists(resolved_path):
235
raise FileNotFoundError(f"File not found: {resolved_path}")
236
return resolved_path
237
except Exception as e:
238
logger.error(f"Path resolution failed for {path}: {e}")
239
raise
240
```
241
242
## URL Processing Details
243
244
### Zotero API Pagination
245
246
The Zotero integration handles API pagination automatically:
247
248
1. **Initial Request**: Makes first request to provided URL
249
2. **Link Header Processing**: Checks for "next" link in response headers
250
3. **Pagination Loop**: Continues until no more pages or max pages reached
251
4. **Content Aggregation**: Combines all pages into single BibTeX file
252
253
### Query Parameter Management
254
255
The `sanitize_zotero_query` function ensures proper API parameters:
256
257
- **format=bibtex**: Ensures response is in BibTeX format
258
- **limit=100**: Maximum items per page for efficiency
259
- **Preservation**: Keeps existing parameters like filters and tags
260
261
## Performance Considerations
262
263
### Caching Strategy
264
265
- **Temporary Files**: Downloaded files persist for the build session
266
- **No Automatic Cleanup**: Relies on OS temporary file cleanup
267
- **Single Download**: Each URL downloaded once per build
268
269
### Memory Management
270
271
- **Streaming Downloads**: Large files handled efficiently
272
- **UTF-8 Encoding**: Consistent encoding for international content
273
- **Pagination Buffer**: Zotero content accumulated in memory then written
274
275
### Network Optimization
276
277
- **Retry Logic**: Handles transient network failures
278
- **Connection Reuse**: Uses requests library connection pooling
279
- **Timeout Handling**: Built-in timeout management via requests
280
281
## Logging Integration
282
283
All utility functions use consistent logging:
284
285
```python
286
from mkdocs_bibtex.utils import log
287
288
# Usage patterns in utility functions
289
log.debug(f"Downloading {name} from URL {url}")
290
log.info(f"{name} downloaded to temporary file")
291
log.warning(f"Exceeded maximum pages. Found: {page_num} pages")
292
log.error(f"Failed to download: {url}")
293
```
294
295
This provides comprehensive debugging information when MkDocs is run with verbose flags (`mkdocs build -v`).