0
# HTML and Markdown Processing
1
2
Advanced text processing utilities for converting docstrings, handling multiple documentation formats, generating HTML output, and processing mathematical expressions with cross-linking support.
3
4
## Capabilities
5
6
### Docstring Format Conversion
7
8
Convert docstrings between different formats with support for NumPy, Google, and reStructuredText styles.
9
10
```python { .api }
11
def to_html(text, docformat=None, module=None, link=None, latex_math=False) -> str:
12
"""
13
Convert docstring text to HTML with formatting and cross-linking.
14
15
Parameters:
16
- text: Input docstring text
17
- docformat: Documentation format ('numpy', 'google', None for auto-detect)
18
- module: Module object for identifier resolution and cross-linking
19
- link: Function to generate cross-reference links (takes Doc object, returns str)
20
- latex_math: If True, enable LaTeX math processing with MathJax
21
22
Returns:
23
str: Formatted HTML with syntax highlighting and cross-links
24
"""
25
26
def to_markdown(text, docformat=None, module=None, link=None) -> str:
27
"""
28
Convert docstring text to Markdown format.
29
30
Parameters:
31
- text: Input docstring text
32
- docformat: Documentation format ('numpy', 'google', None for auto-detect)
33
- module: Module object for identifier resolution
34
- link: Function to generate cross-reference links
35
36
Returns:
37
str: Formatted Markdown text
38
"""
39
```
40
41
### HTML and CSS Optimization
42
43
Minify and optimize HTML and CSS content for production deployment.
44
45
```python { .api }
46
def minify_html(html: str) -> str:
47
"""
48
Minify HTML by replacing consecutive whitespace with single characters.
49
50
Parameters:
51
- html: Input HTML string
52
53
Returns:
54
str: Minified HTML (preserves <pre> tag contents)
55
"""
56
57
def minify_css(css: str) -> str:
58
"""
59
Minify CSS by removing whitespace, comments, and trailing semicolons.
60
61
Parameters:
62
- css: Input CSS string
63
64
Returns:
65
str: Minified CSS string
66
"""
67
```
68
69
### Text Processing Utilities
70
71
Extract and format text content for summaries, previews, and content organization.
72
73
```python { .api }
74
def glimpse(text: str, max_length=153, paragraph=True) -> str:
75
"""
76
Extract short excerpt from text for previews and summaries.
77
78
Parameters:
79
- text: Input text
80
- max_length: Maximum excerpt length in characters
81
- paragraph: If True, break at paragraph boundaries when possible
82
83
Returns:
84
str: Text excerpt, potentially truncated with ellipsis
85
"""
86
87
def extract_toc(text: str) -> str:
88
"""
89
Extract table of contents from Markdown text based on headers.
90
91
Parameters:
92
- text: Markdown text with headers
93
94
Returns:
95
str: Table of contents as nested list
96
"""
97
```
98
99
### Git Integration
100
101
Generate links to source code in version control systems.
102
103
```python { .api }
104
def format_git_link(template: str, dobj: 'Doc') -> Optional[str]:
105
"""
106
Format git repository links using template and documentation object.
107
108
Parameters:
109
- template: URL template with placeholders like {path}, {start_line}, {end_line}
110
- dobj: Documentation object with source information
111
112
Returns:
113
Optional[str]: Formatted URL or None if template cannot be resolved
114
115
Template placeholders:
116
- {path}: File path relative to repository root
117
- {start_line}: Starting line number of object definition
118
- {end_line}: Ending line number of object definition
119
"""
120
```
121
122
### Docstring Format Processors
123
124
Specialized processors for different docstring formats and markup languages.
125
126
```python { .api }
127
class _ToMarkdown:
128
@staticmethod
129
def numpy(text: str) -> str:
130
"""
131
Convert NumPy-style docstrings to Markdown.
132
133
Parameters:
134
- text: NumPy docstring text
135
136
Returns:
137
str: Markdown formatted text
138
"""
139
140
@staticmethod
141
def google(text: str) -> str:
142
"""
143
Convert Google-style docstrings to Markdown.
144
145
Parameters:
146
- text: Google docstring text
147
148
Returns:
149
str: Markdown formatted text
150
"""
151
152
@staticmethod
153
def admonitions(text: str, module: 'Module', limit_types=None) -> str:
154
"""
155
Process reStructuredText directives and admonitions.
156
157
Parameters:
158
- text: Text with reST directives
159
- module: Module for file path resolution
160
- limit_types: Optional tuple of directive types to process
161
162
Returns:
163
str: Processed text with directives converted to Markdown
164
"""
165
166
@staticmethod
167
def doctests(text: str) -> str:
168
"""
169
Fence Python doctest blocks for proper syntax highlighting.
170
171
Parameters:
172
- text: Text containing doctest examples
173
174
Returns:
175
str: Text with doctests wrapped in code fences
176
"""
177
178
@staticmethod
179
def raw_urls(text: str) -> str:
180
"""
181
Wrap raw URLs in angle brackets for Markdown processing.
182
183
Parameters:
184
- text: Text containing URLs
185
186
Returns:
187
str: Text with URLs properly formatted
188
"""
189
190
@staticmethod
191
def indent(indent: str, text: str, clean_first=False) -> str:
192
"""
193
Add indentation to text lines.
194
195
Parameters:
196
- indent: Indentation string to add
197
- text: Text to indent
198
- clean_first: If True, remove existing indentation first
199
200
Returns:
201
str: Indented text
202
"""
203
```
204
205
### Mathematical Expression Processing
206
207
Process LaTeX mathematical expressions for web display.
208
209
```python { .api }
210
class _MathPattern(InlineProcessor):
211
"""
212
Markdown processor for LaTeX math expressions.
213
214
Processes:
215
- Inline math: $expression$
216
- Display math: $$expression$$
217
218
Converts to MathJax-compatible HTML elements.
219
"""
220
```
221
222
### Warning System
223
224
Custom warnings for documentation processing issues.
225
226
```python { .api }
227
class ReferenceWarning(UserWarning):
228
"""
229
Warning raised when Markdown object references don't match documented objects.
230
231
Occurs when cross-references in docstrings cannot be resolved to actual
232
documented identifiers.
233
"""
234
```
235
236
### Markdown Configuration
237
238
Pre-configured Markdown processor with extensions for documentation.
239
240
```python { .api }
241
_md: markdown.Markdown # Configured with extensions for code, tables, etc.
242
```
243
244
## Usage Examples
245
246
### Basic HTML Generation
247
248
```python
249
from pdoc.html_helpers import to_html, to_markdown
250
import pdoc
251
252
# Convert docstring to HTML
253
module = pdoc.Module('mymodule')
254
docstring = """
255
This is a function that does something important.
256
257
Parameters:
258
x (int): The input value
259
y (str): The string parameter
260
261
Returns:
262
bool: True if successful
263
"""
264
265
html_output = to_html(docstring, docformat='numpy', module=module)
266
markdown_output = to_markdown(docstring, docformat='numpy')
267
```
268
269
### Advanced Cross-Linking
270
271
```python
272
from pdoc.html_helpers import to_html
273
import pdoc
274
275
def create_link_function(base_url):
276
"""Create a link function for cross-references."""
277
def link_func(doc_obj):
278
return f"{base_url}/{doc_obj.url()}"
279
return link_func
280
281
module = pdoc.Module('mypackage')
282
pdoc.link_inheritance()
283
284
# Convert with cross-linking
285
link_fn = create_link_function('https://docs.myproject.com')
286
html_doc = to_html(
287
module.docstring,
288
module=module,
289
link=link_fn,
290
latex_math=True
291
)
292
```
293
294
### Content Optimization
295
296
```python
297
from pdoc.html_helpers import minify_html, minify_css, glimpse
298
299
# Generate documentation
300
html_doc = pdoc.html('mymodule')
301
302
# Optimize for production
303
minified_html = minify_html(html_doc)
304
305
# Extract preview text
306
preview = glimpse(module.docstring, max_length=200)
307
308
# Process CSS
309
css_content = """
310
.doc {
311
margin: 10px;
312
padding: 20px;
313
/* This is a comment */
314
background-color: #ffffff;
315
}
316
"""
317
minified_css = minify_css(css_content)
318
```
319
320
### Git Integration Setup
321
322
```python
323
from pdoc.html_helpers import format_git_link
324
import pdoc
325
326
# Configure git link template
327
git_template = "https://github.com/user/repo/blob/main/{path}#L{start_line}-L{end_line}"
328
329
module = pdoc.Module('mypackage')
330
331
# Generate git links for all functions
332
for name, obj in module.doc.items():
333
if isinstance(obj, pdoc.Function):
334
git_url = format_git_link(git_template, obj)
335
if git_url:
336
print(f"{obj.name}: {git_url}")
337
```
338
339
### Custom Docstring Processing
340
341
```python
342
from pdoc.html_helpers import _ToMarkdown
343
344
# Process different docstring formats
345
numpy_docstring = """
346
Brief description.
347
348
Parameters
349
----------
350
x : int
351
Parameter description
352
y : str, optional
353
Another parameter
354
355
Returns
356
-------
357
bool
358
Return description
359
"""
360
361
google_docstring = """
362
Brief description.
363
364
Args:
365
x (int): Parameter description
366
y (str, optional): Another parameter
367
368
Returns:
369
bool: Return description
370
"""
371
372
# Convert to Markdown
373
numpy_md = _ToMarkdown.numpy(numpy_docstring)
374
google_md = _ToMarkdown.google(google_docstring)
375
376
# Process doctests
377
doctest_text = """
378
Example usage:
379
380
>>> x = 5
381
>>> result = my_function(x)
382
>>> print(result)
383
True
384
"""
385
386
processed_doctests = _ToMarkdown.doctests(doctest_text)
387
```
388
389
### Table of Contents Generation
390
391
```python
392
from pdoc.html_helpers import extract_toc
393
394
markdown_text = """
395
# Main Title
396
397
Some introduction text.
398
399
## Section 1
400
401
Content for section 1.
402
403
### Subsection 1.1
404
405
More detailed content.
406
407
## Section 2
408
409
Content for section 2.
410
"""
411
412
toc = extract_toc(markdown_text)
413
print(toc)
414
# Output:
415
# - [Main Title](#main-title)
416
# - [Section 1](#section-1)
417
# - [Subsection 1.1](#subsection-1-1)
418
# - [Section 2](#section-2)
419
```
420
421
### Error Handling and Warnings
422
423
```python
424
from pdoc.html_helpers import to_html, ReferenceWarning
425
import warnings
426
import pdoc
427
428
# Capture reference warnings
429
with warnings.catch_warnings(record=True) as w:
430
warnings.simplefilter("always")
431
432
docstring_with_refs = """
433
This function uses `nonexistent.Class` and `undefined_function()`.
434
"""
435
436
module = pdoc.Module('mymodule')
437
html_result = to_html(docstring_with_refs, module=module)
438
439
# Check for reference warnings
440
ref_warnings = [warning for warning in w if issubclass(warning.category, ReferenceWarning)]
441
for warning in ref_warnings:
442
print(f"Reference warning: {warning.message}")
443
```