0
# Syntax Analysis
1
2
Advanced docstring syntax analysis and formatting for field lists, code blocks, URLs, and various docstring styles including Sphinx, Epytext, Google, and NumPy formats. This module handles the complex parsing and formatting rules that make docformatter compatible with different documentation conventions.
3
4
## Capabilities
5
6
### Regular Expression Patterns
7
8
Constants defining patterns for various docstring elements.
9
10
```python { .api }
11
DEFAULT_INDENT = 4
12
13
# Field list patterns
14
ALEMBIC_REGEX = r"^ *[a-zA-Z0-9_\- ]*: "
15
BULLET_REGEX = r"\s*[*\-+] [\S ]+"
16
ENUM_REGEX = r"\s*\d\."
17
EPYTEXT_REGEX = r"@[a-zA-Z0-9_\-\s]+:"
18
GOOGLE_REGEX = r"^ *[a-zA-Z0-9_\- ]*:$"
19
NUMPY_REGEX = r"^\s[a-zA-Z0-9_\- ]+ ?: [\S ]+"
20
OPTION_REGEX = r"^-{1,2}[\S ]+ {2}\S+"
21
SPHINX_REGEX = r":({SPHINX_FIELD_PATTERNS})[a-zA-Z0-9_\-.() ]*:"
22
23
# Content patterns
24
LITERAL_REGEX = r"[\S ]*::"
25
REST_REGEX = r"((\.{2}|`{2}) ?[\w.~-]+(:{2}|`{2})?[\w ]*?|`[\w.~]+`)"
26
URL_REGEX = r"({URL_PATTERNS})://[^\s]+"
27
28
# Sphinx field patterns
29
SPHINX_FIELD_PATTERNS = (
30
"arg|cvar|except|ivar|key|meta|param|raise|return|rtype|type|var|yield"
31
)
32
33
# URL scheme patterns
34
URL_PATTERNS = (
35
"afp|apt|bitcoin|chrome|cvs|dav|dns|file|finger|fish|ftp|ftps|git|"
36
"http|https|imap|ipp|ldap|mailto|news|nfs|nntp|pop|rsync|rtsp|sftp|"
37
"smb|smtp|ssh|svn|tcp|telnet|tftp|udp|vnc|ws|wss"
38
)
39
```
40
41
### Field List Detection
42
43
Functions for detecting and analyzing field lists in docstrings.
44
45
```python { .api }
46
def do_find_field_lists(text: str, style: str) -> List[Tuple[int, int]]:
47
"""
48
Find field list positions in text.
49
50
Args:
51
text (str): Text to search for field lists
52
style (str): Field list style ('sphinx', 'epytext', 'google', 'numpy')
53
54
Returns:
55
List[Tuple[int, int]]: List of (start_pos, end_pos) tuples for field lists
56
"""
57
58
def is_some_sort_of_field_list(line: str, style: str) -> bool:
59
"""
60
Determine if line contains a field list.
61
62
Args:
63
line (str): Line to check
64
style (str): Field list style to check against
65
66
Returns:
67
bool: True if line contains a field list
68
"""
69
```
70
71
### List Detection
72
73
Functions for detecting various types of lists in docstrings.
74
75
```python { .api }
76
def is_some_sort_of_list(text, strict=True):
77
"""
78
Determine if text contains any type of list.
79
80
Detects bullet lists, enumerated lists, field lists, option lists,
81
and other structured content that should not be reflowed.
82
83
Args:
84
text: Text to analyze
85
strict (bool): Whether to use strict reST syntax checking
86
87
Returns:
88
bool: True if text contains list-like content
89
"""
90
```
91
92
### Code Block Detection
93
94
Functions for identifying code blocks and literal content.
95
96
```python { .api }
97
def is_some_sort_of_code(text: str) -> bool:
98
"""
99
Determine if text contains code or literal blocks.
100
101
Args:
102
text (str): Text to analyze
103
104
Returns:
105
bool: True if text appears to contain code
106
"""
107
```
108
109
### URL and Link Processing
110
111
Functions for handling URLs and links in docstrings.
112
113
```python { .api }
114
def do_find_links(text: str) -> List[Tuple[int, int]]:
115
"""
116
Find link positions in text.
117
118
Args:
119
text (str): Text to search for links
120
121
Returns:
122
List[Tuple[int, int]]: List of (start_pos, end_pos) tuples for links
123
"""
124
125
def do_skip_link(text: str, index: Tuple[int, int]) -> bool:
126
"""
127
Determine if link should be skipped during wrapping.
128
129
Args:
130
text (str): Text containing the link
131
index (Tuple[int, int]): Link position (start, end)
132
133
Returns:
134
bool: True if link should not be wrapped
135
"""
136
137
def do_clean_url(url: str, indentation: str) -> str:
138
"""
139
Clean and format URL for proper display.
140
141
Args:
142
url (str): URL to clean
143
indentation (str): Indentation to apply
144
145
Returns:
146
str: Cleaned and formatted URL
147
"""
148
```
149
150
### Text Wrapping and Formatting
151
152
Core text wrapping functions with syntax awareness.
153
154
```python { .api }
155
def wrap_description(text, indentation, wrap_length, force_wrap, strict,
156
rest_sections, style="sphinx"):
157
"""
158
Wrap description text while preserving syntax elements.
159
160
Args:
161
text: Text to wrap
162
indentation: Base indentation string
163
wrap_length (int): Maximum line length
164
force_wrap (bool): Force wrapping even if messy
165
strict (bool): Whether to use strict reST syntax checking
166
rest_sections: Regular expression for reST section adornments
167
style (str): Docstring style for field list handling (default: "sphinx")
168
169
Returns:
170
str: Wrapped text with syntax preservation
171
"""
172
173
def wrap_summary(summary, initial_indent, subsequent_indent, wrap_length):
174
"""
175
Wrap summary text with proper indentation.
176
177
Args:
178
summary: Summary text to wrap
179
initial_indent: Indentation for first line
180
subsequent_indent: Indentation for continuation lines
181
wrap_length (int): Maximum line length
182
183
Returns:
184
str: Wrapped summary text
185
"""
186
```
187
188
### Field List Wrapping
189
190
Specialized wrapping for field lists.
191
192
```python { .api }
193
def do_wrap_field_lists(text: str, field_idx: List[Tuple[int, int]],
194
lines: List[str], text_idx: int, indentation: str,
195
wrap_length: int) -> Tuple[List[str], int]:
196
"""
197
Wrap field lists in the long description.
198
199
Args:
200
text (str): The long description text
201
field_idx (List[Tuple[int, int]]): List of field list indices in description
202
lines (List[str]): List of text lines
203
text_idx (int): Current text index
204
indentation (str): Base indentation string
205
wrap_length (int): Maximum line length
206
207
Returns:
208
Tuple[List[str], int]: Wrapped lines and updated text index
209
"""
210
```
211
212
### URL Wrapping
213
214
Specialized wrapping for URLs and links.
215
216
```python { .api }
217
def do_wrap_urls(text: str, url_idx: Iterable, text_idx: int,
218
indentation: str, wrap_length: int) -> Tuple[List[str], int]:
219
"""
220
Wrap URLs in the long description.
221
222
Args:
223
text (str): The long description text
224
url_idx (Iterable): List of URL indices found in the description text
225
text_idx (int): Current text index
226
indentation (str): Base indentation string
227
wrap_length (int): Maximum line length
228
229
Returns:
230
Tuple[List[str], int]: Wrapped lines and updated text index
231
"""
232
```
233
234
### Text Transformation
235
236
Utility functions for text transformation.
237
238
```python { .api }
239
def reindent(text, indentation):
240
"""
241
Apply indentation to text lines.
242
243
Args:
244
text: Text to reindent
245
indentation: Indentation string to apply
246
247
Returns:
248
str: Reindented text
249
"""
250
251
def remove_section_header(text):
252
"""
253
Remove section headers from text.
254
255
Args:
256
text: Text potentially containing section headers
257
258
Returns:
259
str: Text with section headers removed
260
"""
261
262
def strip_leading_blank_lines(text):
263
"""
264
Remove leading blank lines from text.
265
266
Args:
267
text: Text to process
268
269
Returns:
270
str: Text without leading blank lines
271
"""
272
273
def unwrap_summary(summary):
274
"""
275
Remove line breaks from summary text.
276
277
Args:
278
summary: Summary text to unwrap
279
280
Returns:
281
str: Summary as single line
282
"""
283
```
284
285
### Description Processing
286
287
Functions for processing description content.
288
289
```python { .api }
290
def description_to_list(text, indentation, wrap_length, force_wrap, tab_width, style):
291
"""
292
Convert description text to properly formatted list.
293
294
Args:
295
text: Description text
296
indentation: Base indentation
297
wrap_length (int): Maximum line length
298
force_wrap (bool): Force wrapping mode
299
tab_width (int): Tab width
300
style (str): Docstring style
301
302
Returns:
303
List[str]: Formatted description lines
304
"""
305
306
def do_split_description(text, indentation, wrap_length, force_wrap, tab_width, style):
307
"""
308
Split and format description text.
309
310
Args:
311
text: Description text to split
312
indentation: Base indentation
313
wrap_length (int): Maximum line length
314
force_wrap (bool): Force wrapping mode
315
tab_width (int): Tab width
316
style (str): Docstring style
317
318
Returns:
319
str: Split and formatted description
320
"""
321
```
322
323
### Directive Detection
324
325
Functions for detecting reStructuredText directives.
326
327
```python { .api }
328
def do_find_directives(text: str) -> bool:
329
"""
330
Find reStructuredText directives in text.
331
332
Args:
333
text (str): Text to search
334
335
Returns:
336
bool: True if text contains reST directives
337
"""
338
```
339
340
## Usage Examples
341
342
### Field List Detection and Processing
343
344
```python
345
from docformatter import do_find_field_lists, is_some_sort_of_field_list
346
347
# Sphinx-style field list
348
sphinx_text = """
349
Parameters:
350
param1 (str): First parameter
351
param2 (int): Second parameter
352
353
Returns:
354
bool: Success status
355
"""
356
357
# Find field lists
358
field_positions = do_find_field_lists(sphinx_text, style="sphinx")
359
print(f"Found {len(field_positions)} field lists")
360
361
# Check individual lines
362
lines = sphinx_text.strip().split('\n')
363
for line in lines:
364
is_field = is_some_sort_of_field_list(line, style="sphinx")
365
print(f"'{line.strip()}' -> {is_field}")
366
```
367
368
### List Detection
369
370
```python
371
from docformatter import is_some_sort_of_list
372
373
# Test various list types
374
test_texts = [
375
"- Bullet point item",
376
"1. Enumerated item",
377
":param name: Parameter description",
378
"@param name: Epytext parameter",
379
"Regular paragraph text",
380
" * Indented bullet",
381
"Args:",
382
" argument (str): Description"
383
]
384
385
for text in test_texts:
386
is_list = is_some_sort_of_list(text)
387
print(f"'{text}' -> {is_list}")
388
```
389
390
### URL and Link Processing
391
392
```python
393
from docformatter import do_find_links, do_clean_url
394
395
# Text with URLs
396
text_with_urls = """
397
See https://example.com for details.
398
Also check http://docs.python.org/library/re.html
399
for regular expression documentation.
400
"""
401
402
# Find links
403
links = do_find_links(text_with_urls)
404
print(f"Found {len(links)} links")
405
406
for start, end in links:
407
url = text_with_urls[start:end]
408
cleaned = do_clean_url(url, " ")
409
print(f"Original: {url}")
410
print(f"Cleaned: {cleaned}")
411
```
412
413
### Text Wrapping with Syntax Awareness
414
415
```python
416
from docformatter import wrap_description
417
418
# Description with field lists
419
description = """
420
This function processes data according to parameters.
421
422
Args:
423
data (list): Input data to process
424
options (dict): Processing options including:
425
- timeout: Maximum processing time
426
- format: Output format ('json' or 'xml')
427
428
Returns:
429
dict: Processing results with metadata
430
431
Raises:
432
ValueError: If data format is invalid
433
TimeoutError: If processing exceeds timeout
434
"""
435
436
# Wrap while preserving field lists
437
wrapped = wrap_description(
438
description,
439
indentation=" ",
440
wrap_length=72,
441
force_wrap=False,
442
tab_width=4,
443
style="sphinx"
444
)
445
446
print("Wrapped description:")
447
print(wrapped)
448
```
449
450
### Code Block Detection
451
452
```python
453
from docformatter import is_some_sort_of_code
454
455
# Test code detection
456
code_examples = [
457
"def function():\n pass",
458
">>> print('hello')\nhello",
459
".. code-block:: python\n\n import os",
460
" if condition::\n do_something()",
461
"Regular text without code",
462
" Indented text block::\n Code follows"
463
]
464
465
for example in code_examples:
466
is_code = is_some_sort_of_code(example)
467
print(f"Code detected: {is_code}")
468
print(f"Text: {repr(example[:50])}...")
469
print()
470
```
471
472
### Field List Wrapping
473
474
```python
475
from docformatter import do_wrap_field_lists
476
477
# Long field list descriptions
478
field_text = """
479
:param very_long_parameter_name: This is a very long parameter description that should be wrapped properly while maintaining the field list format and indentation structure.
480
:type very_long_parameter_name: str
481
:returns: A very long return description that explains what this function returns and provides detailed information about the return value format and structure.
482
:rtype: dict
483
"""
484
485
# Wrap field lists
486
wrapped_fields = do_wrap_field_lists(
487
field_text,
488
indentation="",
489
wrap_length=72,
490
force_wrap=False,
491
tab_width=4,
492
style="sphinx"
493
)
494
495
print("Wrapped field lists:")
496
print(wrapped_fields)
497
```
498
499
### Complex Syntax Processing
500
501
```python
502
from docformatter import (
503
wrap_description,
504
do_find_field_lists,
505
do_find_links,
506
is_some_sort_of_code
507
)
508
509
def analyze_docstring_syntax(text):
510
"""Comprehensive syntax analysis of docstring."""
511
analysis = {
512
'has_field_lists': bool(do_find_field_lists(text)),
513
'has_links': bool(do_find_links(text)),
514
'has_code': is_some_sort_of_code(text),
515
'field_list_positions': do_find_field_lists(text),
516
'link_positions': do_find_links(text)
517
}
518
519
return analysis
520
521
# Example complex docstring
522
complex_docstring = """
523
Process data with advanced options.
524
525
This function handles data processing with support for various
526
formats. See https://example.com/docs for details.
527
528
Args:
529
data (list): Input data
530
options (dict): Configuration options
531
532
Example:
533
>>> process_data([1, 2, 3], {'format': 'json'})
534
{'result': [1, 2, 3], 'format': 'json'}
535
536
Returns:
537
dict: Processed results
538
"""
539
540
analysis = analyze_docstring_syntax(complex_docstring)
541
for key, value in analysis.items():
542
print(f"{key}: {value}")
543
```
544
545
## Style Support
546
547
The syntax analysis module supports multiple docstring styles:
548
549
### Sphinx Style (Default)
550
551
```python
552
"""
553
Function description.
554
555
:param name: Parameter description
556
:type name: str
557
:returns: Return description
558
:rtype: bool
559
:raises ValueError: Error condition
560
"""
561
```
562
563
### Epytext Style
564
565
```python
566
"""
567
Function description.
568
569
@param name: Parameter description
570
@type name: str
571
@return: Return description
572
@rtype: bool
573
@raise ValueError: Error condition
574
"""
575
```
576
577
### Google Style
578
579
```python
580
"""
581
Function description.
582
583
Args:
584
name (str): Parameter description
585
586
Returns:
587
bool: Return description
588
589
Raises:
590
ValueError: Error condition
591
"""
592
```
593
594
### NumPy Style
595
596
```python
597
"""
598
Function description.
599
600
Parameters
601
----------
602
name : str
603
Parameter description
604
605
Returns
606
-------
607
bool
608
Return description
609
610
Raises
611
------
612
ValueError
613
Error condition
614
"""
615
```
616
617
## Integration with Formatting
618
619
The syntax analysis functions integrate with the core formatting engine to:
620
621
- **Preserve Structure**: Maintain field list formatting during text wrapping
622
- **Handle Code Blocks**: Avoid reflowing code examples and literal blocks
623
- **Process URLs**: Handle long URLs appropriately during line wrapping
624
- **Support Styles**: Apply style-specific formatting rules
625
- **Maintain Indentation**: Preserve relative indentation in complex structures
626
627
## Error Handling
628
629
Syntax analysis functions handle various edge cases:
630
631
- **Malformed Field Lists**: Graceful handling of incomplete or malformed field syntax
632
- **Mixed Styles**: Detection and handling of multiple docstring styles in one docstring
633
- **Complex Nesting**: Proper handling of nested lists and field structures
634
- **Edge Cases**: Robust handling of unusual formatting patterns
635
- **Unicode Content**: Full Unicode support for international documentation