CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-docformatter

Formats docstrings to follow PEP 257 conventions with support for various docstring styles and Black formatter compatibility

Pending
Overview
Eval results
Files

syntax-analysis.mddocs/

Syntax Analysis

Advanced docstring syntax analysis and formatting for field lists, code blocks, URLs, and various docstring styles including Sphinx, Epytext, Google, and NumPy formats. This module handles the complex parsing and formatting rules that make docformatter compatible with different documentation conventions.

Capabilities

Regular Expression Patterns

Constants defining patterns for various docstring elements.

DEFAULT_INDENT = 4

# Field list patterns
ALEMBIC_REGEX = r"^ *[a-zA-Z0-9_\- ]*: "
BULLET_REGEX = r"\s*[*\-+] [\S ]+"
ENUM_REGEX = r"\s*\d\."
EPYTEXT_REGEX = r"@[a-zA-Z0-9_\-\s]+:"
GOOGLE_REGEX = r"^ *[a-zA-Z0-9_\- ]*:$"
NUMPY_REGEX = r"^\s[a-zA-Z0-9_\- ]+ ?: [\S ]+"
OPTION_REGEX = r"^-{1,2}[\S ]+ {2}\S+"
SPHINX_REGEX = r":({SPHINX_FIELD_PATTERNS})[a-zA-Z0-9_\-.() ]*:"

# Content patterns  
LITERAL_REGEX = r"[\S ]*::"
REST_REGEX = r"((\.{2}|`{2}) ?[\w.~-]+(:{2}|`{2})?[\w ]*?|`[\w.~]+`)"
URL_REGEX = r"({URL_PATTERNS})://[^\s]+"

# Sphinx field patterns
SPHINX_FIELD_PATTERNS = (
    "arg|cvar|except|ivar|key|meta|param|raise|return|rtype|type|var|yield"
)

# URL scheme patterns  
URL_PATTERNS = (
    "afp|apt|bitcoin|chrome|cvs|dav|dns|file|finger|fish|ftp|ftps|git|"
    "http|https|imap|ipp|ldap|mailto|news|nfs|nntp|pop|rsync|rtsp|sftp|"
    "smb|smtp|ssh|svn|tcp|telnet|tftp|udp|vnc|ws|wss"
)

Field List Detection

Functions for detecting and analyzing field lists in docstrings.

def do_find_field_lists(text: str, style: str) -> List[Tuple[int, int]]:
    """
    Find field list positions in text.
    
    Args:
        text (str): Text to search for field lists
        style (str): Field list style ('sphinx', 'epytext', 'google', 'numpy')
        
    Returns:
        List[Tuple[int, int]]: List of (start_pos, end_pos) tuples for field lists
    """

def is_some_sort_of_field_list(line: str, style: str) -> bool:
    """
    Determine if line contains a field list.
    
    Args:
        line (str): Line to check
        style (str): Field list style to check against
        
    Returns:
        bool: True if line contains a field list
    """

List Detection

Functions for detecting various types of lists in docstrings.

def is_some_sort_of_list(text, strict=True):
    """
    Determine if text contains any type of list.
    
    Detects bullet lists, enumerated lists, field lists, option lists,
    and other structured content that should not be reflowed.
    
    Args:
        text: Text to analyze
        strict (bool): Whether to use strict reST syntax checking
        
    Returns:
        bool: True if text contains list-like content
    """

Code Block Detection

Functions for identifying code blocks and literal content.

def is_some_sort_of_code(text: str) -> bool:
    """
    Determine if text contains code or literal blocks.
    
    Args:
        text (str): Text to analyze
        
    Returns:
        bool: True if text appears to contain code
    """

URL and Link Processing

Functions for handling URLs and links in docstrings.

def do_find_links(text: str) -> List[Tuple[int, int]]:
    """
    Find link positions in text.
    
    Args:
        text (str): Text to search for links
        
    Returns:
        List[Tuple[int, int]]: List of (start_pos, end_pos) tuples for links
    """

def do_skip_link(text: str, index: Tuple[int, int]) -> bool:
    """
    Determine if link should be skipped during wrapping.
    
    Args:
        text (str): Text containing the link
        index (Tuple[int, int]): Link position (start, end)
        
    Returns:
        bool: True if link should not be wrapped
    """

def do_clean_url(url: str, indentation: str) -> str:
    """
    Clean and format URL for proper display.
    
    Args:
        url (str): URL to clean
        indentation (str): Indentation to apply
        
    Returns:
        str: Cleaned and formatted URL
    """

Text Wrapping and Formatting

Core text wrapping functions with syntax awareness.

def wrap_description(text, indentation, wrap_length, force_wrap, strict, 
                    rest_sections, style="sphinx"):
    """
    Wrap description text while preserving syntax elements.
    
    Args:
        text: Text to wrap
        indentation: Base indentation string
        wrap_length (int): Maximum line length
        force_wrap (bool): Force wrapping even if messy
        strict (bool): Whether to use strict reST syntax checking
        rest_sections: Regular expression for reST section adornments
        style (str): Docstring style for field list handling (default: "sphinx")
        
    Returns:
        str: Wrapped text with syntax preservation
    """

def wrap_summary(summary, initial_indent, subsequent_indent, wrap_length):
    """
    Wrap summary text with proper indentation.
    
    Args:
        summary: Summary text to wrap
        initial_indent: Indentation for first line
        subsequent_indent: Indentation for continuation lines
        wrap_length (int): Maximum line length
        
    Returns:
        str: Wrapped summary text
    """

Field List Wrapping

Specialized wrapping for field lists.

def do_wrap_field_lists(text: str, field_idx: List[Tuple[int, int]], 
                       lines: List[str], text_idx: int, indentation: str, 
                       wrap_length: int) -> Tuple[List[str], int]:
    """
    Wrap field lists in the long description.
    
    Args:
        text (str): The long description text
        field_idx (List[Tuple[int, int]]): List of field list indices in description
        lines (List[str]): List of text lines
        text_idx (int): Current text index
        indentation (str): Base indentation string
        wrap_length (int): Maximum line length
        
    Returns:
        Tuple[List[str], int]: Wrapped lines and updated text index
    """

URL Wrapping

Specialized wrapping for URLs and links.

def do_wrap_urls(text: str, url_idx: Iterable, text_idx: int, 
                indentation: str, wrap_length: int) -> Tuple[List[str], int]:
    """
    Wrap URLs in the long description.
    
    Args:
        text (str): The long description text
        url_idx (Iterable): List of URL indices found in the description text
        text_idx (int): Current text index
        indentation (str): Base indentation string
        wrap_length (int): Maximum line length
        
    Returns:
        Tuple[List[str], int]: Wrapped lines and updated text index
    """

Text Transformation

Utility functions for text transformation.

def reindent(text, indentation):
    """
    Apply indentation to text lines.
    
    Args:
        text: Text to reindent
        indentation: Indentation string to apply
        
    Returns:
        str: Reindented text
    """

def remove_section_header(text):
    """
    Remove section headers from text.
    
    Args:
        text: Text potentially containing section headers
        
    Returns:
        str: Text with section headers removed
    """

def strip_leading_blank_lines(text):
    """
    Remove leading blank lines from text.
    
    Args:
        text: Text to process
        
    Returns:
        str: Text without leading blank lines
    """

def unwrap_summary(summary):
    """
    Remove line breaks from summary text.
    
    Args:
        summary: Summary text to unwrap
        
    Returns:
        str: Summary as single line
    """

Description Processing

Functions for processing description content.

def description_to_list(text, indentation, wrap_length, force_wrap, tab_width, style):
    """
    Convert description text to properly formatted list.
    
    Args:
        text: Description text
        indentation: Base indentation
        wrap_length (int): Maximum line length
        force_wrap (bool): Force wrapping mode
        tab_width (int): Tab width
        style (str): Docstring style
        
    Returns:
        List[str]: Formatted description lines
    """

def do_split_description(text, indentation, wrap_length, force_wrap, tab_width, style):
    """
    Split and format description text.
    
    Args:
        text: Description text to split
        indentation: Base indentation
        wrap_length (int): Maximum line length
        force_wrap (bool): Force wrapping mode
        tab_width (int): Tab width
        style (str): Docstring style
        
    Returns:
        str: Split and formatted description
    """

Directive Detection

Functions for detecting reStructuredText directives.

def do_find_directives(text: str) -> bool:
    """
    Find reStructuredText directives in text.
    
    Args:
        text (str): Text to search
        
    Returns:
        bool: True if text contains reST directives
    """

Usage Examples

Field List Detection and Processing

from docformatter import do_find_field_lists, is_some_sort_of_field_list

# Sphinx-style field list
sphinx_text = """
Parameters:
    param1 (str): First parameter
    param2 (int): Second parameter

Returns:
    bool: Success status
"""

# Find field lists
field_positions = do_find_field_lists(sphinx_text, style="sphinx")
print(f"Found {len(field_positions)} field lists")

# Check individual lines
lines = sphinx_text.strip().split('\n')
for line in lines:
    is_field = is_some_sort_of_field_list(line, style="sphinx")
    print(f"'{line.strip()}' -> {is_field}")

List Detection

from docformatter import is_some_sort_of_list

# Test various list types
test_texts = [
    "- Bullet point item",
    "1. Enumerated item", 
    ":param name: Parameter description",
    "@param name: Epytext parameter",
    "Regular paragraph text",
    "    * Indented bullet",
    "Args:",
    "    argument (str): Description"
]

for text in test_texts:
    is_list = is_some_sort_of_list(text)
    print(f"'{text}' -> {is_list}")

URL and Link Processing

from docformatter import do_find_links, do_clean_url

# Text with URLs
text_with_urls = """
See https://example.com for details.
Also check http://docs.python.org/library/re.html
for regular expression documentation.
"""

# Find links
links = do_find_links(text_with_urls)
print(f"Found {len(links)} links")

for start, end in links:
    url = text_with_urls[start:end]
    cleaned = do_clean_url(url, "    ")
    print(f"Original: {url}")
    print(f"Cleaned: {cleaned}")

Text Wrapping with Syntax Awareness

from docformatter import wrap_description

# Description with field lists
description = """
This function processes data according to parameters.

Args:
    data (list): Input data to process
    options (dict): Processing options including:
        - timeout: Maximum processing time
        - format: Output format ('json' or 'xml')

Returns:
    dict: Processing results with metadata

Raises:
    ValueError: If data format is invalid
    TimeoutError: If processing exceeds timeout
"""

# Wrap while preserving field lists
wrapped = wrap_description(
    description,
    indentation="    ",
    wrap_length=72,
    force_wrap=False,
    tab_width=4,
    style="sphinx"
)

print("Wrapped description:")
print(wrapped)

Code Block Detection

from docformatter import is_some_sort_of_code

# Test code detection
code_examples = [
    "def function():\n    pass",
    ">>> print('hello')\nhello",
    ".. code-block:: python\n\n    import os",
    "    if condition::\n        do_something()",
    "Regular text without code",
    "    Indented text block::\n        Code follows"
]

for example in code_examples:
    is_code = is_some_sort_of_code(example)
    print(f"Code detected: {is_code}")
    print(f"Text: {repr(example[:50])}...")
    print()

Field List Wrapping

from docformatter import do_wrap_field_lists

# Long field list descriptions
field_text = """
:param very_long_parameter_name: This is a very long parameter description that should be wrapped properly while maintaining the field list format and indentation structure.
:type very_long_parameter_name: str
:returns: A very long return description that explains what this function returns and provides detailed information about the return value format and structure.
:rtype: dict
"""

# Wrap field lists
wrapped_fields = do_wrap_field_lists(
    field_text,
    indentation="",
    wrap_length=72,
    force_wrap=False,
    tab_width=4,
    style="sphinx"
)

print("Wrapped field lists:")
print(wrapped_fields)

Complex Syntax Processing

from docformatter import (
    wrap_description,
    do_find_field_lists,
    do_find_links,
    is_some_sort_of_code
)

def analyze_docstring_syntax(text):
    """Comprehensive syntax analysis of docstring."""
    analysis = {
        'has_field_lists': bool(do_find_field_lists(text)),
        'has_links': bool(do_find_links(text)),
        'has_code': is_some_sort_of_code(text),
        'field_list_positions': do_find_field_lists(text),
        'link_positions': do_find_links(text)
    }
    
    return analysis

# Example complex docstring
complex_docstring = """
Process data with advanced options.

This function handles data processing with support for various
formats. See https://example.com/docs for details.

Args:
    data (list): Input data
    options (dict): Configuration options

Example:
    >>> process_data([1, 2, 3], {'format': 'json'})
    {'result': [1, 2, 3], 'format': 'json'}

Returns:
    dict: Processed results
"""

analysis = analyze_docstring_syntax(complex_docstring)
for key, value in analysis.items():
    print(f"{key}: {value}")

Style Support

The syntax analysis module supports multiple docstring styles:

Sphinx Style (Default)

"""
Function description.

:param name: Parameter description
:type name: str
:returns: Return description
:rtype: bool
:raises ValueError: Error condition
"""

Epytext Style

"""
Function description.

@param name: Parameter description
@type name: str
@return: Return description
@rtype: bool
@raise ValueError: Error condition
"""

Google Style

"""
Function description.

Args:
    name (str): Parameter description

Returns:
    bool: Return description

Raises:
    ValueError: Error condition
"""

NumPy Style

"""
Function description.

Parameters
----------
name : str
    Parameter description

Returns
-------
bool
    Return description

Raises
------
ValueError
    Error condition
"""

Integration with Formatting

The syntax analysis functions integrate with the core formatting engine to:

  • Preserve Structure: Maintain field list formatting during text wrapping
  • Handle Code Blocks: Avoid reflowing code examples and literal blocks
  • Process URLs: Handle long URLs appropriately during line wrapping
  • Support Styles: Apply style-specific formatting rules
  • Maintain Indentation: Preserve relative indentation in complex structures

Error Handling

Syntax analysis functions handle various edge cases:

  • Malformed Field Lists: Graceful handling of incomplete or malformed field syntax
  • Mixed Styles: Detection and handling of multiple docstring styles in one docstring
  • Complex Nesting: Proper handling of nested lists and field structures
  • Edge Cases: Robust handling of unusual formatting patterns
  • Unicode Content: Full Unicode support for international documentation

Install with Tessl CLI

npx tessl i tessl/pypi-docformatter

docs

configuration.md

core-formatting.md

file-io.md

index.md

string-processing.md

syntax-analysis.md

tile.json