CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pygments

A syntax highlighting package that supports over 500 programming languages and text formats with extensive output format options

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

filter-system.mddocs/

Filter System

Token stream filters for post-processing highlighted code with special effects and transformations. Filters modify token streams after lexical analysis but before formatting.

Capabilities

Filter Discovery

Get filter instances by name.

def get_filter_by_name(filtername: str, **options):
    """
    Get filter instance by name with options.
    
    Parameters:
    - filtername: Filter name (e.g., 'codetagfilter', 'keywordcase')
    - **options: Filter-specific options
    
    Returns:
    Filter instance configured with the given options
    
    Raises:
    ClassNotFound: If no filter with that name is found
    """
def get_all_filters() -> Iterator[str]:
    """
    Return generator of all available filter names.
    
    Yields:
    Filter name strings for both built-in and plugin filters
    """
def find_filter_class(filtername: str):
    """
    Find filter class by name without instantiation.
    
    Parameters:
    - filtername: Filter name
    
    Returns:
    Filter class or None if not found
    """

Usage example:

from pygments.filters import get_filter_by_name, get_all_filters

# Get specific filter
codetag_filter = get_filter_by_name('codetagify', codetags=['TODO', 'FIXME', 'XXX'])

# List all available filters
print("Available filters:")
for filter_name in get_all_filters():
    print(f"  {filter_name}")

Filter Application

Apply filters to lexers:

from pygments.lexers import PythonLexer
from pygments.filters import get_filter_by_name

# Create lexer and add filters
lexer = PythonLexer()
lexer.add_filter(get_filter_by_name('codetagify'))
lexer.add_filter(get_filter_by_name('whitespace'))

# Or use filter names directly
lexer.add_filter('keywordcase', case='upper')

Built-in Filters

Code Tag Filter

Highlights special code tags in comments and docstrings.

class CodeTagFilter:
    """
    Highlight code tags like TODO, FIXME in comments.
    
    Options:
    - codetags: List of strings to highlight (default: ['XXX', 'TODO', 'FIXME', 'BUG', 'NOTE'])
    """

Usage example:

from pygments.filters import get_filter_by_name

# Default tags: XXX, TODO, FIXME, BUG, NOTE
codetag_filter = get_filter_by_name('codetagify')

# Custom tags
custom_filter = get_filter_by_name('codetagify', 
                                  codetags=['TODO', 'HACK', 'REVIEW', 'OPTIMIZE'])

Keyword Case Filter

Changes the case of language keywords.

class KeywordCaseFilter:
    """
    Convert keywords to upper or lower case.
    
    Options:
    - case: 'upper' or 'lower' (default: 'lower')
    """

Usage example:

# Make all keywords uppercase
upper_filter = get_filter_by_name('keywordcase', case='upper')

# Make all keywords lowercase  
lower_filter = get_filter_by_name('keywordcase', case='lower')

Name Highlight Filter

Highlights specific names/identifiers.

class NameHighlightFilter:
    """
    Highlight specific names in the code.
    
    Options:
    - names: List of names to highlight
    """

Usage example:

# Highlight specific function/variable names
name_filter = get_filter_by_name('highlight', 
                                names=['main', 'process_data', 'API_KEY'])

Visible Whitespace Filter

Makes whitespace characters visible.

class VisibleWhitespaceFilter:
    """
    Make whitespace visible by replacing with symbols.
    
    Options:
    - spaces: Character to represent spaces (default: '·')
    - tabs: Character to represent tabs (default: '»')
    - tabsize: Tab size for display (default: 8)
    - newlines: Character for newlines (default: '¶')
    - wsnl: Show newlines (default: False)
    """

Usage example:

# Show all whitespace
ws_filter = get_filter_by_name('whitespace', 
                              spaces='·', tabs='»', newlines='¶', wsnl=True)

# Show only spaces and tabs
ws_filter = get_filter_by_name('whitespace', 
                              spaces='·', tabs='»')

Gobble Filter

Removes common leading whitespace from all lines.

class GobbleFilter:
    """
    Remove common leading whitespace from all lines.
    
    Options:
    - n: Number of characters to remove from each line (auto-detected if not specified)
    """

Usage example:

# Auto-detect common indentation and remove it
gobble_filter = get_filter_by_name('gobble')

# Remove specific number of characters
gobble_filter = get_filter_by_name('gobble', n=4)

Token Merge Filter

Merges consecutive tokens of the same type.

class TokenMergeFilter:
    """
    Merge consecutive tokens of the same type to reduce token count.
    """

Usage example:

merge_filter = get_filter_by_name('tokenmerge')

Raise on Error Token Filter

Raises an exception when error tokens are encountered.

class RaiseOnErrorTokenFilter:
    """
    Raise FilterError on Error tokens.
    
    Options:
    - excclass: Exception class to raise (default: pygments.filters.ErrorToken)
    """

Usage example:

error_filter = get_filter_by_name('raiseonerror')

Symbol Filter

Highlights specific symbols or operators in the code.

class SymbolFilter:
    """
    Highlight specific symbols in the code.
    
    Options:
    - symbols: List of symbols to highlight
    """

Usage example:

# Highlight specific symbols
symbol_filter = get_filter_by_name('symbols', symbols=['==', '!=', '<=', '>='])

Filter Usage Examples

Basic Filter Application

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter

code = '''
def process_data():
    # TODO: Optimize this function
    # FIXME: Handle edge cases
    data = "hello world"  # Some data
    return data
'''

# Create lexer and add filters
lexer = PythonLexer()
lexer.add_filter('codetagify')  # Highlight TODO, FIXME
lexer.add_filter('whitespace', spaces='·')  # Show spaces

# Highlight with filters applied
result = highlight(code, lexer, HtmlFormatter())

Multiple Filters

# Chain multiple filters
lexer = PythonLexer()
lexer.add_filter('gobble')  # Remove common indentation
lexer.add_filter('codetagify', codetags=['TODO', 'HACK'])
lexer.add_filter('keywordcase', case='upper')
lexer.add_filter('tokenmerge')  # Optimize token stream

result = highlight(code, lexer, HtmlFormatter())

Custom Filter Chain

from pygments.filter import Filter

class CustomFilter(Filter):
    def filter(self, lexer, stream):
        for ttype, value in stream:
            # Custom token processing
            if 'secret' in value.lower():
                value = value.replace('secret', '***')
            yield ttype, value

# Use custom filter
lexer = PythonLexer()
lexer.add_filter(CustomFilter())

Filter Order

Filters are applied in the order they are added to the lexer. The order can affect the final result:

lexer = PythonLexer()

# Order matters:
lexer.add_filter('gobble')        # 1. Remove indentation first
lexer.add_filter('codetagfilter') # 2. Then highlight code tags
lexer.add_filter('tokenmerge')    # 3. Finally merge tokens

Error Handling

  • ClassNotFound: No filter found with the specified name
  • OptionError: Invalid filter options provided
  • FilterError: Raised by some filters (e.g., RaiseOnErrorTokenFilter)

Install with Tessl CLI

npx tessl i tessl/pypi-pygments

docs

command-line.md

custom-components.md

filter-system.md

formatter-management.md

high-level-api.md

index.md

lexer-management.md

style-management.md

tile.json