or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

classes-types.mdcompilation-utilities.mdflags-constants.mdindex.mdpattern-matching.mdsplitting.mdsubstitution.md
tile.json

tessl/pypi-regex

Alternative regular expression module providing enhanced pattern matching, fuzzy matching, and advanced Unicode support as a replacement for Python's re module.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/regex@2025.9.x

To install, run

npx @tessl/cli install tessl/pypi-regex@2025.9.0

index.mddocs/

regex

An advanced regular expression library that serves as a backwards-compatible replacement for Python's standard re module while offering significantly enhanced functionality. The regex library provides full Unicode 16.0.0 support, fuzzy matching capabilities, advanced flags for fine-grained pattern control, and multithreading support with GIL release during matching operations.

Package Information

  • Package Name: regex
  • Language: Python
  • Installation: pip install regex
  • Version: 2025.9.1

Core Imports

import regex

Common usage pattern:

import regex as re  # Drop-in replacement for standard re module

Specific imports:

from regex import match, search, sub, findall, compile
from regex import IGNORECASE, MULTILINE, DOTALL, VERBOSE
from regex import BESTMATCH, ENHANCEMATCH, FULLCASE

Basic Usage

import regex

# Basic pattern matching
pattern = r'\b\w+@\w+\.\w+\b'
text = "Contact us at support@example.com or sales@company.org"
matches = regex.findall(pattern, text)
print(matches)  # ['support@example.com', 'sales@company.org']

# Case-insensitive matching with enhanced flags
result = regex.search(r'hello', 'Hello World', regex.IGNORECASE)
if result:
    print(f"Found: {result.group()}")  # Found: Hello

# Fuzzy matching for approximate matches
pattern = r'(?e)(hello){i<=1,d<=1,s<=1}'  # Allow 1 insertion, deletion, substitution
result = regex.search(pattern, 'helo world')  # Matches with 1 deletion
if result:
    print(f"Fuzzy match: {result.group()}")  # Fuzzy match: helo

# Pattern compilation for reuse
compiled = regex.compile(r'\d{4}-\d{2}-\d{2}', regex.VERBOSE)
dates = compiled.findall('Dates: 2023-12-25 and 2024-01-01')
print(dates)  # ['2023-12-25', '2024-01-01']

Architecture

The regex module extends Python's regular expression capabilities through several key components:

  • Enhanced Pattern Engine: Provides backwards compatibility with re while adding advanced features
  • Fuzzy Matching System: Supports approximate matching with configurable error limits
  • Unicode Support: Full Unicode 16.0.0 support with proper case-folding
  • Flag System: Scoped and global flags for fine-grained pattern control
  • Multithreading: GIL release during matching operations for better performance

The module supports both VERSION0 (legacy re-compatible) and VERSION1 (enhanced) behaviors, allowing gradual migration while maintaining compatibility.

Capabilities

Pattern Matching Functions

Core functions for finding patterns in text including match, search, findall, and finditer with enhanced parameters for position control, partial matching, concurrency, and timeout handling.

def match(pattern, string, flags=0, pos=None, endpos=None, partial=False, 
          concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Try to apply pattern at start of string, returning Match object or None"""

def search(pattern, string, flags=0, pos=None, endpos=None, partial=False,
           concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Search through string for pattern match, returning Match object or None"""

def findall(pattern, string, flags=0, pos=None, endpos=None, overlapped=False,
            concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Return list of all matches in string"""

def finditer(pattern, string, flags=0, pos=None, endpos=None, overlapped=False,
             partial=False, concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Return iterator over all matches in string"""

def fullmatch(pattern, string, flags=0, pos=None, endpos=None, partial=False,
              concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Try to apply pattern against all of string, returning Match object or None"""

Pattern Matching

String Substitution Functions

Advanced string replacement capabilities including standard substitution, format-based replacement, and variants that return substitution counts. Supports concurrent execution and timeout handling.

def sub(pattern, repl, string, count=0, flags=0, pos=None, endpos=None,
        concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Replace pattern occurrences with replacement string"""

def subf(pattern, format, string, count=0, flags=0, pos=None, endpos=None,
         concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Replace pattern occurrences using format string"""

def subn(pattern, repl, string, count=0, flags=0, pos=None, endpos=None,
         concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Return (new_string, number_of_substitutions_made) tuple"""

def subfn(pattern, format, string, count=0, flags=0, pos=None, endpos=None,
          concurrent=None, timeout=None, ignore_unused=False, **kwargs):
    """Return (new_string, number_of_substitutions_made) tuple using format string"""

String Substitution

String Splitting Functions

Pattern-based string splitting with support for maximum splits, concurrent execution, and iterator-based processing for memory efficiency with large texts.

def split(pattern, string, maxsplit=0, flags=0, concurrent=None,
          timeout=None, ignore_unused=False, **kwargs):
    """Split string by pattern occurrences, returning list of substrings"""

def splititer(pattern, string, maxsplit=0, flags=0, concurrent=None,
              timeout=None, ignore_unused=False, **kwargs):
    """Return iterator yielding split string parts"""

String Splitting

Pattern Compilation and Utilities

Pattern compilation, caching control, template support, and string escaping utilities for preparing literal strings for use in patterns.

def compile(pattern, flags=0, ignore_unused=False, cache_pattern=None, **kwargs):
    """Compile regular expression pattern, returning Pattern object"""

def escape(pattern, special_only=True, literal_spaces=False):
    """Escape string for use as literal in pattern"""

def purge():
    """Clear the regular expression cache"""

def cache_all(value=True):
    """Set/get whether to cache all patterns"""

def template(pattern, flags=0):
    """Compile a template pattern, returning a Pattern object"""

Compilation and Utilities

Advanced Classes and Types

Pattern and Match objects providing compiled pattern functionality and match result access, plus Scanner for tokenization and RegexFlag enumeration for proper flag handling.

class Pattern:
    """Compiled regular expression pattern object"""
    def match(self, string, pos=None, endpos=None, concurrent=None, partial=False, timeout=None): ...
    def search(self, string, pos=None, endpos=None, concurrent=None, partial=False, timeout=None): ...
    # Additional methods: findall, finditer, sub, split, etc.

class Match:
    """Match object containing match information"""
    def group(self, *groups): ...
    def groups(self, default=None): ...
    def groupdict(self, default=None): ...
    def start(self, group=0): ...
    def end(self, group=0): ...
    def span(self, group=0): ...

class Scanner:
    """Tokenizing scanner using pattern-action pairs"""
    def __init__(self, lexicon, flags=0): ...
    def scan(self, string): ...

Classes and Types

Flags and Constants

Comprehensive flag system including standard regex flags, enhanced flags for fuzzy matching and Unicode handling, version control flags, and global constants for controlling library behavior.

# Standard flags
IGNORECASE = I = 0x2      # Case-insensitive matching
MULTILINE = M = 0x8       # Multi-line mode for ^ and $
DOTALL = S = 0x10         # Make . match any character including newline
VERBOSE = X = 0x40        # Verbose mode allowing comments

# Enhanced flags
BESTMATCH = B = 0x1000    # Find best fuzzy match instead of first
ENHANCEMATCH = E = 0x8000 # Improve fuzzy match fit after finding first
FULLCASE = F = 0x4000     # Full case-folding for Unicode
WORD = W = 0x800          # Unicode word boundaries and line breaks

# Version control
VERSION0 = V0 = 0x2000    # Legacy re-compatible behavior
VERSION1 = V1 = 0x100     # Enhanced behavior mode
DEFAULT_VERSION           # Current default version setting

Flags and Constants

Types

class error(Exception):
    """Exception raised for invalid regular expressions"""
    msg: str        # Unformatted error message
    pattern: str    # Regular expression pattern
    pos: int        # Position where compilation failed
    lineno: int     # Line number where compilation failed
    colno: int      # Column number where compilation failed

RegexFlag = enum.IntFlag  # Enumeration of regex flags with proper combination support