0
# Google RE2
1
2
A fast, safe, thread-friendly regular expression library with guaranteed linear time complexity. Unlike traditional regex engines that use backtracking (which can lead to exponential time complexity), RE2 compiles regular expressions into deterministic finite automata, ensuring predictable performance even with malicious input. This makes it ideal for production environments where regex patterns come from untrusted sources.
3
4
## Package Information
5
6
- **Package Name**: google-re2
7
- **Language**: Python (C++ library with Python bindings)
8
- **Installation**: `pip install google-re2`
9
- **Version**: 1.1.20250805
10
- **License**: BSD-3-Clause
11
12
## Core Imports
13
14
```python
15
import re2
16
```
17
18
Individual functions can be imported:
19
20
```python
21
from re2 import compile, search, match, fullmatch, findall, split, sub
22
```
23
24
## Basic Usage
25
26
```python
27
import re2
28
29
# Basic pattern matching
30
pattern = r'\d{3}-\d{2}-\d{4}'
31
text = "My SSN is 123-45-6789"
32
33
# Search for pattern
34
match_obj = re2.search(pattern, text)
35
if match_obj:
36
print(f"Found: {match_obj.group()}") # "123-45-6789"
37
38
# Compile pattern for reuse (more efficient)
39
compiled_pattern = re2.compile(r'(\w+)@(\w+\.\w+)')
40
email_text = "Contact john@example.com for info"
41
match = compiled_pattern.search(email_text)
42
if match:
43
username, domain = match.groups()
44
print(f"User: {username}, Domain: {domain}") # "User: john, Domain: example.com"
45
46
# Replace patterns
47
result = re2.sub(r'\d+', 'X', "Phone: 555-1234")
48
print(result) # "Phone: XXX-XXXX"
49
```
50
51
## Architecture
52
53
RE2 provides two main interfaces:
54
55
- **Python Module Interface**: Drop-in replacement for Python's `re` module with familiar functions like `search`, `match`, `findall`, `sub`
56
- **Compiled Pattern Objects**: Pre-compiled patterns for better performance in repeated operations
57
- **Advanced Features**: Pattern sets for multi-pattern matching and filtered matching for high-performance scenarios
58
59
The library prioritizes safety and reliability over performance in pathological cases, making it secure for handling untrusted input while maintaining excellent performance for normal use cases.
60
61
## Capabilities
62
63
### Core Pattern Matching
64
65
Essential pattern matching functions that provide the primary interface for regular expressions. These functions support searching, matching, and extracting subpatterns from text.
66
67
```python { .api }
68
def search(pattern, text, options=None): ...
69
def match(pattern, text, options=None): ...
70
def fullmatch(pattern, text, options=None): ...
71
def findall(pattern, text, options=None): ...
72
def finditer(pattern, text, options=None): ...
73
```
74
75
[Core Pattern Matching](./core-matching.md)
76
77
### Text Processing
78
79
Functions for splitting text and performing substitutions using regular expressions. These operations are fundamental for text processing and data cleaning tasks.
80
81
```python { .api }
82
def split(pattern, text, maxsplit=0, options=None): ...
83
def sub(pattern, repl, text, count=0, options=None): ...
84
def subn(pattern, repl, text, count=0, options=None): ...
85
```
86
87
[Text Processing](./text-processing.md)
88
89
### Pattern Compilation
90
91
Pre-compilation of regular expressions for improved performance when patterns are used repeatedly. Compiled patterns provide access to advanced features and optimization options.
92
93
```python { .api }
94
def compile(pattern, options=None): ...
95
96
class _Regexp:
97
def search(text, pos=None, endpos=None): ...
98
def match(text, pos=None, endpos=None): ...
99
def fullmatch(text, pos=None, endpos=None): ...
100
# ... additional methods
101
```
102
103
[Pattern Compilation](./pattern-compilation.md)
104
105
### Options and Configuration
106
107
Configuration options that control how RE2 processes regular expressions, including encoding, syntax modes, memory limits, and performance tuning.
108
109
```python { .api }
110
class Options:
111
max_mem: int
112
encoding: Options.Encoding
113
posix_syntax: bool
114
longest_match: bool
115
case_sensitive: bool
116
# ... additional options
117
```
118
119
[Options and Configuration](./options-configuration.md)
120
121
### Advanced Features
122
123
Specialized functionality for high-performance scenarios including pattern sets for matching multiple patterns simultaneously and filtered matching for optimized multi-pattern operations.
124
125
```python { .api }
126
class Set:
127
def Add(pattern): ...
128
def Compile(): ...
129
def Match(text): ...
130
131
class Filter:
132
def Add(pattern, options=None): ...
133
def Compile(): ...
134
def Match(text, potential=False): ...
135
```
136
137
[Advanced Features](./advanced-features.md)
138
139
## Common Types
140
141
```python { .api }
142
class Options:
143
"""Configuration options for RE2 compilation and matching."""
144
145
class Encoding:
146
UTF8: int
147
LATIN1: int
148
149
def __init__(self):
150
self.max_mem: int = 8388608 # 8MiB default
151
self.encoding: Options.Encoding = Options.Encoding.UTF8
152
self.posix_syntax: bool = False
153
self.longest_match: bool = False
154
self.log_errors: bool = True
155
self.literal: bool = False
156
self.never_nl: bool = False
157
self.dot_nl: bool = False
158
self.never_capture: bool = False
159
self.case_sensitive: bool = True
160
self.perl_classes: bool = True
161
self.word_boundary: bool = True
162
self.one_line: bool = False
163
164
class error(Exception):
165
"""Exception raised for RE2 compilation and matching errors."""
166
pass
167
```