0
# Pattern System
1
2
Extensible pattern implementation system with built-in Git wildmatch support and custom pattern registration capabilities. The pattern system allows for pluggable pattern implementations and provides the foundation for pathspec's flexibility.
3
4
## Imports
5
6
```python
7
from pathspec import lookup_pattern, register_pattern
8
from pathspec.pattern import Pattern, RegexPattern
9
from pathspec.patterns import GitWildMatchPattern, GitIgnorePattern
10
from pathspec.patterns.gitwildmatch import GitWildMatchPatternError
11
from typing import Any, AnyStr, Callable, Iterable, List, Match, Optional, Tuple, Union
12
import re
13
```
14
15
## Capabilities
16
17
### Pattern Base Class
18
19
Abstract base class for all pattern implementations. Defines the interface that all patterns must implement.
20
21
```python { .api }
22
class Pattern:
23
def __init__(self, include: Optional[bool]) -> None:
24
"""
25
Initialize pattern with include/exclude flag.
26
27
Parameters:
28
- include: True for include patterns, False for exclude patterns, None for auto-detection
29
"""
30
31
def match_file(self, file: str) -> Optional[Any]:
32
"""
33
Abstract method to match pattern against a single file.
34
Must be implemented by subclasses.
35
36
Parameters:
37
- file: File path to test against pattern
38
39
Returns:
40
Match result (implementation-specific) or None if no match
41
"""
42
43
def match(self, files: Iterable[str]) -> Iterator[str]:
44
"""
45
DEPRECATED: Match pattern against multiple files.
46
47
Parameters:
48
- files: Iterable of file paths to test
49
50
Yields:
51
File paths that match the pattern
52
"""
53
```
54
55
### RegexPattern Class
56
57
Concrete pattern implementation using regular expressions. Serves as the base for most pattern types.
58
59
```python { .api }
60
class RegexPattern(Pattern):
61
def __init__(
62
self,
63
pattern: Union[str, bytes, Pattern, None],
64
include: Optional[bool] = None
65
) -> None:
66
"""
67
Initialize regex pattern from string or compiled regex.
68
69
Parameters:
70
- pattern: Pattern string, compiled regex, or None
71
- include: Include/exclude flag, auto-detected if None
72
"""
73
74
def __eq__(self, other: RegexPattern) -> bool:
75
"""
76
Test equality by comparing include flag and regex pattern.
77
"""
78
79
def match_file(self, file: str) -> Optional[RegexMatchResult]:
80
"""
81
Match file against regex pattern.
82
83
Parameters:
84
- file: File path to test
85
86
Returns:
87
RegexMatchResult if match found, None otherwise
88
"""
89
90
@classmethod
91
def pattern_to_regex(cls, pattern: str) -> Tuple[str, bool]:
92
"""
93
Convert pattern string to regex and include flag.
94
Base implementation for simple regex conversion.
95
96
Parameters:
97
- pattern: Pattern string to convert
98
99
Returns:
100
Tuple of (regex_string, include_flag)
101
"""
102
```
103
104
### GitWildMatchPattern Class
105
106
Git-style wildcard pattern implementation that converts Git wildmatch patterns to regular expressions.
107
108
```python { .api }
109
class GitWildMatchPattern(RegexPattern):
110
@classmethod
111
def pattern_to_regex(cls, pattern: Union[str, bytes]) -> Tuple[Optional[Union[str, bytes]], Optional[bool]]:
112
"""
113
Convert Git wildmatch pattern to regex.
114
Handles Git-specific wildcards, character classes, and negation.
115
116
Parameters:
117
- pattern: Git wildmatch pattern string
118
119
Returns:
120
Tuple of (regex_string, include_flag) or (None, None) for invalid patterns
121
122
Raises:
123
GitWildMatchPatternError: For invalid pattern syntax
124
"""
125
126
@staticmethod
127
def escape(s: Union[str, bytes]) -> Union[str, bytes]:
128
"""
129
Escape special characters in strings for use in Git patterns.
130
131
Parameters:
132
- s: String to escape
133
134
Returns:
135
Escaped string safe for use in patterns
136
"""
137
```
138
139
### GitIgnorePattern Class (Deprecated)
140
141
Backward compatibility alias for GitWildMatchPattern.
142
143
```python { .api }
144
class GitIgnorePattern(GitWildMatchPattern):
145
"""
146
DEPRECATED: Use GitWildMatchPattern instead.
147
Maintained for backward compatibility.
148
149
This class is identical to GitWildMatchPattern and exists only
150
for backward compatibility with older code.
151
"""
152
```
153
154
### Pattern Factory System
155
156
Functions for registering and looking up pattern implementations by name.
157
158
```python { .api }
159
def register_pattern(
160
name: str,
161
pattern_factory: Callable[[Union[str, bytes]], Pattern],
162
override: Optional[bool] = None
163
) -> None:
164
"""
165
Register a pattern factory under a specified name.
166
167
Parameters:
168
- name: Name to register the factory under
169
- pattern_factory: Callable that creates Pattern instances from strings
170
- override: Allow overriding existing registrations if True
171
172
Raises:
173
AlreadyRegisteredError: If name already registered and override is False
174
"""
175
176
def lookup_pattern(name: str) -> Callable[[Union[str, bytes]], Pattern]:
177
"""
178
Look up a registered pattern factory by name.
179
180
Parameters:
181
- name: Name of the pattern factory to look up
182
183
Returns:
184
Pattern factory callable
185
186
Raises:
187
KeyError: If pattern name is not registered
188
"""
189
```
190
191
### Pattern Match Results
192
193
Data classes for containing pattern match information.
194
195
```python { .api }
196
class RegexMatchResult:
197
"""
198
Contains information about a regex pattern match.
199
200
Attributes:
201
- match: The regex match object from re.match()
202
"""
203
match: Match[str]
204
```
205
206
### Pattern Exceptions
207
208
Exception classes for pattern-related errors.
209
210
```python { .api }
211
class GitWildMatchPatternError(ValueError):
212
"""
213
Raised when a Git wildmatch pattern is invalid or cannot be parsed.
214
"""
215
216
class AlreadyRegisteredError(Exception):
217
"""
218
Raised when attempting to register a pattern factory name that already exists.
219
"""
220
```
221
222
## Built-in Pattern Factories
223
224
PathSpec includes several pre-registered pattern factories:
225
226
- **`'gitwildmatch'`**: GitWildMatchPattern factory for Git wildcard patterns
227
- **`'gitignore'`**: Alias for GitWildMatchPattern (deprecated, use 'gitwildmatch')
228
229
## Usage Examples
230
231
### Using Built-in Patterns
232
233
```python
234
import pathspec
235
236
# Use the built-in gitwildmatch factory
237
spec = pathspec.PathSpec.from_lines('gitwildmatch', [
238
"*.py",
239
"!test_*.py",
240
"src/**/*.js"
241
])
242
243
# Equivalent to using the class directly
244
from pathspec.patterns import GitWildMatchPattern
245
patterns = [GitWildMatchPattern(line) for line in pattern_lines]
246
spec = pathspec.PathSpec(patterns)
247
```
248
249
### Creating Custom Patterns
250
251
```python
252
import pathspec
253
import re
254
from pathspec.pattern import RegexPattern
255
256
class SimpleGlobPattern(RegexPattern):
257
"""Simple glob pattern supporting * and ? wildcards only."""
258
259
@classmethod
260
def pattern_to_regex(cls, pattern):
261
# Convert simple glob to regex
262
regex = pattern.replace('*', '.*').replace('?', '.')
263
regex = f'^{regex}$'
264
include = not pattern.startswith('!')
265
if not include:
266
regex = regex[1:] # Remove ! prefix
267
return regex, include
268
269
# Register the custom pattern
270
pathspec.register_pattern('simpleglob', SimpleGlobPattern)
271
272
# Use the custom pattern
273
spec = pathspec.PathSpec.from_lines('simpleglob', [
274
"*.txt",
275
"!temp.*"
276
])
277
```
278
279
### Advanced Pattern Factory
280
281
```python
282
import pathspec
283
284
def create_case_insensitive_factory(base_factory):
285
"""Create a case-insensitive version of any pattern factory."""
286
287
def case_insensitive_factory(pattern):
288
# Create base pattern
289
base_pattern = base_factory(pattern.lower())
290
291
# Override match_file to lowercase the input
292
original_match = base_pattern.match_file
293
def case_insensitive_match(file):
294
return original_match(file.lower())
295
296
base_pattern.match_file = case_insensitive_match
297
return base_pattern
298
299
return case_insensitive_factory
300
301
# Create and register case-insensitive version
302
case_insensitive_git = create_case_insensitive_factory(
303
pathspec.lookup_pattern('gitwildmatch')
304
)
305
pathspec.register_pattern('gitwildmatch_ci', case_insensitive_git)
306
307
# Use case-insensitive matching
308
spec = pathspec.PathSpec.from_lines('gitwildmatch_ci', [
309
"*.PY", # Will match .py, .Py, .PY, etc.
310
"SRC/"
311
])
312
```
313
314
### Pattern Inspection
315
316
```python
317
import pathspec
318
from pathspec.patterns import GitWildMatchPattern
319
320
# Examine how patterns are converted
321
pattern_str = "src/**/*.py"
322
regex, include = GitWildMatchPattern.pattern_to_regex(pattern_str)
323
print(f"Pattern: {pattern_str}")
324
print(f"Regex: {regex}")
325
print(f"Include: {include}")
326
327
# Test individual pattern
328
pattern = GitWildMatchPattern(pattern_str)
329
result = pattern.match_file("src/utils/helper.py")
330
if result:
331
print(f"Match: {result.match.group()}")
332
```
333
334
### Error Handling
335
336
```python
337
import pathspec
338
339
try:
340
# Invalid pattern syntax
341
pattern = pathspec.patterns.GitWildMatchPattern("[invalid")
342
except pathspec.patterns.GitWildMatchPatternError as e:
343
print(f"Invalid pattern: {e}")
344
345
try:
346
# Attempt to register existing name
347
pathspec.register_pattern('gitwildmatch', lambda x: None)
348
except pathspec.AlreadyRegisteredError as e:
349
print(f"Pattern already registered: {e}")
350
351
try:
352
# Look up non-existent pattern
353
factory = pathspec.lookup_pattern('nonexistent')
354
except KeyError as e:
355
print(f"Pattern not found: {e}")
356
```
357
358
### Pattern Combination Strategies
359
360
```python
361
import pathspec
362
363
# Combine different pattern types
364
def create_multi_pattern_spec(pattern_groups):
365
"""Create PathSpec from multiple pattern types."""
366
all_patterns = []
367
368
for factory_name, patterns in pattern_groups.items():
369
factory = pathspec.lookup_pattern(factory_name)
370
for pattern_str in patterns:
371
all_patterns.append(factory(pattern_str))
372
373
return pathspec.PathSpec(all_patterns)
374
375
# Use multiple pattern types together
376
spec = create_multi_pattern_spec({
377
'gitwildmatch': ["*.py", "src/"],
378
'simpleglob': ["*.txt"],
379
})
380
```
381
382
## Git Wildmatch Pattern Syntax
383
384
The GitWildMatchPattern supports full Git wildmatch syntax:
385
386
- **`*`**: Matches any number of characters except path separators
387
- **`**`**: Matches any number of characters including path separators
388
- **`?`**: Matches exactly one character except path separators
389
- **`[abc]`**: Matches any character in the set
390
- **`[a-z]`**: Matches any character in the range
391
- **`[!abc]`**: Matches any character not in the set
392
- **`\`**: Escapes the next character
393
- **`!pattern`**: Negation (exclude) pattern
394
- **`/`**: Directory separator (normalized across platforms)
395
396
Patterns ending with `/` match directories only. Patterns starting with `/` are anchored to the root.