0
# Filter System
1
2
Token stream filters for post-processing highlighted code with special effects and transformations. Filters modify token streams after lexical analysis but before formatting.
3
4
## Capabilities
5
6
### Filter Discovery
7
8
Get filter instances by name.
9
10
```python { .api }
11
def get_filter_by_name(filtername: str, **options):
12
"""
13
Get filter instance by name with options.
14
15
Parameters:
16
- filtername: Filter name (e.g., 'codetagfilter', 'keywordcase')
17
- **options: Filter-specific options
18
19
Returns:
20
Filter instance configured with the given options
21
22
Raises:
23
ClassNotFound: If no filter with that name is found
24
"""
25
```
26
27
```python { .api }
28
def get_all_filters() -> Iterator[str]:
29
"""
30
Return generator of all available filter names.
31
32
Yields:
33
Filter name strings for both built-in and plugin filters
34
"""
35
```
36
37
```python { .api }
38
def find_filter_class(filtername: str):
39
"""
40
Find filter class by name without instantiation.
41
42
Parameters:
43
- filtername: Filter name
44
45
Returns:
46
Filter class or None if not found
47
"""
48
```
49
50
Usage example:
51
52
```python
53
from pygments.filters import get_filter_by_name, get_all_filters
54
55
# Get specific filter
56
codetag_filter = get_filter_by_name('codetagify', codetags=['TODO', 'FIXME', 'XXX'])
57
58
# List all available filters
59
print("Available filters:")
60
for filter_name in get_all_filters():
61
print(f" {filter_name}")
62
```
63
64
### Filter Application
65
66
Apply filters to lexers:
67
68
```python
69
from pygments.lexers import PythonLexer
70
from pygments.filters import get_filter_by_name
71
72
# Create lexer and add filters
73
lexer = PythonLexer()
74
lexer.add_filter(get_filter_by_name('codetagify'))
75
lexer.add_filter(get_filter_by_name('whitespace'))
76
77
# Or use filter names directly
78
lexer.add_filter('keywordcase', case='upper')
79
```
80
81
## Built-in Filters
82
83
### Code Tag Filter
84
85
Highlights special code tags in comments and docstrings.
86
87
```python { .api }
88
class CodeTagFilter:
89
"""
90
Highlight code tags like TODO, FIXME in comments.
91
92
Options:
93
- codetags: List of strings to highlight (default: ['XXX', 'TODO', 'FIXME', 'BUG', 'NOTE'])
94
"""
95
```
96
97
Usage example:
98
99
```python
100
from pygments.filters import get_filter_by_name
101
102
# Default tags: XXX, TODO, FIXME, BUG, NOTE
103
codetag_filter = get_filter_by_name('codetagify')
104
105
# Custom tags
106
custom_filter = get_filter_by_name('codetagify',
107
codetags=['TODO', 'HACK', 'REVIEW', 'OPTIMIZE'])
108
```
109
110
### Keyword Case Filter
111
112
Changes the case of language keywords.
113
114
```python { .api }
115
class KeywordCaseFilter:
116
"""
117
Convert keywords to upper or lower case.
118
119
Options:
120
- case: 'upper' or 'lower' (default: 'lower')
121
"""
122
```
123
124
Usage example:
125
126
```python
127
# Make all keywords uppercase
128
upper_filter = get_filter_by_name('keywordcase', case='upper')
129
130
# Make all keywords lowercase
131
lower_filter = get_filter_by_name('keywordcase', case='lower')
132
```
133
134
### Name Highlight Filter
135
136
Highlights specific names/identifiers.
137
138
```python { .api }
139
class NameHighlightFilter:
140
"""
141
Highlight specific names in the code.
142
143
Options:
144
- names: List of names to highlight
145
"""
146
```
147
148
Usage example:
149
150
```python
151
# Highlight specific function/variable names
152
name_filter = get_filter_by_name('highlight',
153
names=['main', 'process_data', 'API_KEY'])
154
```
155
156
### Visible Whitespace Filter
157
158
Makes whitespace characters visible.
159
160
```python { .api }
161
class VisibleWhitespaceFilter:
162
"""
163
Make whitespace visible by replacing with symbols.
164
165
Options:
166
- spaces: Character to represent spaces (default: '·')
167
- tabs: Character to represent tabs (default: '»')
168
- tabsize: Tab size for display (default: 8)
169
- newlines: Character for newlines (default: '¶')
170
- wsnl: Show newlines (default: False)
171
"""
172
```
173
174
Usage example:
175
176
```python
177
# Show all whitespace
178
ws_filter = get_filter_by_name('whitespace',
179
spaces='·', tabs='»', newlines='¶', wsnl=True)
180
181
# Show only spaces and tabs
182
ws_filter = get_filter_by_name('whitespace',
183
spaces='·', tabs='»')
184
```
185
186
### Gobble Filter
187
188
Removes common leading whitespace from all lines.
189
190
```python { .api }
191
class GobbleFilter:
192
"""
193
Remove common leading whitespace from all lines.
194
195
Options:
196
- n: Number of characters to remove from each line (auto-detected if not specified)
197
"""
198
```
199
200
Usage example:
201
202
```python
203
# Auto-detect common indentation and remove it
204
gobble_filter = get_filter_by_name('gobble')
205
206
# Remove specific number of characters
207
gobble_filter = get_filter_by_name('gobble', n=4)
208
```
209
210
### Token Merge Filter
211
212
Merges consecutive tokens of the same type.
213
214
```python { .api }
215
class TokenMergeFilter:
216
"""
217
Merge consecutive tokens of the same type to reduce token count.
218
"""
219
```
220
221
Usage example:
222
223
```python
224
merge_filter = get_filter_by_name('tokenmerge')
225
```
226
227
### Raise on Error Token Filter
228
229
Raises an exception when error tokens are encountered.
230
231
```python { .api }
232
class RaiseOnErrorTokenFilter:
233
"""
234
Raise FilterError on Error tokens.
235
236
Options:
237
- excclass: Exception class to raise (default: pygments.filters.ErrorToken)
238
"""
239
```
240
241
Usage example:
242
243
```python
244
error_filter = get_filter_by_name('raiseonerror')
245
```
246
247
### Symbol Filter
248
249
Highlights specific symbols or operators in the code.
250
251
```python { .api }
252
class SymbolFilter:
253
"""
254
Highlight specific symbols in the code.
255
256
Options:
257
- symbols: List of symbols to highlight
258
"""
259
```
260
261
Usage example:
262
263
```python
264
# Highlight specific symbols
265
symbol_filter = get_filter_by_name('symbols', symbols=['==', '!=', '<=', '>='])
266
```
267
268
## Filter Usage Examples
269
270
### Basic Filter Application
271
272
```python
273
from pygments import highlight
274
from pygments.lexers import PythonLexer
275
from pygments.formatters import HtmlFormatter
276
277
code = '''
278
def process_data():
279
# TODO: Optimize this function
280
# FIXME: Handle edge cases
281
data = "hello world" # Some data
282
return data
283
'''
284
285
# Create lexer and add filters
286
lexer = PythonLexer()
287
lexer.add_filter('codetagify') # Highlight TODO, FIXME
288
lexer.add_filter('whitespace', spaces='·') # Show spaces
289
290
# Highlight with filters applied
291
result = highlight(code, lexer, HtmlFormatter())
292
```
293
294
### Multiple Filters
295
296
```python
297
# Chain multiple filters
298
lexer = PythonLexer()
299
lexer.add_filter('gobble') # Remove common indentation
300
lexer.add_filter('codetagify', codetags=['TODO', 'HACK'])
301
lexer.add_filter('keywordcase', case='upper')
302
lexer.add_filter('tokenmerge') # Optimize token stream
303
304
result = highlight(code, lexer, HtmlFormatter())
305
```
306
307
### Custom Filter Chain
308
309
```python
310
from pygments.filter import Filter
311
312
class CustomFilter(Filter):
313
def filter(self, lexer, stream):
314
for ttype, value in stream:
315
# Custom token processing
316
if 'secret' in value.lower():
317
value = value.replace('secret', '***')
318
yield ttype, value
319
320
# Use custom filter
321
lexer = PythonLexer()
322
lexer.add_filter(CustomFilter())
323
```
324
325
## Filter Order
326
327
Filters are applied in the order they are added to the lexer. The order can affect the final result:
328
329
```python
330
lexer = PythonLexer()
331
332
# Order matters:
333
lexer.add_filter('gobble') # 1. Remove indentation first
334
lexer.add_filter('codetagfilter') # 2. Then highlight code tags
335
lexer.add_filter('tokenmerge') # 3. Finally merge tokens
336
```
337
338
## Error Handling
339
340
- **ClassNotFound**: No filter found with the specified name
341
- **OptionError**: Invalid filter options provided
342
- **FilterError**: Raised by some filters (e.g., RaiseOnErrorTokenFilter)