0
# Lexer Management
1
2
Functions for discovering, loading, and working with syntax lexers. Pygments includes lexers for over 500 programming languages and text formats.
3
4
## Capabilities
5
6
### Lexer Discovery by Name
7
8
Get a lexer instance by its alias or name.
9
10
```python { .api }
11
def get_lexer_by_name(_alias: str, **options):
12
"""
13
Get a lexer instance by alias with options.
14
15
Parameters:
16
- _alias: Short identifier for the lexer (e.g., 'python', 'javascript', 'c++')
17
- **options: Lexer-specific options (stripnl, stripall, ensurenl, tabsize, encoding)
18
19
Returns:
20
Lexer instance configured with the given options
21
22
Raises:
23
ClassNotFound: If no lexer with that alias is found
24
"""
25
```
26
27
Usage example:
28
29
```python
30
from pygments.lexers import get_lexer_by_name
31
32
# Get Python lexer
33
python_lexer = get_lexer_by_name('python')
34
35
# Get JavaScript lexer with custom options
36
js_lexer = get_lexer_by_name('javascript', tabsize=4, stripall=True)
37
38
# Get C++ lexer
39
cpp_lexer = get_lexer_by_name('cpp')
40
```
41
42
### Lexer Discovery by Filename
43
44
Get a lexer based on filename patterns, with optional code analysis.
45
46
```python { .api }
47
def get_lexer_for_filename(_fn: str, code=None, **options):
48
"""
49
Get a lexer for a filename, optionally using code analysis.
50
51
Parameters:
52
- _fn: File name or path to determine lexer
53
- code: Optional code content for disambiguation
54
- **options: Lexer-specific options
55
56
Returns:
57
Lexer instance best matching the filename and code
58
59
Raises:
60
ClassNotFound: If no suitable lexer is found
61
"""
62
```
63
64
Usage example:
65
66
```python
67
from pygments.lexers import get_lexer_for_filename
68
69
# Get lexer based on file extension
70
lexer = get_lexer_for_filename('script.py') # Returns Python lexer
71
lexer = get_lexer_for_filename('app.js') # Returns JavaScript lexer
72
lexer = get_lexer_for_filename('main.cpp') # Returns C++ lexer
73
74
# Use code content for disambiguation
75
with open('ambiguous_file', 'r') as f:
76
code = f.read()
77
lexer = get_lexer_for_filename('ambiguous_file', code=code)
78
```
79
80
### Lexer Guessing
81
82
Guess the appropriate lexer based on code analysis.
83
84
```python { .api }
85
def guess_lexer(_text: str, **options):
86
"""
87
Guess lexer based on text analysis using analyse_text() methods.
88
89
Parameters:
90
- _text: Source code to analyze
91
- **options: Lexer-specific options
92
93
Returns:
94
Lexer instance with highest confidence score
95
96
Raises:
97
ClassNotFound: If no lexer can analyze the text
98
"""
99
```
100
101
Usage example:
102
103
```python
104
from pygments.lexers import guess_lexer
105
106
# Guess based on code content only
107
code = '''
108
function hello() {
109
console.log("Hello, World!");
110
}
111
'''
112
lexer = guess_lexer(code) # Likely returns JavaScript lexer
113
```
114
115
### Advanced Lexer Guessing
116
117
Combined filename and content analysis for better accuracy.
118
119
```python { .api }
120
def guess_lexer_for_filename(_fn: str, _text: str, **options):
121
"""
122
Guess lexer using both filename and text analysis.
123
124
Parameters:
125
- _fn: File name for initial lexer detection
126
- _text: Source code for analysis and disambiguation
127
- **options: Lexer-specific options
128
129
Returns:
130
Lexer instance with highest combined confidence score
131
132
Raises:
133
ClassNotFound: If no suitable lexer is found
134
"""
135
```
136
137
```python { .api }
138
def get_lexer_for_mimetype(_mime: str, **options):
139
"""
140
Get lexer instance by MIME type.
141
142
Parameters:
143
- _mime: MIME type (e.g., 'text/x-python', 'application/javascript')
144
- **options: Lexer-specific options
145
146
Returns:
147
Lexer instance for the given MIME type
148
149
Raises:
150
ClassNotFound: If no lexer handles that MIME type
151
"""
152
```
153
154
Usage example:
155
156
```python
157
from pygments.lexers import guess_lexer_for_filename, get_lexer_for_mimetype
158
159
# Guess using both filename and content (most accurate)
160
with open('script.js', 'r') as f:
161
code = f.read()
162
lexer = guess_lexer_for_filename('script.js', code)
163
164
# Get lexer by MIME type
165
lexer = get_lexer_for_mimetype('text/x-python')
166
lexer = get_lexer_for_mimetype('application/javascript')
167
```
168
169
### Lexer Enumeration
170
171
List all available lexers with their metadata.
172
173
```python { .api }
174
def get_all_lexers(plugins: bool = True) -> Iterator[tuple[str, list[str], list[str], list[str]]]:
175
"""
176
Return generator of all available lexers.
177
178
Parameters:
179
- plugins: Include plugin lexers from entry points (default True)
180
181
Yields:
182
Tuples of (name, aliases, filenames, mimetypes) for each lexer
183
"""
184
```
185
186
Usage example:
187
188
```python
189
from pygments.lexers import get_all_lexers
190
191
# List all lexers
192
for name, aliases, filenames, mimetypes in get_all_lexers():
193
print(f"{name}: {', '.join(aliases)}")
194
if filenames:
195
print(f" Files: {', '.join(filenames)}")
196
if mimetypes:
197
print(f" MIME: {', '.join(mimetypes)}")
198
199
# Find Python-related lexers
200
python_lexers = [
201
(name, aliases) for name, aliases, _, _ in get_all_lexers()
202
if any('python' in alias.lower() for alias in aliases)
203
]
204
```
205
206
### Class-Only Discovery
207
208
Find lexer classes without instantiating them.
209
210
```python { .api }
211
def find_lexer_class(name: str):
212
"""
213
Find lexer class by name without instantiation.
214
215
Parameters:
216
- name: Full lexer name (not alias)
217
218
Returns:
219
Lexer class or None if not found
220
"""
221
```
222
223
```python { .api }
224
def find_lexer_class_by_name(_alias: str):
225
"""
226
Find lexer class by alias without instantiation.
227
228
Parameters:
229
- _alias: Lexer alias
230
231
Returns:
232
Lexer class
233
234
Raises:
235
ClassNotFound: If no lexer with that alias is found
236
"""
237
```
238
239
```python { .api }
240
def find_lexer_class_for_filename(_fn: str, code=None):
241
"""
242
Find lexer class for filename without instantiation.
243
244
Parameters:
245
- _fn: File name for lexer detection
246
- code: Optional code for analysis
247
248
Returns:
249
Lexer class or None if not found
250
"""
251
```
252
253
### Custom Lexer Loading
254
255
Load custom lexers from files.
256
257
```python { .api }
258
def load_lexer_from_file(filename: str, lexername: str = "CustomLexer", **options):
259
"""
260
Load custom lexer from file.
261
262
Parameters:
263
- filename: Path to Python file containing lexer class
264
- lexername: Name of lexer class in file (default "CustomLexer")
265
- **options: Options passed to lexer constructor
266
267
Returns:
268
Custom lexer instance
269
270
Raises:
271
ClassNotFound: If file cannot be loaded or class not found
272
"""
273
```
274
275
Usage example:
276
277
```python
278
from pygments.lexers import load_lexer_from_file
279
280
# Load custom lexer
281
custom_lexer = load_lexer_from_file('my_lexer.py', 'MyLanguageLexer')
282
```
283
284
## Common Lexer Options
285
286
All lexers accept these standard options:
287
288
- `stripnl` (bool): Strip leading/trailing newlines (default True)
289
- `stripall` (bool): Strip all leading/trailing whitespace (default False)
290
- `ensurenl` (bool): Ensure input ends with newline (default True)
291
- `tabsize` (int): Tab size for expansion (default 0, no expansion)
292
- `encoding` (str): Input encoding for byte strings
293
294
## Error Handling
295
296
- **ClassNotFound**: No lexer found for the given criteria
297
- **OptionError**: Invalid lexer options provided
298
- **ImportError**: Issues loading custom lexer files