A syntax highlighting package that supports over 500 programming languages and text formats with extensive output format options
npx @tessl/cli install tessl/pypi-pygments@2.19.00
# Pygments
1
2
A comprehensive syntax highlighting package that supports over 500 programming languages and text formats. Pygments is a generic syntax highlighter designed for use in code hosting platforms, forums, wikis, and other applications requiring source code prettification.
3
4
## Package Information
5
6
- **Package Name**: Pygments
7
- **Language**: Python
8
- **Installation**: `pip install Pygments`
9
- **Python Version**: >=3.8
10
11
## Core Imports
12
13
```python
14
import pygments
15
```
16
17
Common for highlighting code:
18
19
```python
20
from pygments import highlight
21
from pygments.lexers import get_lexer_by_name
22
from pygments.formatters import HtmlFormatter
23
```
24
25
Or using the high-level API:
26
27
```python
28
from pygments import lex, format, highlight
29
```
30
31
## Basic Usage
32
33
```python
34
from pygments import highlight
35
from pygments.lexers import PythonLexer
36
from pygments.formatters import HtmlFormatter
37
38
# Highlight Python code to HTML
39
code = '''
40
def hello_world():
41
print("Hello, World!")
42
return True
43
'''
44
45
# Basic highlighting
46
result = highlight(code, PythonLexer(), HtmlFormatter())
47
print(result)
48
49
# Using lexer lookup by name
50
from pygments.lexers import get_lexer_by_name
51
from pygments.formatters import get_formatter_by_name
52
53
lexer = get_lexer_by_name('python')
54
formatter = get_formatter_by_name('html')
55
result = highlight(code, lexer, formatter)
56
```
57
58
## Architecture
59
60
Pygments follows a three-stage pipeline architecture:
61
62
- **Lexers**: Tokenize source code into semantic tokens (keywords, strings, comments, etc.)
63
- **Formatters**: Convert token streams into various output formats (HTML, LaTeX, RTF, SVG, terminal, etc.)
64
- **Styles**: Define color schemes and formatting for different token types
65
- **Filters**: Post-process token streams for special effects (highlighting names, visible whitespace, etc.)
66
67
The modular design allows mixing any lexer with any formatter and style, providing extensive customization while maintaining clean separation of concerns.
68
69
## Capabilities
70
71
### High-Level API
72
73
Core highlighting functions that provide the most convenient interface for syntax highlighting tasks.
74
75
```python { .api }
76
def lex(code: str, lexer) -> Iterator[tuple[TokenType, str]]: ...
77
def format(tokens: Iterator[tuple[TokenType, str]], formatter, outfile=None) -> str: ...
78
def highlight(code: str, lexer, formatter, outfile=None) -> str: ...
79
```
80
81
[High-Level API](./high-level-api.md)
82
83
### Lexer Management
84
85
Functions for discovering, loading, and working with syntax lexers for different programming languages and text formats.
86
87
```python { .api }
88
def get_lexer_by_name(_alias: str, **options): ...
89
def get_lexer_for_filename(_fn: str, code=None, **options): ...
90
def guess_lexer(_text: str, **options): ...
91
def get_all_lexers(plugins: bool = True) -> Iterator[tuple[str, list[str], list[str], list[str]]]: ...
92
```
93
94
[Lexer Management](./lexer-management.md)
95
96
### Formatter Management
97
98
Functions for working with output formatters that convert highlighted tokens into various formats.
99
100
```python { .api }
101
def get_formatter_by_name(_alias: str, **options): ...
102
def get_formatter_for_filename(fn: str, **options): ...
103
def get_all_formatters() -> Iterator[type]: ...
104
```
105
106
[Formatter Management](./formatter-management.md)
107
108
### Style Management
109
110
Functions for working with color schemes and visual styles for highlighted code.
111
112
```python { .api }
113
def get_style_by_name(name: str): ...
114
def get_all_styles() -> Iterator[str]: ...
115
def find_plugin_styles() -> Iterator[type]: ...
116
```
117
118
[Style Management](./style-management.md)
119
120
### Filter System
121
122
Token stream filters for post-processing highlighted code with special effects and transformations.
123
124
```python { .api }
125
def get_filter_by_name(filtername: str, **options): ...
126
def get_all_filters() -> Iterator[str]: ...
127
```
128
129
[Filter System](./filter-system.md)
130
131
### Custom Lexers and Formatters
132
133
Base classes and utilities for creating custom lexers and formatters.
134
135
```python { .api }
136
class Lexer: ...
137
class RegexLexer(Lexer): ...
138
class Formatter: ...
139
class Style: ...
140
```
141
142
[Custom Components](./custom-components.md)
143
144
### Command Line Interface
145
146
The `pygmentize` command-line tool for syntax highlighting from the terminal.
147
148
```bash { .api }
149
pygmentize [options] [file]
150
pygmentize -l <lexer> -f <formatter> [options] [file]
151
pygmentize -g [options] [file] # guess lexer
152
```
153
154
[Command Line Interface](./command-line.md)
155
156
### Plugin System
157
158
Plugin loading and discovery system for external lexers, formatters, and styles.
159
160
```python { .api }
161
def find_plugin_lexers() -> Iterator[type]: ...
162
def find_plugin_formatters() -> Iterator[type]: ...
163
def find_plugin_styles() -> Iterator[type]: ...
164
def find_plugin_filters() -> Iterator[type]: ...
165
```
166
167
### Modeline Parsing
168
169
Editor modeline parsing for automatic lexer detection.
170
171
```python { .api }
172
def get_filetype_from_buffer(buf: bytes, max_chars: int = 65536) -> str: ...
173
```
174
175
## Token Types
176
177
Core token type system used throughout Pygments for semantic categorization of code elements.
178
179
```python { .api }
180
class _TokenType(tuple):
181
def split(self) -> list[_TokenType]: ...
182
def __contains__(self, val) -> bool: ...
183
184
# Root token type
185
Token: _TokenType
186
187
# Common token types
188
Text: _TokenType
189
Whitespace: _TokenType
190
Error: _TokenType
191
Other: _TokenType
192
Keyword: _TokenType
193
Name: _TokenType
194
Literal: _TokenType
195
String: _TokenType
196
Number: _TokenType
197
Punctuation: _TokenType
198
Operator: _TokenType
199
Comment: _TokenType
200
Generic: _TokenType
201
```
202
203
## Utility Functions
204
205
Core utility functions for text processing, option handling, and encoding detection.
206
207
```python { .api }
208
def string_to_tokentype(s: str) -> _TokenType: ...
209
def is_token_subtype(ttype: _TokenType, other: _TokenType) -> bool: ...
210
def get_bool_opt(options: dict, optname: str, default=None) -> bool: ...
211
def get_int_opt(options: dict, optname: str, default=None) -> int: ...
212
def get_list_opt(options: dict, optname: str, default=None) -> list: ...
213
def docstring_headline(obj) -> str: ...
214
def make_analysator(f): ...
215
def shebang_matches(text: str, regex) -> bool: ...
216
def doctype_matches(text: str, regex) -> bool: ...
217
def html_doctype_matches(text: str) -> bool: ...
218
def looks_like_xml(text: str) -> bool: ...
219
def surrogatepair(c: int) -> tuple[int, int]: ...
220
def format_lines(var_name: str, seq, raw: bool = False, indent_level: int = 0) -> str: ...
221
def duplicates_removed(it, already_seen=()) -> Iterator: ...
222
```
223
224
## Exception Classes
225
226
```python { .api }
227
class ClassNotFound(ValueError):
228
"""Raised when lookup functions can't find a matching class."""
229
230
class OptionError(Exception):
231
"""Raised by option processing functions for invalid options."""
232
```