or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-pygments

A syntax highlighting package that supports over 500 programming languages and text formats with extensive output format options

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pygments@2.19.x

To install, run

npx @tessl/cli install tessl/pypi-pygments@2.19.0

0

# Pygments

1

2

A comprehensive syntax highlighting package that supports over 500 programming languages and text formats. Pygments is a generic syntax highlighter designed for use in code hosting platforms, forums, wikis, and other applications requiring source code prettification.

3

4

## Package Information

5

6

- **Package Name**: Pygments

7

- **Language**: Python

8

- **Installation**: `pip install Pygments`

9

- **Python Version**: >=3.8

10

11

## Core Imports

12

13

```python

14

import pygments

15

```

16

17

Common for highlighting code:

18

19

```python

20

from pygments import highlight

21

from pygments.lexers import get_lexer_by_name

22

from pygments.formatters import HtmlFormatter

23

```

24

25

Or using the high-level API:

26

27

```python

28

from pygments import lex, format, highlight

29

```

30

31

## Basic Usage

32

33

```python

34

from pygments import highlight

35

from pygments.lexers import PythonLexer

36

from pygments.formatters import HtmlFormatter

37

38

# Highlight Python code to HTML

39

code = '''

40

def hello_world():

41

print("Hello, World!")

42

return True

43

'''

44

45

# Basic highlighting

46

result = highlight(code, PythonLexer(), HtmlFormatter())

47

print(result)

48

49

# Using lexer lookup by name

50

from pygments.lexers import get_lexer_by_name

51

from pygments.formatters import get_formatter_by_name

52

53

lexer = get_lexer_by_name('python')

54

formatter = get_formatter_by_name('html')

55

result = highlight(code, lexer, formatter)

56

```

57

58

## Architecture

59

60

Pygments follows a three-stage pipeline architecture:

61

62

- **Lexers**: Tokenize source code into semantic tokens (keywords, strings, comments, etc.)

63

- **Formatters**: Convert token streams into various output formats (HTML, LaTeX, RTF, SVG, terminal, etc.)

64

- **Styles**: Define color schemes and formatting for different token types

65

- **Filters**: Post-process token streams for special effects (highlighting names, visible whitespace, etc.)

66

67

The modular design allows mixing any lexer with any formatter and style, providing extensive customization while maintaining clean separation of concerns.

68

69

## Capabilities

70

71

### High-Level API

72

73

Core highlighting functions that provide the most convenient interface for syntax highlighting tasks.

74

75

```python { .api }

76

def lex(code: str, lexer) -> Iterator[tuple[TokenType, str]]: ...

77

def format(tokens: Iterator[tuple[TokenType, str]], formatter, outfile=None) -> str: ...

78

def highlight(code: str, lexer, formatter, outfile=None) -> str: ...

79

```

80

81

[High-Level API](./high-level-api.md)

82

83

### Lexer Management

84

85

Functions for discovering, loading, and working with syntax lexers for different programming languages and text formats.

86

87

```python { .api }

88

def get_lexer_by_name(_alias: str, **options): ...

89

def get_lexer_for_filename(_fn: str, code=None, **options): ...

90

def guess_lexer(_text: str, **options): ...

91

def get_all_lexers(plugins: bool = True) -> Iterator[tuple[str, list[str], list[str], list[str]]]: ...

92

```

93

94

[Lexer Management](./lexer-management.md)

95

96

### Formatter Management

97

98

Functions for working with output formatters that convert highlighted tokens into various formats.

99

100

```python { .api }

101

def get_formatter_by_name(_alias: str, **options): ...

102

def get_formatter_for_filename(fn: str, **options): ...

103

def get_all_formatters() -> Iterator[type]: ...

104

```

105

106

[Formatter Management](./formatter-management.md)

107

108

### Style Management

109

110

Functions for working with color schemes and visual styles for highlighted code.

111

112

```python { .api }

113

def get_style_by_name(name: str): ...

114

def get_all_styles() -> Iterator[str]: ...

115

def find_plugin_styles() -> Iterator[type]: ...

116

```

117

118

[Style Management](./style-management.md)

119

120

### Filter System

121

122

Token stream filters for post-processing highlighted code with special effects and transformations.

123

124

```python { .api }

125

def get_filter_by_name(filtername: str, **options): ...

126

def get_all_filters() -> Iterator[str]: ...

127

```

128

129

[Filter System](./filter-system.md)

130

131

### Custom Lexers and Formatters

132

133

Base classes and utilities for creating custom lexers and formatters.

134

135

```python { .api }

136

class Lexer: ...

137

class RegexLexer(Lexer): ...

138

class Formatter: ...

139

class Style: ...

140

```

141

142

[Custom Components](./custom-components.md)

143

144

### Command Line Interface

145

146

The `pygmentize` command-line tool for syntax highlighting from the terminal.

147

148

```bash { .api }

149

pygmentize [options] [file]

150

pygmentize -l <lexer> -f <formatter> [options] [file]

151

pygmentize -g [options] [file] # guess lexer

152

```

153

154

[Command Line Interface](./command-line.md)

155

156

### Plugin System

157

158

Plugin loading and discovery system for external lexers, formatters, and styles.

159

160

```python { .api }

161

def find_plugin_lexers() -> Iterator[type]: ...

162

def find_plugin_formatters() -> Iterator[type]: ...

163

def find_plugin_styles() -> Iterator[type]: ...

164

def find_plugin_filters() -> Iterator[type]: ...

165

```

166

167

### Modeline Parsing

168

169

Editor modeline parsing for automatic lexer detection.

170

171

```python { .api }

172

def get_filetype_from_buffer(buf: bytes, max_chars: int = 65536) -> str: ...

173

```

174

175

## Token Types

176

177

Core token type system used throughout Pygments for semantic categorization of code elements.

178

179

```python { .api }

180

class _TokenType(tuple):

181

def split(self) -> list[_TokenType]: ...

182

def __contains__(self, val) -> bool: ...

183

184

# Root token type

185

Token: _TokenType

186

187

# Common token types

188

Text: _TokenType

189

Whitespace: _TokenType

190

Error: _TokenType

191

Other: _TokenType

192

Keyword: _TokenType

193

Name: _TokenType

194

Literal: _TokenType

195

String: _TokenType

196

Number: _TokenType

197

Punctuation: _TokenType

198

Operator: _TokenType

199

Comment: _TokenType

200

Generic: _TokenType

201

```

202

203

## Utility Functions

204

205

Core utility functions for text processing, option handling, and encoding detection.

206

207

```python { .api }

208

def string_to_tokentype(s: str) -> _TokenType: ...

209

def is_token_subtype(ttype: _TokenType, other: _TokenType) -> bool: ...

210

def get_bool_opt(options: dict, optname: str, default=None) -> bool: ...

211

def get_int_opt(options: dict, optname: str, default=None) -> int: ...

212

def get_list_opt(options: dict, optname: str, default=None) -> list: ...

213

def docstring_headline(obj) -> str: ...

214

def make_analysator(f): ...

215

def shebang_matches(text: str, regex) -> bool: ...

216

def doctype_matches(text: str, regex) -> bool: ...

217

def html_doctype_matches(text: str) -> bool: ...

218

def looks_like_xml(text: str) -> bool: ...

219

def surrogatepair(c: int) -> tuple[int, int]: ...

220

def format_lines(var_name: str, seq, raw: bool = False, indent_level: int = 0) -> str: ...

221

def duplicates_removed(it, already_seen=()) -> Iterator: ...

222

```

223

224

## Exception Classes

225

226

```python { .api }

227

class ClassNotFound(ValueError):

228

"""Raised when lookup functions can't find a matching class."""

229

230

class OptionError(Exception):

231

"""Raised by option processing functions for invalid options."""

232

```