or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

ast-utilities.mdcli.mdcore-engine.mdindex.mdplugin-system.mdstring-processing.mdtoken-manipulation.md

core-engine.mddocs/

0

# Core Transformation Engine

1

2

Core functionality for applying syntax transformations through plugin and token systems. The engine operates in two phases: plugin-based AST transformations followed by token-level fixes.

3

4

## Capabilities

5

6

### Plugin-Based Transformations

7

8

Apply all registered plugin transformations to source code through AST analysis.

9

10

```python { .api }

11

def _fix_plugins(contents_text: str, settings: Settings) -> str:

12

"""

13

Apply all plugin-based AST transformations to source code.

14

15

Args:

16

contents_text: Python source code to transform

17

settings: Configuration settings for transformations

18

19

Returns:

20

Transformed source code with plugin fixes applied

21

22

Notes:

23

- Returns original code if syntax errors occur

24

- Applies token fixup for DEDENT/UNIMPORTANT_WS ordering

25

- Processes callbacks in reverse token order for correct offsets

26

"""

27

```

28

29

### Token-Level Transformations

30

31

Apply token-level transformations for string literals, parentheses, and format strings.

32

33

```python { .api }

34

def _fix_tokens(contents_text: str) -> str:

35

"""

36

Apply token-level transformations to source code.

37

38

Args:

39

contents_text: Python source code to transform

40

41

Returns:

42

Transformed source code with token fixes applied

43

44

Transformations:

45

- Fix escape sequences in string literals

46

- Remove 'u' prefix from Unicode strings

47

- Remove extraneous parentheses

48

- Simplify format string literals

49

- Convert string.encode() to binary literals

50

- Remove encoding cookies from file headers

51

"""

52

```

53

54

### Utility Functions

55

56

Core utility functions used throughout the transformation process.

57

58

```python { .api }

59

def inty(s: str) -> bool:

60

"""

61

Check if string represents an integer.

62

63

Args:

64

s: String to check

65

66

Returns:

67

True if string can be converted to int, False otherwise

68

69

Notes:

70

Uses try/except to handle ValueError and TypeError gracefully

71

"""

72

```

73

74

### Configuration Settings

75

76

Configuration object controlling transformation behavior.

77

78

```python { .api }

79

class Settings(NamedTuple):

80

"""

81

Configuration settings for pyupgrade transformations.

82

83

Attributes:

84

min_version: Minimum Python version tuple (e.g., (3, 10))

85

keep_percent_format: Preserve %-style format strings

86

keep_mock: Preserve mock imports instead of unittest.mock

87

keep_runtime_typing: Preserve typing imports at runtime

88

"""

89

min_version: Version = (3,)

90

keep_percent_format: bool = False

91

keep_mock: bool = False

92

keep_runtime_typing: bool = False

93

```

94

95

### Token Ordering Fix

96

97

Fix misordered DEDENT and UNIMPORTANT_WS tokens from tokenize-rt.

98

99

```python { .api }

100

def _fixup_dedent_tokens(tokens: list[Token]) -> None:

101

"""

102

Fix misordered DEDENT/UNIMPORTANT_WS tokens.

103

104

Args:

105

tokens: Token list to fix in-place

106

107

Notes:

108

Addresses tokenize-rt issue where DEDENT and UNIMPORTANT_WS

109

tokens appear in wrong order in certain indentation patterns.

110

"""

111

```

112

113

## String Literal Processing

114

115

### Escape Sequence Constants

116

117

Constants used for validating and processing escape sequences in string literals.

118

119

```python { .api }

120

ESCAPE_STARTS: frozenset[str]

121

"""

122

Valid escape sequence starting characters.

123

124

Contains:

125

- Newline characters: '\n', '\r'

126

- Quote characters: '\\', "'", '"'

127

- Named escapes: 'a', 'b', 'f', 'n', 'r', 't', 'v'

128

- Octal digits: '0'-'7'

129

- Hex escape: 'x'

130

"""

131

132

ESCAPE_RE: re.Pattern[str]

133

"""Regex pattern for matching escape sequences ('\\.', DOTALL)."""

134

135

NAMED_ESCAPE_NAME: re.Pattern[str]

136

"""Regex pattern for matching named Unicode escapes ('{[^}]+}')."""

137

```

138

139

### Escape Sequence Fixes

140

141

Fix invalid escape sequences in string literals.

142

143

```python { .api }

144

def _fix_escape_sequences(token: Token) -> Token:

145

"""

146

Fix invalid escape sequences in string token.

147

148

Args:

149

token: String token to process

150

151

Returns:

152

Token with fixed escape sequences

153

154

Logic:

155

- Skips raw strings and strings without backslashes

156

- Validates escape sequences against Python standards

157

- Adds raw prefix if only invalid escapes found

158

- Escapes invalid sequences if valid ones also present

159

"""

160

```

161

162

### Unicode Prefix Removal

163

164

Remove unnecessary 'u' prefixes from Unicode string literals.

165

166

```python { .api }

167

def _remove_u_prefix(token: Token) -> Token:

168

"""

169

Remove 'u' prefix from Unicode string literals.

170

171

Args:

172

token: String token to process

173

174

Returns:

175

Token with 'u'/'U' prefixes removed

176

"""

177

```

178

179

## Parentheses and Format Processing

180

181

### Extraneous Parentheses Removal

182

183

Remove unnecessary parentheses around expressions.

184

185

```python { .api }

186

def _fix_extraneous_parens(tokens: list[Token], i: int) -> None:

187

"""

188

Remove extraneous parentheses around expressions.

189

190

Args:

191

tokens: Token list to modify in-place

192

i: Index of opening parenthesis token

193

194

Notes:

195

- Preserves tuple syntax (checks for commas)

196

- Preserves generator expressions (checks for yield)

197

- Only removes truly redundant parentheses

198

"""

199

```

200

201

### Format String Simplification

202

203

Simplify format string literals by removing redundant format keys.

204

205

```python { .api }

206

def _fix_format_literal(tokens: list[Token], end: int) -> None:

207

"""

208

Simplify format string literals.

209

210

Args:

211

tokens: Token list to modify in-place

212

end: Index of format method call

213

214

Logic:

215

- Removes positional format keys (0, 1, 2, ...)

216

- Only processes sequential numeric keys

217

- Skips f-strings and malformed format strings

218

"""

219

```

220

221

### String Encoding to Binary

222

223

Convert string.encode() calls to binary string literals.

224

225

```python { .api }

226

def _fix_encode_to_binary(tokens: list[Token], i: int) -> None:

227

"""

228

Convert string.encode() to binary literals.

229

230

Args:

231

tokens: Token list to modify in-place

232

i: Index of 'encode' token

233

234

Supported encodings:

235

- ASCII, UTF-8: Full conversion

236

- ISO-8859-1: Latin-1 compatible conversion

237

- Skips non-ASCII or complex escape sequences

238

"""

239

```

240

241

## Usage Examples

242

243

### Basic Transformation

244

245

```python

246

from pyupgrade._main import _fix_plugins, _fix_tokens

247

from pyupgrade._data import Settings

248

249

# Apply both transformation phases

250

source = "set([1, 2, 3])"

251

settings = Settings(min_version=(3, 8))

252

253

# Phase 1: Plugin transformations

254

transformed = _fix_plugins(source, settings)

255

# Result: "{1, 2, 3}"

256

257

# Phase 2: Token transformations

258

final = _fix_tokens(transformed)

259

```

260

261

### Custom Settings

262

263

```python

264

# Configure for Python 3.10+ with format preservation

265

settings = Settings(

266

min_version=(3, 10),

267

keep_percent_format=True,

268

keep_mock=True,

269

keep_runtime_typing=False

270

)

271

272

transformed = _fix_plugins(source_code, settings)

273

```