or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-ply

Python implementation of lex and yacc parsing tools with LALR(1) algorithm and zero dependencies

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/ply@2022.10.x

To install, run

npx @tessl/cli install tessl/pypi-ply@2022.10.0

0

# PLY (Python Lex-Yacc)

1

2

PLY is a pure Python implementation of the popular Unix parsing tools lex and yacc. It provides a complete framework for building lexical analyzers and parsers using the LALR(1) parsing algorithm, designed for creating compilers, interpreters, protocol decoders, and other language processing tools.

3

4

## Package Information

5

6

- **Package Name**: ply

7

- **Language**: Python

8

- **Installation**: Copy directly from GitHub (no longer distributed via PyPI)

9

- **Repository**: https://github.com/dabeaz/ply

10

- **Version**: 2022.10.27

11

12

## Core Imports

13

14

```python

15

import ply.lex as lex

16

import ply.yacc as yacc

17

```

18

19

Alternative import patterns:

20

21

```python

22

from ply import lex

23

from ply import yacc

24

```

25

26

## Basic Usage

27

28

```python

29

import ply.lex as lex

30

import ply.yacc as yacc

31

32

# Define tokens for lexical analysis

33

tokens = (

34

'NAME',

35

'NUMBER',

36

'PLUS',

37

'MINUS',

38

'TIMES',

39

'DIVIDE',

40

'LPAREN',

41

'RPAREN',

42

)

43

44

# Token rules

45

t_PLUS = r'\+'

46

t_MINUS = r'-'

47

t_TIMES = r'\*'

48

t_DIVIDE = r'/'

49

t_LPAREN = r'\('

50

t_RPAREN = r'\)'

51

t_ignore = ' \t'

52

53

def t_NAME(t):

54

r'[a-zA-Z_][a-zA-Z_0-9]*'

55

return t

56

57

def t_NUMBER(t):

58

r'\d+'

59

t.value = int(t.value)

60

return t

61

62

def t_newline(t):

63

r'\n+'

64

t.lexer.lineno += len(t.value)

65

66

def t_error(t):

67

print(f"Illegal character '{t.value[0]}'")

68

t.lexer.skip(1)

69

70

# Build the lexer

71

lexer = lex.lex()

72

73

# Define grammar rules for parsing

74

def p_expression_binop(p):

75

'''expression : expression PLUS term

76

| expression MINUS term'''

77

if p[2] == '+':

78

p[0] = p[1] + p[3]

79

elif p[2] == '-':

80

p[0] = p[1] - p[3]

81

82

def p_expression_term(p):

83

'''expression : term'''

84

p[0] = p[1]

85

86

def p_term_binop(p):

87

'''term : term TIMES factor

88

| term DIVIDE factor'''

89

if p[2] == '*':

90

p[0] = p[1] * p[3]

91

elif p[2] == '/':

92

p[0] = p[1] / p[3]

93

94

def p_term_factor(p):

95

'''term : factor'''

96

p[0] = p[1]

97

98

def p_factor_num(p):

99

'''factor : NUMBER'''

100

p[0] = p[1]

101

102

def p_factor_expr(p):

103

'''factor : LPAREN expression RPAREN'''

104

p[0] = p[2]

105

106

def p_error(p):

107

if p:

108

print(f"Syntax error at token {p.type}")

109

else:

110

print("Syntax error at EOF")

111

112

# Build the parser

113

parser = yacc.yacc()

114

115

# Parse input

116

result = parser.parse("3 + 4 * 2", lexer=lexer)

117

print(f"Result: {result}") # Output: Result: 11

118

```

119

120

## Architecture

121

122

PLY follows the traditional Unix lex/yacc design with two separate but coordinated phases:

123

124

- **Lexical Analysis (`lex`)**: Converts raw text into tokens using regular expressions and state machines

125

- **Syntax Analysis (`yacc`)**: Parses token streams into structured data using LALR(1) grammar rules

126

- **Convention-based API**: Uses function/variable naming patterns for automatic rule discovery

127

- **Error Recovery**: Comprehensive error handling and recovery mechanisms for both phases

128

129

The design emphasizes simplicity and educational value while providing production-ready parsing capabilities.

130

131

## Capabilities

132

133

### Lexical Analysis

134

135

Tokenizes input text using regular expressions and finite state machines. Supports multiple lexer states, line tracking, error handling, and flexible token rules defined through naming conventions.

136

137

```python { .api }

138

def lex(*, module=None, object=None, debug=False, reflags=int(re.VERBOSE), debuglog=None, errorlog=None): ...

139

def TOKEN(r): ...

140

def runmain(lexer=None, data=None): ...

141

142

class Lexer:

143

def input(self, s): ...

144

def token(self): ...

145

def clone(self, object=None): ...

146

def begin(self, state): ...

147

def push_state(self, state): ...

148

def pop_state(self): ...

149

def current_state(self): ...

150

def skip(self, n): ...

151

def __iter__(self): ...

152

def __next__(self): ...

153

lineno: int

154

lexpos: int

155

156

class LexToken:

157

type: str

158

value: any

159

lineno: int

160

lexpos: int

161

```

162

163

[Lexical Analysis](./lexical-analysis.md)

164

165

### Syntax Parsing

166

167

Parses token streams using LALR(1) algorithm with grammar rules defined in function docstrings. Supports precedence rules, error recovery, debugging, and ambiguity resolution.

168

169

```python { .api }

170

def yacc(*, debug=False, module=None, start=None, check_recursion=True, optimize=False, debugfile='parser.out', debuglog=None, errorlog=None): ...

171

def format_result(r): ...

172

def format_stack_entry(r): ...

173

174

class LRParser:

175

def parse(self, input=None, lexer=None, debug=False, tracking=False): ...

176

def errok(self): ...

177

def restart(self): ...

178

def set_defaulted_states(self): ...

179

def disable_defaulted_states(self): ...

180

181

class YaccProduction:

182

def lineno(self, n): ...

183

def set_lineno(self, n, lineno): ...

184

def linespan(self, n): ...

185

def lexpos(self, n): ...

186

def set_lexpos(self, n, lexpos): ...

187

def lexspan(self, n): ...

188

def error(self): ...

189

def __getitem__(self, n): ...

190

def __setitem__(self, n, v): ...

191

def __len__(self): ...

192

slice: list

193

stack: list

194

lexer: object

195

parser: object

196

```

197

198

[Syntax Parsing](./syntax-parsing.md)

199

200

## Types

201

202

```python { .api }

203

class LexError(Exception):

204

"""Exception raised for lexical analysis errors"""

205

text: str

206

207

class YaccError(Exception):

208

"""Base exception for parser errors"""

209

210

class GrammarError(YaccError):

211

"""Exception for grammar specification errors"""

212

213

class LALRError(YaccError):

214

"""Exception for LALR parsing algorithm errors"""

215

216

class PlyLogger:

217

"""Logging utility for PLY operations"""

218

def critical(self, msg, *args, **kwargs): ...

219

def warning(self, msg, *args, **kwargs): ...

220

def error(self, msg, *args, **kwargs): ...

221

def info(self, msg, *args, **kwargs): ...

222

def debug(self, msg, *args, **kwargs): ...

223

224

class NullLogger:

225

"""Null logging implementation"""

226

def debug(self, msg, *args, **kwargs): ...

227

def warning(self, msg, *args, **kwargs): ...

228

def error(self, msg, *args, **kwargs): ...

229

def info(self, msg, *args, **kwargs): ...

230

def critical(self, msg, *args, **kwargs): ...

231

232

class YaccSymbol:

233

"""Internal parser symbol representation"""

234

def __str__(self): ...

235

def __repr__(self): ...

236

237

# Configuration constants

238

yaccdebug: bool = False

239

debug_file: str = 'parser.out'

240

error_count: int = 3

241

resultlimit: int = 40

242

MAXINT: int

243

StringTypes: tuple = (str, bytes)

244

245

# Package version

246

__version__: str = '2022.10.27'

247

```