or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# cssselect

1

2

cssselect is a Python library that parses CSS3 selectors and translates them to XPath 1.0 expressions. It enables developers to use CSS selector syntax to find matching elements in XML or HTML documents through XPath engines like lxml. The library provides a clean API for converting CSS selectors into XPath expressions, making it easier to work with HTML/XML parsing and element selection in Python applications.

3

4

## Package Information

5

6

- **Package Name**: cssselect

7

- **Language**: Python

8

- **Installation**: `pip install cssselect`

9

- **Python Support**: >= 3.9

10

11

## Core Imports

12

13

```python

14

import cssselect

15

```

16

17

Common usage patterns:

18

19

```python

20

from cssselect import GenericTranslator, HTMLTranslator, parse

21

```

22

23

For accessing all public API components:

24

25

```python

26

from cssselect import (

27

ExpressionError,

28

FunctionalPseudoElement,

29

GenericTranslator,

30

HTMLTranslator,

31

Selector,

32

SelectorError,

33

SelectorSyntaxError,

34

parse,

35

)

36

```

37

38

## Basic Usage

39

40

```python

41

from cssselect import GenericTranslator, HTMLTranslator

42

43

# Basic CSS to XPath translation

44

translator = GenericTranslator()

45

xpath = translator.css_to_xpath('div.content > p')

46

print(xpath) # "descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' content ')]/p"

47

48

# HTML-specific translation with pseudo-class support

49

html_translator = HTMLTranslator()

50

xpath = html_translator.css_to_xpath('input:checked')

51

print(xpath) # XPath expression for checked input elements

52

53

# Parse selectors for inspection

54

from cssselect import parse

55

selectors = parse('div.content, #main')

56

for selector in selectors:

57

print(f"Selector: {selector.canonical()}")

58

print(f"Specificity: {selector.specificity()}")

59

```

60

61

## Capabilities

62

63

### CSS Selector Parsing

64

65

Parse CSS selector strings into structured Selector objects for analysis and manipulation.

66

67

```python { .api }

68

def parse(css: str) -> list[Selector]:

69

"""

70

Parse a CSS group of selectors into Selector objects.

71

72

Parameters:

73

- css (str): A group of selectors as a string

74

75

Returns:

76

list[Selector]: List of parsed Selector objects

77

78

Raises:

79

SelectorSyntaxError: On invalid selectors

80

"""

81

```

82

83

### Generic XML Translation

84

85

Translate CSS selectors to XPath expressions for generic XML documents with case-sensitive matching.

86

87

```python { .api }

88

class GenericTranslator:

89

"""

90

Translator for generic XML documents.

91

92

Everything is case-sensitive, no assumption is made on the meaning

93

of element names and attribute names.

94

"""

95

96

def __init__(self):

97

"""Initialize a GenericTranslator instance."""

98

99

def css_to_xpath(self, css: str, prefix: str = "descendant-or-self::") -> str:

100

"""

101

Translate a group of selectors to XPath.

102

103

Parameters:

104

- css (str): A group of selectors as a string

105

- prefix (str): Prepended to XPath expression (default: "descendant-or-self::")

106

107

Returns:

108

str: The equivalent XPath 1.0 expression

109

110

Raises:

111

SelectorSyntaxError: On invalid selectors

112

ExpressionError: On unknown/unsupported selectors

113

"""

114

115

def selector_to_xpath(

116

self,

117

selector: Selector,

118

prefix: str = "descendant-or-self::",

119

translate_pseudo_elements: bool = False

120

) -> str:

121

"""

122

Translate a single parsed selector to XPath.

123

124

Parameters:

125

- selector (Selector): A parsed Selector object

126

- prefix (str): Prepended to XPath expression (default: "descendant-or-self::")

127

- translate_pseudo_elements (bool): Whether to handle pseudo-elements

128

129

Returns:

130

str: The equivalent XPath 1.0 expression

131

132

Raises:

133

ExpressionError: On unknown/unsupported selectors

134

"""

135

136

def xpath_pseudo_element(self, xpath, pseudo_element):

137

"""

138

Handle pseudo-element in XPath translation.

139

140

Parameters:

141

- xpath: XPath expression object

142

- pseudo_element (PseudoElement): Pseudo-element to handle

143

144

Returns:

145

XPath expression with pseudo-element handling

146

"""

147

148

@staticmethod

149

def xpath_literal(s: str) -> str:

150

"""

151

Create properly escaped XPath literal from string.

152

153

Parameters:

154

- s (str): String to escape

155

156

Returns:

157

str: XPath-escaped string literal

158

"""

159

160

# Configuration attributes

161

id_attribute: str = "id" # Attribute used for ID selectors

162

lang_attribute: str = "xml:lang" # Attribute used for :lang() pseudo-class

163

lower_case_element_names: bool = False # Case sensitivity for element names

164

lower_case_attribute_names: bool = False # Case sensitivity for attribute names

165

lower_case_attribute_values: bool = False # Case sensitivity for attribute values

166

```

167

168

### HTML-Specific Translation

169

170

Translate CSS selectors to XPath expressions optimized for HTML documents with HTML-specific pseudo-class support.

171

172

```python { .api }

173

class HTMLTranslator(GenericTranslator):

174

"""

175

Translator for HTML documents.

176

177

Has useful implementations of HTML-specific pseudo-classes and

178

handles HTML case-insensitivity rules.

179

"""

180

181

def __init__(self, xhtml: bool = False):

182

"""

183

Initialize HTML translator.

184

185

Parameters:

186

- xhtml (bool): If False (default), element and attribute names are case-insensitive

187

"""

188

189

# Overridden configuration attributes

190

lang_attribute: str = "lang" # Uses 'lang' instead of 'xml:lang' for HTML

191

```

192

193

### Selector Objects

194

195

Work with parsed CSS selectors as structured objects for analysis and manipulation.

196

197

```python { .api }

198

class Selector:

199

"""

200

Represents a parsed CSS selector.

201

"""

202

203

def __init__(self, tree: Tree, pseudo_element: PseudoElement | None = None):

204

"""

205

Create a Selector object.

206

207

Parameters:

208

- tree (Tree): The parsed selector tree

209

- pseudo_element (PseudoElement | None): Pseudo-element if present

210

"""

211

212

def canonical(self) -> str:

213

"""

214

Return a CSS representation for this selector.

215

216

Returns:

217

str: CSS selector string

218

"""

219

220

def specificity(self) -> tuple[int, int, int]:

221

"""

222

Return the CSS specificity of this selector.

223

224

Returns:

225

tuple[int, int, int]: Specificity as (a, b, c) tuple per CSS specification

226

"""

227

228

# Attributes

229

parsed_tree: Tree # The parsed selector tree

230

pseudo_element: PseudoElement | None # Pseudo-element if present

231

```

232

233

### Functional Pseudo-Elements

234

235

Handle functional pseudo-elements with arguments like `::name(arguments)`.

236

237

```python { .api }

238

class FunctionalPseudoElement:

239

"""

240

Represents functional pseudo-elements like ::name(arguments).

241

"""

242

243

def __init__(self, name: str, arguments: Sequence[Token]):

244

"""

245

Create a functional pseudo-element.

246

247

Parameters:

248

- name (str): The pseudo-element name

249

- arguments (Sequence[Token]): The argument tokens

250

"""

251

252

def argument_types(self) -> list[str]:

253

"""

254

Get the types of the pseudo-element arguments.

255

256

Returns:

257

list[str]: List of argument token types

258

"""

259

260

def canonical(self) -> str:

261

"""

262

Return CSS representation of the functional pseudo-element.

263

264

Returns:

265

str: CSS pseudo-element string

266

"""

267

268

# Attributes

269

name: str # The pseudo-element name

270

arguments: Sequence[Token] # The argument tokens

271

```

272

273

## Exception Handling

274

275

### Exception Types

276

277

```python { .api }

278

class SelectorError(Exception):

279

"""

280

Base exception for CSS selector related errors.

281

282

Common parent for SelectorSyntaxError and ExpressionError.

283

Use except SelectorError: to catch both exception types.

284

"""

285

286

class SelectorSyntaxError(SelectorError, SyntaxError):

287

"""

288

Exception raised when parsing a selector that does not match the CSS grammar.

289

"""

290

291

class ExpressionError(SelectorError, RuntimeError):

292

"""

293

Exception raised for unknown or unsupported selector features during XPath translation.

294

"""

295

```

296

297

### Error Handling Examples

298

299

**Basic error handling:**

300

301

```python

302

from cssselect import GenericTranslator, SelectorError

303

304

translator = GenericTranslator()

305

306

try:

307

xpath = translator.css_to_xpath('div.content > p')

308

except SelectorError as e:

309

print(f"Selector error: {e}")

310

```

311

312

**Specific error handling:**

313

314

```python

315

from cssselect import parse, SelectorSyntaxError, ExpressionError

316

317

try:

318

selectors = parse('div.content > p')

319

# Process selectors...

320

except SelectorSyntaxError as e:

321

print(f"Invalid CSS syntax: {e}")

322

except ExpressionError as e:

323

print(f"Unsupported selector feature: {e}")

324

```

325

326

## Advanced Usage

327

328

### Selector Analysis

329

330

```python

331

from cssselect import parse

332

333

# Analyze selector specificity and structure

334

selectors = parse('div.content #main, body > nav a:hover')

335

for selector in selectors:

336

print(f"Selector: {selector.canonical()}")

337

print(f"Specificity: {selector.specificity()}")

338

if selector.pseudo_element:

339

print(f"Pseudo-element: {selector.pseudo_element}")

340

```

341

342

### Custom Translation

343

344

```python

345

from cssselect import GenericTranslator

346

347

# Use custom prefix for XPath expression

348

translator = GenericTranslator()

349

xpath = translator.css_to_xpath('div > p', prefix="./")

350

print(xpath) # "./div/p"

351

352

# Translate single selector with pseudo-element handling

353

from cssselect import parse

354

selectors = parse('div::before')

355

xpath = translator.selector_to_xpath(

356

selectors[0],

357

prefix="descendant::",

358

translate_pseudo_elements=True

359

)

360

```

361

362

### HTML vs Generic Translation

363

364

```python

365

from cssselect import GenericTranslator, HTMLTranslator

366

367

css = 'INPUT:checked'

368

369

# Generic (case-sensitive) translation

370

generic = GenericTranslator()

371

generic_xpath = generic.css_to_xpath(css)

372

373

# HTML (case-insensitive with HTML pseudo-classes) translation

374

html = HTMLTranslator()

375

html_xpath = html.css_to_xpath(css)

376

377

print(f"Generic: {generic_xpath}")

378

print(f"HTML: {html_xpath}")

379

```

380

381

## Parsed Selector Tree Components

382

383

Advanced users working with parsed selectors may encounter these tree node classes:

384

385

### Tree Node Classes

386

387

```python { .api }

388

class Element:

389

"""Represents element selectors (tag, *, namespace|tag)."""

390

def canonical(self) -> str: ...

391

def specificity(self) -> tuple[int, int, int]: ...

392

393

class Class:

394

"""Represents class selectors (.classname)."""

395

def canonical(self) -> str: ...

396

def specificity(self) -> tuple[int, int, int]: ...

397

398

class Hash:

399

"""Represents ID selectors (#id)."""

400

def canonical(self) -> str: ...

401

def specificity(self) -> tuple[int, int, int]: ...

402

403

class Attrib:

404

"""Represents attribute selectors ([attr], [attr=val], etc.)."""

405

def canonical(self) -> str: ...

406

def specificity(self) -> tuple[int, int, int]: ...

407

408

class Pseudo:

409

"""Represents pseudo-class selectors (:hover, :first-child)."""

410

def canonical(self) -> str: ...

411

def specificity(self) -> tuple[int, int, int]: ...

412

413

class Function:

414

"""Represents functional pseudo-classes (:nth-child(2n+1))."""

415

def canonical(self) -> str: ...

416

def specificity(self) -> tuple[int, int, int]: ...

417

418

class Negation:

419

"""Represents :not() pseudo-class."""

420

def canonical(self) -> str: ...

421

def specificity(self) -> tuple[int, int, int]: ...

422

423

class Relation:

424

"""Represents :has() relational pseudo-class."""

425

def canonical(self) -> str: ...

426

def specificity(self) -> tuple[int, int, int]: ...

427

428

class Matching:

429

"""Represents :is() pseudo-class."""

430

def canonical(self) -> str: ...

431

def specificity(self) -> tuple[int, int, int]: ...

432

433

class SpecificityAdjustment:

434

"""Represents :where() pseudo-class."""

435

def canonical(self) -> str: ...

436

def specificity(self) -> tuple[int, int, int]: ...

437

438

class CombinedSelector:

439

"""Represents combined selectors with combinators ('>', '+', '~', ' ')."""

440

def canonical(self) -> str: ...

441

def specificity(self) -> tuple[int, int, int]: ...

442

```

443

444

## Types

445

446

### Core Types

447

448

```python { .api }

449

# Type aliases for internal selector tree structure

450

Tree = Union[

451

Element, Hash, Class, Function, Pseudo, Attrib,

452

Negation, Relation, Matching, SpecificityAdjustment, CombinedSelector

453

]

454

455

PseudoElement = Union[FunctionalPseudoElement, str]

456

```

457

458

### Token Type

459

460

```python { .api }

461

class Token(tuple[str, Optional[str]]):

462

"""

463

Represents a CSS token during parsing.

464

465

Token types include: IDENT, HASH, STRING, S (whitespace), DELIM, NUMBER, EOF

466

"""

467

468

def __new__(cls, type_: str, value: str | None, pos: int):

469

"""

470

Create a new token.

471

472

Parameters:

473

- type_ (str): Token type (IDENT, HASH, STRING, S, DELIM, NUMBER, EOF)

474

- value (str | None): Token value

475

- pos (int): Position in source string

476

"""

477

478

def is_delim(self, *values: str) -> bool:

479

"""

480

Check if token is delimiter with specific value(s).

481

482

Parameters:

483

- *values (str): Values to check against

484

485

Returns:

486

bool: True if token is delimiter with one of the specified values

487

"""

488

489

def css(self) -> str:

490

"""

491

Return CSS representation of the token.

492

493

Returns:

494

str: CSS string representation

495

"""

496

497

# Properties

498

type: str # Token type

499

value: str | None # Token value

500

pos: int # Position in source

501

502

class EOFToken(Token):

503

"""Special end-of-file token."""

504

```

505

506

## Utility Functions

507

508

Advanced parsing and string manipulation utilities:

509

510

```python { .api }

511

def parse_series(tokens) -> tuple[int, int]:

512

"""

513

Parse :nth-child() style arguments like '2n+1'.

514

515

Parameters:

516

- tokens: Iterable of tokens representing the series expression

517

518

Returns:

519

tuple[int, int]: (a, b) values for an + b expression

520

"""

521

522

def ascii_lower(string: str) -> str:

523

"""

524

ASCII-only lowercase conversion.

525

526

Parameters:

527

- string (str): String to convert

528

529

Returns:

530

str: Lowercase string using ASCII rules only

531

"""

532

533

def unescape_ident(value: str) -> str:

534

"""

535

Unescape CSS identifier strings.

536

537

Parameters:

538

- value (str): CSS identifier with possible escape sequences

539

540

Returns:

541

str: Unescaped identifier string

542

"""

543

```

544

545

## Package Version

546

547

```python { .api }

548

VERSION = "1.3.0"

549

__version__ = "1.3.0"

550

```