0
# SelectorList Operations
1
2
Batch operations on multiple selectors with chainable methods for filtering, extracting, and transforming collections of selected elements. SelectorList extends Python's list class with selector-specific functionality.
3
4
## Capabilities
5
6
### SelectorList Class
7
8
A list subclass containing multiple Selector objects with chainable selection methods.
9
10
```python { .api }
11
class SelectorList(List["Selector"]):
12
"""
13
List of Selector objects with additional selection methods.
14
15
Supports all standard list operations plus selector-specific methods
16
for batch processing of multiple elements.
17
"""
18
19
def __getitem__(self, pos: Union[SupportsIndex, slice]) -> Union["Selector", "SelectorList[Selector]"]:
20
"""
21
Get selector(s) by index or slice.
22
23
Parameters:
24
- pos: Index or slice object
25
26
Returns:
27
- Single Selector for index access
28
- New SelectorList for slice access
29
"""
30
```
31
32
### Batch Selection Operations
33
34
Apply selection queries across all selectors in the list.
35
36
```python { .api }
37
def xpath(
38
self,
39
xpath: str,
40
namespaces: Optional[Mapping[str, str]] = None,
41
**kwargs: Any,
42
) -> "SelectorList[Selector]":
43
"""
44
Call xpath() on each element and return flattened results.
45
46
Parameters:
47
- xpath (str): XPath expression to apply
48
- namespaces (dict, optional): Namespace prefix mappings
49
- **kwargs: XPath variable bindings
50
51
Returns:
52
SelectorList: Flattened results from all elements
53
"""
54
55
def css(self, query: str) -> "SelectorList[Selector]":
56
"""
57
Call css() on each element and return flattened results.
58
59
Parameters:
60
- query (str): CSS selector to apply
61
62
Returns:
63
SelectorList: Flattened results from all elements
64
"""
65
66
def jmespath(self, query: str, **kwargs: Any) -> "SelectorList[Selector]":
67
"""
68
Call jmespath() on each element and return flattened results.
69
70
Parameters:
71
- query (str): JMESPath expression to apply
72
- **kwargs: Additional jmespath options
73
74
Returns:
75
SelectorList: Flattened results from all elements
76
"""
77
```
78
79
**Usage Example:**
80
81
```python
82
from parsel import Selector
83
84
html = """
85
<div class="product">
86
<h2>Product 1</h2>
87
<div class="details">
88
<p class="price">$19.99</p>
89
<p class="rating">4.5 stars</p>
90
</div>
91
</div>
92
<div class="product">
93
<h2>Product 2</h2>
94
<div class="details">
95
<p class="price">$29.99</p>
96
<p class="rating">4.8 stars</p>
97
</div>
98
</div>
99
"""
100
101
selector = Selector(text=html)
102
103
# Get all product containers
104
products = selector.css('.product') # Returns SelectorList with 2 elements
105
106
# Chain selections - extract all prices from all products
107
all_prices = products.css('.price::text') # SelectorList with price texts
108
109
# Chain XPath - get all headings from all products
110
all_headings = products.xpath('.//h2/text()') # SelectorList with heading texts
111
112
# Further filter results
113
high_ratings = products.css('.rating:contains("4.8")') # Products with 4.8 rating
114
```
115
116
### Batch Content Extraction
117
118
Extract content from all selectors in the list.
119
120
```python { .api }
121
def get(self, default: Optional[str] = None) -> Optional[str]:
122
"""
123
Return get() result for the first element in the list.
124
125
Parameters:
126
- default (str, optional): Value if list is empty
127
128
Returns:
129
str or None: Content of first element or default
130
"""
131
132
def getall(self) -> List[str]:
133
"""
134
Call get() on each element and return all results.
135
136
Returns:
137
List[str]: Content from all elements in the list
138
"""
139
140
# Legacy aliases
141
extract_first = get
142
extract = getall
143
```
144
145
**Usage Example:**
146
147
```python
148
# Continuing from previous example
149
products = selector.css('.product')
150
151
# Get content from first product only
152
first_product_html = products.get()
153
154
# Get content from all products
155
all_product_html = products.getall() # List of HTML strings
156
157
# Extract all price values
158
price_texts = products.css('.price::text').getall()
159
# Returns: ['$19.99', '$29.99']
160
161
# Get first price only
162
first_price = products.css('.price::text').get()
163
# Returns: '$19.99'
164
165
# Get first price with default
166
first_price_safe = products.css('.nonexistent::text').get(default='$0.00')
167
# Returns: '$0.00' since no elements match
168
```
169
170
### Batch Regular Expression Operations
171
172
Apply regular expressions across all selectors in the list.
173
174
```python { .api }
175
def re(
176
self, regex: Union[str, Pattern[str]], replace_entities: bool = True
177
) -> List[str]:
178
"""
179
Call re() on each element and return flattened results.
180
181
Parameters:
182
- regex (str or Pattern): Regular expression pattern
183
- replace_entities (bool): Replace HTML entities
184
185
Returns:
186
List[str]: All regex matches from all elements
187
"""
188
189
def re_first(
190
self,
191
regex: Union[str, Pattern[str]],
192
default: Optional[str] = None,
193
replace_entities: bool = True,
194
) -> Optional[str]:
195
"""
196
Call re() on elements until first match is found.
197
198
Parameters:
199
- regex (str or Pattern): Regular expression pattern
200
- default (str, optional): Value if no matches found
201
- replace_entities (bool): Replace HTML entities
202
203
Returns:
204
str or None: First match across all elements or default
205
"""
206
```
207
208
**Usage Example:**
209
210
```python
211
# Extract all numeric values from all products
212
numbers = products.re(r'\\d+\\.\\d+')
213
# Returns: ['19.99', '4.5', '29.99', '4.8']
214
215
# Get first numeric value found
216
first_number = products.re_first(r'\\d+\\.\\d+')
217
# Returns: '19.99'
218
219
# Extract prices specifically
220
prices = products.css('.price').re(r'\\$([\\d.]+)')
221
# Returns: ['19.99', '29.99']
222
223
# Extract ratings
224
ratings = products.css('.rating').re(r'([\\d.]+) stars')
225
# Returns: ['4.5', '4.8']
226
```
227
228
### Attribute Access
229
230
Access attributes from the first element in the list.
231
232
```python { .api }
233
@property
234
def attrib(self) -> Mapping[str, str]:
235
"""
236
Return attributes dictionary for the first element.
237
238
Returns:
239
Mapping[str, str]: Attributes of first element, empty dict if list is empty
240
"""
241
```
242
243
**Usage Example:**
244
245
```python
246
html = """
247
<div class="item" data-id="1">Item 1</div>
248
<div class="item" data-id="2">Item 2</div>
249
"""
250
251
selector = Selector(text=html)
252
items = selector.css('.item')
253
254
# Get attributes of first item
255
first_item_attrs = items.attrib
256
# Returns: {'class': 'item', 'data-id': '1'}
257
258
# Access specific attribute
259
first_item_id = items.attrib.get('data-id')
260
# Returns: '1'
261
```
262
263
### Element Modification
264
265
Remove or modify elements in batch operations.
266
267
```python { .api }
268
def drop(self) -> None:
269
"""
270
Drop all matched nodes from their parents.
271
272
Removes each element in the list from its parent in the DOM.
273
"""
274
275
def remove(self) -> None:
276
"""
277
Remove all matched nodes from their parents.
278
279
Deprecated: Use drop() instead.
280
"""
281
```
282
283
**Usage Example:**
284
285
```python
286
html = """
287
<div>
288
<p class="temp">Temporary content</p>
289
<p class="keep">Important content</p>
290
<p class="temp">Another temp</p>
291
</div>
292
"""
293
294
selector = Selector(text=html)
295
296
# Remove all temporary paragraphs
297
temp_elements = selector.css('.temp')
298
temp_elements.drop() # Removes both .temp elements
299
300
# Check remaining content
301
remaining = selector.css('p').getall()
302
# Only the .keep paragraph remains
303
```
304
305
## List Operations and Indexing
306
307
SelectorList supports all standard Python list operations:
308
309
```python
310
products = selector.css('.product')
311
312
# Length
313
count = len(products) # Number of selected elements
314
315
# Indexing
316
first_product = products[0] # First Selector
317
last_product = products[-1] # Last Selector
318
319
# Slicing
320
first_two = products[:2] # SelectorList with first 2 elements
321
even_products = products[::2] # Every other product
322
323
# Iteration
324
for product in products:
325
title = product.css('h2::text').get()
326
print(title)
327
328
# List comprehension
329
titles = [p.css('h2::text').get() for p in products]
330
```
331
332
## Chaining Operations
333
334
SelectorList methods return new SelectorList objects, enabling method chaining:
335
336
```python
337
# Complex chaining example
338
product_details = (selector
339
.css('.product') # Get all products -> SelectorList
340
.css('.details') # Get details from each -> SelectorList
341
.xpath('.//p[contains(@class, "price")]') # Get price paragraphs -> SelectorList
342
.css('::text') # Get text content -> SelectorList
343
.re(r'\\$([\\d.]+)') # Extract price numbers -> List[str]
344
)
345
```