0
# Tree Navigation
1
2
Navigate through the parse tree using parent-child relationships, sibling traversal, and document-order iteration. Beautiful Soup provides both direct property access and generator-based iteration for memory-efficient traversal of large documents.
3
4
## Capabilities
5
6
### Parent-Child Navigation
7
8
Navigate up and down the parse tree hierarchy using parent and children relationships.
9
10
```python { .api }
11
@property
12
def parent(self):
13
"""
14
The parent element of this element, or None if this is the root.
15
16
Returns:
17
PageElement or None
18
"""
19
20
@property
21
def contents(self):
22
"""
23
List of direct children of this element.
24
25
Returns:
26
list of PageElement instances
27
"""
28
29
@property
30
def children(self):
31
"""
32
Generator yielding direct children of this element.
33
34
Yields:
35
PageElement instances
36
"""
37
38
@property
39
def descendants(self):
40
"""
41
Generator yielding all descendant elements in document order.
42
43
Yields:
44
PageElement instances (tags and strings)
45
"""
46
47
@property
48
def parents(self):
49
"""
50
Generator yielding all parent elements up to the document root.
51
52
Yields:
53
PageElement instances
54
"""
55
```
56
57
Usage Examples:
58
59
```python
60
from bs4 import BeautifulSoup
61
62
html = '''
63
<html>
64
<body>
65
<div class="container">
66
<p>First paragraph</p>
67
<p>Second paragraph</p>
68
</div>
69
</body>
70
</html>
71
'''
72
73
soup = BeautifulSoup(html, 'html.parser')
74
div = soup.find('div')
75
first_p = soup.find('p')
76
77
# Parent access
78
print(div.parent.name) # 'body'
79
print(first_p.parent.name) # 'div'
80
81
# Children access
82
print(len(div.contents)) # 5 (includes whitespace text nodes)
83
print([child.name for child in div.children if child.name]) # ['p', 'p']
84
85
# Descendants - all elements below
86
for element in div.descendants:
87
if hasattr(element, 'name') and element.name:
88
print(element.name) # div, p, p
89
90
# Parents - up to root
91
for parent in first_p.parents:
92
if parent.name:
93
print(parent.name) # div, body, html
94
```
95
96
### Sibling Navigation
97
98
Navigate horizontally through elements at the same level in the parse tree.
99
100
```python { .api }
101
@property
102
def next_sibling(self):
103
"""
104
The next sibling element, or None if this is the last child.
105
106
Returns:
107
PageElement or None
108
"""
109
110
@property
111
def previous_sibling(self):
112
"""
113
The previous sibling element, or None if this is the first child.
114
115
Returns:
116
PageElement or None
117
"""
118
119
@property
120
def next_siblings(self):
121
"""
122
Generator yielding all following sibling elements.
123
124
Yields:
125
PageElement instances
126
"""
127
128
@property
129
def previous_siblings(self):
130
"""
131
Generator yielding all preceding sibling elements in reverse order.
132
133
Yields:
134
PageElement instances
135
"""
136
```
137
138
Usage Examples:
139
140
```python
141
html = '<div><p>One</p><p>Two</p><p>Three</p></div>'
142
soup = BeautifulSoup(html, 'html.parser')
143
144
first_p = soup.find('p')
145
second_p = first_p.next_sibling.next_sibling # Skip whitespace text node
146
third_p = soup.find_all('p')[2]
147
148
# Direct sibling access
149
print(second_p.previous_sibling.previous_sibling.string) # 'One'
150
print(second_p.next_sibling.next_sibling.string) # 'Three'
151
152
# Iterate through siblings
153
for sibling in first_p.next_siblings:
154
if hasattr(sibling, 'name') and sibling.name == 'p':
155
print(sibling.string) # 'Two', 'Three'
156
157
for sibling in third_p.previous_siblings:
158
if hasattr(sibling, 'name') and sibling.name == 'p':
159
print(sibling.string) # 'Two', 'One'
160
```
161
162
### Document-Order Navigation
163
164
Navigate through elements in the order they appear in the source document.
165
166
```python { .api }
167
@property
168
def next_element(self):
169
"""
170
The next element in document order, or None if this is the last.
171
172
Returns:
173
PageElement or None
174
"""
175
176
@property
177
def previous_element(self):
178
"""
179
The previous element in document order, or None if this is the first.
180
181
Returns:
182
PageElement or None
183
"""
184
185
@property
186
def next_elements(self):
187
"""
188
Generator yielding all following elements in document order.
189
190
Yields:
191
PageElement instances (tags and strings)
192
"""
193
194
@property
195
def previous_elements(self):
196
"""
197
Generator yielding all preceding elements in reverse document order.
198
199
Yields:
200
PageElement instances (tags and strings)
201
"""
202
```
203
204
Usage Examples:
205
206
```python
207
html = '<div><p>Para <em>emphasis</em> text</p><span>After</span></div>'
208
soup = BeautifulSoup(html, 'html.parser')
209
210
p_tag = soup.find('p')
211
em_tag = soup.find('em')
212
213
# Document order navigation
214
current = p_tag
215
while current:
216
if hasattr(current, 'name') and current.name:
217
print(f"Tag: {current.name}")
218
elif isinstance(current, str) and current.strip():
219
print(f"Text: {current.strip()}")
220
current = current.next_element
221
222
# Output: Tag: p, Text: Para, Tag: em, Text: emphasis, Text: text, Tag: span, Text: After
223
224
# Find all text in document order from a starting point
225
text_content = []
226
for element in em_tag.next_elements:
227
if isinstance(element, str) and element.strip():
228
text_content.append(element.strip())
229
if hasattr(element, 'name') and element.name == 'span':
230
break
231
print(text_content) # ['text', 'After']
232
```
233
234
### Navigation Utilities
235
236
Helper methods for common navigation patterns.
237
238
```python { .api }
239
def index(self, element):
240
"""
241
Get the index of a child element.
242
243
Parameters:
244
- element: PageElement to find
245
246
Returns:
247
int, index of element in contents list
248
249
Raises:
250
ValueError if element is not a child
251
"""
252
253
@property
254
def is_empty_element(self):
255
"""
256
True if this tag has no contents and can be rendered as self-closing.
257
258
Returns:
259
bool
260
"""
261
```
262
263
Usage Examples:
264
265
```python
266
html = '<ul><li>First</li><li>Second</li><li>Third</li></ul>'
267
soup = BeautifulSoup(html, 'html.parser')
268
269
ul = soup.find('ul')
270
second_li = soup.find_all('li')[1]
271
272
# Get child index
273
print(ul.index(second_li)) # Index position of second <li>
274
275
# Check if element is empty
276
empty_div = soup.new_tag('div')
277
print(empty_div.is_empty_element) # True
278
279
div_with_content = soup.new_tag('div')
280
div_with_content.string = 'Content'
281
print(div_with_content.is_empty_element) # False
282
```
283
284
### Backward Compatibility
285
286
Legacy navigation methods from BeautifulSoup 3.x are aliased for compatibility.
287
288
```python { .api }
289
# BeautifulSoup 3.x compatibility aliases
290
@property
291
def nextSibling(self): # Use next_sibling instead
292
"""Deprecated: use next_sibling"""
293
294
@property
295
def previousSibling(self): # Use previous_sibling instead
296
"""Deprecated: use previous_sibling"""
297
298
@property
299
def findNextSibling(self): # Use find_next_sibling instead
300
"""Deprecated: use find_next_sibling"""
301
302
@property
303
def findPreviousSibling(self): # Use find_previous_sibling instead
304
"""Deprecated: use find_previous_sibling"""
305
```