0
# Browser Session Management
1
2
Browser session creation, configuration, and control for CDP-based browser automation. The BrowserSession class manages browser lifecycle and provides state access capabilities. Browser actions like navigation, clicking, and form interaction are handled through the Tools/Actions system (see [Browser Actions](./browser-actions.md)).
3
4
## Capabilities
5
6
### Browser Session Control
7
8
Core browser session management with CDP protocol integration for direct browser control.
9
10
```python { .api }
11
class BrowserSession:
12
async def get_browser_state_summary(
13
self,
14
cache_clickable_elements_hashes: bool = True,
15
include_screenshot: bool = True,
16
cached: bool = False,
17
include_recent_events: bool = False
18
) -> BrowserStateSummary:
19
"""
20
Get current browser state including page content, URL, and available elements.
21
22
Parameters:
23
- cache_clickable_elements_hashes: Cache element hashes for performance
24
- include_screenshot: Include screenshot in the state summary
25
- cached: Use cached state if available
26
- include_recent_events: Include recent browser events in summary
27
28
Returns:
29
BrowserStateSummary: Current page state with DOM elements and metadata
30
"""
31
32
async def get_tabs(self) -> list[TabInfo]:
33
"""
34
Get list of all browser tabs.
35
36
Returns:
37
list[TabInfo]: List of all available browser tabs
38
"""
39
40
async def get_element_by_index(self, index: int) -> EnhancedDOMTreeNode | None:
41
"""
42
Get DOM element by its index from the element mapping.
43
44
Parameters:
45
- index: Element index from DOM serialization
46
47
Returns:
48
EnhancedDOMTreeNode | None: DOM element or None if not found
49
"""
50
51
async def get_current_page_url(self) -> str:
52
"""
53
Get URL of currently active page.
54
55
Returns:
56
str: Current page URL
57
"""
58
59
async def get_current_page_title(self) -> str:
60
"""
61
Get title of currently active page.
62
63
Returns:
64
str: Current page title
65
"""
66
67
async def start(self) -> None:
68
"""Start the browser session."""
69
70
async def kill(self) -> None:
71
"""Terminate browser session and cleanup resources."""
72
73
async def stop(self) -> None:
74
"""Stop the browser session gracefully."""
75
```
76
77
### Browser Profile Configuration
78
79
Comprehensive browser configuration for customizing browser behavior, security settings, and automation parameters.
80
81
```python { .api }
82
class BrowserProfile:
83
def __init__(
84
self,
85
headless: bool = False,
86
user_data_dir: str = None,
87
allowed_domains: list[str] = None,
88
downloads_path: str = None,
89
proxy: ProxySettings = None,
90
keep_alive: bool = False,
91
window_size: tuple[int, int] = (1920, 1080),
92
viewport_size: tuple[int, int] = None,
93
user_agent: str = None,
94
disable_web_security: bool = False,
95
disable_features: list[str] = None,
96
enable_features: list[str] = None,
97
extra_args: list[str] = None
98
):
99
"""
100
Configure browser behavior and settings.
101
102
Parameters:
103
- headless: Run browser in headless mode
104
- user_data_dir: Directory for browser user data
105
- allowed_domains: List of allowed domains (domain restriction)
106
- downloads_path: Directory for file downloads
107
- proxy: Proxy server configuration
108
- keep_alive: Keep browser alive after session ends
109
- window_size: Browser window dimensions
110
- viewport_size: Viewport dimensions (defaults to window_size)
111
- user_agent: Custom user agent string
112
- disable_web_security: Disable web security features
113
- disable_features: Chrome features to disable
114
- enable_features: Chrome features to enable
115
- extra_args: Additional Chrome command line arguments
116
"""
117
118
headless: bool
119
user_data_dir: str
120
allowed_domains: list[str]
121
downloads_path: str
122
proxy: ProxySettings
123
keep_alive: bool
124
```
125
126
### Proxy Configuration
127
128
Network proxy settings for browser sessions.
129
130
```python { .api }
131
class ProxySettings:
132
def __init__(
133
self,
134
server: str,
135
username: str = None,
136
password: str = None,
137
bypass_list: list[str] = None
138
):
139
"""
140
Configure proxy settings for browser session.
141
142
Parameters:
143
- server: Proxy server address (e.g., "proxy.example.com:8080")
144
- username: Proxy authentication username
145
- password: Proxy authentication password
146
- bypass_list: List of domains to bypass proxy
147
"""
148
149
server: str
150
username: str
151
password: str
152
bypass_list: list[str]
153
```
154
155
### Browser State Information
156
157
Comprehensive browser state representation for agent decision-making.
158
159
```python { .api }
160
class BrowserStateSummary:
161
"""
162
Current browser state information.
163
"""
164
url: str # Current page URL
165
title: str # Page title
166
tabs: list[TabInfo] # Available browser tabs
167
elements: list[ElementInfo] # Clickable/interactable elements
168
text_content: str # Page text content
169
screenshot_path: str # Path to current screenshot
170
viewport_size: tuple[int, int] # Viewport dimensions
171
172
class TabInfo:
173
"""Browser tab information."""
174
id: str
175
title: str
176
url: str
177
active: bool
178
179
class ElementInfo:
180
"""DOM element information."""
181
index: int
182
tag: str
183
text: str
184
attributes: dict[str, str]
185
bounding_box: dict[str, float]
186
```
187
188
## Usage Examples
189
190
### Basic Browser Session
191
192
```python
193
from browser_use import BrowserSession, BrowserProfile
194
195
# Create browser session with default profile
196
session = BrowserSession()
197
198
# Navigate and interact
199
await session.navigate_to_url("https://example.com")
200
state = await session.get_browser_state_summary()
201
print(f"Page title: {state.title}")
202
203
# Cleanup
204
await session.kill()
205
```
206
207
### Custom Browser Profile
208
209
```python
210
from browser_use import BrowserSession, BrowserProfile, ProxySettings
211
212
# Configure proxy
213
proxy = ProxySettings(
214
server="proxy.company.com:8080",
215
username="user",
216
password="pass",
217
bypass_list=["*.local", "127.0.0.1"]
218
)
219
220
# Create custom profile
221
profile = BrowserProfile(
222
headless=True,
223
user_data_dir="/tmp/browser-data",
224
allowed_domains=["*.example.com", "*.trusted-site.org"],
225
downloads_path="/tmp/downloads",
226
proxy=proxy,
227
window_size=(1440, 900),
228
user_agent="CustomBot/1.0"
229
)
230
231
# Create session with profile
232
session = BrowserSession(browser_profile=profile)
233
234
await session.navigate_to_url("https://example.com")
235
```
236
237
### Element Interaction
238
239
```python
240
from browser_use import BrowserSession
241
242
session = BrowserSession()
243
await session.navigate_to_url("https://example.com/form")
244
245
# Get current state
246
state = await session.get_browser_state_summary()
247
248
# Find search input (assuming it's element index 5)
249
search_input_index = 5
250
await session.input_text(search_input_index, "search query")
251
252
# Find and click search button (assuming it's element index 8)
253
search_button_index = 8
254
await session.click_element(search_button_index)
255
256
# Scroll down to see more results
257
await session.scroll(down=True, num_pages=2)
258
```
259
260
### Multi-Tab Management
261
262
```python
263
from browser_use import BrowserSession
264
265
session = BrowserSession()
266
267
# Navigate and open multiple tabs
268
await session.navigate_to_url("https://example.com")
269
state = await session.get_browser_state_summary()
270
271
# Switch between tabs
272
for tab in state.tabs:
273
print(f"Tab: {tab.title} - {tab.url}")
274
if not tab.active:
275
await session.switch_tab(tab.id)
276
# Do something in this tab
277
break
278
279
# Close unnecessary tabs
280
for tab in state.tabs:
281
if "unwanted" in tab.title.lower():
282
await session.close_tab(tab.id)
283
```
284
285
### Session Persistence
286
287
```python
288
from browser_use import BrowserSession, BrowserProfile
289
290
# Create persistent browser session
291
profile = BrowserProfile(
292
keep_alive=True,
293
user_data_dir="/persistent/browser/data"
294
)
295
296
session = BrowserSession(browser_profile=profile)
297
298
# Use session for multiple tasks
299
await session.navigate_to_url("https://site1.com")
300
# ... perform tasks ...
301
302
await session.navigate_to_url("https://site2.com")
303
# ... perform more tasks ...
304
305
# Session data persists for future use
306
```
307
308
## Browser Constants
309
310
```python { .api }
311
# Default browser configuration values
312
DEFAULT_BROWSER_PROFILE: BrowserProfile
313
CHROME_DEBUG_PORT: int = 9242
314
CHROME_DISABLED_COMPONENTS: list[str]
315
CHROME_HEADLESS_ARGS: list[str]
316
CHROME_DOCKER_ARGS: list[str]
317
318
# Screenshot and viewport limits
319
MAX_SCREENSHOT_HEIGHT: int = 2000
320
MAX_SCREENSHOT_WIDTH: int = 1920
321
```