or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

browser.mdforms.mdindex.mdnavigation.mdutilities.md

browser.mddocs/

0

# Core Browser Operations

1

2

Low-level browser functionality for HTTP requests with automatic BeautifulSoup parsing. The Browser class provides direct request/response handling with session management and is recommended for applications requiring fine-grained control over HTTP interactions.

3

4

## Capabilities

5

6

### Browser Creation and Configuration

7

8

Create a browser instance with optional session, parsing, and adapter configuration.

9

10

```python { .api }

11

class Browser:

12

def __init__(self, session=None, soup_config={'features': 'lxml'},

13

requests_adapters=None, raise_on_404=False, user_agent=None):

14

"""

15

Create a Browser instance.

16

17

Parameters:

18

- session: Optional requests.Session instance

19

- soup_config: BeautifulSoup configuration dict

20

- requests_adapters: Requests adapter configuration

21

- raise_on_404: If True, raise LinkNotFoundError on 404 errors

22

- user_agent: Custom User-Agent string

23

"""

24

```

25

26

**Usage Example:**

27

28

```python

29

import mechanicalsoup

30

import requests

31

32

# Basic browser

33

browser = mechanicalsoup.Browser()

34

35

# Browser with custom session

36

session = requests.Session()

37

browser = mechanicalsoup.Browser(session=session)

38

39

# Browser with custom BeautifulSoup parser

40

browser = mechanicalsoup.Browser(soup_config={'features': 'html.parser'})

41

42

# Browser that raises on 404 errors

43

browser = mechanicalsoup.Browser(raise_on_404=True)

44

```

45

46

### HTTP Request Methods

47

48

Standard HTTP methods with automatic BeautifulSoup parsing of HTML responses.

49

50

```python { .api }

51

def request(self, *args, **kwargs):

52

"""Low-level request method, forwards to session.request()"""

53

54

def get(self, *args, **kwargs):

55

"""HTTP GET request with soup parsing"""

56

57

def post(self, *args, **kwargs):

58

"""HTTP POST request with soup parsing"""

59

60

def put(self, *args, **kwargs):

61

"""HTTP PUT request with soup parsing"""

62

```

63

64

**Usage Example:**

65

66

```python

67

browser = mechanicalsoup.Browser()

68

69

# GET request

70

response = browser.get("https://httpbin.org/get")

71

print(response.soup.title.string)

72

73

# POST request with data

74

response = browser.post("https://httpbin.org/post",

75

data={"key": "value"})

76

77

# PUT request with data

78

response = browser.put("https://httpbin.org/put",

79

json={"updated_key": "updated_value"})

80

81

# Request with headers

82

response = browser.get("https://httpbin.org/headers",

83

headers={"Custom-Header": "value"})

84

```

85

86

### Form Submission

87

88

Submit HTML forms with automatic data extraction and encoding.

89

90

```python { .api }

91

def submit(self, form, url=None, **kwargs):

92

"""

93

Submit a form object.

94

95

Parameters:

96

- form: Form instance to submit

97

- url: Optional URL override for form action

98

- **kwargs: Additional request parameters

99

100

Returns:

101

requests.Response with soup attribute

102

"""

103

```

104

105

**Usage Example:**

106

107

```python

108

from mechanicalsoup import Browser, Form

109

110

browser = Browser()

111

response = browser.get("https://httpbin.org/forms/post")

112

113

# Create and fill form

114

form = Form(response.soup.find("form"))

115

form["custname"] = "John Doe"

116

117

# Submit form

118

result = browser.submit(form)

119

print(result.soup)

120

```

121

122

### Session and Cookie Management

123

124

Manage cookies and session state for authenticated or persistent interactions.

125

126

```python { .api }

127

def set_cookiejar(self, cookiejar):

128

"""Replace the current cookiejar in the requests session"""

129

130

def get_cookiejar(self):

131

"""Get the current cookiejar from the requests session"""

132

```

133

134

**Usage Example:**

135

136

```python

137

import mechanicalsoup

138

from http.cookiejar import CookieJar

139

140

browser = mechanicalsoup.Browser()

141

142

# Get current cookies

143

cookies = browser.get_cookiejar()

144

145

# Set new cookie jar

146

new_jar = CookieJar()

147

browser.set_cookiejar(new_jar)

148

```

149

150

### User Agent Management

151

152

Set and manage the User-Agent header for requests.

153

154

```python { .api }

155

def set_user_agent(self, user_agent):

156

"""

157

Set the User-Agent header for requests.

158

159

Parameters:

160

- user_agent: String to use as User-Agent, or None for default

161

"""

162

```

163

164

**Usage Example:**

165

166

```python

167

browser = mechanicalsoup.Browser()

168

169

# Set custom user agent

170

browser.set_user_agent("MyBot/1.0 (Contact: admin@example.com)")

171

172

# Reset to default

173

browser.set_user_agent(None)

174

```

175

176

### Debugging and Development

177

178

Tools for debugging and development workflow.

179

180

```python { .api }

181

def launch_browser(self, soup):

182

"""

183

Launch external browser with page content for debugging.

184

185

Parameters:

186

- soup: BeautifulSoup object to display

187

"""

188

```

189

190

### Session Cleanup

191

192

Clean up browser resources and close connections.

193

194

```python { .api }

195

def close(self):

196

"""Close the session and clear cookies"""

197

```

198

199

**Usage Example:**

200

201

```python

202

browser = mechanicalsoup.Browser()

203

try:

204

response = browser.get("https://example.com")

205

# Use response...

206

finally:

207

browser.close()

208

```

209

210

### Context Manager Support

211

212

Browser supports context manager protocol for automatic resource cleanup.

213

214

```python { .api }

215

def __enter__(self):

216

"""Enter context manager, returns self"""

217

218

def __exit__(self, *args):

219

"""Exit context manager, calls close() automatically"""

220

```

221

222

**Usage Example:**

223

224

```python

225

# Recommended approach using context manager

226

with mechanicalsoup.Browser() as browser:

227

response = browser.get("https://example.com")

228

# Process response...

229

response2 = browser.post("https://example.com/api", data={"key": "value"})

230

# Browser automatically closed when exiting with-block

231

232

# For long-running applications

233

with mechanicalsoup.Browser(user_agent="MyApp/1.0") as browser:

234

for url in urls:

235

try:

236

response = browser.get(url)

237

process_page(response.soup)

238

except Exception as e:

239

print(f"Error processing {url}: {e}")

240

```

241

242

### Static Utility Methods

243

244

Helper methods for response processing and form data extraction.

245

246

```python { .api }

247

@staticmethod

248

def add_soup(response, soup_config):

249

"""

250

Attach a BeautifulSoup object to a requests response.

251

252

Parameters:

253

- response: requests.Response object

254

- soup_config: BeautifulSoup configuration dict

255

"""

256

257

@staticmethod

258

def get_request_kwargs(form, url=None, **kwargs):

259

"""

260

Extract form data for request submission.

261

262

Parameters:

263

- form: Form instance

264

- url: Optional URL override

265

- **kwargs: Additional parameters

266

267

Returns:

268

Dict with request parameters

269

"""

270

```

271

272

## Public Attributes

273

274

```python { .api }

275

# Browser instance attributes

276

session: requests.Session # The underlying requests session

277

soup_config: Dict[str, Any] # BeautifulSoup configuration

278

raise_on_404: bool # Whether to raise on 404 errors

279

```

280

281

## Error Handling

282

283

The Browser class can raise LinkNotFoundError when `raise_on_404=True` and a 404 error occurs:

284

285

```python

286

import mechanicalsoup

287

288

browser = mechanicalsoup.Browser(raise_on_404=True)

289

try:

290

response = browser.get("https://httpbin.org/status/404")

291

except mechanicalsoup.LinkNotFoundError:

292

print("Page not found!")

293

```