0
# MechanicalSoup
1
2
A Python library for automating interaction with websites. MechanicalSoup provides a simple API for web scraping and form submission, built on top of the popular Requests library for HTTP sessions and BeautifulSoup for HTML parsing. It automatically handles cookies, redirects, and can follow links and submit forms without JavaScript execution.
3
4
## Package Information
5
6
- **Package Name**: mechanicalsoup
7
- **Language**: Python
8
- **Installation**: `pip install MechanicalSoup`
9
10
## Core Imports
11
12
```python
13
import mechanicalsoup
14
```
15
16
Common usage patterns:
17
18
```python
19
from mechanicalsoup import StatefulBrowser, Browser, Form
20
from mechanicalsoup import LinkNotFoundError, InvalidFormMethod
21
```
22
23
## Basic Usage
24
25
```python
26
import mechanicalsoup
27
28
# Create a browser instance
29
browser = mechanicalsoup.StatefulBrowser()
30
31
# Open a webpage
32
browser.open("https://httpbin.org/forms/post")
33
34
# Select and fill a form
35
browser.select_form('form[action="/post"]')
36
browser["custname"] = "John Doe"
37
browser["custtel"] = "555-1234"
38
39
# Submit the form
40
response = browser.submit_selected()
41
print(response.text)
42
43
# Navigate using links
44
browser.open("https://httpbin.org/")
45
links = browser.links()
46
if links:
47
browser.follow_link(links[0])
48
print(f"Now at: {browser.url}")
49
```
50
51
## Architecture
52
53
MechanicalSoup provides a layered architecture for web automation:
54
55
- **Browser**: Low-level HTTP browser with BeautifulSoup integration for basic request/response handling
56
- **StatefulBrowser**: High-level browser that maintains navigation state, handles forms, and provides convenient web interaction methods
57
- **Form**: HTML form manipulation class for filling fields and preparing submissions
58
- **Utilities**: Exception classes and helper functions for error handling and form analysis
59
60
This design enables both simple scripting for basic web scraping and sophisticated automation workflows for complex multi-step interactions.
61
62
## Capabilities
63
64
### Core Browser Operations
65
66
Low-level HTTP browser functionality providing direct request/response handling with automatic BeautifulSoup parsing. Handles sessions, cookies, and basic web interactions.
67
68
```python { .api }
69
class Browser:
70
def __init__(self, session=None, soup_config=None, requests_adapters=None,
71
raise_on_404=False, user_agent=None): ...
72
def get(self, *args, **kwargs): ...
73
def post(self, *args, **kwargs): ...
74
def submit(self, form, url=None, **kwargs): ...
75
```
76
77
[Core Browser Operations](./browser.md)
78
79
### Stateful Web Navigation
80
81
High-level browser that maintains page state and provides convenient methods for navigation, link following, and multi-step web interactions. Recommended for most applications.
82
83
```python { .api }
84
class StatefulBrowser(Browser):
85
@property
86
def page(self): ... # Current page BeautifulSoup object
87
@property
88
def url(self): ... # Current page URL
89
@property
90
def form(self): ... # Currently selected form
91
92
def open(self, url, *args, **kwargs): ...
93
def select_form(self, selector="form", nr=0): ...
94
def follow_link(self, link=None, **kwargs): ...
95
```
96
97
[Stateful Web Navigation](./navigation.md)
98
99
### Form Handling
100
101
HTML form manipulation and field setting capabilities. Supports all standard form elements including inputs, checkboxes, radio buttons, selects, and textareas.
102
103
```python { .api }
104
class Form:
105
def __init__(self, form): ... # form is bs4.element.Tag
106
def set(self, name, value, force=False): ...
107
def __setitem__(self, name, value): ...
108
def set_checkbox(self, data, uncheck_other_boxes=True): ...
109
def set_select(self, data): ...
110
```
111
112
[Form Handling](./forms.md)
113
114
### Utilities and Error Handling
115
116
Exception classes and utility functions for error handling and form analysis.
117
118
```python { .api }
119
class LinkNotFoundError(Exception): ...
120
class InvalidFormMethod(LinkNotFoundError): ...
121
122
def is_multipart_file_upload(form, tag): ...
123
```
124
125
[Utilities and Error Handling](./utilities.md)
126
127
## Types
128
129
```python { .api }
130
# Session and configuration types
131
from typing import Optional, Dict, Any, List, Union
132
from requests import Session
133
from bs4 import BeautifulSoup, Tag
134
135
# Common parameter types
136
SoupConfig = Dict[str, Any]
137
RequestsAdapters = Dict[str, Any]
138
```