0
# Response Content Processing
1
2
Functions for processing HTTP response content with support for streaming, buffered access, automatic encoding detection, and JSON parsing. These functions handle the asynchronous nature of Twisted's response system.
3
4
## Capabilities
5
6
### Incremental Content Collection
7
8
Collects response body data incrementally as it arrives, useful for streaming large responses or processing data in chunks.
9
10
```python { .api }
11
def collect(response, collector):
12
"""
13
Incrementally collect the body of the response.
14
15
This function may only be called once for a given response.
16
If the collector raises an exception, it will be set as the error
17
value on the response Deferred and the HTTP transport will be closed.
18
19
Parameters:
20
- response: IResponse - The HTTP response to collect body from
21
- collector: callable - Function called with each data chunk (bytes)
22
23
Returns:
24
Deferred that fires with None when entire body has been read
25
"""
26
```
27
28
### Complete Content Retrieval
29
30
Gets the complete response content as bytes, caching the result for multiple calls.
31
32
```python { .api }
33
def content(response):
34
"""
35
Read the complete contents of an HTTP response.
36
37
This function may be called multiple times for a response, it uses
38
a WeakKeyDictionary to cache the contents of the response.
39
40
Parameters:
41
- response: IResponse - The HTTP response to get contents of
42
43
Returns:
44
Deferred that fires with complete content as bytes
45
"""
46
```
47
48
### Text Content Decoding
49
50
Decodes response content as text using automatic charset detection from Content-Type headers or a specified encoding.
51
52
```python { .api }
53
def text_content(response, encoding="ISO-8859-1"):
54
"""
55
Read and decode HTTP response contents as text.
56
57
The charset is automatically detected from the Content-Type header.
58
If no charset is specified, the provided encoding is used as fallback.
59
60
Parameters:
61
- response: IResponse - The HTTP response to decode
62
- encoding: str - Fallback encoding if none detected (default: ISO-8859-1)
63
64
Returns:
65
Deferred that fires with decoded text as str
66
"""
67
```
68
69
### JSON Content Parsing
70
71
Parses response content as JSON, automatically handling UTF-8 encoding for JSON data.
72
73
```python { .api }
74
def json_content(response, **kwargs):
75
"""
76
Read and parse HTTP response contents as JSON.
77
78
This function relies on text_content() and may be called multiple
79
times for a given response. JSON content is automatically decoded
80
as UTF-8 per RFC 7159.
81
82
Parameters:
83
- response: IResponse - The HTTP response to parse
84
- **kwargs: Additional keyword arguments for json.loads()
85
86
Returns:
87
Deferred that fires with parsed JSON data
88
"""
89
```
90
91
## Usage Examples
92
93
### Basic Content Access
94
95
```python
96
import treq
97
from twisted.internet import defer
98
99
@defer.inlineCallbacks
100
def get_content():
101
response = yield treq.get('https://httpbin.org/get')
102
103
# Get raw bytes
104
raw_data = yield treq.content(response)
105
print(f"Raw data: {raw_data[:100]}...")
106
107
# Get decoded text
108
text_data = yield treq.text_content(response)
109
print(f"Text data: {text_data[:100]}...")
110
111
# Parse as JSON
112
json_data = yield treq.json_content(response)
113
print(f"JSON data: {json_data}")
114
```
115
116
### Streaming Large Responses
117
118
```python
119
@defer.inlineCallbacks
120
def stream_large_file():
121
response = yield treq.get('https://httpbin.org/bytes/10000')
122
123
chunks = []
124
def collector(data):
125
chunks.append(data)
126
print(f"Received chunk of {len(data)} bytes")
127
128
yield treq.collect(response, collector)
129
total_data = b''.join(chunks)
130
print(f"Total received: {len(total_data)} bytes")
131
```
132
133
### Processing Different Content Types
134
135
```python
136
@defer.inlineCallbacks
137
def handle_different_types():
138
# JSON API response
139
json_response = yield treq.get('https://httpbin.org/json')
140
data = yield treq.json_content(json_response)
141
142
# Plain text response
143
text_response = yield treq.get('https://httpbin.org/robots.txt')
144
text = yield treq.text_content(text_response)
145
146
# Binary data (image, file, etc.)
147
binary_response = yield treq.get('https://httpbin.org/bytes/1024')
148
binary_data = yield treq.content(binary_response)
149
150
# Custom JSON parsing with parameters
151
json_response = yield treq.get('https://httpbin.org/json')
152
# Parse with custom options
153
data = yield treq.json_content(json_response, parse_float=float, parse_int=int)
154
```
155
156
### Error Handling
157
158
```python
159
@defer.inlineCallbacks
160
def handle_content_errors():
161
try:
162
response = yield treq.get('https://httpbin.org/status/500')
163
164
# Content functions work regardless of HTTP status
165
content = yield treq.text_content(response)
166
print(f"Error response content: {content}")
167
168
except Exception as e:
169
print(f"Request failed: {e}")
170
171
try:
172
response = yield treq.get('https://httpbin.org/html')
173
174
# This will raise an exception if content is not valid JSON
175
json_data = yield treq.json_content(response)
176
177
except ValueError as e:
178
print(f"JSON parsing failed: {e}")
179
# Fall back to text content
180
text_data = yield treq.text_content(response)
181
```
182
183
### Response Object Methods
184
185
The _Response object also provides convenient methods for content access:
186
187
```python
188
@defer.inlineCallbacks
189
def use_response_methods():
190
response = yield treq.get('https://httpbin.org/json')
191
192
# These are equivalent to the module-level functions
193
content_bytes = yield response.content()
194
text_content = yield response.text()
195
json_data = yield response.json()
196
197
# Incremental collection
198
chunks = []
199
yield response.collect(chunks.append)
200
```
201
202
## Types
203
204
Content-related types:
205
206
```python { .api }
207
# Collector function type for incremental processing
208
CollectorFunction = Callable[[bytes], None]
209
210
# Encoding detection return type
211
Optional[str] # Charset name or None if not detected
212
213
# Content function return types
214
Deferred[bytes] # For content()
215
Deferred[str] # For text_content()
216
Deferred[Any] # For json_content()
217
Deferred[None] # For collect()
218
```
219
220
## Encoding Detection
221
222
treq automatically detects character encoding from HTTP headers:
223
224
1. **Content-Type header parsing**: Extracts charset parameter from Content-Type
225
2. **JSON default**: Uses UTF-8 for application/json responses per RFC 7159
226
3. **Fallback encoding**: Uses provided encoding parameter (default: ISO-8859-1)
227
4. **Charset validation**: Validates charset names against RFC 2978 specification
228
229
The encoding detection handles edge cases like:
230
- Multiple Content-Type headers (uses last one)
231
- Quoted charset values (`charset="utf-8"`)
232
- Case-insensitive charset names
233
- Invalid charset characters (falls back to default)
234
235
## Performance Considerations
236
237
- **Buffering**: By default, treq buffers complete responses in memory
238
- **Unbuffered responses**: Use `unbuffered=True` in request to stream large responses
239
- **Multiple access**: Content functions cache results for repeated access to same response
240
- **Streaming**: Use `collect()` for processing large responses incrementally
241
- **Memory usage**: Consider streaming for responses larger than available memory