0
# File Compression
1
2
WhiteNoise provides comprehensive file compression functionality supporting both gzip and Brotli formats. The compression system intelligently determines which files to compress and creates compressed variants that are automatically served based on client Accept-Encoding headers.
3
4
## Capabilities
5
6
### Compressor Class
7
8
The main compression utility that handles both gzip and Brotli compression with intelligent compression decisions and concurrent processing.
9
10
```python { .api }
11
class Compressor:
12
"""
13
File compression utility for static files.
14
15
Supports gzip and Brotli compression with intelligent effectiveness
16
testing and concurrent processing for performance.
17
"""
18
19
def __init__(
20
self,
21
extensions=None,
22
use_gzip=True,
23
use_brotli=True,
24
log=print,
25
quiet=False
26
):
27
"""
28
Initialize compressor.
29
30
Parameters:
31
- extensions: File extensions to skip (defaults to SKIP_COMPRESS_EXTENSIONS)
32
- use_gzip: Enable gzip compression
33
- use_brotli: Enable Brotli compression (requires brotli package)
34
- log: Logging function (defaults to print)
35
- quiet: Suppress logging output
36
"""
37
38
def should_compress(self, filename):
39
"""
40
Check if file should be compressed based on extension.
41
42
Parameters:
43
- filename: File name to check
44
45
Returns:
46
- bool: True if file should be compressed
47
"""
48
49
def compress(self, path):
50
"""
51
Compress file with available compression formats.
52
53
Parameters:
54
- path: Absolute path to file to compress
55
56
Returns:
57
- list[str]: List of created compressed file paths
58
59
Creates compressed variants (.gz, .br) only if compression
60
reduces file size by at least 5%.
61
"""
62
63
@staticmethod
64
def compress_gzip(data):
65
"""
66
Compress data using gzip (static method).
67
68
Parameters:
69
- data: bytes to compress
70
71
Returns:
72
- bytes: Gzip compressed data
73
74
Uses maximum compression level and removes timestamp for
75
deterministic output.
76
"""
77
78
@staticmethod
79
def compress_brotli(data):
80
"""
81
Compress data using Brotli (static method).
82
83
Parameters:
84
- data: bytes to compress
85
86
Returns:
87
- bytes: Brotli compressed data
88
89
Requires brotli package to be installed.
90
"""
91
92
def is_compressed_effectively(self, encoding_name, path, orig_size, data):
93
"""
94
Check if compression is effective (reduces size by >5%).
95
96
Parameters:
97
- encoding_name: Name of compression format for logging
98
- path: File path for logging
99
- orig_size: Original file size in bytes
100
- data: Compressed data
101
102
Returns:
103
- bool: True if compression is effective
104
"""
105
106
def write_data(self, path, data, suffix, stat_result):
107
"""
108
Write compressed data to file with preserved timestamps.
109
110
Parameters:
111
- path: Original file path
112
- data: Compressed data bytes
113
- suffix: File suffix (.gz or .br)
114
- stat_result: os.stat result for timestamp preservation
115
116
Returns:
117
- str: Path to created compressed file
118
"""
119
120
@staticmethod
121
def get_extension_re(extensions):
122
"""
123
Build regex pattern for file extension matching.
124
125
Parameters:
126
- extensions: Iterable of file extensions to match
127
128
Returns:
129
- re.Pattern: Compiled regex for extension matching
130
"""
131
132
# Default extensions to skip compression
133
SKIP_COMPRESS_EXTENSIONS: tuple[str, ...] = (
134
# Images
135
"jpg", "jpeg", "png", "gif", "webp",
136
# Compressed files
137
"zip", "gz", "tgz", "bz2", "tbz", "xz", "br",
138
# Flash
139
"swf", "flv",
140
# Fonts
141
"woff", "woff2",
142
# Video
143
"3gp", "3gpp", "asf", "avi", "m4v", "mov", "mp4", "mpeg", "mpg", "webm", "wmv"
144
)
145
```
146
147
### Command Line Interface
148
149
WhiteNoise provides a command-line tool for batch compression of static files.
150
151
```python { .api }
152
def main(argv=None):
153
"""
154
Command-line interface for file compression.
155
156
Parameters:
157
- argv: Command line arguments (defaults to sys.argv)
158
159
Returns:
160
- int: Exit code (0 for success)
161
162
Usage:
163
python -m whitenoise.compress [options] <root> [extensions...]
164
165
Arguments:
166
- root: Directory to search for files
167
- extensions: File extensions to exclude (defaults to SKIP_COMPRESS_EXTENSIONS)
168
169
Options:
170
-q, --quiet: Don't produce log output
171
--no-gzip: Don't produce gzip '.gz' files
172
--no-brotli: Don't produce brotli '.br' files
173
"""
174
```
175
176
## Usage Examples
177
178
### Basic Compression
179
180
```python
181
from whitenoise.compress import Compressor
182
183
# Create compressor with default settings
184
compressor = Compressor()
185
186
# Compress a single file
187
compressed_files = compressor.compress('/path/to/style.css')
188
# Creates /path/to/style.css.gz and /path/to/style.css.br (if effective)
189
190
# Check if file should be compressed
191
if compressor.should_compress('script.js'):
192
compressed_files = compressor.compress('/path/to/script.js')
193
```
194
195
### Custom Configuration
196
197
```python
198
# Skip additional extensions and disable Brotli
199
compressor = Compressor(
200
extensions=['jpg', 'png', 'pdf', 'doc'], # Custom skip list
201
use_gzip=True,
202
use_brotli=False, # Disable Brotli
203
quiet=True # Suppress logging
204
)
205
206
# Custom logging function
207
def custom_logger(message):
208
print(f"[COMPRESS] {message}")
209
210
compressor = Compressor(log=custom_logger)
211
```
212
213
### Batch Compression
214
215
```python
216
import os
217
from concurrent.futures import ThreadPoolExecutor, as_completed
218
from whitenoise.compress import Compressor
219
220
def compress_directory(root_path):
221
"""Compress all eligible files in directory."""
222
compressor = Compressor()
223
224
with ThreadPoolExecutor() as executor:
225
futures = []
226
227
for dirpath, dirs, files in os.walk(root_path):
228
for filename in files:
229
if compressor.should_compress(filename):
230
filepath = os.path.join(dirpath, filename)
231
future = executor.submit(compressor.compress, filepath)
232
futures.append(future)
233
234
# Wait for completion and handle any errors
235
for future in as_completed(futures):
236
try:
237
compressed_files = future.result()
238
print(f"Compressed: {compressed_files}")
239
except Exception as e:
240
print(f"Compression error: {e}")
241
```
242
243
### Command Line Usage
244
245
```bash
246
# Compress all files in static directory
247
python -m whitenoise.compress /path/to/static
248
249
# Compress with custom excluded extensions
250
python -m whitenoise.compress /path/to/static pdf doc
251
252
# Quiet mode with only gzip
253
python -m whitenoise.compress --quiet --no-brotli /path/to/static
254
255
# Help
256
python -m whitenoise.compress --help
257
```
258
259
### Integration with Build Process
260
261
```python
262
# build.py - Build script integration
263
from whitenoise.compress import Compressor
264
import os
265
266
def build_static_files():
267
"""Build and compress static files."""
268
static_root = '/var/www/static'
269
270
# Run your static file collection/build process
271
os.system('npm run build') # or equivalent
272
273
# Compress generated files
274
compressor = Compressor(quiet=False)
275
276
for dirpath, dirs, files in os.walk(static_root):
277
for filename in files:
278
if compressor.should_compress(filename):
279
filepath = os.path.join(dirpath, filename)
280
compressed = compressor.compress(filepath)
281
if compressed:
282
print(f"Compressed {filename} -> {compressed}")
283
284
if __name__ == '__main__':
285
build_static_files()
286
```
287
288
## Compression Algorithm Details
289
290
### Gzip Compression
291
292
- Uses Python's built-in `gzip` module
293
- Maximum compression level (9) for best size reduction
294
- Timestamp set to 0 for deterministic output
295
- Compatible with all modern browsers
296
297
### Brotli Compression
298
299
- Requires optional `brotli` package installation
300
- Generally provides better compression than gzip
301
- Supported by modern browsers (Chrome 49+, Firefox 44+, Safari 14+)
302
- Served preferentially when client supports both gzip and Brotli
303
304
### Effectiveness Testing
305
306
Compression is only applied if it reduces file size by more than 5%. This prevents:
307
- Wasted CPU cycles on incompressible files
308
- Potential size increases for very small files
309
- Storage of ineffective compressed variants
310
311
### File Selection
312
313
Files are automatically excluded from compression if they have extensions indicating they're already compressed or binary formats that don't compress well:
314
315
- **Images**: jpg, jpeg, png, gif, webp
316
- **Archives**: zip, gz, tgz, bz2, tbz, xz, br
317
- **Media**: mp4, avi, mov, webm, mp3, flv, swf
318
- **Fonts**: woff, woff2
319
320
## Types
321
322
```python { .api }
323
from typing import Callable, Optional
324
import argparse
325
326
# Logging function type
327
LogFunction = Callable[[str], None]
328
329
# Command line arguments type
330
Args = argparse.Namespace
331
332
# Compression result type
333
CompressionResult = list[str] # List of compressed file paths
334
```