0
# Merge System
1
2
The core functionality that manages merging of assets, providing data abstractions (hunks) and filter application tools for the asset processing pipeline. This system handles different types of content sources and enables filter processing with caching support.
3
4
## Capabilities
5
6
### Base Hunk
7
8
Abstract base class representing a unit of content that can be processed and merged.
9
10
```python { .api }
11
class BaseHunk:
12
def mtime(self):
13
"""Return modification time of the content source."""
14
15
def id(self):
16
"""Return unique identifier based on content hash."""
17
18
def data(self):
19
"""Return the content as a string."""
20
21
def save(self, filename):
22
"""Save content to specified file."""
23
24
def __eq__(self, other):
25
"""Compare hunks by content hash."""
26
```
27
28
### File Hunk
29
30
Represents content from a single file on the filesystem.
31
32
```python { .api }
33
class FileHunk(BaseHunk):
34
def __init__(self, filename):
35
"""
36
Initialize with path to source file.
37
38
Args:
39
filename: Path to the file to read content from
40
"""
41
42
def mtime(self):
43
"""Return file modification time."""
44
45
def data(self):
46
"""Read and return file contents as UTF-8 string."""
47
```
48
49
### URL Hunk
50
51
Represents content from a remote URL with HTTP caching support using etag and last-modified headers.
52
53
```python { .api }
54
class UrlHunk(BaseHunk):
55
def __init__(self, url, env=None):
56
"""
57
Initialize with URL and optional environment for caching.
58
59
Args:
60
url: URL to fetch content from
61
env: Environment instance for cache access (optional)
62
"""
63
64
def data(self):
65
"""
66
Fetch URL content with HTTP caching support.
67
68
Uses environment cache for etag/last-modified headers
69
and caches response content and headers.
70
"""
71
```
72
73
### Memory Hunk
74
75
Represents processed content in memory, typically the result of filtering or merging operations.
76
77
```python { .api }
78
class MemoryHunk(BaseHunk):
79
def __init__(self, data, files=None):
80
"""
81
Initialize with content data and optional source file list.
82
83
Args:
84
data: String content or file-like object
85
files: List of source files this content originated from
86
"""
87
88
def mtime(self):
89
"""Return None (memory content has no modification time)."""
90
91
def data(self):
92
"""Return content as string, reading from file-like objects if needed."""
93
94
def save(self, filename):
95
"""Save content to file with UTF-8 encoding."""
96
```
97
98
### Merge Function
99
100
```python { .api }
101
def merge(hunks, separator=None):
102
"""
103
Merge multiple hunks into a single MemoryHunk.
104
105
Args:
106
hunks: List of hunk objects to merge
107
separator: String to join hunks (default: newline)
108
109
Returns:
110
MemoryHunk: Combined content from all input hunks
111
"""
112
```
113
114
### Filter Tool
115
116
Applies filters to hunks with caching support, managing the filter execution pipeline and cache operations.
117
118
```python { .api }
119
class FilterTool:
120
VALID_TRANSFORMS = ('input', 'output')
121
VALID_FUNCS = ('open', 'concat')
122
123
def __init__(self, cache=None, no_cache_read=False, kwargs=None):
124
"""
125
Initialize filter tool with cache and options.
126
127
Args:
128
cache: Cache instance for storing filter results
129
no_cache_read: Skip cache reads (but still write results)
130
kwargs: Default arguments to pass to all filters
131
"""
132
133
def apply(self, hunk, filters, type, kwargs=None):
134
"""
135
Apply filters to a hunk using stream transforms.
136
137
Args:
138
hunk: Input hunk to process
139
filters: List of filter objects to apply
140
type: Transform type ('input' or 'output')
141
kwargs: Additional arguments for filters
142
143
Returns:
144
MemoryHunk: Processed content
145
"""
146
147
def apply_func(self, filters, type, args, kwargs=None, cache_key=None):
148
"""
149
Apply filter functions that don't use stream transforms.
150
151
Args:
152
filters: List of filter objects (only one can have the method)
153
type: Function type ('open' or 'concat')
154
args: Arguments to pass to filter function
155
kwargs: Keyword arguments for filter
156
cache_key: Additional cache key components
157
158
Returns:
159
MemoryHunk: Function result
160
161
Raises:
162
NoFilters: No filters implement the requested function
163
MoreThanOneFilterError: Multiple filters implement the function
164
"""
165
```
166
167
### Filter Management Functions
168
169
```python { .api }
170
def merge_filters(filters1, filters2):
171
"""
172
Merge two filter lists, removing duplicates.
173
174
Args:
175
filters1: Primary filter list (takes precedence)
176
filters2: Secondary filter list (duplicates removed)
177
178
Returns:
179
list: Combined filter list with filters1 + unique filters from filters2
180
"""
181
182
def select_filters(filters, level):
183
"""
184
Select filters appropriate for given debug level.
185
186
Args:
187
filters: List of filter objects
188
level: Debug level to check against
189
190
Returns:
191
list: Filters that should run at the specified debug level
192
"""
193
```
194
195
## Exception Classes
196
197
```python { .api }
198
class MoreThanOneFilterError(Exception):
199
"""
200
Raised when multiple filters implement a function that can only be used by one.
201
202
Attributes:
203
filters: List of conflicting filter objects
204
"""
205
206
class NoFilters(Exception):
207
"""Raised when no filters implement a requested function."""
208
```
209
210
## Key Features
211
212
### Content Abstraction
213
214
The hunk system provides a unified interface for different content sources:
215
216
- **File-based content**: Direct filesystem access with modification tracking
217
- **URL-based content**: HTTP fetching with caching headers support
218
- **Memory content**: Processed data from filters or merge operations
219
220
### HTTP Caching
221
222
URL hunks implement sophisticated HTTP caching:
223
224
- Stores etag and last-modified headers in environment cache
225
- Uses conditional requests (If-None-Match, If-Modified-Since)
226
- Handles 304 Not Modified responses efficiently
227
- Caches both headers and content separately
228
229
### Filter Pipeline
230
231
The FilterTool class manages complex filter execution:
232
233
- Supports both stream transforms (input/output) and function calls (open/concat)
234
- Implements comprehensive caching with composite cache keys
235
- Handles filter-specific options and additional cache key components
236
- Prevents conflicts when multiple filters implement exclusive functions
237
238
### Debug Logging
239
240
Detailed logging system for troubleshooting:
241
242
- Uses separate 'webassets.debug' logger controlled by WEBASSETS_DEBUG environment variable
243
- Logs cache hits/misses, filter execution, and hunk operations
244
- Includes content hashes in hunk representations for debugging
245
246
## Usage Examples
247
248
### Basic Hunk Operations
249
250
```python
251
from webassets.merge import FileHunk, MemoryHunk, merge
252
253
# Create hunks from different sources
254
file_hunk = FileHunk('style.css')
255
memory_hunk = MemoryHunk('/* Generated CSS */')
256
257
# Merge multiple hunks
258
combined = merge([file_hunk, memory_hunk], separator='\n\n')
259
260
# Save result
261
combined.save('output.css')
262
```
263
264
### Filter Application
265
266
```python
267
from webassets.merge import FilterTool
268
from webassets.filter import get_filter
269
270
# Setup filter tool with cache
271
tool = FilterTool(cache=env.cache)
272
273
# Apply filters to content
274
cssmin = get_filter('cssmin')
275
result = tool.apply(file_hunk, [cssmin], 'output')
276
```
277
278
### URL Content with Caching
279
280
```python
281
from webassets.merge import UrlHunk
282
283
# Fetch remote content with caching
284
url_hunk = UrlHunk('https://cdn.example.com/lib.js', env=env)
285
content = url_hunk.data() # Cached on subsequent calls
286
```