0
# Memory and Caching
1
2
Transparent disk-caching of function results using the memoize pattern. Provides automatic cache invalidation, configurable storage backends, and memory-mapped array support for handling large datasets efficiently in scientific computing and machine learning workflows.
3
4
## Capabilities
5
6
```python
7
from typing import Optional
8
```
9
10
### Memory Context Manager
11
12
Creates a caching context for functions with configurable storage location, compression, and backend options.
13
14
```python { .api }
15
class Memory(Logger):
16
def __init__(self, location=None, backend="local", mmap_mode=None, compress=False, verbose=1, backend_options=None):
17
"""
18
Create a Memory context for caching function results.
19
20
Parameters:
21
- location: str or pathlib.Path, cache directory (None for no caching)
22
- backend: str, storage backend ("local" by default)
23
- mmap_mode: {None, 'r+', 'r', 'w+', 'c'}, memory mapping mode for arrays
24
- compress: bool or int, compression level (False, True, or 0-9)
25
- verbose: int, verbosity level (0=silent, 1=normal, 2=verbose)
26
- backend_options: dict, additional backend-specific options
27
"""
28
29
def cache(self, func=None, ignore=None, verbose=None, mmap_mode=False, cache_validation_callback=None):
30
"""
31
Decorator to cache function results to disk.
32
33
Parameters:
34
- func: callable, function to cache (or None for decorator usage)
35
- ignore: list of str, parameter names to ignore in cache key
36
- verbose: int, override Memory verbose level
37
- mmap_mode: bool or str, memory mapping mode for this function
38
- cache_validation_callback: callable, custom cache validation logic
39
40
Returns:
41
Decorated function or MemorizedFunc instance
42
"""
43
44
def clear(self, warn=True):
45
"""
46
Erase complete cache directory.
47
48
Parameters:
49
- warn: bool, warn before clearing cache
50
"""
51
52
def reduce_size(self, bytes_limit=None, items_limit=None, age_limit=None):
53
"""
54
Remove cache elements to fit within specified limits.
55
56
Parameters:
57
- bytes_limit: int, maximum cache size in bytes
58
- items_limit: int, maximum number of cached items
59
- age_limit: datetime.timedelta, maximum age of cached items
60
"""
61
62
def eval(self, func, *args, **kwargs):
63
"""
64
Evaluate function within Memory context.
65
66
Parameters:
67
- func: callable, function to evaluate
68
- *args, **kwargs: arguments to pass to function
69
70
Returns:
71
Function result (cached if applicable)
72
"""
73
74
# Properties
75
location: Optional[str] # Cache directory location
76
backend: str # Storage backend type
77
verbose: int # Verbosity level
78
```
79
80
**Usage Example:**
81
82
```python
83
from joblib import Memory
84
import numpy as np
85
86
# Create memory context
87
mem = Memory(location='./cache', verbose=1)
88
89
# Cache expensive computation
90
@mem.cache
91
def compute_features(data, n_components=10):
92
"""Expensive feature computation."""
93
# Simulate expensive computation
94
result = np.random.random((len(data), n_components))
95
return result
96
97
# First call computes and caches
98
data = np.random.random(1000)
99
features = compute_features(data, n_components=20)
100
101
# Second call loads from cache
102
features = compute_features(data, n_components=20) # Fast!
103
104
# Clear specific function cache
105
compute_features.clear()
106
107
# Clear entire cache
108
mem.clear()
109
```
110
111
### Cached Result Management
112
113
Manages individual cached computation results with access and cleanup capabilities.
114
115
```python { .api }
116
class MemorizedResult:
117
def __init__(self, location, call_id, backend="local", mmap_mode=None, verbose=0, timestamp=None, metadata=None):
118
"""
119
Represent a cached computation result.
120
121
Parameters:
122
- location: str, cache location path
123
- call_id: str, unique identifier for the cached call
124
- backend: str, storage backend type
125
- mmap_mode: str, memory mapping mode
126
- verbose: int, verbosity level
127
- timestamp: float, cache creation timestamp
128
- metadata: dict, additional cache metadata
129
"""
130
131
def get(self):
132
"""
133
Read cached value and return it.
134
135
Returns:
136
Cached result object
137
"""
138
139
def clear(self):
140
"""
141
Clear this cached value from storage.
142
"""
143
144
# Properties
145
location: str # Cache location
146
func: str # Function name
147
args_id: str # Arguments identifier
148
```
149
150
### Cache Validation
151
152
Provides cache validation callbacks for time-based and custom invalidation logic.
153
154
```python { .api }
155
def expires_after(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0):
156
"""
157
Cache validation callback to force recomputation after duration.
158
159
Parameters:
160
- days, seconds, microseconds, milliseconds, minutes, hours, weeks: int,
161
time duration components
162
163
Returns:
164
Validation callback function for use with Memory.cache()
165
"""
166
```
167
168
**Usage Example:**
169
170
```python
171
from joblib import Memory, expires_after
172
from datetime import timedelta
173
174
mem = Memory('./cache')
175
176
# Cache expires after 1 hour
177
@mem.cache(cache_validation_callback=expires_after(hours=1))
178
def fetch_data():
179
# This will be recomputed after 1 hour
180
return expensive_api_call()
181
182
# Custom validation callback
183
def custom_validator(metadata):
184
"""Custom cache validation logic."""
185
return metadata.get('version') == get_current_version()
186
187
@mem.cache(cache_validation_callback=custom_validator)
188
def process_with_version():
189
return process_data()
190
```
191
192
### Storage Backend Registration
193
194
Extends Memory with custom storage backends for cloud storage, databases, or other persistence layers.
195
196
```python { .api }
197
def register_store_backend(backend_name, backend):
198
"""
199
Register a new storage backend for Memory objects.
200
201
Parameters:
202
- backend_name: str, name identifying the backend
203
- backend: class, StoreBackendBase subclass implementation
204
205
Raises:
206
ValueError: If backend_name is not string or backend doesn't inherit StoreBackendBase
207
"""
208
```
209
210
**Usage Example:**
211
212
```python
213
from joblib import Memory, register_store_backend
214
from joblib._store_backends import StoreBackendBase
215
216
class S3StoreBackend(StoreBackendBase):
217
"""Example S3 storage backend."""
218
219
def __init__(self, bucket_name, **kwargs):
220
self.bucket_name = bucket_name
221
# S3 client initialization
222
223
def _open_item(self, f, mode):
224
# S3-specific file opening logic
225
pass
226
227
def _item_exists(self, location):
228
# S3-specific existence check
229
pass
230
231
# ... implement other required methods
232
233
# Register custom backend
234
register_store_backend('s3', S3StoreBackend)
235
236
# Use with Memory
237
mem = Memory(backend='s3', backend_options={'bucket_name': 'my-cache-bucket'})
238
```
239
240
## Advanced Caching Patterns
241
242
### Memory Mapping for Large Arrays
243
244
```python
245
from joblib import Memory
246
import numpy as np
247
248
mem = Memory('./cache', mmap_mode='r')
249
250
@mem.cache
251
def create_large_array(size):
252
return np.random.random(size)
253
254
# Array is memory-mapped when loaded from cache
255
large_array = create_large_array((10000, 10000))
256
```
257
258
### Ignoring Specific Parameters
259
260
```python
261
@mem.cache(ignore=['verbose', 'debug'])
262
def process_data(data, model_params, verbose=False, debug=False):
263
# 'verbose' and 'debug' don't affect cache key
264
return model.fit(data, **model_params)
265
```
266
267
### Cache Size Management
268
269
```python
270
# Limit cache to 1GB and 100 items
271
mem.reduce_size(bytes_limit=1024**3, items_limit=100)
272
273
# Remove items older than 7 days
274
from datetime import timedelta
275
mem.reduce_size(age_limit=timedelta(days=7))
276
```