0
# Temporary Operations
1
2
dill.temp provides utilities for temporary file operations with serialization, stream capture, and IO buffer management for testing, development workflows, and temporary data handling.
3
4
## File Operations
5
6
### Temporary File Serialization
7
8
```python { .api }
9
def dump(object, **kwds):
10
"""
11
Dump object to a NamedTemporaryFile using dill.dump.
12
13
Serializes an object to a temporary file and returns the file handle,
14
useful for quick testing and temporary storage without managing filenames.
15
16
Parameters:
17
- object: object to serialize to temporary file
18
- **kwds: optional keyword arguments including:
19
- suffix: str, file name suffix (default: no suffix)
20
- prefix: str, file name prefix (default: system default)
21
- other arguments passed to NamedTemporaryFile and dill.dump
22
23
Returns:
24
file handle: NamedTemporaryFile handle containing serialized object
25
26
Raises:
27
- PicklingError: when object cannot be serialized
28
- IOError: when temporary file operations fail
29
"""
30
31
def load(file, **kwds):
32
"""
33
Load an object that was stored with dill.temp.dump.
34
35
Deserializes an object from a file, typically used with temporary
36
files created by temp.dump but works with any dill-compatible file.
37
38
Parameters:
39
- file: file handle or str, file handle or path to file containing serialized object
40
- **kwds: optional keyword arguments including:
41
- mode: str, file open mode ('r' or 'rb', default: 'rb')
42
- other arguments passed to open() and dill.load
43
44
Returns:
45
object: deserialized object
46
47
Raises:
48
- UnpicklingError: when object cannot be deserialized
49
- IOError: when file operations fail
50
"""
51
```
52
53
### Source Code File Operations
54
55
```python { .api }
56
def load_source(file, **kwds):
57
"""
58
Load source code from file.
59
60
Reads and executes Python source code from a file, returning
61
the executed namespace or specific objects.
62
63
Parameters:
64
- file: str, path to Python source file
65
- **kwds: keyword arguments for source execution control
66
67
Returns:
68
object: result of source code execution
69
70
Raises:
71
- SyntaxError: when source code is invalid
72
- IOError: when file cannot be read
73
"""
74
75
def dump_source(object, **kwds):
76
"""
77
Dump source code to temporary file.
78
79
Extracts source code from an object and writes it to a temporary
80
file, useful for code analysis and temporary source storage.
81
82
Parameters:
83
- object: object to extract source code from
84
- **kwds: keyword arguments for source extraction
85
86
Returns:
87
str: path to temporary file containing source code
88
89
Raises:
90
- OSError: when source code cannot be extracted
91
- IOError: when file operations fail
92
"""
93
```
94
95
## IO Buffer Operations
96
97
### Buffer-based Serialization
98
99
```python { .api }
100
def loadIO(buffer, **kwds):
101
"""
102
Load object from IO buffer.
103
104
Deserializes an object from a bytes buffer or file-like object,
105
providing memory-based deserialization for testing and processing.
106
107
Parameters:
108
- buffer: bytes or file-like object containing serialized data
109
- **kwds: keyword arguments passed to dill.load
110
111
Returns:
112
object: deserialized object
113
114
Raises:
115
- UnpicklingError: when deserialization fails
116
- TypeError: when buffer type is not supported
117
"""
118
119
def dumpIO(object, **kwds):
120
"""
121
Dump object to IO buffer.
122
123
Serializes an object to a bytes buffer, returning the buffer
124
for further processing or storage.
125
126
Parameters:
127
- object: object to serialize
128
- **kwds: keyword arguments passed to dill.dumps
129
130
Returns:
131
bytes: serialized object as bytes buffer
132
133
Raises:
134
- PicklingError: when serialization fails
135
"""
136
137
def loadIO_source(buffer, **kwds):
138
"""
139
Load source code from IO buffer.
140
141
Reads and executes Python source code from a buffer,
142
useful for dynamic code execution and testing.
143
144
Parameters:
145
- buffer: str or bytes containing Python source code
146
- **kwds: keyword arguments for execution control
147
148
Returns:
149
object: result of source code execution
150
151
Raises:
152
- SyntaxError: when source code is invalid
153
"""
154
155
def dumpIO_source(object, **kwds):
156
"""
157
Dump source code to IO buffer.
158
159
Extracts source code from an object and returns it as a string buffer,
160
useful for in-memory source code processing.
161
162
Parameters:
163
- object: object to extract source code from
164
- **kwds: keyword arguments for source extraction
165
166
Returns:
167
str: source code as string buffer
168
169
Raises:
170
- OSError: when source code cannot be extracted
171
"""
172
```
173
174
## Stream Capture
175
176
### Output Capture Utilities
177
178
```python { .api }
179
def capture(stream='stdout'):
180
"""
181
Capture stdout or stderr stream.
182
183
Context manager that captures output from stdout or stderr,
184
useful for testing, logging, and output analysis.
185
186
Parameters:
187
- stream: str, stream name ('stdout' or 'stderr')
188
189
Returns:
190
context manager that captures stream output
191
192
Usage:
193
with dill.temp.capture('stdout') as output:
194
print("Hello")
195
captured_text = output.getvalue()
196
"""
197
```
198
199
## Usage Examples
200
201
### Basic Temporary File Operations
202
203
```python
204
import dill.temp as temp
205
206
# Create some test data
207
def test_function(x, y):
208
return x * y + 42
209
210
class TestClass:
211
def __init__(self, value):
212
self.value = value
213
214
def get_doubled(self):
215
return self.value * 2
216
217
test_data = {
218
'function': test_function,
219
'class': TestClass,
220
'instance': TestClass(10),
221
'list': [1, 2, 3, 4, 5]
222
}
223
224
# Dump to temporary file
225
temp_file = temp.dump(test_data)
226
print(f"Data saved to temporary file: {temp_file}")
227
228
# Load from temporary file
229
restored_data = temp.load(temp_file)
230
print(f"Restored data keys: {list(restored_data.keys())}")
231
232
# Test restored functionality
233
restored_function = restored_data['function']
234
result = restored_function(5, 3)
235
print(f"Function result: {result}") # Should be 57
236
237
restored_instance = restored_data['instance']
238
doubled = restored_instance.get_doubled()
239
print(f"Instance method result: {doubled}") # Should be 20
240
```
241
242
### Source Code Operations
243
244
```python
245
import dill.temp as temp
246
247
# Define a function to work with
248
def example_algorithm(data, threshold=0.5):
249
"""Example algorithm for demonstration."""
250
processed = []
251
for item in data:
252
if isinstance(item, (int, float)) and item > threshold:
253
processed.append(item * 2)
254
else:
255
processed.append(item)
256
return processed
257
258
# Dump source code to temporary file
259
source_file = temp.dump_source(example_algorithm)
260
print(f"Source code saved to: {source_file}")
261
262
# Load and examine source code
263
with open(source_file, 'r') as f:
264
source_content = f.read()
265
print("Source code:")
266
print(source_content)
267
268
# Load source code as executable
269
loaded_func = temp.load_source(source_file)
270
print(f"Loaded function: {loaded_func}")
271
272
# Test loaded function
273
test_data = [0.2, 0.8, 1.5, 0.1, 2.0]
274
result = loaded_func(test_data, threshold=0.6)
275
print(f"Algorithm result: {result}")
276
```
277
278
### IO Buffer Operations
279
280
```python
281
import dill.temp as temp
282
import io
283
284
# Complex object for testing
285
complex_data = {
286
'functions': [lambda x: x**2, lambda x: x**3],
287
'nested': {'level1': {'level2': [1, 2, 3]}},
288
'instance': TestClass(42)
289
}
290
291
# Serialize to buffer
292
buffer_data = temp.dumpIO(complex_data)
293
print(f"Serialized to buffer: {len(buffer_data)} bytes")
294
295
# Deserialize from buffer
296
restored_from_buffer = temp.loadIO(buffer_data)
297
print("Restored from buffer successfully")
298
299
# Test functionality
300
square_func = restored_from_buffer['functions'][0]
301
print(f"Square function: {square_func(5)}") # Should be 25
302
303
# Source code buffer operations
304
source_buffer = temp.dumpIO_source(example_algorithm)
305
print(f"Source code buffer length: {len(source_buffer)} characters")
306
307
# Load source from buffer
308
namespace = {}
309
exec(source_buffer, namespace)
310
func_from_buffer = namespace['example_algorithm']
311
result = func_from_buffer([1, 2, 3], threshold=1.5)
312
print(f"Function from source buffer: {result}")
313
```
314
315
### Stream Capture
316
317
```python
318
import dill.temp as temp
319
320
def noisy_function():
321
"""Function that produces output."""
322
print("Starting processing...")
323
print("Processing item 1")
324
print("Processing item 2")
325
print("Processing complete!")
326
return "Result"
327
328
def error_function():
329
"""Function that produces error output."""
330
import sys
331
print("This goes to stdout")
332
print("This goes to stderr", file=sys.stderr)
333
return "Done"
334
335
# Capture stdout
336
with temp.capture('stdout') as stdout_capture:
337
result = noisy_function()
338
339
stdout_output = stdout_capture.getvalue()
340
print(f"Captured stdout ({len(stdout_output)} chars):")
341
print(repr(stdout_output))
342
343
# Capture stderr
344
with temp.capture('stderr') as stderr_capture:
345
result = error_function()
346
347
stderr_output = stderr_capture.getvalue()
348
print(f"Captured stderr: {repr(stderr_output)}")
349
350
# Capture both streams
351
import sys
352
from contextlib import redirect_stdout, redirect_stderr
353
354
def capture_both_streams(func):
355
"""Capture both stdout and stderr."""
356
stdout_buffer = io.StringIO()
357
stderr_buffer = io.StringIO()
358
359
with redirect_stdout(stdout_buffer), redirect_stderr(stderr_buffer):
360
result = func()
361
362
return {
363
'result': result,
364
'stdout': stdout_buffer.getvalue(),
365
'stderr': stderr_buffer.getvalue()
366
}
367
368
# Usage
369
captured = capture_both_streams(error_function)
370
print(f"Function result: {captured['result']}")
371
print(f"Stdout: {repr(captured['stdout'])}")
372
print(f"Stderr: {repr(captured['stderr'])}")
373
```
374
375
## Advanced Use Cases
376
377
### Testing Framework Integration
378
379
```python
380
import dill.temp as temp
381
import unittest
382
import os
383
384
class TempFileTestCase(unittest.TestCase):
385
"""Test case with temporary file management."""
386
387
def setUp(self):
388
self.temp_files = []
389
390
def tearDown(self):
391
# Clean up temporary files
392
for temp_file in self.temp_files:
393
try:
394
if os.path.exists(temp_file):
395
os.remove(temp_file)
396
except:
397
pass
398
399
def create_temp_object(self, obj):
400
"""Create temporary file for object and track it."""
401
temp_file = temp.dump(obj)
402
self.temp_files.append(temp_file)
403
return temp_file
404
405
def test_function_serialization(self):
406
"""Test function serialization via temporary files."""
407
def test_func(x):
408
return x * 2
409
410
# Serialize to temp file
411
temp_file = self.create_temp_object(test_func)
412
self.assertTrue(os.path.exists(temp_file))
413
414
# Load and test
415
loaded_func = temp.load(temp_file)
416
self.assertEqual(loaded_func(5), 10)
417
418
def test_complex_object_roundtrip(self):
419
"""Test complex object serialization roundtrip."""
420
complex_obj = {
421
'data': [1, 2, 3, 4, 5],
422
'func': lambda x: sum(x),
423
'nested': {'key': 'value'}
424
}
425
426
temp_file = self.create_temp_object(complex_obj)
427
loaded_obj = temp.load(temp_file)
428
429
# Test loaded functionality
430
self.assertEqual(loaded_obj['data'], [1, 2, 3, 4, 5])
431
self.assertEqual(loaded_obj['func']([1, 2, 3]), 6)
432
self.assertEqual(loaded_obj['nested']['key'], 'value')
433
434
# Run tests
435
if __name__ == '__main__':
436
unittest.main()
437
```
438
439
### Development Workflow Tools
440
441
```python
442
import dill.temp as temp
443
import datetime
444
import json
445
446
class DevelopmentSession:
447
"""Manage development session with temporary serialization."""
448
449
def __init__(self, session_name):
450
self.session_name = session_name
451
self.snapshots = []
452
self.current_objects = {}
453
454
def add_object(self, name, obj):
455
"""Add object to current session."""
456
self.current_objects[name] = obj
457
print(f"Added {name} to session")
458
459
def take_snapshot(self, description=""):
460
"""Take snapshot of current session state."""
461
timestamp = datetime.datetime.now().isoformat()
462
463
# Serialize current objects to temporary file
464
temp_file = temp.dump(self.current_objects)
465
466
snapshot = {
467
'timestamp': timestamp,
468
'description': description,
469
'temp_file': temp_file,
470
'object_count': len(self.current_objects)
471
}
472
473
self.snapshots.append(snapshot)
474
print(f"Snapshot taken: {len(self.snapshots)} total snapshots")
475
return len(self.snapshots) - 1
476
477
def restore_snapshot(self, snapshot_index):
478
"""Restore session from snapshot."""
479
if 0 <= snapshot_index < len(self.snapshots):
480
snapshot = self.snapshots[snapshot_index]
481
482
# Load objects from temporary file
483
self.current_objects = temp.load(snapshot['temp_file'])
484
485
print(f"Restored snapshot from {snapshot['timestamp']}")
486
print(f"Objects restored: {list(self.current_objects.keys())}")
487
else:
488
print(f"Invalid snapshot index: {snapshot_index}")
489
490
def list_snapshots(self):
491
"""List all snapshots."""
492
for i, snapshot in enumerate(self.snapshots):
493
print(f"{i}: {snapshot['timestamp']} - {snapshot['description']} ({snapshot['object_count']} objects)")
494
495
def get_object(self, name):
496
"""Get object from current session."""
497
return self.current_objects.get(name)
498
499
def export_session_info(self, filename):
500
"""Export session metadata to JSON."""
501
session_info = {
502
'session_name': self.session_name,
503
'snapshots': [
504
{
505
'timestamp': s['timestamp'],
506
'description': s['description'],
507
'object_count': s['object_count']
508
}
509
for s in self.snapshots
510
],
511
'current_objects': list(self.current_objects.keys())
512
}
513
514
with open(filename, 'w') as f:
515
json.dump(session_info, f, indent=2)
516
517
print(f"Session info exported to {filename}")
518
519
# Usage example
520
session = DevelopmentSession("algorithm_development")
521
522
# Add some objects to work with
523
def fibonacci(n):
524
if n <= 1:
525
return n
526
return fibonacci(n-1) + fibonacci(n-2)
527
528
class DataProcessor:
529
def __init__(self, multiplier=1):
530
self.multiplier = multiplier
531
532
def process(self, data):
533
return [x * self.multiplier for x in data]
534
535
session.add_object('fib_func', fibonacci)
536
session.add_object('processor', DataProcessor(2))
537
session.add_object('test_data', [1, 2, 3, 4, 5])
538
539
# Take initial snapshot
540
session.take_snapshot("Initial development state")
541
542
# Modify objects
543
session.add_object('processor', DataProcessor(3)) # Change multiplier
544
session.add_object('results', session.get_object('processor').process(session.get_object('test_data')))
545
546
# Take another snapshot
547
session.take_snapshot("After modifications")
548
549
# Show snapshots
550
session.list_snapshots()
551
552
# Restore previous state
553
session.restore_snapshot(0)
554
555
# Export session info
556
session.export_session_info('dev_session.json')
557
```
558
559
### Performance Benchmarking
560
561
```python
562
import dill.temp as temp
563
import time
564
import sys
565
566
class SerializationBenchmark:
567
"""Benchmark serialization performance with temporary operations."""
568
569
def __init__(self):
570
self.results = {}
571
572
def benchmark_temp_operations(self, test_objects, iterations=10):
573
"""Benchmark temporary file operations."""
574
for name, obj in test_objects.items():
575
print(f"Benchmarking {name}...")
576
577
# Measure temp.dump performance
578
dump_times = []
579
load_times = []
580
file_sizes = []
581
582
for i in range(iterations):
583
# Time dump operation
584
start_time = time.time()
585
temp_file = temp.dump(obj)
586
dump_time = time.time() - start_time
587
dump_times.append(dump_time)
588
589
# Get file size
590
file_size = os.path.getsize(temp_file)
591
file_sizes.append(file_size)
592
593
# Time load operation
594
start_time = time.time()
595
temp.load(temp_file)
596
load_time = time.time() - start_time
597
load_times.append(load_time)
598
599
# Clean up
600
os.remove(temp_file)
601
602
self.results[name] = {
603
'avg_dump_time': sum(dump_times) / len(dump_times),
604
'avg_load_time': sum(load_times) / len(load_times),
605
'avg_file_size': sum(file_sizes) / len(file_sizes),
606
'min_dump_time': min(dump_times),
607
'max_dump_time': max(dump_times),
608
'min_load_time': min(load_times),
609
'max_load_time': max(load_times)
610
}
611
612
def benchmark_io_operations(self, test_objects, iterations=10):
613
"""Benchmark IO buffer operations."""
614
for name, obj in test_objects.items():
615
print(f"Benchmarking IO operations for {name}...")
616
617
dump_times = []
618
load_times = []
619
buffer_sizes = []
620
621
for i in range(iterations):
622
# Time dumpIO operation
623
start_time = time.time()
624
buffer_data = temp.dumpIO(obj)
625
dump_time = time.time() - start_time
626
dump_times.append(dump_time)
627
628
buffer_sizes.append(len(buffer_data))
629
630
# Time loadIO operation
631
start_time = time.time()
632
temp.loadIO(buffer_data)
633
load_time = time.time() - start_time
634
load_times.append(load_time)
635
636
io_results = {
637
'avg_dump_time': sum(dump_times) / len(dump_times),
638
'avg_load_time': sum(load_times) / len(load_times),
639
'avg_buffer_size': sum(buffer_sizes) / len(buffer_sizes)
640
}
641
642
if name in self.results:
643
self.results[name]['io'] = io_results
644
else:
645
self.results[name] = {'io': io_results}
646
647
def print_results(self):
648
"""Print benchmark results."""
649
print("\\nSerialization Benchmark Results")
650
print("=" * 60)
651
652
for name, results in self.results.items():
653
print(f"\\n{name}:")
654
print("-" * 30)
655
656
if 'avg_dump_time' in results: # File operations
657
print(f"File Operations:")
658
print(f" Avg dump time: {results['avg_dump_time']:.4f}s")
659
print(f" Avg load time: {results['avg_load_time']:.4f}s")
660
print(f" Avg file size: {results['avg_file_size']:,.0f} bytes")
661
662
if 'io' in results: # IO operations
663
io = results['io']
664
print(f"IO Buffer Operations:")
665
print(f" Avg dump time: {io['avg_dump_time']:.4f}s")
666
print(f" Avg load time: {io['avg_load_time']:.4f}s")
667
print(f" Avg buffer size: {io['avg_buffer_size']:,.0f} bytes")
668
669
# Run benchmark
670
benchmark = SerializationBenchmark()
671
672
# Create test objects of varying complexity
673
test_objects = {
674
'simple_list': list(range(1000)),
675
'function': lambda x: x**2 + x + 1,
676
'class_instance': TestClass(100),
677
'nested_dict': {'level1': {'level2': {'level3': list(range(100))}}},
678
'mixed_structure': {
679
'data': list(range(500)),
680
'funcs': [lambda x: x+1, lambda x: x*2],
681
'objects': [TestClass(i) for i in range(10)]
682
}
683
}
684
685
benchmark.benchmark_temp_operations(test_objects, iterations=5)
686
benchmark.benchmark_io_operations(test_objects, iterations=5)
687
benchmark.print_results()
688
```
689
690
## Best Practices
691
692
### Temporary File Management
693
694
1. **Cleanup**: Always clean up temporary files when done
695
2. **Path Management**: Store temp file paths for later cleanup
696
3. **Error Handling**: Handle file operation errors gracefully
697
4. **Security**: Be aware of temporary file permissions and location
698
699
### Performance Optimization
700
701
1. **Buffer vs File**: Use IO buffers for small objects, files for large ones
702
2. **Memory Usage**: Monitor memory usage with large temporary operations
703
3. **Disk Space**: Monitor disk space usage with temporary files
704
4. **Cleanup Frequency**: Clean up temporary files regularly in long-running processes