0
# Configuration
1
2
Parsl's configuration system provides comprehensive control over workflow execution, resource management, monitoring, and checkpointing through the `Config` class.
3
4
## Capabilities
5
6
### Configuration Class
7
8
The main configuration class that binds together executors, monitoring, checkpointing, and execution policies.
9
10
```python { .api }
11
class Config:
12
def __init__(self, executors=None, app_cache=True, checkpoint_files=None,
13
checkpoint_mode=None, checkpoint_period=None,
14
dependency_resolver=None, exit_mode='cleanup',
15
garbage_collect=True, internal_tasks_max_threads=10,
16
retries=0, retry_handler=None, run_dir='runinfo',
17
std_autopath=None, strategy='simple', strategy_period=5,
18
max_idletime=120.0, monitoring=None, usage_tracking=0,
19
project_name=None, initialize_logging=True):
20
"""
21
Parsl configuration specification.
22
23
Parameters:
24
- executors: List of ParslExecutor instances (default: [ThreadPoolExecutor()])
25
- app_cache: Enable app result caching (default: True)
26
- checkpoint_files: List of checkpoint file paths to load
27
- checkpoint_mode: 'dfk_exit', 'task_exit', 'periodic', 'manual', or None
28
- checkpoint_period: Time interval for periodic checkpointing (HH:MM:SS)
29
- dependency_resolver: Custom dependency resolver plugin
30
- exit_mode: Context manager exit behavior ('cleanup', 'skip', 'wait')
31
- garbage_collect: Enable task garbage collection (default: True)
32
- internal_tasks_max_threads: Max threads for internal operations (default: 10)
33
- retries: Default retry count for failed tasks (default: 0)
34
- retry_handler: Custom retry handler function
35
- run_dir: Directory for Parsl runtime information (default: 'runinfo')
36
- std_autopath: Standard output auto-path function
37
- strategy: Task execution strategy ('simple', 'htex_auto_scale', 'none')
38
- strategy_period: Strategy evaluation period in seconds (default: 5)
39
- max_idletime: Max idle time before cleanup in seconds (default: 120.0)
40
- monitoring: MonitoringHub instance for workflow monitoring
41
- usage_tracking: Usage tracking level 0-3 (default: 0)
42
- project_name: Project name for usage tracking identification
43
- initialize_logging: Set up logging automatically (default: True)
44
"""
45
46
@property
47
def executors(self):
48
"""Read-only property returning tuple of configured executors."""
49
50
def validate_usage_tracking(self, level):
51
"""Validate usage tracking level is between 0 and 3."""
52
53
def get_usage_information(self):
54
"""Get configuration usage information for tracking."""
55
```
56
57
**Basic Configuration Example:**
58
59
```python
60
from parsl.config import Config
61
from parsl.executors import ThreadPoolExecutor, HighThroughputExecutor
62
from parsl.providers import LocalProvider
63
64
# Simple local configuration
65
config = Config(
66
executors=[
67
ThreadPoolExecutor(
68
max_threads=4,
69
label='local_threads'
70
)
71
]
72
)
73
74
# More complex configuration with multiple executors
75
config = Config(
76
executors=[
77
ThreadPoolExecutor(max_threads=2, label='light_tasks'),
78
HighThroughputExecutor(
79
label='heavy_compute',
80
max_workers=8,
81
provider=LocalProvider(
82
init_blocks=1,
83
max_blocks=2,
84
)
85
)
86
],
87
app_cache=True,
88
retries=2,
89
checkpoint_mode='task_exit'
90
)
91
```
92
93
### Checkpointing Configuration
94
95
Configure automatic checkpointing to enable workflow restart and recovery.
96
97
```python { .api }
98
# Checkpoint modes:
99
# - 'dfk_exit': Checkpoint when DataFlowKernel exits
100
# - 'task_exit': Checkpoint after each task completion
101
# - 'periodic': Checkpoint at regular intervals
102
# - 'manual': Only checkpoint when explicitly called
103
# - None: Disable checkpointing
104
```
105
106
**Checkpointing Example:**
107
108
```python
109
from parsl.utils import get_all_checkpoints
110
111
# Configuration with checkpointing
112
config = Config(
113
executors=[ThreadPoolExecutor(max_threads=4)],
114
checkpoint_mode='task_exit', # Checkpoint after each task
115
checkpoint_files=get_all_checkpoints('checkpoints/'), # Load existing
116
run_dir='workflow_run_001'
117
)
118
119
# Periodic checkpointing
120
config = Config(
121
executors=[ThreadPoolExecutor(max_threads=4)],
122
checkpoint_mode='periodic',
123
checkpoint_period='00:10:00' # Every 10 minutes
124
)
125
```
126
127
### Monitoring Configuration
128
129
Configure workflow monitoring for performance tracking and resource usage analysis.
130
131
```python { .api }
132
from parsl.monitoring import MonitoringHub
133
134
monitoring_config = MonitoringHub(
135
hub_address='localhost',
136
hub_port=55055,
137
monitoring_debug=False,
138
resource_monitoring_interval=30, # seconds
139
logging_endpoint=None,
140
logdir='monitoring_logs'
141
)
142
```
143
144
**Monitoring Example:**
145
146
```python
147
from parsl.monitoring import MonitoringHub
148
149
config = Config(
150
executors=[HighThroughputExecutor(max_workers=4)],
151
monitoring=MonitoringHub(
152
hub_address='localhost',
153
hub_port=55055,
154
resource_monitoring_interval=10,
155
logdir='parsl_monitoring'
156
)
157
)
158
```
159
160
### Usage Tracking Configuration
161
162
Control Parsl's optional anonymous usage tracking for development insights.
163
164
```python { .api }
165
from parsl.usage_tracking.levels import DISABLED, LEVEL_1, LEVEL_2, LEVEL_3
166
167
# Usage tracking levels:
168
# - DISABLED: No tracking
169
# - LEVEL_1: Basic usage statistics
170
# - LEVEL_2: Configuration and executor info
171
# - LEVEL_3: Detailed performance metrics (default)
172
```
173
174
**Usage Tracking Example:**
175
176
```python
177
from parsl.usage_tracking.levels import DISABLED
178
179
config = Config(
180
executors=[ThreadPoolExecutor(max_threads=4)],
181
usage_tracking=DISABLED # Disable usage tracking
182
)
183
```
184
185
### Advanced Configuration Options
186
187
Advanced options for specialized workflow requirements and performance tuning.
188
189
```python { .api }
190
# Retry and failure handling
191
def custom_retry_handler(exception, task_record):
192
"""Custom logic for determining retry behavior."""
193
return True # or False
194
195
def custom_failure_handler(exception, task_record):
196
"""Custom logic for handling task failures."""
197
pass
198
199
config = Config(
200
executors=[...],
201
retries=3,
202
retry_handler=custom_retry_handler,
203
task_failure_handler=custom_failure_handler,
204
max_idletime=300.0, # 5 minutes before cleanup
205
garbage_collect=True,
206
internal_tasks_max_threads=20
207
)
208
```
209
210
### Exit Mode Configuration
211
212
Control behavior when using Parsl as a context manager with `with parsl.load(config):`.
213
214
```python { .api }
215
# Exit modes:
216
# - 'cleanup': Cleanup DFK on exit without waiting
217
# - 'skip': Skip all shutdown behavior
218
# - 'wait': Wait for tasks when exiting normally, exit immediately on exception
219
```
220
221
**Context Manager Example:**
222
223
```python
224
import parsl
225
226
config = Config(
227
executors=[ThreadPoolExecutor(max_threads=4)],
228
exit_mode='wait' # Wait for completion on normal exit
229
)
230
231
with parsl.load(config):
232
# Submit tasks
233
futures = [my_app(i) for i in range(10)]
234
# Tasks will complete before exiting context
235
```
236
237
## Configuration Loading
238
239
Load configuration and manage DataFlowKernel lifecycle:
240
241
```python { .api }
242
import parsl
243
244
# Load configuration
245
parsl.load(config)
246
247
# Check current configuration state
248
current_dfk = parsl.dfk() # Get current DataFlowKernel
249
250
# Wait for all tasks to complete
251
parsl.wait_for_current_tasks()
252
253
# Clear configuration and shutdown
254
parsl.clear()
255
```
256
257
## Pre-built Configuration Templates
258
259
Parsl provides pre-configured templates for common computing environments:
260
261
```python
262
# Local configurations
263
from parsl.configs.htex_local import config as htex_local
264
from parsl.configs.local_threads import config as local_threads
265
266
# HPC system configurations
267
from parsl.configs.stampede2 import config as stampede2_config
268
from parsl.configs.frontera import config as frontera_config
269
from parsl.configs.summit import config as summit_config
270
271
# Cloud configurations
272
from parsl.configs.ec2 import config as ec2_config
273
from parsl.configs.kubernetes import config as k8s_config
274
275
# Use pre-built config
276
parsl.load(htex_local)
277
```
278
279
## Configuration Validation
280
281
Common configuration validation patterns and error handling:
282
283
```python
284
from parsl.errors import ConfigurationError
285
286
try:
287
parsl.load(config)
288
except ConfigurationError as e:
289
print(f"Configuration error: {e}")
290
# Handle configuration issues
291
292
# Validate executor labels are unique
293
executor_labels = [ex.label for ex in config.executors]
294
if len(executor_labels) != len(set(executor_labels)):
295
raise ConfigurationError("Executor labels must be unique")
296
```