0
# Debug Tools
1
2
Specialized debugging functionality for loading debug export files and running webserver instances with ephemeral data. Enables detailed troubleshooting, development workflows, and analysis of production issues in isolated environments.
3
4
## Capabilities
5
6
### Debug Workspace Context
7
8
Specialized workspace context that works with ephemeral instances for debugging scenarios.
9
10
```python { .api }
11
class WebserverDebugWorkspaceProcessContext(IWorkspaceProcessContext):
12
"""
13
IWorkspaceProcessContext that works with an ephemeral instance.
14
Needed for dagster-webserver debug to work with preloaded debug data.
15
"""
16
17
def __init__(self, instance: DagsterInstance):
18
"""
19
Initialize debug workspace context.
20
21
Args:
22
instance: Ephemeral DagsterInstance with preloaded debug data
23
"""
24
25
def create_request_context(self, source: Optional[Any] = None) -> BaseWorkspaceRequestContext:
26
"""
27
Create request context for debug mode.
28
29
Args:
30
source: Optional source context
31
32
Returns:
33
BaseWorkspaceRequestContext: Context with ephemeral instance and empty workspace
34
"""
35
36
def refresh_code_location(self, name: str) -> None:
37
"""Refresh code location (not implemented in debug mode)."""
38
39
def reload_code_location(self, name: str) -> None:
40
"""Reload code location (not implemented in debug mode)."""
41
42
def reload_workspace(self) -> None:
43
"""Reload workspace (no-op in debug mode)."""
44
45
def refresh_workspace(self) -> None:
46
"""Refresh workspace (no-op in debug mode)."""
47
48
@property
49
def instance(self) -> DagsterInstance:
50
"""Get the ephemeral debug instance."""
51
52
@property
53
def version(self) -> str:
54
"""Get webserver version."""
55
```
56
57
**Usage Examples:**
58
59
```python
60
from dagster import DagsterInstance
61
from dagster._core.debug import DebugRunPayload
62
from dagster_webserver.debug import WebserverDebugWorkspaceProcessContext
63
from dagster_webserver.app import create_app_from_workspace_process_context
64
65
# Load debug payloads from export files
66
debug_payloads = []
67
for debug_file in debug_files:
68
with gzip.open(debug_file, 'rb') as f:
69
payload = deserialize_value(f.read().decode('utf-8'), DebugRunPayload)
70
debug_payloads.append(payload)
71
72
# Create ephemeral instance with debug data
73
debug_instance = DagsterInstance.ephemeral(preload=debug_payloads)
74
75
# Create debug workspace context
76
with WebserverDebugWorkspaceProcessContext(debug_instance) as debug_context:
77
# Create webserver app
78
app = create_app_from_workspace_process_context(debug_context)
79
80
# Run webserver
81
uvicorn.run(app, host="127.0.0.1", port=3000)
82
```
83
84
### Debug CLI Command
85
86
Command-line interface for loading debug export files and starting a webserver with the debug data.
87
88
```python { .api }
89
@click.command(name="debug")
90
@click.argument("input_files", nargs=-1, type=click.Path(exists=True))
91
@click.option("--port", "-p", type=click.INT, default=3000, help="Port to run server on")
92
def webserver_debug_command(input_files, port):
93
"""
94
Load webserver with ephemeral instance from dagster debug export files.
95
96
Args:
97
input_files: Paths to debug export files (gzipped)
98
port: Port to run webserver on
99
"""
100
101
def main():
102
"""Entry point for dagster-webserver-debug CLI."""
103
104
def load_debug_files(file_paths: List[str]) -> List[DebugRunPayload]:
105
"""
106
Load debug payloads from compressed export files.
107
108
Args:
109
file_paths: List of paths to .gz debug export files
110
111
Returns:
112
List[DebugRunPayload]: Loaded and deserialized debug payloads
113
"""
114
```
115
116
**CLI Usage Examples:**
117
118
```bash
119
# Load single debug export file
120
dagster-webserver-debug /path/to/debug_export.gz
121
122
# Load multiple debug files
123
dagster-webserver-debug debug1.gz debug2.gz debug3.gz
124
125
# Specify custom port
126
dagster-webserver-debug --port 8080 /path/to/debug_export.gz
127
128
# Load from current directory
129
dagster-webserver-debug *.gz
130
```
131
132
### Debug File Loading
133
134
Process for loading and deserializing debug export files:
135
136
```python
137
from gzip import GzipFile
138
from dagster._core.debug import DebugRunPayload
139
from dagster_shared.serdes import deserialize_value
140
141
def load_debug_files(file_paths):
142
"""
143
Load debug payloads from export files.
144
145
Args:
146
file_paths: List of paths to debug export files
147
148
Returns:
149
list[DebugRunPayload]: Loaded debug payloads
150
"""
151
debug_payloads = []
152
153
for file_path in file_paths:
154
print(f"Loading {file_path}...")
155
156
with GzipFile(file_path, "rb") as file:
157
blob = file.read().decode("utf-8")
158
debug_payload = deserialize_value(blob, DebugRunPayload)
159
160
print(f" run_id: {debug_payload.dagster_run.run_id}")
161
print(f" dagster version: {debug_payload.version}")
162
163
debug_payloads.append(debug_payload)
164
165
return debug_payloads
166
167
# Usage
168
debug_payloads = load_debug_files(["debug1.gz", "debug2.gz"])
169
```
170
171
### Ephemeral Instance Creation
172
173
Create DagsterInstance with preloaded debug data:
174
175
```python
176
from dagster import DagsterInstance
177
178
def create_debug_instance(debug_payloads):
179
"""
180
Create ephemeral instance with debug data.
181
182
Args:
183
debug_payloads: List of DebugRunPayload objects
184
185
Returns:
186
DagsterInstance: Ephemeral instance with preloaded data
187
"""
188
return DagsterInstance.ephemeral(preload=debug_payloads)
189
190
# Usage
191
debug_instance = create_debug_instance(debug_payloads)
192
193
# Instance contains all debug data
194
runs = debug_instance.get_runs()
195
print(f"Loaded {len(runs)} runs from debug files")
196
```
197
198
## Debug Export File Format
199
200
Debug export files contain serialized Dagster run data:
201
202
### DebugRunPayload Structure
203
```python
204
@dataclass
205
class DebugRunPayload:
206
"""Payload for debug export containing run and related data."""
207
version: str # Dagster version
208
dagster_run: DagsterRun # Run configuration and metadata
209
event_list: List[DagsterEvent] # All events for the run
210
instance_settings: dict # Instance configuration
211
workspace_context: dict # Workspace information
212
```
213
214
### Creating Debug Exports
215
```python
216
# Debug exports are typically created using dagster CLI
217
# dagster debug export <run_id> --output debug_export.gz
218
219
# Programmatic export (advanced usage)
220
from dagster._core.debug import DebugRunPayload
221
from dagster_shared.serdes import serialize_value
222
import gzip
223
224
def create_debug_export(instance, run_id, output_path):
225
"""Create debug export file for a run."""
226
run = instance.get_run_by_id(run_id)
227
events = instance.get_logs_for_run(run_id).all_events
228
229
payload = DebugRunPayload(
230
version=__version__,
231
dagster_run=run,
232
event_list=events,
233
instance_settings=instance.get_settings(),
234
workspace_context={}
235
)
236
237
serialized = serialize_value(payload)
238
239
with gzip.open(output_path, 'wb') as f:
240
f.write(serialized.encode('utf-8'))
241
```
242
243
## Development Workflows
244
245
### Local Debugging
246
```python
247
# Load production debug data locally
248
debug_files = ["prod_run_123.gz", "prod_run_124.gz"]
249
debug_payloads = load_debug_files(debug_files)
250
251
# Create local debug environment
252
debug_instance = DagsterInstance.ephemeral(preload=debug_payloads)
253
254
with WebserverDebugWorkspaceProcessContext(debug_instance) as context:
255
app = create_app_from_workspace_process_context(context)
256
257
# Access production data locally without connecting to prod
258
uvicorn.run(app, host="127.0.0.1", port=3000)
259
```
260
261
### Issue Investigation
262
```bash
263
# 1. Export debug data from production
264
dagster debug export failing_run_id --output failing_run.gz
265
266
# 2. Load in local debug environment
267
dagster-webserver-debug failing_run.gz
268
269
# 3. Navigate to http://localhost:3000 to investigate
270
# - View run logs and events
271
# - Analyze execution timeline
272
# - Examine asset lineage
273
# - Debug configuration issues
274
```
275
276
### Testing and Development
277
```python
278
# Create test debug data
279
test_payloads = create_test_debug_payloads()
280
test_instance = DagsterInstance.ephemeral(preload=test_payloads)
281
282
with WebserverDebugWorkspaceProcessContext(test_instance) as context:
283
# Test webserver functionality
284
app = create_app_from_workspace_process_context(context)
285
286
# Run integration tests
287
test_client = TestClient(app)
288
response = test_client.get("/graphql")
289
assert response.status_code == 200
290
```
291
292
### Performance Analysis
293
```python
294
# Load multiple runs for performance analysis
295
large_debug_files = glob.glob("debug_exports/*.gz")
296
debug_payloads = load_debug_files(large_debug_files)
297
298
print(f"Loaded {len(debug_payloads)} runs for analysis")
299
300
# Create instance with all data
301
analysis_instance = DagsterInstance.ephemeral(preload=debug_payloads)
302
303
# Analyze patterns across multiple runs
304
with WebserverDebugWorkspaceProcessContext(analysis_instance) as context:
305
app = create_app_from_workspace_process_context(context)
306
# Use webserver UI to analyze trends and patterns
307
```
308
309
## Advanced Debug Scenarios
310
311
### Custom Debug Context
312
```python
313
class CustomDebugWorkspaceProcessContext(WebserverDebugWorkspaceProcessContext):
314
"""Extended debug context with custom functionality."""
315
316
def __init__(self, instance, custom_config):
317
super().__init__(instance)
318
self.custom_config = custom_config
319
320
def create_request_context(self, source=None):
321
context = super().create_request_context(source)
322
# Add custom debugging information
323
context.debug_config = self.custom_config
324
return context
325
326
# Usage with custom context
327
custom_context = CustomDebugWorkspaceProcessContext(
328
debug_instance,
329
{"debug_mode": True, "verbose_logging": True}
330
)
331
```
332
333
### Debug Data Filtering
334
```python
335
def filter_debug_payloads(payloads, criteria):
336
"""Filter debug payloads based on criteria."""
337
filtered = []
338
339
for payload in payloads:
340
if criteria.get("status") and payload.dagster_run.status != criteria["status"]:
341
continue
342
if criteria.get("job_name") and payload.dagster_run.job_name != criteria["job_name"]:
343
continue
344
filtered.append(payload)
345
346
return filtered
347
348
# Load only failed runs
349
failed_runs = filter_debug_payloads(
350
all_payloads,
351
{"status": DagsterRunStatus.FAILURE}
352
)
353
354
debug_instance = DagsterInstance.ephemeral(preload=failed_runs)
355
```
356
357
### Multi-Environment Debug
358
```python
359
# Load debug data from multiple environments
360
prod_payloads = load_debug_files(glob.glob("prod_exports/*.gz"))
361
staging_payloads = load_debug_files(glob.glob("staging_exports/*.gz"))
362
363
# Create separate debug instances
364
prod_instance = DagsterInstance.ephemeral(preload=prod_payloads)
365
staging_instance = DagsterInstance.ephemeral(preload=staging_payloads)
366
367
# Compare environments
368
print(f"Prod runs: {len(prod_instance.get_runs())}")
369
print(f"Staging runs: {len(staging_instance.get_runs())}")
370
```
371
372
## Security Considerations
373
374
Debug mode should only be used in secure environments:
375
376
```python
377
# Ensure debug webserver is not exposed publicly
378
if os.getenv("ENVIRONMENT") == "production":
379
raise Exception("Debug mode not allowed in production")
380
381
# Bind to localhost only
382
host_dagster_ui_with_workspace_process_context(
383
debug_context,
384
host="127.0.0.1", # Never use 0.0.0.0 for debug
385
port=port,
386
path_prefix="",
387
log_level="debug"
388
)
389
```
390
391
## Limitations
392
393
Debug mode has several limitations:
394
395
- **No code locations**: Workspace contains no active code locations
396
- **No mutations**: Cannot execute runs or modify instance state
397
- **Ephemeral data**: All data is in-memory and lost when process ends
398
- **Limited workspace operations**: Most workspace operations are no-ops
399
- **Read-only access**: UI provides read-only view of historical data
400
401
These limitations ensure debug mode is safe for analyzing production data without affecting live systems.