Tessl Tile for pypi/kedro@1.0.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli-project.md configuration.md context-session.md data-catalog.md hooks.md index.md ipython-integration.md pipeline-construction.md pipeline-execution.md

ipython-integration.mddocs/

0
# IPython Integration
1

2
Kedro provides seamless integration with IPython and Jupyter environments through magic commands, automatic project loading, and interactive development support. This enables iterative development and debugging of data pipelines.
3

4
## Capabilities
5

6
### IPython Extension Loading
7

8
Main entry point for IPython extension that enables Kedro magic commands and project integration.
9

10
```python { .api }
11
def load_ipython_extension(ipython):
12
    """
13
    Load Kedro IPython extension.
14
    
15
    Args:
16
        ipython (InteractiveShell): IPython shell instance
17
        
18
    Side Effects:
19
        - Registers %reload_kedro magic command
20
        - Registers %load_node magic command  
21
        - Automatically loads Kedro project if found in current directory
22
        - Provides logging and status information
23
    """
24
```
25

26
### Project Reloading
27

28
Functions for loading and reloading Kedro projects in interactive environments.
29

30
```python { .api }
31
def reload_kedro(path=None, env=None, runtime_params=None, local_namespace=None, conf_source=None):
32
    """
33
    Load or reload Kedro project in IPython/Jupyter environment.
34
    
35
    Args:
36
        path (str, optional): Path to Kedro project root
37
        env (str, optional): Environment name for configuration
38
        runtime_params (dict, optional): Runtime parameters to pass
39
        local_namespace (dict, optional): Local namespace for variable injection
40
        conf_source (str, optional): Configuration source directory
41
        
42
    Side Effects:
43
        - Creates KedroSession and loads context
44
        - Injects 'context', 'catalog', 'session', 'pipelines' into namespace
45
        - Loads project-specific entry points and magic commands
46
        - Configures project logging and settings
47
    """
48
```
49

50
## Magic Commands
51

52
### %reload_kedro Magic
53

54
IPython line magic for loading and reloading Kedro projects with configuration options.
55

56
```python { .api }
57
# %reload_kedro [path] [--env ENV] [--params PARAMS] [--conf-source CONF_SOURCE]
58

59
"""
60
Reload Kedro project with specified configuration.
61

62
Arguments:
63
    path (str, optional): Path to project root directory
64

65
Options:
66
    --env, -e (str): Environment name for configuration loading
67
    --params (str): Runtime parameters in key=value,key2=value2 format  
68
    --conf-source (str): Path to configuration source directory
69

70
Examples:
71
    %reload_kedro
72
    %reload_kedro /path/to/project
73
    %reload_kedro --env production
74
    %reload_kedro --params model_type=xgboost,n_estimators=100
75
    %reload_kedro --conf-source custom_conf
76
    %reload_kedro /path/to/project --env staging --params debug=true
77

78
Variables Created:
79
    context: KedroContext instance for project management
80
    catalog: DataCatalog instance for dataset operations
81
    session: KedroSession instance for lifecycle management
82
    pipelines: Pipeline registry with project pipelines
83
"""
84
```
85

86
### %load_node Magic
87

88
IPython line magic for loading and debugging individual pipeline nodes.
89

90
```python { .api }
91
# %load_node [node_name]
92

93
"""
94
Load node code for debugging and development.
95

96
Arguments:
97
    node_name (str): Name of the pipeline node to load
98

99
Features:
100
    - Generates executable code for node debugging
101
    - Loads node inputs from catalog
102
    - Provides import statements and function definitions
103
    - Creates executable function calls with proper parameters
104
    - Supports multiple output cells in Jupyter/VSCode
105

106
Supported Environments:
107
    - Jupyter Notebook (>7.0)  
108
    - Jupyter Lab
109
    - IPython terminal
110
    - VSCode Notebook
111

112
Examples:
113
    %load_node preprocess_data_node
114
    %load_node train_model_node
115

116
Generated Code Includes:
117
    1. Catalog loading statements for node inputs
118
    2. Import statements from node's source file
119
    3. Function definition from node source
120
    4. Function call with proper parameter mapping
121
"""
122
```
123

124
## Usage Examples
125

126
### Basic IPython Integration
127

128
```python
129
# In IPython/Jupyter cell
130
%load_ext kedro.ipython
131

132
# This automatically:
133
# 1. Registers magic commands
134
# 2. Detects and loads Kedro project from current directory
135
# 3. Provides context, catalog, session, pipelines variables
136

137
# Access project components
138
print(f"Available datasets: {catalog.keys()}")
139
print(f"Available pipelines: {list(pipelines.keys())}")
140

141
# Load data for exploration
142
raw_data = catalog.load("raw_data")
143
print(f"Raw data shape: {raw_data.shape}")
144

145
# Save results back to catalog
146
processed_data = raw_data.dropna()
147
catalog.save("processed_data", processed_data)
148
```
149

150
### Project Reloading
151

152
```python
153
# Reload project after code changes
154
%reload_kedro
155

156
# Reload with specific environment
157
%reload_kedro --env production
158

159
# Reload with runtime parameters
160
%reload_kedro --params model_type=random_forest,n_estimators=200
161

162
# Reload from different project path
163
%reload_kedro /path/to/other/project --env local
164
```
165

166
### Node Debugging Workflow
167

168
```python
169
# Load specific node for debugging
170
%load_node preprocess_data_node
171

172
# This generates multiple cells with:
173
# 1. Input loading
174
# 2. Imports
175
# 3. Function definition  
176
# 4. Function call
177

178
# Example generated code:
179
# Cell 1 - Load inputs
180
"""
181
# Prepare necessary inputs for debugging
182
# All debugging inputs must be defined in your project catalog
183
raw_data = catalog.load("raw_data")
184
parameters = catalog.load("parameters:preprocessing")
185
"""
186

187
# Cell 2 - Imports
188
"""
189
import pandas as pd
190
import numpy as np
191
from sklearn.preprocessing import StandardScaler
192
"""
193

194
# Cell 3 - Function definition
195
"""
196
def preprocess_data(raw_data, parameters):
197
    '''Clean and preprocess raw data.'''
198
    # Drop missing values
199
    clean_data = raw_data.dropna()
200
    
201
    # Apply scaling if requested
202
    if parameters.get("scale_features", False):
203
        scaler = StandardScaler()
204
        numeric_columns = clean_data.select_dtypes(include=[np.number]).columns
205
        clean_data[numeric_columns] = scaler.fit_transform(clean_data[numeric_columns])
206
    
207
    return clean_data
208
"""
209

210
# Cell 4 - Function call
211
"""
212
preprocess_data(raw_data, parameters)
213
"""
214
```
215

216
### Advanced Interactive Development
217

218
```python
219
# Load extension and project
220
%load_ext kedro.ipython
221
%reload_kedro --env development
222

223
# Explore pipeline structure
224
pipeline = pipelines["data_processing"]
225
print(f"Pipeline has {len(pipeline.nodes)} nodes")
226

227
# Visualize dependencies
228
for node in pipeline.nodes:
229
    print(f"{node.name}: {node.inputs} -> {node.outputs}")
230

231
# Run partial pipeline interactively
232
from kedro.runner import SequentialRunner
233
runner = SequentialRunner()
234

235
# Run just preprocessing nodes
236
preprocessing_pipeline = pipeline.filter(tags=["preprocessing"])
237
result = runner.run(preprocessing_pipeline, catalog)
238

239
# Inspect intermediate results
240
intermediate_data = catalog.load("cleaned_data")
241
print(f"Cleaned data statistics:\n{intermediate_data.describe()}")
242

243
# Test individual node modifications
244
def modified_preprocess_data(raw_data):
245
    # Test new preprocessing logic
246
    return raw_data.fillna(0)  # Different approach
247

248
# Test with current catalog data
249
test_input = catalog.load("raw_data")
250
test_output = modified_preprocess_data(test_input)
251
print(f"Modified preprocessing result: {test_output.shape}")
252
```
253

254
### Multi-Environment Development
255

256
```python
257
# Work with different environments
258
%reload_kedro --env local
259
local_catalog = catalog
260

261
%reload_kedro --env staging  
262
staging_catalog = catalog
263

264
# Compare configurations
265
print("Local datasets:", local_catalog.keys())
266
print("Staging datasets:", staging_catalog.keys())
267

268
# Test pipeline with different data
269
%reload_kedro --env test
270
test_result = session.run("validation_pipeline")
271
print("Test validation results:", test_result)
272
```
273

274
### Custom Magic Command Development
275

276
```python
277
from IPython.core.magic import register_line_magic, needs_local_scope
278
from kedro.framework.cli.utils import load_entry_points
279

280
@register_line_magic
281
@needs_local_scope
282
def kedro_status(line, local_ns=None):
283
    """Custom magic command to show Kedro project status."""
284
    if 'context' not in local_ns:
285
        print("No Kedro project loaded. Use %reload_kedro first.")
286
        return
287
    
288
    context = local_ns['context']
289
    catalog = local_ns['catalog']
290
    pipelines = local_ns['pipelines']
291
    
292
    print(f"Project Path: {context.project_path}")
293
    print(f"Environment: {context._env}")
294
    print(f"Datasets: {len(catalog.keys())}")
295
    print(f"Pipelines: {len(pipelines)}")
296
    
297
    # Show pipeline node counts
298
    for name, pipeline in pipelines.items():
299
        print(f"  {name}: {len(pipeline.nodes)} nodes")
300

301
# Register custom magic
302
%kedro_status
303
```
304

305
### Integration with Data Science Workflows
306

307
```python
308
# Load Kedro project
309
%load_ext kedro.ipython
310

311
# Use catalog data with pandas/matplotlib
312
import matplotlib.pyplot as plt
313
import seaborn as sns
314

315
# Load data for analysis
316
df = catalog.load("cleaned_data")
317

318
# Exploratory data analysis
319
plt.figure(figsize=(12, 8))
320
sns.heatmap(df.corr(), annot=True)
321
plt.title("Feature Correlations")
322
plt.show()
323

324
# Feature engineering experiments
325
def create_new_features(data):
326
    data = data.copy()
327
    data['feature_ratio'] = data['feature_a'] / data['feature_b']
328
    data['feature_interaction'] = data['feature_a'] * data['feature_b']
329
    return data
330

331
# Test new features
332
enhanced_data = create_new_features(df)
333

334
# Save back to catalog for pipeline use
335
catalog.save("enhanced_features", enhanced_data)
336

337
# Run modeling pipeline with new features
338
result = session.run("modeling_pipeline", from_inputs=["enhanced_features"])
339
```
340

341
### Debugging Failed Pipelines
342

343
```python
344
# Load project and examine failed pipeline
345
%reload_kedro
346

347
# Load the specific node that failed
348
%load_node problematic_node
349

350
# Debug with actual inputs
351
problematic_inputs = {
352
    input_name: catalog.load(input_name) 
353
    for input_name in node.inputs
354
}
355

356
# Step through function logic
357
def debug_function(input_data, parameters):
358
    print(f"Input shape: {input_data.shape}")
359
    print(f"Parameters: {parameters}")
360
    
361
    # Add debugging prints
362
    step1_result = input_data.dropna()
363
    print(f"After dropna: {step1_result.shape}")
364
    
365
    step2_result = step1_result[step1_result['value'] > 0]
366
    print(f"After filtering: {step2_result.shape}")
367
    
368
    return step2_result
369

370
# Test with debugging
371
debug_result = debug_function(
372
    problematic_inputs['input_data'],
373
    problematic_inputs['parameters:config']
374
)
375
```
376

377
## Environment Detection and Adaptation
378

379
```python { .api }
380
def _guess_run_environment():
381
    """
382
    Detect the current IPython/Jupyter environment.
383
    
384
    Returns:
385
        str: Environment identifier - "vscode", "databricks", "jupyter", or "ipython"
386
        
387
    Detection Logic:
388
        - VSCode: Checks for VSCODE_PID or VSCODE_CWD environment variables
389
        - Databricks: Checks for DATABRICKS_RUNTIME_VERSION environment variable  
390
        - Jupyter: Checks for kernel attribute on IPython instance
391
        - IPython: Default fallback for terminal IPython
392
    """
393
```
394

395
## Types
396

397
```python { .api }
398
from typing import Dict, Any, Optional, List
399
from IPython.core.interactiveshell import InteractiveShell
400

401
EnvironmentName = Optional[str]
402
RuntimeParams = Optional[Dict[str, Any]]
403
LocalNamespace = Optional[Dict[str, Any]]
404
ConfSource = Optional[str]
405
ProjectPath = Optional[str]
406
NodeName = str
407
MagicCommand = str
408
```

Version

Tile

Files

ipython-integration.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

ipython-integration.mddocs/