or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli-project.mdconfiguration.mdcontext-session.mddata-catalog.mdhooks.mdindex.mdipython-integration.mdpipeline-construction.mdpipeline-execution.md

ipython-integration.mddocs/

0

# IPython Integration

1

2

Kedro provides seamless integration with IPython and Jupyter environments through magic commands, automatic project loading, and interactive development support. This enables iterative development and debugging of data pipelines.

3

4

## Capabilities

5

6

### IPython Extension Loading

7

8

Main entry point for IPython extension that enables Kedro magic commands and project integration.

9

10

```python { .api }

11

def load_ipython_extension(ipython):

12

"""

13

Load Kedro IPython extension.

14

15

Args:

16

ipython (InteractiveShell): IPython shell instance

17

18

Side Effects:

19

- Registers %reload_kedro magic command

20

- Registers %load_node magic command

21

- Automatically loads Kedro project if found in current directory

22

- Provides logging and status information

23

"""

24

```

25

26

### Project Reloading

27

28

Functions for loading and reloading Kedro projects in interactive environments.

29

30

```python { .api }

31

def reload_kedro(path=None, env=None, runtime_params=None, local_namespace=None, conf_source=None):

32

"""

33

Load or reload Kedro project in IPython/Jupyter environment.

34

35

Args:

36

path (str, optional): Path to Kedro project root

37

env (str, optional): Environment name for configuration

38

runtime_params (dict, optional): Runtime parameters to pass

39

local_namespace (dict, optional): Local namespace for variable injection

40

conf_source (str, optional): Configuration source directory

41

42

Side Effects:

43

- Creates KedroSession and loads context

44

- Injects 'context', 'catalog', 'session', 'pipelines' into namespace

45

- Loads project-specific entry points and magic commands

46

- Configures project logging and settings

47

"""

48

```

49

50

## Magic Commands

51

52

### %reload_kedro Magic

53

54

IPython line magic for loading and reloading Kedro projects with configuration options.

55

56

```python { .api }

57

# %reload_kedro [path] [--env ENV] [--params PARAMS] [--conf-source CONF_SOURCE]

58

59

"""

60

Reload Kedro project with specified configuration.

61

62

Arguments:

63

path (str, optional): Path to project root directory

64

65

Options:

66

--env, -e (str): Environment name for configuration loading

67

--params (str): Runtime parameters in key=value,key2=value2 format

68

--conf-source (str): Path to configuration source directory

69

70

Examples:

71

%reload_kedro

72

%reload_kedro /path/to/project

73

%reload_kedro --env production

74

%reload_kedro --params model_type=xgboost,n_estimators=100

75

%reload_kedro --conf-source custom_conf

76

%reload_kedro /path/to/project --env staging --params debug=true

77

78

Variables Created:

79

context: KedroContext instance for project management

80

catalog: DataCatalog instance for dataset operations

81

session: KedroSession instance for lifecycle management

82

pipelines: Pipeline registry with project pipelines

83

"""

84

```

85

86

### %load_node Magic

87

88

IPython line magic for loading and debugging individual pipeline nodes.

89

90

```python { .api }

91

# %load_node [node_name]

92

93

"""

94

Load node code for debugging and development.

95

96

Arguments:

97

node_name (str): Name of the pipeline node to load

98

99

Features:

100

- Generates executable code for node debugging

101

- Loads node inputs from catalog

102

- Provides import statements and function definitions

103

- Creates executable function calls with proper parameters

104

- Supports multiple output cells in Jupyter/VSCode

105

106

Supported Environments:

107

- Jupyter Notebook (>7.0)

108

- Jupyter Lab

109

- IPython terminal

110

- VSCode Notebook

111

112

Examples:

113

%load_node preprocess_data_node

114

%load_node train_model_node

115

116

Generated Code Includes:

117

1. Catalog loading statements for node inputs

118

2. Import statements from node's source file

119

3. Function definition from node source

120

4. Function call with proper parameter mapping

121

"""

122

```

123

124

## Usage Examples

125

126

### Basic IPython Integration

127

128

```python

129

# In IPython/Jupyter cell

130

%load_ext kedro.ipython

131

132

# This automatically:

133

# 1. Registers magic commands

134

# 2. Detects and loads Kedro project from current directory

135

# 3. Provides context, catalog, session, pipelines variables

136

137

# Access project components

138

print(f"Available datasets: {catalog.keys()}")

139

print(f"Available pipelines: {list(pipelines.keys())}")

140

141

# Load data for exploration

142

raw_data = catalog.load("raw_data")

143

print(f"Raw data shape: {raw_data.shape}")

144

145

# Save results back to catalog

146

processed_data = raw_data.dropna()

147

catalog.save("processed_data", processed_data)

148

```

149

150

### Project Reloading

151

152

```python

153

# Reload project after code changes

154

%reload_kedro

155

156

# Reload with specific environment

157

%reload_kedro --env production

158

159

# Reload with runtime parameters

160

%reload_kedro --params model_type=random_forest,n_estimators=200

161

162

# Reload from different project path

163

%reload_kedro /path/to/other/project --env local

164

```

165

166

### Node Debugging Workflow

167

168

```python

169

# Load specific node for debugging

170

%load_node preprocess_data_node

171

172

# This generates multiple cells with:

173

# 1. Input loading

174

# 2. Imports

175

# 3. Function definition

176

# 4. Function call

177

178

# Example generated code:

179

# Cell 1 - Load inputs

180

"""

181

# Prepare necessary inputs for debugging

182

# All debugging inputs must be defined in your project catalog

183

raw_data = catalog.load("raw_data")

184

parameters = catalog.load("parameters:preprocessing")

185

"""

186

187

# Cell 2 - Imports

188

"""

189

import pandas as pd

190

import numpy as np

191

from sklearn.preprocessing import StandardScaler

192

"""

193

194

# Cell 3 - Function definition

195

"""

196

def preprocess_data(raw_data, parameters):

197

'''Clean and preprocess raw data.'''

198

# Drop missing values

199

clean_data = raw_data.dropna()

200

201

# Apply scaling if requested

202

if parameters.get("scale_features", False):

203

scaler = StandardScaler()

204

numeric_columns = clean_data.select_dtypes(include=[np.number]).columns

205

clean_data[numeric_columns] = scaler.fit_transform(clean_data[numeric_columns])

206

207

return clean_data

208

"""

209

210

# Cell 4 - Function call

211

"""

212

preprocess_data(raw_data, parameters)

213

"""

214

```

215

216

### Advanced Interactive Development

217

218

```python

219

# Load extension and project

220

%load_ext kedro.ipython

221

%reload_kedro --env development

222

223

# Explore pipeline structure

224

pipeline = pipelines["data_processing"]

225

print(f"Pipeline has {len(pipeline.nodes)} nodes")

226

227

# Visualize dependencies

228

for node in pipeline.nodes:

229

print(f"{node.name}: {node.inputs} -> {node.outputs}")

230

231

# Run partial pipeline interactively

232

from kedro.runner import SequentialRunner

233

runner = SequentialRunner()

234

235

# Run just preprocessing nodes

236

preprocessing_pipeline = pipeline.filter(tags=["preprocessing"])

237

result = runner.run(preprocessing_pipeline, catalog)

238

239

# Inspect intermediate results

240

intermediate_data = catalog.load("cleaned_data")

241

print(f"Cleaned data statistics:\n{intermediate_data.describe()}")

242

243

# Test individual node modifications

244

def modified_preprocess_data(raw_data):

245

# Test new preprocessing logic

246

return raw_data.fillna(0) # Different approach

247

248

# Test with current catalog data

249

test_input = catalog.load("raw_data")

250

test_output = modified_preprocess_data(test_input)

251

print(f"Modified preprocessing result: {test_output.shape}")

252

```

253

254

### Multi-Environment Development

255

256

```python

257

# Work with different environments

258

%reload_kedro --env local

259

local_catalog = catalog

260

261

%reload_kedro --env staging

262

staging_catalog = catalog

263

264

# Compare configurations

265

print("Local datasets:", local_catalog.keys())

266

print("Staging datasets:", staging_catalog.keys())

267

268

# Test pipeline with different data

269

%reload_kedro --env test

270

test_result = session.run("validation_pipeline")

271

print("Test validation results:", test_result)

272

```

273

274

### Custom Magic Command Development

275

276

```python

277

from IPython.core.magic import register_line_magic, needs_local_scope

278

from kedro.framework.cli.utils import load_entry_points

279

280

@register_line_magic

281

@needs_local_scope

282

def kedro_status(line, local_ns=None):

283

"""Custom magic command to show Kedro project status."""

284

if 'context' not in local_ns:

285

print("No Kedro project loaded. Use %reload_kedro first.")

286

return

287

288

context = local_ns['context']

289

catalog = local_ns['catalog']

290

pipelines = local_ns['pipelines']

291

292

print(f"Project Path: {context.project_path}")

293

print(f"Environment: {context._env}")

294

print(f"Datasets: {len(catalog.keys())}")

295

print(f"Pipelines: {len(pipelines)}")

296

297

# Show pipeline node counts

298

for name, pipeline in pipelines.items():

299

print(f" {name}: {len(pipeline.nodes)} nodes")

300

301

# Register custom magic

302

%kedro_status

303

```

304

305

### Integration with Data Science Workflows

306

307

```python

308

# Load Kedro project

309

%load_ext kedro.ipython

310

311

# Use catalog data with pandas/matplotlib

312

import matplotlib.pyplot as plt

313

import seaborn as sns

314

315

# Load data for analysis

316

df = catalog.load("cleaned_data")

317

318

# Exploratory data analysis

319

plt.figure(figsize=(12, 8))

320

sns.heatmap(df.corr(), annot=True)

321

plt.title("Feature Correlations")

322

plt.show()

323

324

# Feature engineering experiments

325

def create_new_features(data):

326

data = data.copy()

327

data['feature_ratio'] = data['feature_a'] / data['feature_b']

328

data['feature_interaction'] = data['feature_a'] * data['feature_b']

329

return data

330

331

# Test new features

332

enhanced_data = create_new_features(df)

333

334

# Save back to catalog for pipeline use

335

catalog.save("enhanced_features", enhanced_data)

336

337

# Run modeling pipeline with new features

338

result = session.run("modeling_pipeline", from_inputs=["enhanced_features"])

339

```

340

341

### Debugging Failed Pipelines

342

343

```python

344

# Load project and examine failed pipeline

345

%reload_kedro

346

347

# Load the specific node that failed

348

%load_node problematic_node

349

350

# Debug with actual inputs

351

problematic_inputs = {

352

input_name: catalog.load(input_name)

353

for input_name in node.inputs

354

}

355

356

# Step through function logic

357

def debug_function(input_data, parameters):

358

print(f"Input shape: {input_data.shape}")

359

print(f"Parameters: {parameters}")

360

361

# Add debugging prints

362

step1_result = input_data.dropna()

363

print(f"After dropna: {step1_result.shape}")

364

365

step2_result = step1_result[step1_result['value'] > 0]

366

print(f"After filtering: {step2_result.shape}")

367

368

return step2_result

369

370

# Test with debugging

371

debug_result = debug_function(

372

problematic_inputs['input_data'],

373

problematic_inputs['parameters:config']

374

)

375

```

376

377

## Environment Detection and Adaptation

378

379

```python { .api }

380

def _guess_run_environment():

381

"""

382

Detect the current IPython/Jupyter environment.

383

384

Returns:

385

str: Environment identifier - "vscode", "databricks", "jupyter", or "ipython"

386

387

Detection Logic:

388

- VSCode: Checks for VSCODE_PID or VSCODE_CWD environment variables

389

- Databricks: Checks for DATABRICKS_RUNTIME_VERSION environment variable

390

- Jupyter: Checks for kernel attribute on IPython instance

391

- IPython: Default fallback for terminal IPython

392

"""

393

```

394

395

## Types

396

397

```python { .api }

398

from typing import Dict, Any, Optional, List

399

from IPython.core.interactiveshell import InteractiveShell

400

401

EnvironmentName = Optional[str]

402

RuntimeParams = Optional[Dict[str, Any]]

403

LocalNamespace = Optional[Dict[str, Any]]

404

ConfSource = Optional[str]

405

ProjectPath = Optional[str]

406

NodeName = str

407

MagicCommand = str

408

```