tessl install tessl/pypi-kedro@1.1.0Kedro helps you build production-ready data and analytics pipelines
Agent Success
Agent success rate when using this tile
98%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.32x
Baseline
Agent success rate without this tile
74%
Best practices for managing multi-environment configurations in Kedro.
conf/
├── base/ # Shared configuration (committed to git)
│ ├── catalog.yml # Data catalog
│ ├── parameters.yml # Pipeline parameters
│ └── logging.yml # Logging configuration
├── local/ # Local overrides (gitignored)
│ ├── catalog.yml # Local data sources
│ └── credentials.yml # Local credentials
└── prod/ # Production environment
├── catalog.yml # Production data sources
└── parameters.yml # Production parametersfrom kedro.config import OmegaConfigLoader
# Load configuration for specific environment
loader = OmegaConfigLoader(
conf_source="conf",
env="prod" # Loads base + prod configs
)
# Access configurations
catalog_config = loader["catalog"]
parameters = loader["parameters"]# conf/base/parameters.yml
model:
learning_rate: 0.001
epochs: 100
data:
path: "/data/project"# conf/prod/parameters.yml
model:
epochs: 200 # Override base value
data:
path: "/prod/data" # Override base valuemodel:
learning_rate: 0.001 # From base
epochs: 200 # From prod (overridden)
data:
path: "/prod/data" # From prod (overridden)Recursively merges configurations:
loader = OmegaConfigLoader(
conf_source="conf",
merge_strategy={"parameters": "soft"} # Default
)Completely replaces base configuration:
loader = OmegaConfigLoader(
conf_source="conf",
merge_strategy={"parameters": "destructive"}
)# Override parameters at runtime
loader = OmegaConfigLoader(
conf_source="conf",
env="prod",
runtime_params={
"model.learning_rate": 0.01,
"data.batch_size": 128
}
)# parameters.yml
data_dir: /data/project
raw_data_path: ${data_dir}/raw
processed_data_path: ${data_dir}/processed
model:
learning_rate: 0.001
training:
lr: ${model.learning_rate} # Reference other parameters# conf/base/catalog.yml
database:
type: pandas.SQLTableDataset
credentials: db_credentials # Reference credentials
table_name: users
# conf/local/credentials.yml (gitignored)
db_credentials:
con: postgresql://user:password@localhost:5432/dbname# conf/local/credentials.yml
db_credentials:
con: postgresql://user:password@localhost:5432/local_db
# conf/prod/credentials.yml
db_credentials:
con: postgresql://user:password@prod-server:5432/prod_db# parameters.yml
database_url: ${oc.env:DATABASE_URL}
api_key: ${oc.env:API_KEY}# conf/base/catalog.yml
model_input:
type: pandas.CSVDataset
filepath: ${base_location}/input.csv
# conf/local/catalog.yml
base_location: /local/data
# conf/prod/catalog.yml
base_location: s3://prod-bucket/data# parameters.yml
features:
use_new_algorithm: false
enable_caching: true
debug_mode: false
# conf/local/parameters.yml
features:
debug_mode: true
# conf/prod/parameters.yml
features:
use_new_algorithm: true
enable_caching: trueKedro provides a powerful parameter management system through the params: prefix, enabling automatic parameter injection into pipeline nodes.
Use the params: prefix in node inputs to automatically access parameters from your configuration:
from kedro.pipeline import node
def train_model(data, learning_rate, epochs):
"""Train model with specified hyperparameters."""
model = Model(learning_rate=learning_rate, epochs=epochs)
return model.fit(data)
# Reference entire parameter group
node(
train_model,
inputs=["training_data", "params:model"],
outputs="trained_model"
)# conf/base/parameters.yml
model:
learning_rate: 0.001
epochs: 100
batch_size: 32When the pipeline runs, params:model is automatically resolved to the model dictionary from parameters.yml.
Parameters are loaded and managed by KedroContext through the following process:
OmegaConfigLoader loads parameters.yml from the configuration directoryMemoryDataset in the catalog with the params: prefixparams:something, Kedro looks up the corresponding dataset in the catalogfrom kedro.framework.session import KedroSession
with KedroSession.create() as session:
context = session.load_context()
# Parameters are accessible through context
all_params = context.params
print(all_params) # {'model': {'learning_rate': 0.001, ...}}
# Parameters are also available in catalog with params: prefix
model_params = context.catalog.load("params:model")
print(model_params) # {'learning_rate': 0.001, ...}Kedro supports nested parameter access using dot notation, creating automatic datasets for each level:
# conf/base/parameters.yml
model:
neural_network:
learning_rate: 0.001
layers:
- 128
- 64
- 32
optimizer:
type: adam
beta1: 0.9
beta2: 0.999With this configuration, all of the following are automatically available:
# Access entire model config
node(func, "params:model", "output")
# Receives: {'neural_network': {...}, 'optimizer': {...}}
# Access nested neural_network config
node(func, "params:model.neural_network", "output")
# Receives: {'learning_rate': 0.001, 'layers': [128, 64, 32]}
# Access specific parameter
node(func, "params:model.neural_network.learning_rate", "output")
# Receives: 0.001
# Access optimizer config
node(func, "params:model.optimizer", "output")
# Receives: {'type': 'adam', 'beta1': 0.9, 'beta2': 0.999}
# Access specific optimizer parameter
node(func, "params:model.optimizer.type", "output")
# Receives: "adam"Important: Kedro automatically creates parameter datasets for every possible path through the nested structure. You don't need to explicitly register them.
When parameters are loaded, they follow this resolution order:
conf/base/parameters.ymlconf/{env}/parameters.yml (e.g., conf/prod/parameters.yml)params:* datasetsfrom kedro.config import OmegaConfigLoader
# Example: Runtime parameter overrides
loader = OmegaConfigLoader(
conf_source="conf",
env="prod",
runtime_params={
"model.learning_rate": 0.01, # Override specific nested parameter
"model.epochs": 200
}
)
# Resolution order:
# 1. conf/base/parameters.yml: learning_rate = 0.001
# 2. conf/prod/parameters.yml: learning_rate = 0.005 (if exists)
# 3. runtime_params: learning_rate = 0.01 (final value)def preprocess_data(data, params):
"""Preprocess with configurable settings."""
threshold = params["threshold"]
method = params["method"]
return process(data, threshold=threshold, method=method)
node(
preprocess_data,
inputs=["raw_data", "params:preprocessing"],
outputs="processed_data"
)# conf/base/parameters.yml
preprocessing:
threshold: 0.5
method: "standardize"def train_with_hyperparams(data, learning_rate, batch_size):
"""Train with specific hyperparameters."""
return train(data, lr=learning_rate, batch_size=batch_size)
# Access individual nested parameters
node(
train_with_hyperparams,
inputs=[
"training_data",
"params:model.learning_rate",
"params:model.batch_size"
],
outputs="model"
)from kedro.framework.session import KedroSession
# Override parameters for this session
with KedroSession.create(
extra_params={
"model.learning_rate": 0.01,
"model.epochs": 50,
"preprocessing.threshold": 0.7
}
) as session:
session.run()Command line override:
kedro run --params="model.learning_rate=0.01,model.epochs=50"Parameters can also be referenced in catalog configuration using variable interpolation:
# conf/base/catalog.yml
processed_data:
type: pandas.CSVDataset
filepath: data/processed/output.csv
save_args:
index: false
sep: ${params:file_format.separator} # Reference parameter# conf/base/parameters.yml
file_format:
separator: ","
encoding: "utf-8"def complex_processing(data, data_params, model_params, output_params):
"""Process data with multiple parameter groups."""
cleaned = clean(data, **data_params)
processed = transform(cleaned, **model_params)
return format_output(processed, **output_params)
node(
complex_processing,
inputs=[
"raw_data",
"params:data_processing",
"params:model_config",
"params:output_format"
],
outputs="final_output"
)def train_model(data, model_type, hyperparams):
"""Train with named parameter inputs."""
return train(data, model_type=model_type, **hyperparams)
# Use dict syntax for named inputs
node(
train_model,
inputs={
"data": "training_data",
"model_type": "params:model.type",
"hyperparams": "params:model.hyperparameters"
},
outputs="trained_model"
)# .gitignore
conf/local/
conf/**/credentials*
**/*credentials*# ✅ Good: Use environment variables
api_key: ${oc.env:API_KEY}
# ❌ Bad: Hardcode secrets
api_key: "secret123"conf/
├── base/
│ ├── catalog.yml
│ ├── parameters.yml
│ ├── spark.yml
│ └── mlflow.yml
└── prod/
├── catalog.yml
├── parameters.yml
└── spark.yml# parameters.yml
# Model hyperparameters
model:
learning_rate: 0.001 # Learning rate for gradient descent
epochs: 100 # Number of training epochs
batch_size: 32 # Mini-batch sizeSee also: