CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-azure-ai-ml

Microsoft Azure Machine Learning Client Library for Python providing comprehensive SDK for ML workflows including job execution, pipeline components, model deployment, and AutoML capabilities

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

hyperparameter-tuning.mddocs/

Hyperparameter Tuning

Advanced hyperparameter optimization capabilities with various search spaces, sampling algorithms, and early termination policies for efficient model optimization.

Capabilities

Sweep Jobs

Hyperparameter sweep jobs for optimizing model performance across parameter spaces.

class SweepJob:
    def __init__(
        self,
        *,
        trial: CommandJob,
        search_space: dict,
        objective: Objective,
        sampling_algorithm: SamplingAlgorithm = None,
        early_termination: EarlyTerminationPolicy = None,
        limits: SweepJobLimits = None,
        compute: str = None,
        **kwargs
    ):
        """
        Hyperparameter sweep job for model optimization.
        
        Parameters:
        - trial: Template command job defining the training script
        - search_space: Dictionary defining parameter search spaces
        - objective: Optimization objective and metric
        - sampling_algorithm: Parameter sampling strategy
        - early_termination: Early stopping policy
        - limits: Sweep execution limits
        - compute: Compute target for sweep trials
        """

class SweepJobLimits:
    def __init__(
        self,
        *,
        max_total_trials: int = 1,
        max_concurrent_trials: int = 1,
        timeout_minutes: int = None,
        trial_timeout_minutes: int = None
    ):
        """
        Limits for sweep job execution.
        
        Parameters:
        - max_total_trials: Maximum number of trials to run
        - max_concurrent_trials: Maximum concurrent trials
        - timeout_minutes: Total sweep timeout in minutes
        - trial_timeout_minutes: Individual trial timeout in minutes
        """

Usage Example

from azure.ai.ml import command
from azure.ai.ml.entities import SweepJob, SweepJobLimits
from azure.ai.ml.sweep import Choice, Uniform, Objective, RandomSamplingAlgorithm, BanditPolicy

# Define the training command template
command_job = command(
    code="./src",
    command="python train.py --learning_rate ${{search_space.learning_rate}} --batch_size ${{search_space.batch_size}}",
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:1",
    compute="cpu-cluster"
)

# Define search space
search_space = {
    "learning_rate": Uniform(min_value=0.001, max_value=0.1),
    "batch_size": Choice(values=[16, 32, 64, 128])
}

# Create sweep job
sweep_job = SweepJob(
    trial=command_job,
    search_space=search_space,
    objective=Objective(goal="maximize", primary_metric="accuracy"),
    sampling_algorithm=RandomSamplingAlgorithm(),
    early_termination=BanditPolicy(slack_factor=0.1, evaluation_interval=2),
    limits=SweepJobLimits(
        max_total_trials=20,
        max_concurrent_trials=4,
        timeout_minutes=120
    )
)

# Submit sweep job
submitted_sweep = ml_client.jobs.create_or_update(sweep_job)

Search Space Functions

Functions for defining parameter search spaces with different distributions.

class Choice:
    def __init__(self, values: list):
        """
        Discrete choice from a list of values.
        
        Parameters:
        - values: List of possible values to choose from
        """

class Uniform:
    def __init__(self, min_value: float, max_value: float):
        """
        Uniform distribution between min and max values.
        
        Parameters:
        - min_value: Minimum value
        - max_value: Maximum value
        """

class LogUniform:
    def __init__(self, min_value: float, max_value: float):
        """
        Log-uniform distribution for parameters that vary exponentially.
        
        Parameters:
        - min_value: Minimum value (must be > 0)
        - max_value: Maximum value
        """

class Normal:
    def __init__(self, mu: float, sigma: float):
        """
        Normal (Gaussian) distribution.
        
        Parameters:
        - mu: Mean of the distribution
        - sigma: Standard deviation
        """

class LogNormal:
    def __init__(self, mu: float, sigma: float):
        """
        Log-normal distribution for positive parameters.
        
        Parameters:
        - mu: Mean of the underlying normal distribution
        - sigma: Standard deviation of the underlying normal distribution
        """

class QUniform:
    def __init__(self, min_value: float, max_value: float, q: float):
        """
        Quantized uniform distribution.
        
        Parameters:
        - min_value: Minimum value
        - max_value: Maximum value
        - q: Quantization step size
        """

class QLogUniform:
    def __init__(self, min_value: float, max_value: float, q: float):
        """
        Quantized log-uniform distribution.
        
        Parameters:
        - min_value: Minimum value (must be > 0)
        - max_value: Maximum value
        - q: Quantization step size
        """

class QNormal:
    def __init__(self, mu: float, sigma: float, q: float):
        """
        Quantized normal distribution.
        
        Parameters:
        - mu: Mean of the distribution
        - sigma: Standard deviation
        - q: Quantization step size
        """

class QLogNormal:
    def __init__(self, mu: float, sigma: float, q: float):
        """
        Quantized log-normal distribution.
        
        Parameters:
        - mu: Mean of the underlying normal distribution
        - sigma: Standard deviation of the underlying normal distribution
        - q: Quantization step size
        """

class Randint:
    def __init__(self, upper: int):
        """
        Random integer from 0 to upper-1.
        
        Parameters:
        - upper: Upper bound (exclusive)
        """

Usage Example

from azure.ai.ml.sweep import Choice, Uniform, LogUniform, Normal, Randint

# Different search space examples
search_space = {
    # Discrete choices
    "optimizer": Choice(values=["adam", "sgd", "rmsprop"]),
    "activation": Choice(values=["relu", "tanh", "sigmoid"]),
    
    # Continuous ranges
    "learning_rate": LogUniform(min_value=1e-5, max_value=1e-1),
    "dropout_rate": Uniform(min_value=0.1, max_value=0.5),
    "weight_decay": LogUniform(min_value=1e-6, max_value=1e-2),
    
    # Normal distributions
    "hidden_size": Normal(mu=128, sigma=32),
    
    # Integer ranges
    "batch_size": Choice(values=[16, 32, 64, 128, 256]),
    "num_layers": Randint(upper=5)  # 0, 1, 2, 3, or 4
}

Sampling Algorithms

Different strategies for sampling parameters from the search space.

class SamplingAlgorithm:
    """Base class for sampling algorithms."""

class RandomSamplingAlgorithm(SamplingAlgorithm):
    def __init__(self, seed: int = None):
        """
        Random sampling from the search space.
        
        Parameters:
        - seed: Random seed for reproducibility
        """

class GridSamplingAlgorithm(SamplingAlgorithm):
    def __init__(self):
        """
        Grid search over all parameter combinations.
        Note: Only works with Choice parameters.
        """

class BayesianSamplingAlgorithm(SamplingAlgorithm):
    def __init__(self):
        """
        Bayesian optimization for intelligent parameter selection.
        Uses previous trial results to guide future parameter choices.
        """

Usage Example

from azure.ai.ml.sweep import RandomSamplingAlgorithm, BayesianSamplingAlgorithm, GridSamplingAlgorithm

# Random sampling (most common)
random_sampling = RandomSamplingAlgorithm(seed=42)

# Bayesian optimization (for expensive evaluations)
bayesian_sampling = BayesianSamplingAlgorithm()

# Grid search (for small, discrete spaces)
grid_sampling = GridSamplingAlgorithm()

Early Termination Policies

Policies for early stopping of underperforming trials to save computational resources.

class BanditPolicy:
    def __init__(
        self,
        *,
        slack_factor: float = None,
        slack_amount: float = None,
        evaluation_interval: int = 1,
        delay_evaluation: int = 0
    ):
        """
        Bandit early termination policy based on slack criteria.
        
        Parameters:
        - slack_factor: Slack factor as a ratio (e.g., 0.1 = 10% slack)
        - slack_amount: Slack amount as absolute value
        - evaluation_interval: Frequency of policy evaluation
        - delay_evaluation: Number of intervals to delay evaluation
        """

class MedianStoppingPolicy:
    def __init__(
        self,
        *,
        evaluation_interval: int = 1,
        delay_evaluation: int = 0
    ):
        """
        Median stopping policy terminates trials performing worse than median.
        
        Parameters:
        - evaluation_interval: Frequency of policy evaluation
        - delay_evaluation: Number of intervals to delay evaluation
        """

class TruncationSelectionPolicy:
    def __init__(
        self,
        *,
        truncation_percentage: int = 10,
        evaluation_interval: int = 1,
        delay_evaluation: int = 0,
        exclude_finished_jobs: bool = False
    ):
        """
        Truncation policy terminates a percentage of worst performing trials.
        
        Parameters:
        - truncation_percentage: Percentage of trials to terminate
        - evaluation_interval: Frequency of policy evaluation
        - delay_evaluation: Number of intervals to delay evaluation
        - exclude_finished_jobs: Whether to exclude finished jobs from evaluation
        """

Usage Example

from azure.ai.ml.sweep import BanditPolicy, MedianStoppingPolicy, TruncationSelectionPolicy

# Conservative bandit policy (10% slack)
bandit_policy = BanditPolicy(
    slack_factor=0.1,
    evaluation_interval=2,
    delay_evaluation=5
)

# Median stopping policy
median_policy = MedianStoppingPolicy(
    evaluation_interval=1,
    delay_evaluation=10
)

# Aggressive truncation policy (terminate bottom 20%)
truncation_policy = TruncationSelectionPolicy(
    truncation_percentage=20,
    evaluation_interval=1,
    delay_evaluation=5
)

Optimization Objectives

Definition of optimization goals and metrics for hyperparameter tuning.

class Objective:
    def __init__(
        self,
        *,
        goal: str,
        primary_metric: str
    ):
        """
        Optimization objective for hyperparameter tuning.
        
        Parameters:
        - goal: Optimization goal ("maximize" or "minimize")
        - primary_metric: Name of the metric to optimize
        """

Usage Example

from azure.ai.ml.sweep import Objective

# Maximize accuracy
accuracy_objective = Objective(
    goal="maximize",
    primary_metric="accuracy"
)

# Minimize loss
loss_objective = Objective(
    goal="minimize",
    primary_metric="loss"
)

# Maximize F1 score
f1_objective = Objective(
    goal="maximize",
    primary_metric="f1_score"
)

Complete Sweep Example

from azure.ai.ml import command
from azure.ai.ml.entities import SweepJob, SweepJobLimits, Environment
from azure.ai.ml.sweep import (
    Choice, Uniform, LogUniform,
    RandomSamplingAlgorithm, BayesianSamplingAlgorithm,
    BanditPolicy, Objective
)

# Define training command template
training_job = command(
    code="./src",
    command="python train.py --lr ${{search_space.learning_rate}} --batch_size ${{search_space.batch_size}} --optimizer ${{search_space.optimizer}}",
    environment=Environment(
        image="mcr.microsoft.com/azureml/sklearn-1.0-ubuntu20.04-py38-cpu-inference:latest"
    ),
    compute="cpu-cluster",
    outputs={
        "model": {"type": "uri_folder", "path": "azureml://datastores/workspaceblobstore/paths/models/"}
    }
)

# Define comprehensive search space
search_space = {
    "learning_rate": LogUniform(min_value=1e-4, max_value=1e-1),
    "batch_size": Choice(values=[32, 64, 128, 256]),
    "optimizer": Choice(values=["adam", "sgd", "adamw"]),
    "weight_decay": LogUniform(min_value=1e-6, max_value=1e-2),
    "num_epochs": Choice(values=[10, 20, 30, 50])
}

# Create sweep job with Bayesian optimization
sweep_job = SweepJob(
    trial=training_job,
    search_space=search_space,
    objective=Objective(goal="maximize", primary_metric="val_accuracy"),
    sampling_algorithm=BayesianSamplingAlgorithm(),
    early_termination=BanditPolicy(
        slack_factor=0.15,
        evaluation_interval=2,
        delay_evaluation=10
    ),
    limits=SweepJobLimits(
        max_total_trials=50,
        max_concurrent_trials=5,
        timeout_minutes=300,
        trial_timeout_minutes=30
    ),
    experiment_name="hyperparameter-sweep"
)

# Submit and monitor sweep
submitted_sweep = ml_client.jobs.create_or_update(sweep_job)
print(f"Sweep job submitted: {submitted_sweep.name}")

# Monitor sweep progress
print(f"Sweep job URL: {submitted_sweep.studio_url}")

Best Practices

Search Space Design

  • Use log scales for learning rates and regularization parameters
  • Start with broad ranges and narrow down based on results
  • Use Choice for categorical parameters and discrete values
  • Consider parameter interactions when designing spaces

Sampling Strategy Selection

  • Random sampling: Good default choice, works well with early termination
  • Bayesian optimization: Better for expensive evaluations, fewer trials needed
  • Grid search: Only for small discrete spaces with few parameters

Early Termination Guidelines

  • BanditPolicy: Most flexible, good for most scenarios
  • MedianStoppingPolicy: Conservative, good for stable metrics
  • TruncationSelectionPolicy: Aggressive, good when resources are limited

Resource Management

  • Set appropriate max_concurrent_trials based on compute availability
  • Use trial_timeout_minutes to prevent stuck trials
  • Consider total cost when setting max_total_trials

Install with Tessl CLI

npx tessl i tessl/pypi-azure-ai-ml

docs

asset-management.md

automl.md

client-auth.md

compute-management.md

hyperparameter-tuning.md

index.md

job-management.md

model-deployment.md

tile.json