Tessl Tile for pypi/azure-ai-ml@1.28.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

asset-management.md automl.md client-auth.md compute-management.md hyperparameter-tuning.md index.md job-management.md model-deployment.md

hyperparameter-tuning.mddocs/

0
# Hyperparameter Tuning
1

2
Advanced hyperparameter optimization capabilities with various search spaces, sampling algorithms, and early termination policies for efficient model optimization.
3

4
## Capabilities
5

6
### Sweep Jobs
7

8
Hyperparameter sweep jobs for optimizing model performance across parameter spaces.
9

10
```python { .api }
11
class SweepJob:
12
    def __init__(
13
        self,
14
        *,
15
        trial: CommandJob,
16
        search_space: dict,
17
        objective: Objective,
18
        sampling_algorithm: SamplingAlgorithm = None,
19
        early_termination: EarlyTerminationPolicy = None,
20
        limits: SweepJobLimits = None,
21
        compute: str = None,
22
        **kwargs
23
    ):
24
        """
25
        Hyperparameter sweep job for model optimization.
26
        
27
        Parameters:
28
        - trial: Template command job defining the training script
29
        - search_space: Dictionary defining parameter search spaces
30
        - objective: Optimization objective and metric
31
        - sampling_algorithm: Parameter sampling strategy
32
        - early_termination: Early stopping policy
33
        - limits: Sweep execution limits
34
        - compute: Compute target for sweep trials
35
        """
36

37
class SweepJobLimits:
38
    def __init__(
39
        self,
40
        *,
41
        max_total_trials: int = 1,
42
        max_concurrent_trials: int = 1,
43
        timeout_minutes: int = None,
44
        trial_timeout_minutes: int = None
45
    ):
46
        """
47
        Limits for sweep job execution.
48
        
49
        Parameters:
50
        - max_total_trials: Maximum number of trials to run
51
        - max_concurrent_trials: Maximum concurrent trials
52
        - timeout_minutes: Total sweep timeout in minutes
53
        - trial_timeout_minutes: Individual trial timeout in minutes
54
        """
55
```
56

57
#### Usage Example
58

59
```python
60
from azure.ai.ml import command
61
from azure.ai.ml.entities import SweepJob, SweepJobLimits
62
from azure.ai.ml.sweep import Choice, Uniform, Objective, RandomSamplingAlgorithm, BanditPolicy
63

64
# Define the training command template
65
command_job = command(
66
    code="./src",
67
    command="python train.py --learning_rate ${{search_space.learning_rate}} --batch_size ${{search_space.batch_size}}",
68
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:1",
69
    compute="cpu-cluster"
70
)
71

72
# Define search space
73
search_space = {
74
    "learning_rate": Uniform(min_value=0.001, max_value=0.1),
75
    "batch_size": Choice(values=[16, 32, 64, 128])
76
}
77

78
# Create sweep job
79
sweep_job = SweepJob(
80
    trial=command_job,
81
    search_space=search_space,
82
    objective=Objective(goal="maximize", primary_metric="accuracy"),
83
    sampling_algorithm=RandomSamplingAlgorithm(),
84
    early_termination=BanditPolicy(slack_factor=0.1, evaluation_interval=2),
85
    limits=SweepJobLimits(
86
        max_total_trials=20,
87
        max_concurrent_trials=4,
88
        timeout_minutes=120
89
    )
90
)
91

92
# Submit sweep job
93
submitted_sweep = ml_client.jobs.create_or_update(sweep_job)
94
```
95

96
### Search Space Functions
97

98
Functions for defining parameter search spaces with different distributions.
99

100
```python { .api }
101
class Choice:
102
    def __init__(self, values: list):
103
        """
104
        Discrete choice from a list of values.
105
        
106
        Parameters:
107
        - values: List of possible values to choose from
108
        """
109

110
class Uniform:
111
    def __init__(self, min_value: float, max_value: float):
112
        """
113
        Uniform distribution between min and max values.
114
        
115
        Parameters:
116
        - min_value: Minimum value
117
        - max_value: Maximum value
118
        """
119

120
class LogUniform:
121
    def __init__(self, min_value: float, max_value: float):
122
        """
123
        Log-uniform distribution for parameters that vary exponentially.
124
        
125
        Parameters:
126
        - min_value: Minimum value (must be > 0)
127
        - max_value: Maximum value
128
        """
129

130
class Normal:
131
    def __init__(self, mu: float, sigma: float):
132
        """
133
        Normal (Gaussian) distribution.
134
        
135
        Parameters:
136
        - mu: Mean of the distribution
137
        - sigma: Standard deviation
138
        """
139

140
class LogNormal:
141
    def __init__(self, mu: float, sigma: float):
142
        """
143
        Log-normal distribution for positive parameters.
144
        
145
        Parameters:
146
        - mu: Mean of the underlying normal distribution
147
        - sigma: Standard deviation of the underlying normal distribution
148
        """
149

150
class QUniform:
151
    def __init__(self, min_value: float, max_value: float, q: float):
152
        """
153
        Quantized uniform distribution.
154
        
155
        Parameters:
156
        - min_value: Minimum value
157
        - max_value: Maximum value
158
        - q: Quantization step size
159
        """
160

161
class QLogUniform:
162
    def __init__(self, min_value: float, max_value: float, q: float):
163
        """
164
        Quantized log-uniform distribution.
165
        
166
        Parameters:
167
        - min_value: Minimum value (must be > 0)
168
        - max_value: Maximum value
169
        - q: Quantization step size
170
        """
171

172
class QNormal:
173
    def __init__(self, mu: float, sigma: float, q: float):
174
        """
175
        Quantized normal distribution.
176
        
177
        Parameters:
178
        - mu: Mean of the distribution
179
        - sigma: Standard deviation
180
        - q: Quantization step size
181
        """
182

183
class QLogNormal:
184
    def __init__(self, mu: float, sigma: float, q: float):
185
        """
186
        Quantized log-normal distribution.
187
        
188
        Parameters:
189
        - mu: Mean of the underlying normal distribution
190
        - sigma: Standard deviation of the underlying normal distribution
191
        - q: Quantization step size
192
        """
193

194
class Randint:
195
    def __init__(self, upper: int):
196
        """
197
        Random integer from 0 to upper-1.
198
        
199
        Parameters:
200
        - upper: Upper bound (exclusive)
201
        """
202
```
203

204
#### Usage Example
205

206
```python
207
from azure.ai.ml.sweep import Choice, Uniform, LogUniform, Normal, Randint
208

209
# Different search space examples
210
search_space = {
211
    # Discrete choices
212
    "optimizer": Choice(values=["adam", "sgd", "rmsprop"]),
213
    "activation": Choice(values=["relu", "tanh", "sigmoid"]),
214
    
215
    # Continuous ranges
216
    "learning_rate": LogUniform(min_value=1e-5, max_value=1e-1),
217
    "dropout_rate": Uniform(min_value=0.1, max_value=0.5),
218
    "weight_decay": LogUniform(min_value=1e-6, max_value=1e-2),
219
    
220
    # Normal distributions
221
    "hidden_size": Normal(mu=128, sigma=32),
222
    
223
    # Integer ranges
224
    "batch_size": Choice(values=[16, 32, 64, 128, 256]),
225
    "num_layers": Randint(upper=5)  # 0, 1, 2, 3, or 4
226
}
227
```
228

229
### Sampling Algorithms
230

231
Different strategies for sampling parameters from the search space.
232

233
```python { .api }
234
class SamplingAlgorithm:
235
    """Base class for sampling algorithms."""
236

237
class RandomSamplingAlgorithm(SamplingAlgorithm):
238
    def __init__(self, seed: int = None):
239
        """
240
        Random sampling from the search space.
241
        
242
        Parameters:
243
        - seed: Random seed for reproducibility
244
        """
245

246
class GridSamplingAlgorithm(SamplingAlgorithm):
247
    def __init__(self):
248
        """
249
        Grid search over all parameter combinations.
250
        Note: Only works with Choice parameters.
251
        """
252

253
class BayesianSamplingAlgorithm(SamplingAlgorithm):
254
    def __init__(self):
255
        """
256
        Bayesian optimization for intelligent parameter selection.
257
        Uses previous trial results to guide future parameter choices.
258
        """
259
```
260

261
#### Usage Example
262

263
```python
264
from azure.ai.ml.sweep import RandomSamplingAlgorithm, BayesianSamplingAlgorithm, GridSamplingAlgorithm
265

266
# Random sampling (most common)
267
random_sampling = RandomSamplingAlgorithm(seed=42)
268

269
# Bayesian optimization (for expensive evaluations)
270
bayesian_sampling = BayesianSamplingAlgorithm()
271

272
# Grid search (for small, discrete spaces)
273
grid_sampling = GridSamplingAlgorithm()
274
```
275

276
### Early Termination Policies
277

278
Policies for early stopping of underperforming trials to save computational resources.
279

280
```python { .api }
281
class BanditPolicy:
282
    def __init__(
283
        self,
284
        *,
285
        slack_factor: float = None,
286
        slack_amount: float = None,
287
        evaluation_interval: int = 1,
288
        delay_evaluation: int = 0
289
    ):
290
        """
291
        Bandit early termination policy based on slack criteria.
292
        
293
        Parameters:
294
        - slack_factor: Slack factor as a ratio (e.g., 0.1 = 10% slack)
295
        - slack_amount: Slack amount as absolute value
296
        - evaluation_interval: Frequency of policy evaluation
297
        - delay_evaluation: Number of intervals to delay evaluation
298
        """
299

300
class MedianStoppingPolicy:
301
    def __init__(
302
        self,
303
        *,
304
        evaluation_interval: int = 1,
305
        delay_evaluation: int = 0
306
    ):
307
        """
308
        Median stopping policy terminates trials performing worse than median.
309
        
310
        Parameters:
311
        - evaluation_interval: Frequency of policy evaluation
312
        - delay_evaluation: Number of intervals to delay evaluation
313
        """
314

315
class TruncationSelectionPolicy:
316
    def __init__(
317
        self,
318
        *,
319
        truncation_percentage: int = 10,
320
        evaluation_interval: int = 1,
321
        delay_evaluation: int = 0,
322
        exclude_finished_jobs: bool = False
323
    ):
324
        """
325
        Truncation policy terminates a percentage of worst performing trials.
326
        
327
        Parameters:
328
        - truncation_percentage: Percentage of trials to terminate
329
        - evaluation_interval: Frequency of policy evaluation
330
        - delay_evaluation: Number of intervals to delay evaluation
331
        - exclude_finished_jobs: Whether to exclude finished jobs from evaluation
332
        """
333
```
334

335
#### Usage Example
336

337
```python
338
from azure.ai.ml.sweep import BanditPolicy, MedianStoppingPolicy, TruncationSelectionPolicy
339

340
# Conservative bandit policy (10% slack)
341
bandit_policy = BanditPolicy(
342
    slack_factor=0.1,
343
    evaluation_interval=2,
344
    delay_evaluation=5
345
)
346

347
# Median stopping policy
348
median_policy = MedianStoppingPolicy(
349
    evaluation_interval=1,
350
    delay_evaluation=10
351
)
352

353
# Aggressive truncation policy (terminate bottom 20%)
354
truncation_policy = TruncationSelectionPolicy(
355
    truncation_percentage=20,
356
    evaluation_interval=1,
357
    delay_evaluation=5
358
)
359
```
360

361
### Optimization Objectives
362

363
Definition of optimization goals and metrics for hyperparameter tuning.
364

365
```python { .api }
366
class Objective:
367
    def __init__(
368
        self,
369
        *,
370
        goal: str,
371
        primary_metric: str
372
    ):
373
        """
374
        Optimization objective for hyperparameter tuning.
375
        
376
        Parameters:
377
        - goal: Optimization goal ("maximize" or "minimize")
378
        - primary_metric: Name of the metric to optimize
379
        """
380
```
381

382
#### Usage Example
383

384
```python
385
from azure.ai.ml.sweep import Objective
386

387
# Maximize accuracy
388
accuracy_objective = Objective(
389
    goal="maximize",
390
    primary_metric="accuracy"
391
)
392

393
# Minimize loss
394
loss_objective = Objective(
395
    goal="minimize",
396
    primary_metric="loss"
397
)
398

399
# Maximize F1 score
400
f1_objective = Objective(
401
    goal="maximize",
402
    primary_metric="f1_score"
403
)
404
```
405

406
### Complete Sweep Example
407

408
```python
409
from azure.ai.ml import command
410
from azure.ai.ml.entities import SweepJob, SweepJobLimits, Environment
411
from azure.ai.ml.sweep import (
412
    Choice, Uniform, LogUniform,
413
    RandomSamplingAlgorithm, BayesianSamplingAlgorithm,
414
    BanditPolicy, Objective
415
)
416

417
# Define training command template
418
training_job = command(
419
    code="./src",
420
    command="python train.py --lr ${{search_space.learning_rate}} --batch_size ${{search_space.batch_size}} --optimizer ${{search_space.optimizer}}",
421
    environment=Environment(
422
        image="mcr.microsoft.com/azureml/sklearn-1.0-ubuntu20.04-py38-cpu-inference:latest"
423
    ),
424
    compute="cpu-cluster",
425
    outputs={
426
        "model": {"type": "uri_folder", "path": "azureml://datastores/workspaceblobstore/paths/models/"}
427
    }
428
)
429

430
# Define comprehensive search space
431
search_space = {
432
    "learning_rate": LogUniform(min_value=1e-4, max_value=1e-1),
433
    "batch_size": Choice(values=[32, 64, 128, 256]),
434
    "optimizer": Choice(values=["adam", "sgd", "adamw"]),
435
    "weight_decay": LogUniform(min_value=1e-6, max_value=1e-2),
436
    "num_epochs": Choice(values=[10, 20, 30, 50])
437
}
438

439
# Create sweep job with Bayesian optimization
440
sweep_job = SweepJob(
441
    trial=training_job,
442
    search_space=search_space,
443
    objective=Objective(goal="maximize", primary_metric="val_accuracy"),
444
    sampling_algorithm=BayesianSamplingAlgorithm(),
445
    early_termination=BanditPolicy(
446
        slack_factor=0.15,
447
        evaluation_interval=2,
448
        delay_evaluation=10
449
    ),
450
    limits=SweepJobLimits(
451
        max_total_trials=50,
452
        max_concurrent_trials=5,
453
        timeout_minutes=300,
454
        trial_timeout_minutes=30
455
    ),
456
    experiment_name="hyperparameter-sweep"
457
)
458

459
# Submit and monitor sweep
460
submitted_sweep = ml_client.jobs.create_or_update(sweep_job)
461
print(f"Sweep job submitted: {submitted_sweep.name}")
462

463
# Monitor sweep progress
464
print(f"Sweep job URL: {submitted_sweep.studio_url}")
465
```
466

467
## Best Practices
468

469
### Search Space Design
470
- Use log scales for learning rates and regularization parameters
471
- Start with broad ranges and narrow down based on results
472
- Use Choice for categorical parameters and discrete values
473
- Consider parameter interactions when designing spaces
474

475
### Sampling Strategy Selection
476
- **Random sampling**: Good default choice, works well with early termination
477
- **Bayesian optimization**: Better for expensive evaluations, fewer trials needed
478
- **Grid search**: Only for small discrete spaces with few parameters
479

480
### Early Termination Guidelines
481
- **BanditPolicy**: Most flexible, good for most scenarios
482
- **MedianStoppingPolicy**: Conservative, good for stable metrics
483
- **TruncationSelectionPolicy**: Aggressive, good when resources are limited
484

485
### Resource Management
486
- Set appropriate `max_concurrent_trials` based on compute availability
487
- Use `trial_timeout_minutes` to prevent stuck trials
488
- Consider total cost when setting `max_total_trials`

Version

Tile

Files

hyperparameter-tuning.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

hyperparameter-tuning.mddocs/