or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

asset-management.mdautoml.mdclient-auth.mdcompute-management.mdhyperparameter-tuning.mdindex.mdjob-management.mdmodel-deployment.md

automl.mddocs/

0

# AutoML

1

2

Automated machine learning capabilities for tabular data (classification, regression, forecasting), computer vision tasks (image classification, object detection, instance segmentation), and natural language processing (text classification, named entity recognition).

3

4

## Capabilities

5

6

### Tabular AutoML

7

8

Automated ML for structured data with classification, regression, and forecasting tasks.

9

10

```python { .api }

11

def classification(

12

*,

13

target_column_name: str,

14

training_data: Data,

15

validation_data: Data = None,

16

test_data: Data = None,

17

primary_metric: str = "accuracy",

18

featurization: TabularFeaturizationSettings = None,

19

limits: TabularLimitSettings = None,

20

training_settings: TrainingSettings = None,

21

**kwargs

22

) -> ClassificationJob:

23

"""

24

Create an automated classification job for tabular data.

25

26

Parameters:

27

- target_column_name: Name of the target column

28

- training_data: Training dataset

29

- validation_data: Validation dataset (optional)

30

- test_data: Test dataset (optional)

31

- primary_metric: Primary metric for optimization

32

- featurization: Feature engineering settings

33

- limits: Training limits and constraints

34

- training_settings: Training configuration

35

36

Returns:

37

ClassificationJob configured for automated ML

38

"""

39

40

def regression(

41

*,

42

target_column_name: str,

43

training_data: Data,

44

validation_data: Data = None,

45

test_data: Data = None,

46

primary_metric: str = "normalized_root_mean_squared_error",

47

featurization: TabularFeaturizationSettings = None,

48

limits: TabularLimitSettings = None,

49

training_settings: TrainingSettings = None,

50

**kwargs

51

) -> RegressionJob:

52

"""

53

Create an automated regression job for tabular data.

54

55

Parameters:

56

- target_column_name: Name of the target column

57

- training_data: Training dataset

58

- validation_data: Validation dataset (optional)

59

- test_data: Test dataset (optional)

60

- primary_metric: Primary metric for optimization

61

- featurization: Feature engineering settings

62

- limits: Training limits and constraints

63

- training_settings: Training configuration

64

65

Returns:

66

RegressionJob configured for automated ML

67

"""

68

69

def forecasting(

70

*,

71

target_column_name: str,

72

training_data: Data,

73

validation_data: Data = None,

74

test_data: Data = None,

75

primary_metric: str = "normalized_root_mean_squared_error",

76

forecasting_settings: ForecastingSettings,

77

featurization: TabularFeaturizationSettings = None,

78

limits: TabularLimitSettings = None,

79

training_settings: TrainingSettings = None,

80

**kwargs

81

) -> ForecastingJob:

82

"""

83

Create an automated forecasting job for time series data.

84

85

Parameters:

86

- target_column_name: Name of the target column

87

- training_data: Training dataset

88

- validation_data: Validation dataset (optional)

89

- test_data: Test dataset (optional)

90

- primary_metric: Primary metric for optimization

91

- forecasting_settings: Time series specific settings

92

- featurization: Feature engineering settings

93

- limits: Training limits and constraints

94

- training_settings: Training configuration

95

96

Returns:

97

ForecastingJob configured for automated ML

98

"""

99

```

100

101

#### Usage Example

102

103

```python

104

from azure.ai.ml import automl

105

from azure.ai.ml.entities import Data

106

107

# Create training data asset

108

training_data = Data(

109

name="classification-data",

110

path="./data/train.csv",

111

type="mltable"

112

)

113

114

# Create classification job

115

classification_job = automl.classification(

116

target_column_name="target",

117

training_data=training_data,

118

primary_metric="accuracy",

119

compute="cpu-cluster",

120

experiment_name="automl-classification"

121

)

122

123

# Submit the job

124

submitted_job = ml_client.jobs.create_or_update(classification_job)

125

```

126

127

### Computer Vision AutoML

128

129

Automated ML for image-based tasks including classification, object detection, and instance segmentation.

130

131

```python { .api }

132

def image_classification(

133

*,

134

target_column_name: str,

135

training_data: Data,

136

validation_data: Data = None,

137

primary_metric: str = "accuracy",

138

limits: ImageLimitSettings = None,

139

sweep_settings: ImageSweepSettings = None,

140

model_settings: ImageModelSettingsClassification = None,

141

**kwargs

142

) -> ImageClassificationJob:

143

"""

144

Create an automated image classification job.

145

146

Parameters:

147

- target_column_name: Name of the target column

148

- training_data: Training dataset with images and labels

149

- validation_data: Validation dataset (optional)

150

- primary_metric: Primary metric for optimization

151

- limits: Training limits and constraints

152

- sweep_settings: Hyperparameter sweep settings

153

- model_settings: Model-specific settings

154

155

Returns:

156

ImageClassificationJob configured for automated ML

157

"""

158

159

def image_classification_multilabel(

160

*,

161

target_column_name: str,

162

training_data: Data,

163

validation_data: Data = None,

164

primary_metric: str = "iou",

165

**kwargs

166

) -> ImageClassificationMultilabelJob:

167

"""

168

Create an automated multi-label image classification job.

169

170

Parameters:

171

- target_column_name: Name of the target column

172

- training_data: Training dataset

173

- validation_data: Validation dataset (optional)

174

- primary_metric: Primary metric for optimization

175

176

Returns:

177

ImageClassificationMultilabelJob configured for automated ML

178

"""

179

180

def image_object_detection(

181

*,

182

target_column_name: str,

183

training_data: Data,

184

validation_data: Data = None,

185

primary_metric: str = "mean_average_precision",

186

limits: ImageLimitSettings = None,

187

sweep_settings: ImageSweepSettings = None,

188

model_settings: ImageModelSettingsObjectDetection = None,

189

**kwargs

190

) -> ImageObjectDetectionJob:

191

"""

192

Create an automated object detection job.

193

194

Parameters:

195

- target_column_name: Name of the target column

196

- training_data: Training dataset with images and bounding boxes

197

- validation_data: Validation dataset (optional)

198

- primary_metric: Primary metric for optimization

199

- limits: Training limits and constraints

200

- sweep_settings: Hyperparameter sweep settings

201

- model_settings: Model-specific settings

202

203

Returns:

204

ImageObjectDetectionJob configured for automated ML

205

"""

206

207

def image_instance_segmentation(

208

*,

209

target_column_name: str,

210

training_data: Data,

211

validation_data: Data = None,

212

primary_metric: str = "mean_average_precision",

213

**kwargs

214

) -> ImageInstanceSegmentationJob:

215

"""

216

Create an automated instance segmentation job.

217

218

Parameters:

219

- target_column_name: Name of the target column

220

- training_data: Training dataset with images and segmentation masks

221

- validation_data: Validation dataset (optional)

222

- primary_metric: Primary metric for optimization

223

224

Returns:

225

ImageInstanceSegmentationJob configured for automated ML

226

"""

227

```

228

229

### Natural Language Processing AutoML

230

231

Automated ML for text-based tasks including classification and named entity recognition.

232

233

```python { .api }

234

def text_classification(

235

*,

236

target_column_name: str,

237

training_data: Data,

238

validation_data: Data = None,

239

primary_metric: str = "accuracy",

240

featurization: NlpFeaturizationSettings = None,

241

limits: NlpLimitSettings = None,

242

sweep_settings: NlpSweepSettings = None,

243

**kwargs

244

) -> TextClassificationJob:

245

"""

246

Create an automated text classification job.

247

248

Parameters:

249

- target_column_name: Name of the target column

250

- training_data: Training dataset with text and labels

251

- validation_data: Validation dataset (optional)

252

- primary_metric: Primary metric for optimization

253

- featurization: NLP feature engineering settings

254

- limits: Training limits and constraints

255

- sweep_settings: Hyperparameter sweep settings

256

257

Returns:

258

TextClassificationJob configured for automated ML

259

"""

260

261

def text_classification_multilabel(

262

*,

263

target_column_name: str,

264

training_data: Data,

265

validation_data: Data = None,

266

primary_metric: str = "accuracy",

267

**kwargs

268

) -> TextClassificationMultilabelJob:

269

"""

270

Create an automated multi-label text classification job.

271

272

Parameters:

273

- target_column_name: Name of the target column

274

- training_data: Training dataset

275

- validation_data: Validation dataset (optional)

276

- primary_metric: Primary metric for optimization

277

278

Returns:

279

TextClassificationMultilabelJob configured for automated ML

280

"""

281

282

def text_ner(

283

*,

284

target_column_name: str,

285

training_data: Data,

286

validation_data: Data = None,

287

primary_metric: str = "accuracy",

288

**kwargs

289

) -> TextNerJob:

290

"""

291

Create an automated named entity recognition job.

292

293

Parameters:

294

- target_column_name: Name of the target column

295

- training_data: Training dataset with text and entity labels

296

- validation_data: Validation dataset (optional)

297

- primary_metric: Primary metric for optimization

298

299

Returns:

300

TextNerJob configured for automated ML

301

"""

302

```

303

304

### AutoML Job Classes

305

306

```python { .api }

307

class ClassificationJob:

308

def __init__(

309

self,

310

*,

311

target_column_name: str,

312

training_data: Data,

313

primary_metric: str = "accuracy",

314

**kwargs

315

):

316

"""Tabular classification AutoML job."""

317

318

class RegressionJob:

319

def __init__(

320

self,

321

*,

322

target_column_name: str,

323

training_data: Data,

324

primary_metric: str = "normalized_root_mean_squared_error",

325

**kwargs

326

):

327

"""Tabular regression AutoML job."""

328

329

class ForecastingJob:

330

def __init__(

331

self,

332

*,

333

target_column_name: str,

334

training_data: Data,

335

forecasting_settings: ForecastingSettings,

336

primary_metric: str = "normalized_root_mean_squared_error",

337

**kwargs

338

):

339

"""Time series forecasting AutoML job."""

340

341

class ImageClassificationJob:

342

def __init__(

343

self,

344

*,

345

target_column_name: str,

346

training_data: Data,

347

primary_metric: str = "accuracy",

348

**kwargs

349

):

350

"""Image classification AutoML job."""

351

352

class TextClassificationJob:

353

def __init__(

354

self,

355

*,

356

target_column_name: str,

357

training_data: Data,

358

primary_metric: str = "accuracy",

359

**kwargs

360

):

361

"""Text classification AutoML job."""

362

```

363

364

### Configuration Classes

365

366

```python { .api }

367

class TrainingSettings:

368

def __init__(

369

self,

370

*,

371

enable_onnx_compatible_models: bool = False,

372

enable_dnn_training: bool = False,

373

enable_model_explainability: bool = True,

374

enable_stack_ensemble: bool = True,

375

enable_vote_ensemble: bool = True,

376

stack_ensemble_settings: StackEnsembleSettings = None,

377

blocked_training_algorithms: list = None,

378

allowed_training_algorithms: list = None

379

):

380

"""

381

Training configuration for AutoML jobs.

382

383

Parameters:

384

- enable_onnx_compatible_models: Enable ONNX model generation

385

- enable_dnn_training: Enable deep neural network training

386

- enable_model_explainability: Enable model explanations

387

- enable_stack_ensemble: Enable stack ensemble models

388

- enable_vote_ensemble: Enable vote ensemble models

389

- stack_ensemble_settings: Stack ensemble configuration

390

- blocked_training_algorithms: Algorithms to exclude

391

- allowed_training_algorithms: Algorithms to include

392

"""

393

394

class TabularFeaturizationSettings:

395

def __init__(

396

self,

397

*,

398

mode: str = "auto",

399

transformer_params: dict = None,

400

column_name_and_types: dict = None,

401

dataset_language: str = "eng",

402

blocked_transformers: list = None

403

):

404

"""

405

Feature engineering settings for tabular data.

406

407

Parameters:

408

- mode: Featurization mode ("auto", "custom", "off")

409

- transformer_params: Custom transformer parameters

410

- column_name_and_types: Column data types

411

- dataset_language: Dataset language for text features

412

- blocked_transformers: Transformers to exclude

413

"""

414

415

class TabularLimitSettings:

416

def __init__(

417

self,

418

*,

419

max_trials: int = 1000,

420

max_concurrent_trials: int = None,

421

max_cores_per_trial: int = None,

422

trial_timeout_minutes: int = None,

423

experiment_timeout_minutes: int = None,

424

enable_early_termination: bool = True

425

):

426

"""

427

Training limits for tabular AutoML.

428

429

Parameters:

430

- max_trials: Maximum number of trials

431

- max_concurrent_trials: Maximum concurrent trials

432

- max_cores_per_trial: Maximum cores per trial

433

- trial_timeout_minutes: Timeout per trial in minutes

434

- experiment_timeout_minutes: Total experiment timeout

435

- enable_early_termination: Enable early stopping

436

"""

437

438

class ForecastingSettings:

439

def __init__(

440

self,

441

*,

442

time_column_name: str,

443

forecast_horizon: int,

444

time_series_id_column_names: list = None,

445

frequency: str = None,

446

target_lags: list = None,

447

target_rolling_window_size: int = None,

448

country_or_region_for_holidays: str = None,

449

use_stl: str = None

450

):

451

"""

452

Time series forecasting specific settings.

453

454

Parameters:

455

- time_column_name: Name of the time column

456

- forecast_horizon: Number of periods to forecast

457

- time_series_id_column_names: Columns identifying time series

458

- frequency: Data frequency (D, H, M, etc.)

459

- target_lags: Lag values for target variable

460

- target_rolling_window_size: Rolling window size

461

- country_or_region_for_holidays: Holiday calendar region

462

- use_stl: STL decomposition usage

463

"""

464

465

class ImageLimitSettings:

466

def __init__(

467

self,

468

*,

469

max_trials: int = 1,

470

max_concurrent_trials: int = 1,

471

timeout_minutes: int = None

472

):

473

"""

474

Training limits for image AutoML.

475

476

Parameters:

477

- max_trials: Maximum number of trials

478

- max_concurrent_trials: Maximum concurrent trials

479

- timeout_minutes: Total timeout in minutes

480

"""

481

482

class NlpLimitSettings:

483

def __init__(

484

self,

485

*,

486

max_trials: int = 1,

487

max_concurrent_trials: int = 1,

488

timeout_minutes: int = None

489

):

490

"""

491

Training limits for NLP AutoML.

492

493

Parameters:

494

- max_trials: Maximum number of trials

495

- max_concurrent_trials: Maximum concurrent trials

496

- timeout_minutes: Total timeout in minutes

497

"""

498

```

499

500

#### Usage Example

501

502

```python

503

from azure.ai.ml import automl

504

from azure.ai.ml.entities import Data

505

from azure.ai.ml.automl import TabularLimitSettings, TrainingSettings

506

507

# Configure AutoML settings

508

limits = TabularLimitSettings(

509

max_trials=10,

510

max_concurrent_trials=2,

511

trial_timeout_minutes=30,

512

experiment_timeout_minutes=180

513

)

514

515

training_settings = TrainingSettings(

516

enable_onnx_compatible_models=True,

517

enable_model_explainability=True,

518

enable_stack_ensemble=True

519

)

520

521

# Create regression job with settings

522

regression_job = automl.regression(

523

target_column_name="price",

524

training_data=training_data,

525

primary_metric="r2_score",

526

limits=limits,

527

training_settings=training_settings,

528

compute="cpu-cluster"

529

)

530

531

# Submit job

532

submitted_job = ml_client.jobs.create_or_update(regression_job)

533

print(f"AutoML job submitted: {submitted_job.name}")

534

```

535

536

## Primary Metrics

537

538

Available primary metrics for different AutoML task types:

539

540

### Classification Metrics

541

- `accuracy` - Overall accuracy

542

- `AUC_weighted` - Area under ROC curve (weighted)

543

- `average_precision_score_weighted` - Average precision score

544

- `precision_score_weighted` - Precision score (weighted)

545

- `recall_score_weighted` - Recall score (weighted)

546

547

### Regression Metrics

548

- `normalized_root_mean_squared_error` - Normalized RMSE

549

- `r2_score` - R-squared score

550

- `mean_absolute_error` - Mean absolute error

551

- `normalized_mean_absolute_error` - Normalized MAE

552

- `spearman_correlation` - Spearman correlation

553

554

### Forecasting Metrics

555

- `normalized_root_mean_squared_error` - Normalized RMSE

556

- `r2_score` - R-squared score

557

- `mean_absolute_error` - Mean absolute error

558

- `normalized_mean_absolute_error` - Normalized MAE

559

560

### Computer Vision Metrics

561

- `accuracy` - Classification accuracy

562

- `mean_average_precision` - Object detection/segmentation mAP

563

- `iou` - Intersection over Union