Tessl Tile for pypi/sagemaker@2.251.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

amazon-algorithms.md automl.md core-training.md data-processing.md debugging-profiling.md experiments.md framework-training.md hyperparameter-tuning.md index.md model-monitoring.md model-serving.md remote-functions.md

index.mddocs/

0
# SageMaker Python SDK
1

2
A comprehensive Python library for training and deploying machine learning models on Amazon SageMaker. Provides high-level abstractions and APIs for the complete machine learning workflow including data preprocessing, model training, hyperparameter tuning, batch inference, and real-time endpoint deployment across popular frameworks like TensorFlow, PyTorch, Scikit-learn, XGBoost, and Hugging Face.
3

4
## Package Information
5

6
- **Package Name**: sagemaker
7
- **Language**: Python
8
- **Installation**: `pip install sagemaker`
9
- **Documentation**: https://sagemaker.readthedocs.io/
10

11
## Core Imports
12

13
```python
14
import sagemaker
15
```
16

17
Common session and role management:
18

19
```python
20
from sagemaker import Session, get_execution_role
21
```
22

23
Training and model deployment:
24

25
```python
26
from sagemaker import Estimator, Model, Predictor
27
from sagemaker.inputs import TrainingInput
28
```
29

30
## Basic Usage
31

32
```python
33
import sagemaker
34
from sagemaker import Session, get_execution_role
35
from sagemaker.sklearn import SKLearn
36

37
# Set up SageMaker session and IAM role
38
sagemaker_session = Session()
39
role = get_execution_role()
40

41
# Create a scikit-learn estimator
42
sklearn_estimator = SKLearn(
43
    entry_point="train.py",
44
    framework_version="1.2-1",
45
    instance_type="ml.m5.large",
46
    role=role,
47
    sagemaker_session=sagemaker_session
48
)
49

50
# Train the model
51
sklearn_estimator.fit({"training": "s3://my-bucket/train-data"})
52

53
# Deploy the model
54
predictor = sklearn_estimator.deploy(
55
    initial_instance_count=1,
56
    instance_type="ml.m5.large"
57
)
58

59
# Make predictions
60
predictions = predictor.predict(test_data)
61

62
# Clean up
63
predictor.delete_endpoint()
64
```
65

66
## Architecture
67

68
The SageMaker Python SDK follows a layered architecture that abstracts AWS SageMaker complexity:
69

70
- **Session Layer**: Manages AWS credentials, regions, and service configurations
71
- **Estimator Layer**: High-level training interfaces for different ML frameworks
72
- **Model Layer**: Model deployment and management abstractions
73
- **Predictor Layer**: Real-time and batch inference clients
74
- **Processing Layer**: Data preprocessing and feature engineering jobs
75
- **Pipeline Layer**: End-to-end ML workflow orchestration
76

77
This design enables developers to focus on ML logic while the SDK handles AWS service integration, resource management, and deployment complexities.
78

79
## Capabilities
80

81
### Core Training and Model Management
82

83
Fundamental classes for training models and managing deployments including estimators, models, predictors, and session management. These form the foundation of the SageMaker workflow.
84

85
```python { .api }
86
class Estimator:
87
    def __init__(self, image_uri: str, role: str = None, instance_count: int = None, 
88
                 instance_type: str = None, keep_alive_period_in_seconds: int = None,
89
                 volume_size: int = 30, max_run: int = 24*60*60, input_mode: str = "File",
90
                 output_path: str = None, base_job_name: str = None, 
91
                 sagemaker_session: Session = None, hyperparameters: dict = None,
92
                 tags: list = None, subnets: list = None, security_group_ids: list = None,
93
                 **kwargs): ...
94
    def fit(self, inputs, wait: bool = True, logs: str = "All", job_name: str = None, 
95
            experiment_config: dict = None): ...
96
    def deploy(self, initial_instance_count: int, instance_type: str, **kwargs) -> Predictor: ...
97

98
class Model:
99
    def __init__(self, image_uri: str = None, model_data: str = None, role: str = None,
100
                 predictor_cls: callable = None, env: dict = None, name: str = None,
101
                 vpc_config: dict = None, sagemaker_session: Session = None,
102
                 enable_network_isolation: bool = None, model_kms_key: str = None,
103
                 image_config: dict = None, source_dir: str = None, code_location: str = None,
104
                 entry_point: str = None, container_log_level: int = logging.INFO,
105
                 dependencies: list = None, git_config: dict = None, **kwargs): ...
106
    def deploy(self, initial_instance_count: int, instance_type: str, **kwargs) -> Predictor: ...
107

108
class Predictor:
109
    def predict(self, data, **kwargs): ...
110
    def delete_endpoint(self): ...
111

112
class Session:
113
    def __init__(self, boto_session=None, sagemaker_client=None, sagemaker_runtime_client=None,
114
                 sagemaker_featurestore_runtime_client=None, default_bucket: str = None,
115
                 settings=None, sagemaker_metrics_client=None, sagemaker_config: dict = None,
116
                 default_bucket_prefix: str = None): ...
117
    def upload_data(self, path: str, bucket: str, key_prefix: str) -> str: ...
118

119
def get_execution_role(sagemaker_session: Session = None, use_default: bool = False) -> str: ...
120
```
121

122
[Core Training and Models](./core-training.md)
123

124
### Framework-Specific Training
125

126
Support for popular ML frameworks including PyTorch, TensorFlow, Scikit-learn, XGBoost, Hugging Face, and MXNet. Each framework provides optimized containers and training configurations.
127

128
```python { .api }
129
# PyTorch
130
class PyTorch(Estimator):
131
    def __init__(self, entry_point: str, framework_version: str, py_version: str, **kwargs): ...
132

133
# TensorFlow  
134
class TensorFlow(Estimator):
135
    def __init__(self, entry_point: str, framework_version: str, py_version: str, **kwargs): ...
136

137
# Scikit-learn
138
class SKLearn(Estimator):
139
    def __init__(self, entry_point: str, framework_version: str, **kwargs): ...
140

141
# XGBoost
142
class XGBoost(Estimator):
143
    def __init__(self, entry_point: str, framework_version: str, **kwargs): ...
144

145
# Hugging Face
146
class HuggingFace(Estimator):
147
    def __init__(self, entry_point: str, transformers_version: str, pytorch_version: str, **kwargs): ...
148
```
149

150
[Framework Training](./framework-training.md)
151

152
### Amazon Built-in Algorithms
153

154
Pre-built, optimized algorithms for common ML tasks including clustering, dimensionality reduction, classification, regression, and anomaly detection.
155

156
```python { .api }
157
# Clustering
158
class KMeans(Estimator):
159
    def __init__(self, role: str, instance_count: int, instance_type: str, k: int, **kwargs): ...
160

161
# Dimensionality Reduction
162
class PCA(Estimator):
163
    def __init__(self, role: str, instance_count: int, instance_type: str, num_components: int, **kwargs): ...
164

165
# Classification/Regression
166
class LinearLearner(Estimator):
167
    def __init__(self, role: str, instance_count: int, instance_type: str, **kwargs): ...
168

169
# Anomaly Detection
170
class RandomCutForest(Estimator):
171
    def __init__(self, role: str, instance_count: int, instance_type: str, **kwargs): ...
172
```
173

174
[Amazon Algorithms](./amazon-algorithms.md)
175

176
### AutoML
177

178
Automated machine learning capabilities for tabular data, image classification, text classification, and time series forecasting with minimal configuration required.
179

180
```python { .api }
181
# AutoML v1
182
class AutoML:
183
    def __init__(self, role: str = None, target_attribute_name: str = None, 
184
                 output_kms_key: str = None, output_path: str = None, 
185
                 base_job_name: str = None, compression_type: str = None, 
186
                 sagemaker_session: Session = None, volume_kms_key: str = None, 
187
                 encrypt_inter_container_traffic: bool = None, 
188
                 vpc_config: dict = None, problem_type: str = None, 
189
                 max_candidates: int = None, **kwargs): ...
190
    def fit(self, inputs, wait: bool = True, logs: bool = True, 
191
            job_name: str = None): ...
192

193
class AutoMLInput:
194
    def __init__(self, inputs, target_attribute_name: str, compression: str = None, 
195
                 channel_type: str = None, content_type: str = None, 
196
                 s3_data_type: str = None, sample_weight_attribute_name: str = None): ...
197

198
# AutoML v2
199
class AutoMLV2:
200
    def __init__(self, role: str = None, output_kms_key: str = None, 
201
                 output_path: str = None, base_job_name: str = None, 
202
                 sagemaker_session: Session = None, volume_kms_key: str = None, 
203
                 encrypt_inter_container_traffic: bool = None, **kwargs): ...
204
    def fit(self, inputs, wait: bool = True, logs: bool = True, 
205
            job_name: str = None): ...
206

207
class AutoMLDataChannel:
208
    def __init__(self, s3_data_source: str, target_attribute_name: str = None, 
209
                 channel_type: str = None, content_type: str = None, 
210
                 compression_type: str = None, sample_weight_attribute_name: str = None): ...
211

212
# Configuration classes
213
class AutoMLTabularConfig:
214
    def __init__(self, target_attribute_name: str, problem_type: str = None, 
215
                 job_objective: dict = None, **kwargs): ...
216

217
class AutoMLTimeSeriesForecastingConfig:
218
    def __init__(self, forecast_frequency: str, forecast_horizon: int, 
219
                 forecast_quantiles: list = None, **kwargs): ...
220
```
221

222
[AutoML](./automl.md)
223

224
### Model Serving and Inference
225

226
Comprehensive model deployment options including real-time endpoints, batch transform, serverless inference, and multi-model endpoints with custom serialization support.
227

228
```python { .api }
229
# Model deployment
230
class ModelBuilder:
231
    def __init__(self, **kwargs): ...
232
    def build(self, mode: Mode, role: str, sagemaker_session: Session) -> Model: ...
233

234
# Inference specification
235
class InferenceSpec:
236
    def load(self, model_dir: str): ...
237
    def invoke(self, input_object, model): ...
238

239
# Serializers
240
class JSONSerializer(BaseSerializer):
241
    def serialize(self, data) -> bytes: ...
242

243
class CSVSerializer(BaseSerializer):
244
    def serialize(self, data) -> bytes: ...
245

246
# Deserializers  
247
class JSONDeserializer(BaseDeserializer):
248
    def deserialize(self, stream, content_type: str): ...
249
```
250

251
[Model Serving](./model-serving.md)
252

253
### Data Processing
254

255
Data preprocessing capabilities including built-in processing containers, custom processing jobs, and Spark integration for large-scale data transformation.
256

257
```python { .api }
258
class Processor:
259
    def __init__(self, role: str, image_uri: str, instance_count: int, instance_type: str, **kwargs): ...
260
    def run(self, inputs: List[ProcessingInput], outputs: List[ProcessingOutput], **kwargs): ...
261

262
class ScriptProcessor(Processor):
263
    def __init__(self, command: List[str], **kwargs): ...
264

265
# Framework processors
266
class PyTorchProcessor(Processor): ...
267
class SKLearnProcessor(Processor): ...
268
class SparkMLProcessor(Processor): ...
269
```
270

271
[Data Processing](./data-processing.md)
272

273
### Model Monitoring
274

275
Comprehensive model monitoring including data quality, model quality, bias detection, and explainability analysis with scheduled monitoring jobs.
276

277
```python { .api }
278
class ModelMonitor:
279
    def __init__(self, role: str, **kwargs): ...
280
    def create_monitoring_schedule(self, **kwargs): ...
281

282
class DefaultModelMonitor(ModelMonitor): ...
283

284
class ModelBiasMonitor(ModelMonitor):
285
    def __init__(self, role: str, **kwargs): ...
286

287
class ModelExplainabilityMonitor(ModelMonitor):
288
    def __init__(self, role: str, **kwargs): ...
289

290
class DataCaptureConfig:
291
    def __init__(self, enable_capture: bool, sampling_percentage: int, **kwargs): ...
292
```
293

294
[Model Monitoring](./model-monitoring.md)
295

296
### Hyperparameter Tuning
297

298
Automated hyperparameter optimization with support for multiple search strategies, early stopping, and warm starting from previous tuning jobs.
299

300
```python { .api }
301
class HyperparameterTuner:
302
    def __init__(self, estimator: Estimator, objective_metric_name: str, 
303
                 hyperparameter_ranges: dict, **kwargs): ...
304
    def fit(self, inputs, **kwargs): ...
305
    def deploy(self, initial_instance_count: int, instance_type: str, **kwargs) -> Predictor: ...
306

307
class IntegerParameter:
308
    def __init__(self, min_value: int, max_value: int): ...
309

310
class ContinuousParameter:
311
    def __init__(self, min_value: float, max_value: float): ...
312

313
class CategoricalParameter:
314
    def __init__(self, values: List[str]): ...
315
```
316

317
[Hyperparameter Tuning](./hyperparameter-tuning.md)
318

319
### Experiments and Tracking
320

321
Experiment management and tracking capabilities for organizing ML workflows, comparing runs, and tracking metrics across training jobs.
322

323
```python { .api }
324
class Experiment:
325
    def __init__(self, experiment_name: str, description: str = None, **kwargs): ...
326
    def create(self) -> dict: ...
327

328
class Run:
329
    def __init__(self, experiment_name: str, sagemaker_session: Session = None): ...
330
    def log_parameter(self, name: str, value): ...
331
    def log_metric(self, name: str, value: float, step: int = None): ...
332

333
def load_run(sagemaker_session: Session = None, **kwargs) -> Run: ...
334
def list_runs(experiment_name: str = None, **kwargs) -> List[dict]: ...
335
```
336

337
[Experiments](./experiments.md)
338

339
### Debugging and Profiling
340

341
Comprehensive model debugging and performance profiling tools including tensor analysis, system metrics collection, and framework-specific profiling.
342

343
```python { .api }
344
class ProfilerConfig:
345
    def __init__(self, s3_output_path: str = None, profiling_interval_millis: int = None, **kwargs): ...
346

347
class Profiler:
348
    def __init__(self, **kwargs): ...
349

350
class DebuggerHookConfig:
351
    def __init__(self, s3_output_path: str, **kwargs): ...
352

353
class Rule:
354
    def __init__(self, name: str, image_uri: str, **kwargs): ...
355

356
class ProfilerRule(Rule):
357
    def __init__(self, name: str, **kwargs): ...
358
```
359

360
[Debugging and Profiling](./debugging-profiling.md)
361

362

363
### Remote Functions
364

365
Execute Python functions remotely on SageMaker compute with automatic dependency management, data transfer, and result retrieval.
366

367
```python { .api }
368
@remote(
369
    instance_type: str,
370
    instance_count: int = 1,
371
    role: str = None,
372
    **kwargs
373
)
374
def remote_function(): ...
375

376
class RemoteExecutor:
377
    def __init__(self, **kwargs): ...
378
    def submit(self, func, *args, **kwargs): ...
379
```
380

381
[Remote Functions](./remote-functions.md)
382

383
## Types
384

385
```python { .api }
386
# Training input configuration
387
class TrainingInput:
388
    def __init__(self, s3_data: str, s3_data_type: str = "S3Prefix", **kwargs): ...
389

390
# Processing input/output
391
class ProcessingInput:
392
    def __init__(self, source: str, destination: str, **kwargs): ...
393

394
class ProcessingOutput:
395
    def __init__(self, source: str, s3_upload_path: str, **kwargs): ...
396

397
# Model metrics
398
class ModelMetrics:
399
    def __init__(self, model_statistics: MetricsSource = None, 
400
                 model_constraints: MetricsSource = None, **kwargs): ...
401

402
class MetricsSource:
403
    def __init__(self, s3_uri: str, content_type: str): ...
404

405
# Network configuration
406
class NetworkConfig:
407
    def __init__(self, enable_network_isolation: bool = False, 
408
                 security_group_ids: List[str] = None, **kwargs): ...
409

410
# Instance configuration
411
class InstanceConfig:
412
    def __init__(self, instance_type: str, instance_count: int = 1, **kwargs): ...
413
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/