tessl install tessl/pypi-scikit-learn@1.7.0A comprehensive machine learning library providing supervised and unsupervised learning algorithms with consistent APIs and extensive tools for data preprocessing, model evaluation, and deployment.
Agent Success
Agent success rate when using this tile
87%
Improvement
Agent success rate improvement when using this tile compared to baseline
0.99x
Baseline
Agent success rate without this tile
88%
{
"context": "Evaluates whether the solution builds the requested unsupervised workflow using scikit-learn's preprocessing, decomposition, mixture, and manifold tools. Checks focus on correct use of StandardScaler, PCA-based variance retention, GaussianMixture model selection via BIC, and deterministic 2D manifold embedding driven by random_state.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Scaling + PCA",
"description": "Fits sklearn.preprocessing.StandardScaler then sklearn.decomposition.PCA with n_components reaching >=0.90 explained variance (e.g., n_components=0.9) on training data and reuses them for predictions.",
"max_score": 25
},
{
"name": "Mixture selection",
"description": "Trains sklearn.mixture.GaussianMixture over the provided cluster_counts, compares Bayesian/Akaike information criterion (e.g., bic) to pick the best count, stores the selected count, and seeds the model with random_state.",
"max_score": 30
},
{
"name": "Soft predictions",
"description": "predict() pipes data through the fitted scaler and PCA before calling GaussianMixture.predict and predict_proba, returns labels plus max responsibility per sample, and rejects calls before fit.",
"max_score": 15
},
{
"name": "Manifold embedding",
"description": "embedding_2d() runs a manifold method from sklearn.manifold (e.g., Isomap or LocallyLinearEmbedding) on the PCA-transformed training data with n_components=2, passes random_state when supported, caches/returns deterministic output, and errors if unfitted.",
"max_score": 20
},
{
"name": "Validation errors",
"description": "Raises ValueError during fit when non-finite entries are present or when min(cluster_counts) exceeds available samples before attempting to train any estimator.",
"max_score": 10
}
]
}