UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.
87
83%
Does it follow best practices?
Impact
89%
2.11xAverage score across 6 eval scenarios
Passed
No known issues
UMAP clustering preprocessing
StandardScaler preprocessing
100%
100%
Clustering n_neighbors
0%
100%
Clustering min_dist=0.0
0%
100%
Clustering n_components 5-10
0%
100%
HDBSCAN algorithm used
100%
100%
Reproducibility seed
100%
100%
Separate 2D visualization UMAP
0%
100%
Cluster statistics reported
100%
100%
Clustering evaluation
0%
0%
Cosine metric for document embeddings
Cosine distance metric
0%
100%
StandardScaler preprocessing
0%
100%
Reproducibility seed
50%
100%
2D visualization output
100%
100%
Fit then transform for queries
100%
100%
Corpus and queries separated before fit
100%
100%
Embedding output saved
100%
100%
Visualization produced
100%
100%
min_dist for visualization
0%
100%
Query neighbor search implemented
100%
100%
Semi-supervised UMAP feature extraction
Unlabeled marked as -1
0%
100%
Labels passed to UMAP fit
0%
100%
Unlabeled samples in UMAP fit
40%
100%
StandardScaler preprocessing
100%
100%
n_components for feature engineering
33%
100%
Downstream classifier trained
100%
100%
transform() on test data
100%
100%
Labeled-only classifier training
100%
100%
Reproducibility seed
0%
100%
Supervised UMAP with sklearn Pipeline
sklearn Pipeline used
100%
100%
StandardScaler in pipeline
100%
100%
UMAP in pipeline
0%
100%
Labels passed to pipeline fit
66%
100%
target_weight set explicitly
0%
0%
n_components for feature engineering
0%
0%
Reproducibility seed
0%
100%
Train/test split
100%
100%
Pipeline predict on test set
100%
100%
Accuracy reported
100%
100%
target_metric set for classification
0%
100%
AlignedUMAP for temporal datasets
AlignedUMAP import
0%
100%
List of datasets passed to fit
0%
100%
embeddings_ attribute accessed
0%
100%
StandardScaler preprocessing
100%
100%
alignment_regularisation set
0%
0%
Reproducibility seed
0%
100%
Visualization per timepoint
100%
100%
Multiple timepoint datasets
0%
100%
Embeddings output saved
100%
100%
n_components=2 for visualization
0%
100%
DensMAP density preservation and inverse transform
densmap=True enabled
100%
100%
output_dens=True set
0%
100%
rad_orig_ accessed
0%
100%
rad_emb_ accessed
0%
100%
inverse_transform called
0%
0%
StandardScaler preprocessing
100%
100%
Reproducibility seed
100%
100%
Density comparison output
50%
100%
dens_lambda set explicitly
0%
0%
Reconstruction error reported
62%
50%
086de41
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.