vaex

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.

1.22x

Quality

86%

Does it follow best practices?

Impact

72%

1.22x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Evaluation results

62%

11%

Transaction Analytics Report

Batched aggregations and virtual columns

Criteria

Without context

With context

delay=True usage

100%

vaex.execute() batching

33%

Virtual column creation

100%

Named selections used

Selection-based aggregation

Avoids to_pandas_df on main data

100%

Avoids .values for computation

100%

HDF5 format used

50%

Multi-column statistics

100%

Segment comparison output

100%

54%

11%

Sensor Network Exploratory Data Visualization

Large dataset visualization with density plots

Criteria

Without context

With context

df.plot() for 2D density

25%

df.plot1d() for histograms

20%

Percentile-based limits

Logarithmic color scale

Named selections for comparison

100%

Figures saved to files

100%

No full-dataset sampling for main plots

100%

shape parameter used

100%

Multiple plot types

100%

Comparison visualization

42%

100%

17%

Customer Churn Prediction Pipeline

ML pipeline with vaex.ml and state management

Criteria

Without context

With context

vaex.ml import

100%

StandardScaler or MinMaxScaler

100%

Categorical encoder from vaex.ml

100%

vaex.ml.sklearn.Predictor

100%

fit on training data

100%

state_write() called

53%

100%

state.json file present

100%

state_load() on new data

33%

100%

Virtual column preservation

100%

Predictions output

100%

Repository: K-Dense-AI/claude-scientific-skills
Commit: cbcae7b

Evaluated: 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Transaction Analytics Report Sensor Network Exploratory Data Visualization Customer Churn Prediction Pipeline

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.