Performs pandas DataFrame operations for data analysis, manipulation, and transformation. Use when working with pandas DataFrames, data cleaning, aggregation, merging, or time series analysis. Invoke for data manipulation tasks such as joining DataFrames on multiple keys, pivoting tables, resampling time series, handling NaN values with interpolation or forward-fill, groupby aggregations, type conversion, or performance optimization of large datasets.
89
100%
Does it follow best practices?
Impact
78%
1.11xAverage score across 6 eval scenarios
Passed
No known issues
Data cleaning pipeline
No silent dropna
100%
100%
Empty string to NaN
100%
100%
Uses .copy() on subset
50%
50%
Method chaining
37%
100%
Nullable int type
100%
100%
Categorical conversion
0%
0%
Vectorized string ops
100%
100%
No iterrows
100%
100%
Pre/post validation
100%
100%
No chained indexing
100%
100%
na_values on read
100%
100%
GroupBy aggregation with memory optimization
Categorical dtype
100%
100%
observed=True
100%
100%
Named aggregation
100%
100%
Built-in aggregations
100%
100%
No iterrows
100%
100%
Memory usage check
100%
100%
Vectorized column ops
100%
100%
No chained indexing
100%
100%
Single groupby call
100%
100%
reset_index after groupby
100%
100%
Filter before compute
100%
100%
Multi-DataFrame merging and validation
Merge validate param
0%
100%
indicator=True usage
0%
100%
Unmatched rows reported
50%
100%
pd.concat not append
100%
80%
No chained indexing
100%
100%
Uses .loc for assignment
100%
75%
Downcast or categorical
0%
0%
reset_index after merge
50%
75%
No iterrows
50%
0%
Meaningful suffixes
100%
100%
Null check after join
100%
100%
Time series resampling and gap filling
Resample via set_index
0%
0%
fillna(0) after resample
0%
0%
ffill then interpolate
0%
25%
Mode fill for categoricals
0%
0%
Median fill for numerics
0%
0%
Pre-transform null check
100%
100%
Post-transform null check
100%
100%
No iterrows for time ops
100%
100%
Dtype check at start
0%
0%
Aggregation uses built-in methods
62%
75%
No chained indexing
100%
100%
Pivot table reporting with assertion-based validation
pivot_table with fill_value
100%
100%
margins=True
100%
100%
Assert row count
0%
0%
Assert no nulls
0%
100%
Assert column set
0%
0%
Initial data assessment
100%
100%
Categorical dtype for pivot dims
0%
0%
No iterrows
100%
100%
Vectorized derived columns
100%
100%
No chained indexing
100%
100%
Missing values handled explicitly
100%
100%
Chunked large dataset processing and memory optimization
Chunked CSV reading
100%
100%
dtype spec on read
0%
0%
pd.concat to combine chunks
100%
100%
Downcast integer columns
100%
100%
Downcast float columns
37%
100%
Categorical for low-cardinality
100%
100%
Memory usage reported
100%
100%
Parquet output
100%
100%
Filter before compute
100%
100%
No iterrows
100%
100%
No chained indexing
100%
100%
5b76101
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.