CtrlK
BlogDocsLog inGet started
Tessl Logo

polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

84

1.01x
Quality

81%

Does it follow best practices?

Impact

86%

1.01x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description that clearly communicates when to use this skill (pandas too slow, data fits in RAM, 1-100GB) and provides good boundary conditions. Its main weakness is that it describes characteristics and architecture rather than listing specific concrete actions the skill enables. The trigger terms are excellent and would match natural user queries well.

Suggestions

Add specific concrete actions like 'filter, join, aggregate, pivot DataFrames' to improve specificity beyond architectural characteristics.

DimensionReasoningScore

Specificity

Names the domain (DataFrame library, in-memory data processing) and mentions some capabilities (lazy evaluation, parallel execution, Apache Arrow backend, ETL pipelines), but doesn't list specific concrete actions like 'filter rows', 'join tables', 'aggregate columns'. The description is more about characteristics than actions.

2 / 3

Completeness

Clearly answers both 'what' (fast in-memory DataFrame library with lazy evaluation, parallel execution, Arrow backend) and 'when' ('Use when pandas is too slow but data still fits in memory', 'Best for 1-100GB datasets, ETL pipelines'). Also includes a helpful boundary condition ('For larger-than-RAM data use dask or vaex').

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'pandas is too slow', 'DataFrame', 'ETL pipelines', 'faster pandas replacement', 'in-memory', '1-100GB datasets', 'dask', 'vaex'. These cover common user phrasings when seeking a Polars-like solution.

3 / 3

Distinctiveness Conflict Risk

Clearly carves out a distinct niche: faster-than-pandas in-memory DataFrame processing for 1-100GB datasets. The explicit boundary conditions (not for larger-than-RAM data) and the specific technology markers (Apache Arrow, lazy evaluation) make it highly distinguishable from general data processing or big data skills.

3 / 3

Total

11

/

12

Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill with excellent actionability and progressive disclosure. The main weaknesses are moderate verbosity — explaining concepts Claude already understands (lazy evaluation benefits, what expressions are) — and the lack of an explicit end-to-end ETL workflow with validation steps, which would be valuable given the skill's stated use case for ETL pipelines.

Suggestions

Trim conceptual explanations Claude already knows (e.g., remove 'Expressions are the fundamental building blocks...' paragraph, the 'Benefits of lazy evaluation' bullet list, and the 'Conceptual Differences' prose) to reduce token usage.

Add a concrete end-to-end ETL workflow example with validation steps (e.g., scan → transform → check schema/row counts → write parquet → verify output).

DimensionReasoningScore

Conciseness

The skill contains some unnecessary explanations Claude already knows (e.g., 'Expressions are the fundamental building blocks...', 'Benefits of lazy evaluation' list, explaining what eager vs lazy means conceptually). The pandas migration section and core concepts could be tightened. However, the code examples themselves are lean and useful.

2 / 3

Actionability

The skill provides fully executable, copy-paste ready code examples throughout — DataFrame creation, filtering, grouping, joins, I/O operations, window functions, and performance patterns. Every major operation has concrete, runnable Python code.

3 / 3

Workflow Clarity

The skill covers individual operations well but lacks explicit multi-step workflow sequences with validation checkpoints. For ETL pipeline use cases (mentioned in the description), there's no clear end-to-end workflow showing read → transform → validate → write with error handling or verification steps.

2 / 3

Progressive Disclosure

Excellent progressive disclosure structure: the main file provides a clear overview with executable quick-start examples, then consistently points to one-level-deep reference files (core_concepts.md, operations.md, pandas_migration.md, io_guide.md, transformations.md, best_practices.md) with clear descriptions of what each contains.

3 / 3

Total

10

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata.version' is missing

Warning

Total

10

/

11

Passed

Repository
K-Dense-AI/claude-scientific-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.