Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
82
77%
Does it follow best practices?
Impact
86%
1.01xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/polars/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly identifies its niche as a Polars-like fast DataFrame library, with excellent trigger terms and explicit 'when to use' guidance including boundary conditions. Its main weakness is that it describes library characteristics rather than listing specific concrete actions a user might perform (e.g., filtering, joining, aggregating). The inclusion of when NOT to use it (larger-than-RAM → dask/vaex) is a notable strength for disambiguation.
Suggestions
Add specific concrete actions users would perform, such as 'filter, join, aggregate, pivot DataFrames' to improve specificity beyond library characteristics.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (DataFrame library, in-memory data processing) and mentions some capabilities (lazy evaluation, parallel execution, Apache Arrow backend, ETL pipelines), but doesn't list specific concrete actions like 'filter rows', 'join tables', 'aggregate columns'. It describes characteristics more than actions. | 2 / 3 |
Completeness | Clearly answers both 'what' (fast in-memory DataFrame library with lazy evaluation, parallel execution, Arrow backend) and 'when' ('Use when pandas is too slow but data still fits in memory', 'Best for 1-100GB datasets, ETL pipelines'). Also includes a helpful boundary condition for when NOT to use it. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'pandas is too slow', 'DataFrame', 'ETL pipelines', 'faster pandas replacement', 'in-memory', '1-100GB datasets', 'dask', 'vaex'. These cover common user phrasings when seeking a Polars-like solution. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly carves out a distinct niche: faster-than-pandas in-memory DataFrame processing for 1-100GB datasets. The explicit boundaries (not for larger-than-RAM data, use dask/vaex instead) and the specific technology mentions (Apache Arrow, lazy evaluation) make it highly distinguishable from general data processing or big data skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with excellent executable code examples covering Polars' key features. Its main weaknesses are verbosity (explaining concepts Claude already understands, like what expressions and lazy evaluation are), lack of explicit multi-step workflow patterns with validation for ETL pipelines, and an imbalance between inline content and referenced files — too much detail is kept in the main file while referencing non-existent supplementary documents.
Suggestions
Trim explanatory prose that Claude already knows (e.g., remove 'Expressions are the fundamental building blocks...' and 'Benefits of lazy evaluation' bullet list) to improve conciseness.
Add an explicit end-to-end ETL workflow with validation steps (e.g., scan → transform → schema check → collect → write → verify row counts) to improve workflow clarity for the stated use case.
Reduce inline content that duplicates what referenced files would cover (e.g., move the detailed pandas migration table and transformation examples entirely to their reference files, keeping only 2-3 key differences inline).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill includes some unnecessary explanations Claude already knows (e.g., 'Expressions are the fundamental building blocks...', 'They describe transformations on data and can be composed, reused, and optimized', explaining what lazy evaluation benefits are). The pandas migration table and code examples are useful but the surrounding prose could be trimmed. Overall mostly efficient but with notable padding. | 2 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code examples throughout — DataFrame creation, filtering, grouping, joins, I/O operations, window functions, and migration patterns. Commands like `uv pip install polars` and concrete code snippets make this highly actionable. | 3 / 3 |
Workflow Clarity | The skill covers individual operations well but lacks explicit multi-step workflow sequences with validation checkpoints. The 'Best Practices' section lists optimization tips as numbered items but doesn't provide a clear end-to-end pipeline workflow (e.g., read → transform → validate → write) with verification steps. For ETL pipeline use cases mentioned in the description, this is a gap. | 2 / 3 |
Progressive Disclosure | The skill references six separate files in a references/ directory with clear descriptions, which is good structure. However, no bundle files were provided, meaning these references don't actually exist. Additionally, the SKILL.md itself is quite long (~300 lines) with substantial inline content that overlaps with what the referenced files would cover (e.g., detailed operations, pandas migration, transformations), suggesting the split between overview and reference material isn't well-calibrated. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
cbcae7b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.