CtrlK
BlogDocsLog inGet started
Tessl Logo

preprocessing-data-with-automated-pipelines

Process automate data cleaning, transformation, and validation for ML tasks. Use when requesting "preprocess data", "clean data", "ETL pipeline", or "data transformation". Trigger with relevant phrases based on skill purpose.

45

Quality

33%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/ai-ml/data-preprocessing-pipeline/skills/preprocessing-data-with-automated-pipelines/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has a reasonable structure with both 'what' and 'when' clauses, but suffers from moderate vagueness in its capability listing and includes a meaningless filler sentence ('Trigger with relevant phrases based on skill purpose') that adds no value. The trigger terms cover some common phrases but miss many natural variations users would employ when needing ML data preprocessing.

Suggestions

Replace the vague trailing sentence 'Trigger with relevant phrases based on skill purpose' with additional specific trigger terms like 'feature engineering', 'missing values', 'normalize data', 'data wrangling', 'encode categorical variables'.

Make capabilities more concrete by listing specific actions such as 'handle missing values, remove duplicates, normalize/scale features, encode categorical variables, split train/test sets' instead of broad categories.

Strengthen distinctiveness by emphasizing the ML-specific focus more clearly, e.g., mentioning specific file types (.csv, .parquet), frameworks (pandas, scikit-learn), or distinguishing from general ETL/database pipelines.

DimensionReasoningScore

Specificity

Names the domain (ML data preprocessing) and some actions ('data cleaning, transformation, and validation'), but these are fairly broad categories rather than multiple specific concrete actions like 'remove duplicates, handle missing values, normalize columns, encode categorical features'.

2 / 3

Completeness

Explicitly answers both 'what' (data cleaning, transformation, and validation for ML tasks) and 'when' (with a 'Use when' clause listing specific trigger phrases). Despite the vague trailing sentence, the core structure covers both dimensions.

3 / 3

Trigger Term Quality

Includes some useful trigger terms like 'preprocess data', 'clean data', 'ETL pipeline', and 'data transformation', but misses common variations users might say such as 'feature engineering', 'missing values', 'normalize', 'encode', 'impute', or 'data wrangling'. The final sentence 'Trigger with relevant phrases based on skill purpose' is meaningless filler.

2 / 3

Distinctiveness Conflict Risk

The ML-specific focus helps narrow the scope somewhat, but terms like 'data cleaning' and 'data transformation' are broad enough to overlap with general data processing, database ETL, or analytics skills. The trailing generic sentence further dilutes distinctiveness.

2 / 3

Total

9

/

12

Passed

Implementation

0%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is almost entirely generic boilerplate with no actionable, skill-specific content. It describes what a data preprocessing pipeline should do in abstract terms but provides zero executable code, no concrete library recommendations, no actual pipeline patterns, and no validation steps. It reads like a template that was never filled in with real content.

Suggestions

Replace the abstract examples with actual executable Python code using specific libraries (e.g., pandas for cleaning, scikit-learn's preprocessing module for transformations) showing complete, copy-paste ready pipeline patterns.

Remove all generic boilerplate sections (Integration, Prerequisites, Instructions, Output, Error Handling, Resources) and the 'How It Works' meta-description — these waste tokens describing Claude's own process rather than teaching it something new.

Add concrete validation checkpoints with code, e.g., schema validation with pandera, data quality checks after each transformation step, and explicit error recovery patterns.

Include specific code patterns for common preprocessing tasks: handling missing values, encoding categoricals, scaling features, detecting outliers — with actual function implementations rather than descriptions.

DimensionReasoningScore

Conciseness

Extremely verbose and padded with content Claude already knows. The 'Overview' section restates the title, 'How It Works' describes Claude's own reasoning process, 'When to Use This Skill' repeats the frontmatter description, and sections like 'Best Practices', 'Integration', 'Instructions', 'Output', and 'Error Handling' are generic boilerplate with no skill-specific value.

1 / 3

Actionability

No executable code, no concrete commands, no specific library usage, no copy-paste ready examples. The examples describe what 'the skill will' do in abstract terms rather than providing actual Python code. Phrases like 'using appropriate techniques (e.g., mean imputation)' are vague descriptions, not actionable instructions.

1 / 3

Workflow Clarity

The 'How It Works' section lists abstract meta-steps (analyze, generate, execute, provide metrics) rather than concrete pipeline steps. No validation checkpoints, no feedback loops for error recovery, and no specific sequencing of actual data preprocessing operations. The examples describe outcomes without showing how to achieve them.

1 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files and no bundle files to support it. Content is poorly organized with multiple generic sections (Integration, Prerequisites, Instructions, Output, Error Handling, Resources) that add no value and could be removed entirely. The 'Resources' section lists placeholder items with no actual links.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.