CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

splitting-datasets

tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill splitting-datasets

Process split datasets into training, validation, and testing sets for ML model development. Use when requesting "split dataset", "train-test split", or "data partitioning". Trigger with relevant phrases based on skill purpose.

40%

Overall

SKILL.md
Review
Evals

Validation

81%
CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

metadata_version

'metadata' field is not a dictionary

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

13

/

16

Passed

Implementation

7%

This skill is a template-style document that describes what a dataset splitting skill would do rather than providing actionable instructions. It lacks any executable code examples, explains basic ML concepts Claude already knows, and provides no concrete implementation guidance. The content is almost entirely filler with no practical value for actually performing dataset splits.

Suggestions

Replace the abstract examples with actual executable Python code using sklearn.model_selection.train_test_split, including complete import statements and file I/O

Remove sections that explain concepts Claude knows (Overview, When to Use, Integration) and replace with a concise quick-start code block

Add specific validation steps such as checking output file row counts match input, verifying stratification worked correctly, and handling edge cases like small datasets

Include concrete parameters and their defaults (random_state, shuffle, stratify) rather than abstract 'Best Practices' descriptions

DimensionReasoningScore

Conciseness

Extremely verbose with extensive padding explaining concepts Claude already knows (what dataset splitting is, why it's used). Sections like 'Overview', 'When to Use', 'Integration', and generic 'Instructions' add no actionable value and waste tokens.

1 / 3

Actionability

No executable code provided despite being a code-generation skill. Examples describe what 'the skill will do' abstractly rather than showing actual Python code for train_test_split or similar. Completely lacks copy-paste ready implementations.

1 / 3

Workflow Clarity

The 'How It Works' section describes abstract steps ('Analyze Request', 'Generate Code', 'Execute Splitting') without any concrete workflow. No validation checkpoints, no error recovery steps, and no actual sequence Claude can follow to perform the task.

1 / 3

Progressive Disclosure

Content is organized into sections with headers, but it's a monolithic document with no references to external files. The structure exists but contains too much filler content that should either be removed or split into separate reference files.

2 / 3

Total

5

/

12

Passed

Activation

57%

The description establishes a clear ML data splitting purpose with some useful trigger terms, but suffers from a vague filler sentence ('Trigger with relevant phrases based on skill purpose') that adds no information. The capabilities could be more specific about what splitting options are available, and the trigger terms miss common user phrasings.

Suggestions

Remove the meaningless filler sentence 'Trigger with relevant phrases based on skill purpose' - it provides no value and weakens the description.

Add more specific capabilities like 'configure split ratios, stratified sampling, random seed control, handle CSV/Parquet formats'.

Expand trigger terms to include common variations: 'holdout set', 'cross-validation splits', '80/20 split', 'test set creation'.

DimensionReasoningScore

Specificity

Names the domain (ML datasets) and the core action (split into training/validation/testing sets), but lacks comprehensive detail about specific capabilities like stratification, ratio configuration, or handling different data formats.

2 / 3

Completeness

Has a 'Use when' clause with explicit triggers, but the final sentence 'Trigger with relevant phrases based on skill purpose' is vague filler that adds no value. The 'what' is present but shallow, and the 'when' guidance is partially undermined by the meaningless closing phrase.

2 / 3

Trigger Term Quality

Includes some relevant keywords ('split dataset', 'train-test split', 'data partitioning') but misses common variations users might say like 'holdout set', 'cross-validation', 'test set', 'validation split', or percentage-based requests like '80/20 split'.

2 / 3

Distinctiveness Conflict Risk

The focus on ML dataset splitting with specific terms like 'train-test split' and 'data partitioning' creates a clear niche that is unlikely to conflict with general data processing or other ML skills.

3 / 3

Total

9

/

12

Passed

Reviewed

Table of Contents

ValidationImplementationActivation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.