pyspark-transformer

Pyspark Transformer - Auto-activating skill for Data Pipelines. Triggers on: pyspark transformer, pyspark transformer Part of the Data Pipelines skill category.

1.04x

Quality

Does it follow best practices?

Impact

100%

1.04x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./planned-skills/generated/11-data-pipelines/pyspark-transformer/SKILL.md

Quality

Discovery

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is extremely weak—it is essentially a title and category label with no substantive content. It fails to describe any concrete capabilities, provides no natural trigger terms beyond the skill name repeated, and lacks any explicit guidance on when Claude should select this skill. It would be nearly indistinguishable from other data-related skills in a large skill library.

Suggestions

Add concrete action descriptions such as 'Builds and transforms PySpark DataFrames, writes Spark SQL queries, creates ETL pipeline stages, and handles data transformations using PySpark.'

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about PySpark, Spark DataFrames, distributed data transformations, ETL pipelines, or Apache Spark data processing.'

Include natural trigger term variations users would say: 'pyspark', 'spark dataframe', 'ETL', 'data transformation', 'spark SQL', 'distributed processing', '.parquet', 'data pipeline'.

Dimension	Reasoning	Score
Specificity	The description provides no concrete actions whatsoever. It only names 'Pyspark Transformer' and 'Data Pipelines' without describing what the skill actually does—no verbs like 'transforms', 'processes', 'converts', etc.	1 / 3
Completeness	The description fails to answer both 'what does this do' and 'when should Claude use it'. There is no explanation of capabilities and no explicit 'Use when...' clause—only a category label and redundant trigger terms.	1 / 3
Trigger Term Quality	The trigger terms are just 'pyspark transformer' repeated twice. There are no natural user keywords like 'spark', 'dataframe', 'ETL', 'pipeline', 'transform data', '.py', or other variations a user might naturally use.	1 / 3
Distinctiveness Conflict Risk	The mention of 'Pyspark Transformer' is somewhat specific to a particular technology, which provides some distinctiveness. However, the vague 'Data Pipelines' category and lack of concrete scope could cause overlap with other data processing or pipeline skills.	2 / 3
	Total	5 / 12 Passed

Implementation

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is an empty placeholder with no actual instructional content. It consists entirely of generic boilerplate that could apply to any topic—there is no PySpark-specific knowledge, no code examples, no workflows, and no actionable guidance whatsoever. It fails on every dimension of the rubric.

Suggestions

Add concrete, executable PySpark code examples showing common transformer patterns (e.g., DataFrame transformations, UDFs, window functions, joins) with copy-paste ready snippets.

Define a clear workflow for building PySpark transformers: e.g., 1) read source data, 2) apply transformations, 3) validate schema, 4) write output—with explicit validation checkpoints.

Remove all generic meta-content ('This skill provides automated assistance...', 'Example Triggers', 'When to Use') and replace with domain-specific guidance on PySpark best practices, common pitfalls, and performance optimization.

Add references to supporting files for advanced topics (e.g., OPTIMIZATION.md for partition tuning, TESTING.md for unit testing transformers, EXAMPLES.md for end-to-end pipeline examples).

Dimension	Reasoning	Score
Conciseness	The content is entirely filler and boilerplate. It explains nothing Claude doesn't already know, provides no specific PySpark transformer knowledge, and wastes tokens on generic meta-descriptions like 'Provides step-by-step guidance' without actually providing any.	1 / 3
Actionability	There is zero concrete, executable guidance—no code snippets, no commands, no specific PySpark transformer patterns, no API usage, no configuration examples. The entire content describes rather than instructs.	1 / 3
Workflow Clarity	No workflow steps are defined at all. Despite claiming to provide 'step-by-step guidance,' there are no actual steps, no sequencing, and no validation checkpoints for any data pipeline operation.	1 / 3
Progressive Disclosure	The content is a monolithic block of generic text with no references to supporting files, no structured navigation, and no bundle files to support it. There is nothing to progressively disclose because there is no substantive content.	1 / 3
	Total	4 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: jeremylongshore/claude-code-plugins-plus-skills
Commit: 3a2d27d

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.