tessl i github:jeffallan/claude-skills --skill spark-engineerUse when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.
Review Score
67%
Validation Score
12/16
Implementation Score
42%
Activation Score
90%
Generated
Validation
Total
12/16Score
Passed| Criteria | Score |
|---|---|
metadata_version | 'metadata' field is not a dictionary |
license_field | 'license' field is missing |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata |
body_examples | No examples detected (no code fences and no 'Example' wording) |
Implementation
Suggestions 4
Score
42%Overall Assessment
This skill provides good structural organization and clear progressive disclosure to reference materials, but critically lacks actionable code examples. The content describes Spark best practices at a conceptual level without providing the executable code, specific commands, or concrete examples that would make it immediately useful. The constraints sections (MUST DO/MUST NOT DO) are valuable but would benefit from accompanying code snippets.
Suggestions
| Dimension | Score | Reasoning |
|---|---|---|
Conciseness | 2/3 | The skill contains some unnecessary verbosity in the role definition section that repeats information Claude already knows. The constraints and workflow sections are reasonably efficient, but the 'Knowledge Reference' section is essentially a keyword list that adds little value. |
Actionability | 1/3 | The skill lacks any concrete, executable code examples. It describes what to do at a high level (use DataFrame API, broadcast joins, etc.) but provides no actual Spark code, commands, or copy-paste ready examples. The guidance is abstract rather than instructional. |
Workflow Clarity | 2/3 | The core workflow provides a clear 5-step sequence, but lacks validation checkpoints and feedback loops. For operations involving production data pipelines, there's no explicit 'validate before proceeding' step or error recovery guidance. |
Progressive Disclosure | 3/3 | The skill effectively uses a reference table to point to detailed guidance in separate files, with clear topic categorization and 'Load When' conditions. This is well-organized one-level-deep progressive disclosure. |
Activation
Suggestions 2
Score
90%Overall Assessment
This is a solid skill description with excellent trigger term coverage and clear 'when to use' guidance. The main weakness is that it lists technical concepts rather than concrete actions Claude can perform. Adding specific verbs like 'write', 'optimize', 'debug', or 'migrate' would strengthen the specificity dimension.
Suggestions
| Dimension | Score | Reasoning |
|---|---|---|
Specificity | 2/3 | Names the domain (Apache Spark, distributed data processing) and lists some technical areas (DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics), but doesn't describe concrete actions like 'write', 'optimize', 'debug', or 'configure'. |
Completeness | 3/3 | Explicitly answers both what (building Spark applications, distributed data processing, big data workloads) and when ('Use when building...', 'Invoke for...') with clear trigger guidance at the start of the description. |
Trigger Term Quality | 3/3 | Good coverage of natural terms users would say: 'Spark', 'DataFrame', 'Spark SQL', 'RDD', 'big data', 'streaming analytics', 'performance tuning', 'distributed data processing' - these are terms developers naturally use when working with Spark. |
Distinctiveness Conflict Risk | 3/3 | Highly distinctive with Spark-specific terminology (DataFrame API, RDD, Spark SQL) that clearly separates it from general data processing or other big data tools like Hadoop or Flink. |