senior-data-engineer

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

1.13x

Quality

49%

Does it follow best practices?

Impact

83%

1.13x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./engineering-team/senior-data-engineer/SKILL.md

Quality

Discovery

92%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates capabilities, names specific technologies, and includes an explicit 'Use when' clause with multiple trigger scenarios. Its main weakness is breadth—it covers so much of the data engineering domain that it could potentially conflict with more specialized skills (e.g., a dedicated Airflow skill, a SQL optimization skill, or a Kafka skill). The description uses proper third-person voice throughout.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions and domains: building data pipelines, ETL/ELT systems, data modeling, pipeline orchestration, data quality, and DataOps. Also names specific technologies (Spark, Airflow, dbt, Kafka).	3 / 3
Completeness	Clearly answers both 'what' (building data pipelines, ETL/ELT systems, data infrastructure, data modeling, orchestration, data quality, DataOps) and 'when' with an explicit 'Use when...' clause covering five distinct trigger scenarios.	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'data pipelines', 'ETL', 'ELT', 'Spark', 'Airflow', 'dbt', 'Kafka', 'data modeling', 'data quality', 'data governance', 'data architecture'. These are terms practitioners naturally use.	3 / 3
Distinctiveness Conflict Risk	While data engineering is a recognizable niche, the description is quite broad and could overlap with general Python/SQL skills, database administration skills, or cloud infrastructure skills. Terms like 'troubleshooting data issues' and 'optimizing data workflows' are somewhat generic and could conflict with analytics or database skills.	2 / 3
	Total	11 / 12 Passed

Implementation

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is mostly a high-level overview of data engineering concepts that Claude already knows, with no actionable, executable content. The Quick Start references non-existent scripts, the workflows section is empty, and the bulk of the content consists of comparison tables for well-known architectural patterns. The skill would need a fundamental rewrite to provide actual value—replacing conceptual overviews with concrete, executable code examples and real workflow steps.

Suggestions

Replace the fictional CLI scripts in Quick Start with real, executable code examples (e.g., a complete Airflow DAG, a dbt model, a Spark job, or a Kafka consumer).

Remove the Trigger Phrases section entirely and remove or drastically condense the architecture comparison tables—Claude already knows these concepts.

Add actual workflow content inline with concrete steps, validation checkpoints, and error recovery loops (e.g., step-by-step ETL pipeline build with data quality validation between stages).

Include at least one complete, copy-paste-ready example for a common task like building a dbt model with tests, writing an Airflow DAG, or setting up Great Expectations checks.

Dimension	Reasoning	Score
Conciseness	The skill is verbose with significant padding. The 'Trigger Phrases' section (~30 lines) is entirely unnecessary—Claude doesn't need to be told when to activate. The architecture decision tables explain concepts Claude already knows well (batch vs streaming, Lambda vs Kappa, warehouse vs lakehouse). The tech stack table is a simple enumeration of well-known tools adding no actionable value.	1 / 3
Actionability	The Quick Start references fictional scripts (pipeline_orchestrator.py, data_quality_validator.py, etl_performance_optimizer.py) that don't exist and aren't provided. The workflows section is entirely empty, pointing to a reference file. There is no executable code, no real commands, no concrete examples of actual pipeline code, SQL, Spark jobs, or dbt models.	1 / 3
Workflow Clarity	The Workflows section is completely empty—just a pointer to 'references/workflows.md'. There are no sequenced steps, no validation checkpoints, no feedback loops. For a skill covering ETL pipelines and data quality (which involve destructive/batch operations), the total absence of workflow steps is a critical gap.	1 / 3
Progressive Disclosure	The skill does attempt progressive disclosure with a table of contents and references to external files (references/data_pipeline_architecture.md, etc.) with clear topic summaries. However, the main file contains too much inline content that Claude already knows (architecture comparison tables) while the actually useful content (workflows, troubleshooting) is entirely delegated to external files with no inline summary.	2 / 3
	Total	5 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: alirezarezvani/claude-skills
Commit: f567c61

Reviewed: 10 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.