Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.
Install with Tessl CLI
npx tessl i github:alirezarezvani/claude-skills --skill senior-data-engineerOverall
score
89%
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Discovery
92%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly articulates specific capabilities, technologies, and use cases for data engineering work. It uses appropriate third-person voice and includes an explicit 'Use when...' clause with multiple trigger scenarios. The main weakness is potential overlap with related skills around Python, SQL, or general data work, though the data engineering context helps differentiate it.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'building scalable data pipelines, ETL/ELT systems, data infrastructure, data modeling, pipeline orchestration, data quality, DataOps' along with specific technologies (Python, SQL, Spark, Airflow, dbt, Kafka). | 3 / 3 |
Completeness | Clearly answers both what (building pipelines, ETL/ELT systems, data infrastructure with specific technologies) AND when with explicit 'Use when...' clause covering five distinct trigger scenarios. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'data pipelines', 'ETL/ELT', 'Spark', 'Airflow', 'dbt', 'Kafka', 'data modeling', 'data quality', 'data governance', 'data issues' - these are terms practitioners naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | While specific to data engineering, terms like 'Python', 'SQL', and 'data' could overlap with general programming or data analysis skills. The data engineering focus is clear but 'troubleshooting data issues' is broad enough to potentially conflict with database or analytics skills. | 2 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, production-ready data engineering skill with excellent actionability and workflow clarity. The code examples are comprehensive and executable, covering the full spectrum from batch ETL to streaming to data quality. The main weakness is verbosity - the trigger phrases section and some explanatory content (architecture decision tables) add tokens without proportional value, as Claude already understands these concepts.
Suggestions
Remove or significantly condense the 'Trigger Phrases' section - Claude can infer when data engineering skills apply without explicit enumeration
Trim the architecture decision framework tables to just the decision trees/recommendations, removing explanatory content about what batch vs streaming means
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is comprehensive but includes some unnecessary verbosity, such as the extensive 'Trigger Phrases' section (Claude can infer when to use data engineering skills) and overly detailed architecture decision tables that explain concepts Claude already knows. The content could be tightened by 30-40%. | 2 / 3 |
Actionability | Excellent executable code throughout - complete Python scripts, SQL queries, YAML configs, and bash commands that are copy-paste ready. Examples include full Airflow DAGs, Spark streaming jobs, dbt models, and Great Expectations configurations with proper imports and context. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced with explicit validation steps. Each workflow (ETL pipeline, streaming, data quality) has numbered steps with validation checkpoints like 'Step 6: Validate Pipeline' and includes error handling patterns with dead letter queues and retry logic. | 3 / 3 |
Progressive Disclosure | Well-structured with clear table of contents, logical section organization, and appropriate references to external documentation (references/data_pipeline_architecture.md, etc.). Content is appropriately split between quick start, detailed workflows, and reference materials with one-level-deep navigation. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 13 / 16 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (993 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
license_field | 'license' field is missing | Warning |
Total | 13 / 16 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.