senior-data-engineer

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

1.13x

Quality

49%

Does it follow best practices?

Impact

83%

1.13x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./engineering-team/senior-data-engineer/SKILL.md

Evaluation results

93%

13%

E-Commerce Analytics dbt Project

dbt project structure and testing

Criteria

Without context

With context

Staging layer exists

100%

Intermediate layer exists

100%

Marts layer exists

100%

Staging materialized as view

100%

Intermediate materialized as ephemeral

100%

Incremental merge strategy

100%

on_schema_change set

100%

cluster_by configured

100%

Watermark filter in incremental block

100%

Column tests: unique and not_null

100%

accepted_range test on amount

100%

Recency test present

100%

Source freshness config

100%

Surrogate key macro used

Reusable macro defined

100%

87%

-3%

Real-Time User Engagement Stream Processor

Kafka Spark streaming pipeline with DLQ

Criteria

Without context

With context

Kafka partitions=12

100%

Kafka replication-factor=3

100%

Kafka retention config

100%

Kafka cleanup.policy=delete

100%

Checkpoint location set

100%

shuffle.partitions=12

100%

failOnDataLoss=false

100%

Watermark applied

100%

Delta Lake append output

100%

processingTime trigger

50%

100%

foreachBatch pattern used

100%

DLQ write with error columns

100%

Prometheus Counter metric

100%

Prometheus Gauge metric

100%

81%

22%

Daily Logistics Data Pipeline with Data Contracts

Airflow ETL pipeline and data contracts

Criteria

Without context

With context

Uses pipeline_orchestrator.py

50%

100%

Orchestrator source/destination flags

50%

100%

Orchestrator incremental mode

57%

Uses data_quality_validator.py

50%

100%

Validator checks specified

Airflow retries=2

Airflow retry_delay=5min

100%

Airflow email_on_failure=True

100%

Airflow schedule 0 5 * * *

100%

Airflow catchup=False

100%

Airflow tags present

100%

Data contract schema section

100%

Data contract SLA section

71%

100%

Data contract consumers section

83%

100%

72%

17%

Retail Event Data Lake Migration

Spark batch optimization and medallion architecture

Criteria

Without context

With context

KryoSerializer configured

100%

Adaptive query execution enabled

100%

Executor memory 8g

100%

Executor cores 4

100%

Driver memory 4g

100%

Shuffle partitions 200

Broadcast join for small table

100%

MEMORY_AND_DISK persistence

Unpersist called

Bronze: mergeSchema=true

100%

Bronze: metadata columns

42%

100%

Bronze: append mode

100%

Silver: DeltaTable merge upsert

100%

Gold: partitioned by date

100%

All layers use Delta format

100%

80%

24%

Customer History Dimension for Historical Purchase Analysis

SCD Type 2 dimensional modeling in dbt

Criteria

Without context

With context

Surrogate key via generate_surrogate_key

50%

70%

incremental materialization

100%

strategy='check' with check_cols

effective_start_date column

100%

effective_end_date = 9999-12-31

100%

is_current boolean

100%

row_hash for change detection

20%

100%

is_incremental() filter block

100%

Natural key retained

100%

Staging materialized as view

100%

Tests: unique and not_null on surrogate key

100%

Tests: recency on updated_at or effective_start_date

Macro file created

100%

90%

-10%

Financial Transaction Pipeline: Guaranteed Delivery and Historical Backfill

Exactly-once Kafka producer and idempotent Airflow backfill

Criteria

Without context

With context

acks='all'

100%

enable_idempotence=True

100%

max_in_flight=5

100%

retries=max int

100%

transactional_id set

100%

init_transactions called

100%

begin/commit/abort pattern

100%

send_offsets_to_transaction

100%

catchup=True

100%

max_active_runs set

100%

Processes by execution_date not today

100%

Idempotent load pattern

100%

Loads to date partition

100%

Repository: alirezarezvani/claude-skills
Commit: f567c61

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

E-Commerce Analytics dbt Project Real-Time User Engagement Stream Processor Daily Logistics Data Pipeline with Data Contracts Retail Event Data Lake Migration Customer History Dimension for Historical Purchase Analysis Financial Transaction Pipeline: Guaranteed Delivery and Historical Backfill

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.