CtrlK
BlogDocsLog inGet started
Tessl Logo

spark-optimization

Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.

77

1.28x
Quality

71%

Does it follow best practices?

Impact

77%

1.28x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/data-engineering/skills/spark-optimization/SKILL.md
SKILL.md
Quality
Evals
Security

Evaluation results

100%

24%

Sales Analytics Pipeline Optimization

Join optimization and session configuration

Criteria
Without context
With context

AQE enabled

100%

100%

AQE coalesce partitions

100%

100%

AQE skew join

100%

100%

Kryo serializer

100%

100%

Shuffle partitions

100%

100%

Broadcast join hint

100%

100%

Broadcast threshold config

100%

100%

Shuffle compression

0%

100%

Compression codec lz4

0%

100%

mergeSchema false

100%

100%

Column pruning

0%

100%

maxPartitionBytes config

100%

100%

77%

2%

Multi-Feature ML Feature Engineering Pipeline

Caching, persistence, and iterative pipeline patterns

Criteria
Without context
With context

MEMORY_AND_DISK cache

100%

100%

Cache before multiple actions

100%

100%

Unpersist after use

100%

50%

Checkpoint used

0%

0%

Checkpoint dir set

0%

0%

approx_count_distinct used

100%

100%

No Python UDFs

100%

100%

No large collect

100%

100%

No count for existence

100%

100%

AQE enabled

100%

100%

Kryo serializer

0%

100%

54%

-17%

E-Commerce Order Attribution Pipeline with Skewed Data

Data skew handling and partitioning strategy

Criteria
Without context
With context

Skew detection

100%

100%

Salt column added

100%

0%

Salted key column

50%

0%

Other side exploded

100%

0%

AQE skew factor

0%

100%

AQE skew threshold

100%

100%

Coalesce not repartition

0%

0%

Repartition with key

100%

0%

Write with partitionBy

100%

100%

Memory configuration

100%

100%

Partition count config

100%

100%

Parquet snappy

0%

100%

Repository
Dicklesworthstone/pi_agent_rust
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.