Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.
Install with Tessl CLI
npx tessl i github:Dicklesworthstone/pi_agent_rust --skill spark-optimization83
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Join optimization and session configuration
AQE enabled
100%
100%
AQE coalesce partitions
100%
100%
AQE skew join
100%
100%
Kryo serializer
100%
100%
Shuffle partitions
100%
100%
Broadcast join hint
100%
100%
Broadcast threshold config
100%
100%
Shuffle compression
0%
100%
Compression codec lz4
0%
100%
mergeSchema false
100%
100%
Column pruning
0%
100%
maxPartitionBytes config
100%
100%
Without context: $0.3165 · 1m 32s · 12 turns · 54 in / 5,112 out tokens
With context: $0.4619 · 2m 8s · 16 turns · 254 in / 6,455 out tokens
Caching, persistence, and iterative pipeline patterns
MEMORY_AND_DISK cache
100%
100%
Cache before multiple actions
100%
100%
Unpersist after use
100%
50%
Checkpoint used
0%
0%
Checkpoint dir set
0%
0%
approx_count_distinct used
100%
100%
No Python UDFs
100%
100%
No large collect
100%
100%
No count for existence
100%
100%
AQE enabled
100%
100%
Kryo serializer
0%
100%
Without context: $0.2389 · 1m 14s · 10 turns · 10 in / 3,789 out tokens
With context: $0.5522 · 2m 6s · 22 turns · 20 in / 6,428 out tokens
Data skew handling and partitioning strategy
Skew detection
100%
100%
Salt column added
100%
0%
Salted key column
50%
0%
Other side exploded
100%
0%
AQE skew factor
0%
100%
AQE skew threshold
100%
100%
Coalesce not repartition
0%
0%
Repartition with key
100%
0%
Write with partitionBy
100%
100%
Memory configuration
100%
100%
Partition count config
100%
100%
Parquet snappy
0%
100%
Without context: $0.5846 · 2m 56s · 20 turns · 21 in / 9,796 out tokens
With context: $0.8312 · 3m 49s · 25 turns · 24 in / 12,488 out tokens
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.