Compressed caveman-style prose for AI coding agents — cuts ~65% output tokens while keeping full technical accuracy
96
100%
Does it follow best practices?
Impact
96%
1.00xAverage score across 38 eval scenarios
Passed
No known issues
why use many token when few do trick
A skill that makes AI coding agents respond in compressed caveman-style prose — cutting ~65% of output tokens while keeping full technical accuracy.
tessl install juliusbrussee/cavemanWorks with Claude Code, Cursor, Codex, Gemini, Copilot, and more.
Normal Claude (69 tokens):
"The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I'd recommend using useMemo to memoize the object."
Caveman Claude (19 tokens):
"New object ref each render. Inline object prop = new ref = re-render. Wrap in
useMemo."
Same fix. 75% less word. Brain still big.
Switch anytime with /caveman lite, /caveman ultra, etc.
38 Tessl task eval scenarios test whether caveman degrades technical correctness: 35 coding problems across 10 languages (JS, TS, Python, Go, Rust, Java, CSS, SQL, HCL, YAML) + 3 negative cases. Each scored against weighted technical checklists — zero style points, only facts.
19 independent runs across 4 agents:
| Agent | Runs | Baseline | Caveman | Delta |
|---|---|---|---|---|
| Claude Sonnet 4.6 | 10 | 97.6% | 96.5% | -1.1 |
| Cursor Composer 2 | 3 | 97.7% | 96.7% | -1.0 |
| Codex GPT-5.4 | 3 | 97.0% | 96.7% | -0.3 |
| Claude Haiku 4.5 | 3 | 94.3% | 94.0% | -0.3 |
Delta never exceeds 1.1 percentage points. On some scenarios caveman scores higher than baseline — brevity forces model to focus. Fewer word, same brain, as found by research that shows brevity constraints improved accuracy by 26 percentage points on certain benchmarks.
Reproduce:
tessl eval run skills/caveman --agent claude:claude-sonnet-4-6 \
--variant without-context --variant with-context| Task | Normal (tokens) | Caveman (tokens) | Saved |
|---|---|---|---|
| Explain React re-render bug | 1180 | 159 | 87% |
| Fix auth middleware token expiry | 704 | 121 | 83% |
| Set up PostgreSQL connection pool | 2347 | 380 | 84% |
| Explain git rebase vs merge | 702 | 292 | 58% |
| Refactor callback to async/await | 387 | 301 | 22% |
| Architecture: microservices vs monolith | 446 | 310 | 30% |
| Review PR for security issues | 678 | 398 | 41% |
| Docker multi-stage build | 1042 | 290 | 72% |
| Debug PostgreSQL race condition | 1200 | 232 | 81% |
| Implement React error boundary | 3454 | 456 | 87% |
| Average | 1214 | 294 | 65% |
Range: 22%–87% savings across prompts. Caveman only affects output tokens — thinking/reasoning tokens are untouched.
Full documentation, additional install options, and sub-skills (caveman-commit, caveman-review, caveman-compress) at github.com/JuliusBrussee/caveman.
Quality evals contributed by Baruch Sadogursky using Tessl eval infrastructure.
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
scenario-31
scenario-32
scenario-33
scenario-34
scenario-35
scenario-36
scenario-37