CtrlK
BlogDocsLog inGet started
Tessl Logo

running-tests

running tests at various levels from smoke tests to full suite to randomized tests

71

1.75x
Quality

55%

Does it follow best practices?

Impact

100%

1.75x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/running-tests/SKILL.md
SKILL.md
Quality
Evals
Security

Evaluation results

100%

53%

Test Automation Script for LedgerTxn Changes

Ordered test runner script for module changes

Criteria
Without context
With context

--ll fatal flag

0%

100%

-r simple flag

0%

100%

--disable-dots flag

0%

100%

--abort flag

0%

100%

Smoke tests first

100%

100%

LedgerTxn focused tests

100%

100%

Full suite with --all-versions

0%

100%

Failure gate between phases

100%

100%

Timing per phase

100%

100%

Ordered progression

70%

100%

100%

20%

Transaction Processing Test Plan with Baseline Validation

Transaction change test plan with tx-meta baseline validation

Criteria
Without context
With context

tx-meta baseline command present

100%

100%

Exact rng-seed for baseline

100%

100%

Correct baseline target

0%

100%

--all-versions in baseline check

100%

100%

[tx] tag filter in baseline

100%

100%

Baseline check after full suite

100%

100%

--ll fatal in baseline command

100%

100%

--abort in baseline command

0%

100%

-r simple in baseline command

100%

100%

Ordered test phases

100%

100%

Baseline mismatch guidance

100%

100%

100%

54%

Memory Safety Validation for BucketList Refactor

Sanitizer validation script for memory-sensitive changes

Criteria
Without context
With context

Standard tests before sanitizers

100%

100%

ASan configure flag

20%

100%

TSan configure flag

20%

100%

make clean between sanitizer builds

80%

100%

Parallel build with nproc

100%

100%

--disable-tests build verification

33%

100%

--enable-ccache in configure calls

0%

100%

--enable-sdfprefs in configure calls

0%

100%

Quiet output flags in test commands

0%

100%

Failure gate between phases

100%

100%

Repository
stellar/stellar-core
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.