running-tests

running tests at various levels from smoke tests to full suite to randomized tests

1.75x

Quality

55%

Does it follow best practices?

Impact

100%

1.75x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./.claude/skills/running-tests/SKILL.md

Evaluation results

100%

54%

Memory Safety Validation for BucketList Refactor

Sanitizer validation script for memory-sensitive changes

Criteria

Baseline

With context

Standard tests before sanitizers

100%

ASan configure flag

20%

100%

TSan configure flag

20%

100%

make clean between sanitizer builds

80%

100%

Parallel build with nproc

100%

--disable-tests build verification

33%

100%

--enable-ccache in configure calls

100%

--enable-sdfprefs in configure calls

100%

Quiet output flags in test commands

100%

Failure gate between phases

100%

53%

Test Automation Script for LedgerTxn Changes

Ordered test runner script for module changes

Criteria

Baseline

With context

--ll fatal flag

100%

-r simple flag

100%

--disable-dots flag

100%

--abort flag

100%

Smoke tests first

100%

LedgerTxn focused tests

100%

Full suite with --all-versions

100%

Failure gate between phases

100%

Timing per phase

100%

Ordered progression

70%

100%

20%

Transaction Processing Test Plan with Baseline Validation

Transaction change test plan with tx-meta baseline validation

Criteria

Baseline

With context

tx-meta baseline command present

100%

Exact rng-seed for baseline

100%

Correct baseline target

100%

--all-versions in baseline check

100%

[tx] tag filter in baseline

100%

Baseline check after full suite

100%

--ll fatal in baseline command

100%

--abort in baseline command

100%

-r simple in baseline command

100%

Ordered test phases

100%

Baseline mismatch guidance

100%

Repository: stellar/stellar-core
Commit: 1b0eccd

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Test Automation Script for LedgerTxn Changes Transaction Processing Test Plan with Baseline Validation Memory Safety Validation for BucketList Refactor

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.