Converts a model checker counterexample trace into an executable test case in the source language, so the bug found in the model is reproducible (and regression-guarded) in the real code. Use when TLC/NuSMV/Spin finds a violation and you want a failing test before writing the fix.
Install with Tessl CLI
npx tessl i github:santosomar/general-secure-coding-agent-skills --skill counterexample-to-test-generator90
Quality
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
A model checker trace is a proof that the bug exists — in the model. A test is a proof it exists in the code. The translation is: map each model action back to a code operation, sequence them, and assert the invariant.
This is the model-checking analogue of → bug-reproduction-test-generator.
| Model trace element | Test element |
|---|---|
| Initial state | Test setup / fixture |
Action(p) firing | Call the code function that Action models, as process p |
| Interleaving of actions | Thread scheduling — controlled or forced |
| Final (violating) state | Assertion on the corresponding code state |
The hard part is interleaving. The model checker found a specific schedule that breaks things. Your test has to reproduce that schedule.
| Determinism level | Technique | Reliability |
|---|---|---|
| Full control | Single-threaded simulation: call each step explicitly in order | 100% — preferred |
| Partial control | Barriers/latches between steps to force ordering | High if used carefully |
| No control | Run threads concurrently many times, hope for the interleaving | Flaky — last resort |
Always try single-threaded first. If the model actions are Read(p1), Read(p2), Write(p1), Write(p2), you don't need real threads — just call p1.read(); p2.read(); p1.write(); p2.write(); in sequence. The interleaving that matters is the order of operations on shared state, not actual OS threads.
program-to-tlaplus-spec-generator (or equivalent extraction).\E p : Foo(p), use the p TLC chose.Inv == counter = N becomes assert counter == N.TLC trace (from the lost-update counter):
State 1: counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"start", p2|->"start"]
State 2: <Read(p1)> counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"start"]
State 3: <Read(p2)> counter=0, tmp=[p1|->0, p2|->0], pc=[p1|->"write", p2|->"write"]
State 4: <Write(p1)> counter=1, ...
State 5: <Write(p2)> counter=1 ← Inv violated: 2 increments, counter should be 2Action sequence: Read(p1), Read(p2), Write(p1), Write(p2).
Code mapping:
Read(p) ↔ first half of increment(): tmp = counterWrite(p) ↔ second half of increment(): lock(); counter = tmp + 1; unlock()Problem: increment() is one function in code, two actions in the model. To reproduce the interleaving, split it:
Test (Go):
func TestLostUpdate_FromTLC(t *testing.T) {
counter := 0
var mu sync.Mutex
// Simulate the TLC trace: Read(p1), Read(p2), Write(p1), Write(p2)
// Split increment() into its two atomic halves.
tmp1 := counter // Read(p1) -- reads 0
tmp2 := counter // Read(p2) -- reads 0
mu.Lock(); counter = tmp1 + 1; mu.Unlock() // Write(p1) -- counter=1
mu.Lock(); counter = tmp2 + 1; mu.Unlock() // Write(p2) -- counter=1
// Invariant: two increments → counter == 2
if counter != 2 {
t.Fatalf("lost update: 2 increments, counter=%d (TLC trace reproduced)", counter)
}
}Test fails with counter=1. Bug reproduced, deterministically, no real threads.
The trace violated the model invariant but the test passes. Three reasons:
| Cause | How to tell | Fix |
|---|---|---|
| Model is more abstract than code | Code has a check the model doesn't | Fix the model — it was over-approximating |
| Wrong code mapping | You called the wrong function for an action | Re-check the extraction mapping |
| Atomicity mismatch | Model treats X as one step; code has finer interleaving | Refine the model's actions |
A test that doesn't reproduce is good news — it means the model found a bug in itself, not the code. Fix the model.
## Action sequence (from trace)
1. <Action(params)>
2. ...
## Code mapping
<Action> ↔ <function/code region>
## Schedule strategy
<single-threaded simulation | barriers | stress loop> — <why>
## Test
<code block — executable>
## Expected
FAIL before fix: <what assertion trips>
PASS after fix: <same assertion, now holds>47d56bb
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.