counterexample-to-test-generator

Automatically generates executable test cases from model checking counterexample traces. Translates abstract counterexample states and transitions into concrete test inputs, execution steps, and assertions that reproduce property violations. Use when working with model checker outputs (SPIN, CBMC, NuSMV, TLA+, Java PathFinder, etc.) and needing to create regression tests, validate bug fixes, or reproduce verification failures in executable test suites.

1.16x

Quality

85%

Does it follow best practices?

Impact

99%

1.16x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Counterexample To Test Generator

Overview

This skill transforms counterexample traces from model checkers into executable test cases that reliably reproduce property violations. It bridges formal verification and testing by mapping abstract states to concrete values and generating runnable tests with clear traceability from counterexample steps to test logic.

Workflow

Step 1: Analyze Inputs

Gather and understand the counterexample trace and program:

Identify the model checker format: Determine which tool produced the counterexample (SPIN, CBMC, NuSMV, TLA+, JPF, etc.). See references/model_checker_formats.md for format details.
Extract key information:
- Initial state values
- Sequence of transitions/steps
- Variable values at each step
- Property violation point
- Error condition or assertion failure
Understand the program structure:
- Entry points and function signatures
- Input parameters and their types
- State variables involved in the trace
- Control flow relevant to the counterexample

Step 2: Map Abstract States to Concrete Values

Translate the counterexample's abstract representation into concrete test inputs:

Determine concrete values for abstract state variables:
- Map symbolic values to concrete instances
- Resolve non-deterministic choices to specific values
- Handle ranges and constraints from the model
Identify input sequences:
- Extract the sequence of function calls or operations
- Determine parameter values for each call
- Identify timing or ordering constraints
Handle environment assumptions:
- External inputs or system calls
- Concurrency or scheduling decisions
- Resource states (files, network, memory)

Step 3: Generate Test Structure

Create the test case framework in the target language:

Choose test framework based on the program language:
- C/C++: Google Test, CUnit, Check
- Java: JUnit, TestNG
- Python: pytest, unittest
- C#: NUnit, xUnit
Structure the test:
- Setup phase: Initialize state to match counterexample start
- Execution phase: Replay the counterexample sequence
- Assertion phase: Verify the property violation occurs
Add traceability comments: Map each test step to counterexample steps for debugging and maintenance.

Step 4: Implement Test Logic

Write the executable test code:

Setup code:
```
// Initialize variables to counterexample initial state
// Set up test fixtures or mocks
// Configure environment (if needed)
```

Execution sequence:

// Step 1 (CE line X): [description]
// Execute operation with concrete values

// Step 2 (CE line Y): [description]
// Execute next operation

Assertions:

// Verify property violation (CE line Z)
// Assert expected failure condition
// Check final state matches counterexample

Step 5: Generate Output

Produce the complete test case with documentation:

Test file: Complete, compilable/runnable test code
Mapping document: Table linking counterexample steps to test lines
Execution instructions: How to compile and run the test
Expected behavior: What the test should demonstrate (failure reproduction)

Example Workflow

Input: SPIN counterexample showing a deadlock in a concurrent system

Output:

// test_deadlock.c - Reproduces deadlock from SPIN counterexample trail
#include <pthread.h>
#include <assert.h>

// Counterexample mapping:
// CE Step 1-2: Thread 1 acquires lock A
// CE Step 3-4: Thread 2 acquires lock B
// CE Step 5: Thread 1 waits for lock B (blocks)
// CE Step 6: Thread 2 waits for lock A (deadlock)

void* thread1_func(void* arg) {
    pthread_mutex_lock(&lock_a);  // CE Step 1
    sleep(1);                      // CE Step 2 (timing)
    pthread_mutex_lock(&lock_b);  // CE Step 5 (blocks)
    // ... rest of test
}

void test_deadlock_scenario() {
    // Setup: Initialize locks (CE initial state)
    pthread_mutex_init(&lock_a, NULL);
    pthread_mutex_init(&lock_b, NULL);

    // Execute: Create threads in counterexample order
    pthread_create(&t1, NULL, thread1_func, NULL);
    pthread_create(&t2, NULL, thread2_func, NULL);

    // This test will hang, demonstrating the deadlock
    pthread_join(t1, NULL);  // Will timeout
}

Best Practices

Minimize test complexity: Generate the simplest test that reproduces the violation
Preserve causality: Maintain the exact sequence from the counterexample
Make violations obvious: Use clear assertions and error messages
Add context: Include comments explaining the property being violated
Handle non-determinism: Document any assumptions made when concretizing values
Test the test: Verify the generated test actually fails as expected

Common Challenges

Challenge: Counterexample uses symbolic values without concrete bounds Solution: Use representative values from the domain, document the choice

Challenge: Trace involves complex timing or scheduling Solution: Use synchronization primitives or explicit delays to enforce ordering

Challenge: Program state is partially specified in counterexample Solution: Initialize unspecified variables to default/neutral values

Challenge: Counterexample is very long Solution: Identify the minimal prefix that still triggers the violation

References

references/model_checker_formats.md: Detailed format specifications for common model checkers
assets/test_templates/: Test framework templates for various languages

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Last updated: 4 months ago
Created: 4 months ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.