Generates unit tests for a function or class by analyzing branches, boundaries, and error paths — then emits test code in the project's existing framework and style. Covers happy path, edge cases, and failure modes with mocks for external dependencies. Use when writing tests for new code, when backfilling coverage on untested functions, when the user asks to generate tests, or when a coverage report shows specific gaps.
Install with Tessl CLI
npx tessl i github:santosomar/general-secure-coding-agent-skills --skill unit-test-generator97
Quality
96%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Generate tests that fail when the code is wrong and pass when it's right — not tests that pass because they assert what the code currently does. The purpose of a test is to pin intended behavior, not observed behavior.
Before writing anything, find an existing test file in the project and match it:
| Check | Look for |
|---|---|
| Framework | pytest / unittest (Py), jest / vitest / mocha (JS/TS), JUnit 4/5 / TestNG (Java), go test (Go), xUnit / NUnit (.NET), RSpec / Minitest (Ruby) |
| Assertion style | assert x == y vs assertThat(x).isEqualTo(y) vs expect(x).toBe(y) vs x.should eq(y) |
| File location / naming | tests/test_foo.py, __tests__/foo.test.ts, src/foo_test.go, FooTest.java |
| Mocking library | unittest.mock, jest.fn(), Mockito, gomock, Moq, Sinon |
| Parameterization style | @pytest.mark.parametrize, test.each, @ParameterizedTest, table-driven |
| Fixture / setup idiom | conftest.py, beforeEach, @BeforeEach, TestMain, let(:x) |
If no tests exist at all, ask which framework — or default to the ecosystem standard (pytest, jest, JUnit 5, go test) and say so explicitly in the output.
Read the function and extract a coverage checklist. Every row becomes at least one test.
| Category | How to find them | What to test |
|---|---|---|
| Happy path | The straight-line execution with typical inputs | Exactly one test. This is your baseline. |
| Branches | Every if/elif/else, switch/match arm, ternary, short-circuit &&/||, ?. | One test per arm that takes that arm |
| Boundaries | Comparison operators: <, <=, >, >=, ==; loops with explicit bounds; length/size checks | The boundary value itself, one below, one above |
| Error paths | raise/throw, functions documented to throw, error-returning branches | One test per distinct error, asserting the type and message |
| Input edges | Per parameter type (see table below) | Representative of each class |
| Side effects | Writes to anything outside local scope: DB, file, HTTP, global, logger | Assert the effect happened — or was skipped when it should be |
Per-type input edges:
| Parameter type | Must cover |
|---|---|
| Collection | [], [x], [x, y, ...]; if order matters, also reversed |
| Optional / nullable | Present, absent/None/null/undefined |
| String | "", whitespace-only, typical, very long if there's a length check anywhere downstream |
| Number | 0, negative, positive, boundary of any internal comparison, float-vs-int if the code divides or compares |
| Dict / map | {}, key present, key absent |
| Enum / union | Every variant (no skipping "unlikely" ones) |
Not every cell applies to every function. Prune the ones that provably can't affect behavior — but do it consciously, not by default.
The oracle is how you know the expected value is correct. This is the difference between a test and a tautology.
Valid oracles, in order of preference:
reverse(reverse(xs)) == xs, decode(encode(x)) == x, sum([]) == 0.tax(100, 0.08), you did the arithmetic yourself and got 8.00.Invalid oracle: running the code under test and asserting whatever it returns. That's assert f(x) == f(x) with extra steps. If you catch yourself doing this — if you ran the function to find out what to assert — stop. You don't have a test, you have a change detector. Either find a real oracle or hand off to → test-oracle-generator.
A unit test tests one unit. Anything that crosses a process boundary — network, disk, DB, clock, randomness, environment — gets replaced.
| Dependency | Default replacement | Use the real thing when… |
|---|---|---|
| HTTP client | Mock returning a canned response | Never, in a unit test |
| Database | In-memory fake, or mock the repository method | It's an integration test — different file |
Clock / now() | Freeze to a fixed instant (freezegun, jest.useFakeTimers, Clock.fixed) | Never — nondeterministic tests are flaky tests |
| Random | Seed it, or mock to return a fixed value | Never |
| Filesystem | tmp_path / tmpdir fixture, or in-memory FS | The path logic itself is the unit under test |
| Env var / config | Patch for the test's duration | Never — leaks between tests |
| Another class you own | Don't mock by default — use the real one | It's expensive to construct, or you're testing the interaction specifically |
Mock at the boundary, not in the interior. If OrderService uses TaxCalculator uses TaxRateLookup uses HttpClient — mock the HTTP call, not TaxCalculator. Mocking your own internal code couples the test to implementation details; refactoring breaks the test without breaking the behavior.
Naming: test_<what>_<condition>_<expected>. test_parse_rejects_empty_string beats test_parse_2. A failed test's name should tell you what broke without opening the file.
Structure: Arrange–Act–Assert, one of each. If you need two Acts, that's two tests.
Parameterize only when the assertion logic is identical across cases and the cases differ only in data:
@pytest.mark.parametrize("raw,expected", [
("1,2,3", [1, 2, 3]),
("", []),
(" 7 ", [7]), # whitespace trimmed
("1,,3", [1, 3]), # empty segments dropped
])
def test_parse_csv_ints(raw, expected):
assert parse_csv_ints(raw) == expectedDon't parameterize when different cases need different assertions — you'll end up with if/else inside the test body, which defeats the point.
Code under test:
def apply_discount(price: Decimal, code: str | None, clock=time) -> Decimal:
if code is None:
return price
discount = DISCOUNTS.get(code)
if discount is None:
raise InvalidCodeError(code)
if discount.expires_at < clock.time():
raise ExpiredCodeError(code)
return (price * (1 - discount.rate)).quantize(Decimal("0.01"))Enumeration:
| Category | Case | Oracle |
|---|---|---|
| Happy | Valid unexpired code applied | Hand arithmetic |
| Branch | code is None → return unchanged | Identity |
| Branch | Code not in DISCOUNTS | Spec: raises InvalidCodeError |
| Branch | Code expired | Spec: raises ExpiredCodeError |
| Boundary | expires_at == clock.time() | Spec says <, so at-the-instant is still valid |
| Edge | price = 0 | 0 × anything = 0 |
| Edge | rate = 1.0 (100% off) | Result is 0.00 |
| Precision | price = 9.99, rate = 0.333 | Hand-compute the quantized result |
Output (pytest, matching a project that uses fixtures and pytest.raises):
from decimal import Decimal
import pytest
@pytest.fixture
def clock_at():
class _Fixed:
def __init__(self, t): self._t = t
def time(self): return self._t
return _Fixed
@pytest.fixture
def discounts(monkeypatch):
table = {}
monkeypatch.setattr("pricing.DISCOUNTS", table)
return table
def test_applies_valid_unexpired_discount(discounts, clock_at):
discounts["SUMMER10"] = Discount(rate=Decimal("0.10"), expires_at=2000)
result = apply_discount(Decimal("50.00"), "SUMMER10", clock=clock_at(1000))
assert result == Decimal("45.00") # 50 × 0.9, hand-computed
def test_returns_price_unchanged_when_code_is_none(clock_at):
assert apply_discount(Decimal("50.00"), None, clock=clock_at(0)) == Decimal("50.00")
def test_rejects_unknown_code(discounts, clock_at):
with pytest.raises(InvalidCodeError) as exc:
apply_discount(Decimal("50.00"), "NOPE", clock=clock_at(0))
assert "NOPE" in str(exc.value)
def test_rejects_expired_code(discounts, clock_at):
discounts["OLD"] = Discount(rate=Decimal("0.10"), expires_at=999)
with pytest.raises(ExpiredCodeError):
apply_discount(Decimal("50.00"), "OLD", clock=clock_at(1000))
def test_code_at_exact_expiry_instant_is_still_valid(discounts, clock_at):
# Spec: condition is `<`, not `<=` — expiry at t=1000 is valid at t=1000
discounts["EDGE"] = Discount(rate=Decimal("0.10"), expires_at=1000)
result = apply_discount(Decimal("50.00"), "EDGE", clock=clock_at(1000))
assert result == Decimal("45.00")
def test_zero_price_yields_zero(discounts, clock_at):
discounts["X"] = Discount(rate=Decimal("0.50"), expires_at=2000)
assert apply_discount(Decimal("0"), "X", clock=clock_at(0)) == Decimal("0.00")
def test_quantizes_to_two_decimal_places(discounts, clock_at):
discounts["THIRD"] = Discount(rate=Decimal("0.333"), expires_at=2000)
# 9.99 × 0.667 = 6.66333 → quantize → 6.66
assert apply_discount(Decimal("9.99"), "THIRD", clock=clock_at(0)) == Decimal("6.66")Seven tests. Every branch taken, boundary pinned, clock deterministic. No test calls time.time(). No test asserts a value obtained by running apply_discount first.
None / void: The thing to assert is the side effect. Mock the collaborator and assert it was called with the right arguments — or assert the state change directly if it's observable.**overrides, reuse it. Don't inline 30 lines of setup per test — nobody can tell what's significant.except Exception: log; pass): You can only test that it doesn't raise, which is nearly worthless. Note in output: "Error path is swallowed at line N — consider narrowing the catch or re-raising so failures are observable."metamorphic-test-generator or an integration harness.repr() / str() of objects unless the string representation is the contract. assert str(result) == "<Foo id=3 ...>" breaks when someone adds a field.mock.assert_called_once() passes when called with the wrong args. Use assert_called_once_with(expected).assert isinstance(response, Response) — the web framework guarantees that. Test your logic.def foo(x): return bar(x)), say so: "No test generated — foo is a passthrough to bar. Test bar directly."47d56bb
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.