CtrlK
BlogDocsLog inGet started
Tessl Logo

evilissimo/software-design

Use before implementing or refactoring software. Contains two skills: (1) Modular Software Design — for designing module boundaries, APIs, layers, abstractions, services, repositories, adapters, or architecture, helping reduce total system complexity by creating deep modules, hiding implementation knowledge, avoiding leakage and pass-through APIs, comparing alternative designs, documenting interfaces before coding, and critiquing existing architecture; and (2) Software Testing — for writing unit tests, integration tests, or end-to-end tests, creating mocks/stubs/fakes, designing a testing strategy, doing TDD, reviewing test quality, fixing flaky tests, or refactoring test suites, generating risk-focused test plans, picking appropriate test levels, choosing between mocks/fakes/real dependencies, and applying Arrange-Act-Assert patterns with concrete examples.

93

1.12x
Quality

94%

Does it follow best practices?

Impact

92%

1.12x

Average score across 5 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-5/

{
  "context": "Tests regression workflow, legacy strategy, suite review, and property/boundary thinking.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Highest symptom",
      "description": "Plans a highest-level failing test that captures the real dropped-final-row symptom.",
      "max_score": 12
    },
    {
      "name": "Focused lower test",
      "description": "Adds lower-level focused tests near parsing/import fault only if they help localize or protect boundary behaviour.",
      "max_score": 10
    },
    {
      "name": "Neighbor cases",
      "description": "Covers neighboring boundaries such as with/without trailing newline, empty file, single row, and multiple rows.",
      "max_score": 12
    },
    {
      "name": "Legacy safety net",
      "description": "Starts with coarser safety nets for tightly coupled legacy code before refactoring toward unit tests.",
      "max_score": 10
    },
    {
      "name": "Public seam",
      "description": "Extends tests through production-like public seams rather than private methods.",
      "max_score": 8
    },
    {
      "name": "Refactor path",
      "description": "Suggests separating parsing/business logic from filesystem/database orchestration for cheaper future tests.",
      "max_score": 10
    },
    {
      "name": "Flake review",
      "description": "Identifies slow/flaky/nondeterministic suite risks and recommends stabilizing shared state, timing, or external dependencies.",
      "max_score": 10
    },
    {
      "name": "Doubles choice",
      "description": "Chooses real dependency, fake, stub, spy, or mock with rationale for filesystem/database boundaries.",
      "max_score": 8
    },
    {
      "name": "Property invariant",
      "description": "Considers an invariant/property such as imported row count equals parsed records without reimplementing importer.",
      "max_score": 8
    },
    {
      "name": "Coverage use",
      "description": "Uses coverage diagnostically to find meaningful missed branches after regression tests, not to chase a percentage.",
      "max_score": 6
    },
    {
      "name": "Maintainable structure",
      "description": "Sketches tests with clear names, AAA structure, purposeful data, and precise assertions.",
      "max_score": 6
    }
  ]
}

evals

tile.json