CtrlK
BlogDocsLog inGet started
Tessl Logo

jpc0/provably-correct-software

Build provably correct software using formal methods like Hoare Logic, Weakest Preconditions, and Design-by-Contract.

99

1.45x

Quality

100%

Does it follow best practices?

Impact

99%

1.45x

Average score across 5 eval scenarios

Overview
Skills
Evals
Files

rubric.jsonevals/scenario-2/

{
  "context": "Tests whether the agent can correctly identify and use loop invariants and variants to verify the correctness and termination of a non-trivial loop, including runtime assertions.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Loop Invariant defined",
      "description": "Explicitly defines a Loop Invariant (I) in the code or documentation.",
      "max_score": 15
    },
    {
      "name": "Loop Variant defined",
      "description": "Explicitly defines a Loop Variant (v) in the code or documentation.",
      "max_score": 15
    },
    {
      "name": "Initialization proof",
      "description": "Evidence (e.g. in comments) of showing Precondition => Invariant (Initialization).",
      "max_score": 10
    },
    {
      "name": "Preservation proof",
      "description": "Evidence (e.g. in comments) of showing {I AND B} Body {I} (Preservation).",
      "max_score": 10
    },
    {
      "name": "Termination proof",
      "description": "Evidence (e.g. in comments) of showing v strictly decreases and v >= 0 (Termination).",
      "max_score": 10
    },
    {
      "name": "Postcondition proof",
      "description": "Evidence (e.g. in comments) of showing (I AND NOT B) => Q (Postcondition).",
      "max_score": 10
    },
    {
      "name": "Native assertions",
      "description": "Uses native assert statements for preconditions/postconditions.",
      "max_score": 10
    },
    {
      "name": "Runtime Invariant Check",
      "description": "Includes runtime assert statements WITHIN the loop to check the invariant and/or variant.",
      "max_score": 20
    }
  ]
}

Install with Tessl CLI

npx tessl i jpc0/provably-correct-software

evals

SKILL.md

tile.json