tessl install github:openai/skills --skill codex-readiness-unit-testRun the Codex Readiness unit test report. Use when you need deterministic checks plus in-session LLM evals for AGENTS.md/PLANS.md.
Review Score
79%
Validation Score
13/16
Implementation Score
77%
Activation Score
75%
Instruction-first, in-session "readiness" for evaluating AGENTS/PLANS documentation quality without any external APIs or SDKs. All checks run against the current working directory (cwd), with no monorepo discovery. Each run writes to .codex-readiness-unit-test/<timestamp>/ and updates .codex-readiness-unit-test/latest.json. Keep execution deterministic (filesystem scanning + local command execution only). All LLM evaluation happens in-session and must output strict JSON via the provided references.
python skills/codex-readiness-unit-test/bin/collect_evidence.pypython skills/codex-readiness-unit-test/bin/deterministic_rules.pyreferences/ and store .codex-readiness-unit-test/<timestamp>/llm_results.json.python skills/codex-readiness-unit-test/bin/run_plan.py --plan .codex-readiness-unit-test/<timestamp>/plan.jsonpython skills/codex-readiness-unit-test/bin/scoring.py --mode read-only|executeOutputs (per run, under .codex-readiness-unit-test/<timestamp>/):
report.jsonreport.htmlsummary.jsonlogs/* (execute mode)This skill produces a deterministic evidence file plus an in-session LLM evaluation, then compiles a JSON report and HTML scorecard. It requires no OpenAI API key and makes no external HTTP calls.
mode: read-only or execute (required)soft_timeout_seconds: optional (default 600)NOT_RUN, and no execution logs/summary are produced.plan.json is executed via run_plan.py. This enables check #6 and produces execution logs + execution_summary.json for scoring.Always ask the user which mode to run (read-only vs. execute) before proceeding.
Skill references are discovered from AGENTS.md via $SkillName or .codex/skills/<name> patterns; their SKILL.md files are added to evidence for the LLM checks.
All checks run relative to the current working directory and are defined in skills/codex-readiness-unit-test/references/checks/checks.json, weighted equally by default. Each run writes outputs to .codex-readiness-unit-test/<timestamp>/ and updates .codex-readiness-unit-test/latest.json.
The helper scripts read .codex-readiness-unit-test/latest.json by default to locate the latest run directory.
For each LLM/HYBRID check:
skills/codex-readiness-unit-test/references/json_fix.md with the raw output.The JSON schema is:
{
"status": "PASS|WARN|FAIL|NOT_RUN",
"rationale": "string",
"evidence_quotes": [{"path":"...","quote":"..."}],
"recommendations": ["..."],
"confidence": 0.0
}Combine the command summary and execute plan into one concise confirmation step. Present:
plan.json. If declined, mark execute-required checks as NOT_RUN..codex-readiness-unit-test/<timestamp>/evidence.json (from collect_evidence.py).codex-readiness-unit-test/<timestamp>/deterministic_results.json (from deterministic_rules.py).codex-readiness-unit-test/<timestamp>/llm_results.json (from in-session references).codex-readiness-unit-test/<timestamp>/execution_summary.json (execute mode only).codex-readiness-unit-test/<timestamp>/report.json and .codex-readiness-unit-test/<timestamp>/report.html (from scoring.py).codex-readiness-unit-test/<timestamp>/summary.json (structured pass/fail summary from scoring.py).codex-readiness-unit-test/latest.json (stable pointer to the latest run directory)project_context_specified → skills/codex-readiness-unit-test/references/project_context.mdbuild_test_commands_exist → skills/codex-readiness-unit-test/references/commands.mddev_build_test_loops_documented → skills/codex-readiness-unit-test/references/loop_quality.mddev_build_test_loop_execution → skills/codex-readiness-unit-test/references/execution_explanation.md{
"project_dir": "relative/or/absolute/path (optional)",
"cwd": "optional/absolute/path (defaults to current directory)",
"commands": [
{"label": "setup", "cmd": "npm install"},
{"label": "build", "cmd": "npm run build"},
{"label": "test", "cmd": "npm test"}
],
"env": {
"EXAMPLE": "value"
}
}Place plan.json inside the run directory (e.g., .codex-readiness-unit-test/<timestamp>/plan.json).
{
"project_context_specified": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.7},
"build_test_commands_exist": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.7},
"dev_build_test_loops_documented": {"status":"WARN","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.6},
"dev_build_test_loop_execution": {"status":"PASS","rationale":"...","evidence_quotes":[],"recommendations":[],"confidence":0.6}
}.codex-readiness-unit-test/<timestamp>/logs/.