CtrlK
BlogDocsLog inGet started
Tessl Logo

sahildmk/pr-comment-resolver

Review PR comments, address code issues in source files (not generated files), regenerate derived artifacts, run lint/format, commit, push, and reply to the comment thread confirming resolution.

93

1.19x

Quality

89%

Does it follow best practices?

Impact

99%

1.19x

Average score across 5 eval scenarios

Overview
Skills
Evals
Files

rubric.jsonevals/scenario-5/

{
  "context": "Tests whether the agent implements the full end-to-end PR comment resolution workflow in the correct order, including all key stages: comment fetching via review API, filtering, critical assessment, user confirmation, source-only editing, regeneration, verification, commit formatting, pushing, and reply posting.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Review comments API",
      "description": "resolver.py or config.py references the pulls/{number}/comments endpoint for fetching review comments (not issues comments)",
      "max_score": 8
    },
    {
      "name": "Comment filtering stage",
      "description": "resolver.py includes a filtering step that excludes already-replied comments and/or old comments from prior runs",
      "max_score": 8
    },
    {
      "name": "Assessment stage",
      "description": "resolver.py includes a distinct assessment/triage stage where each comment is evaluated with possible outcomes of address, defer, or disagree",
      "max_score": 10
    },
    {
      "name": "User confirmation stage",
      "description": "resolver.py includes a step that presents a plan and pauses for user confirmation before making any code changes",
      "max_score": 10
    },
    {
      "name": "Source-only editing",
      "description": "resolver.py, README.md, or workflow_diagram.md states that only source files are edited (not generated/derived files)",
      "max_score": 10
    },
    {
      "name": "Regeneration stage",
      "description": "The workflow includes a regeneration step for derived artifacts after editing source files",
      "max_score": 8
    },
    {
      "name": "Verification stage",
      "description": "The workflow includes running formatter/linter/tests after changes, before committing",
      "max_score": 8
    },
    {
      "name": "Correct workflow order",
      "description": "workflow_diagram.md shows the stages in order: fetch -> filter -> assess -> confirm -> fix -> regenerate -> verify -> commit -> push -> reply",
      "max_score": 10
    },
    {
      "name": "Reply via thread endpoint",
      "description": "resolver.py or config.py uses the pulls/comments/{id}/replies endpoint for posting replies (not issue comments as default)",
      "max_score": 8
    },
    {
      "name": "One commit per comment",
      "description": "resolver.py or README.md mentions creating one commit per comment (unless issues share a root cause) rather than a single commit for all changes",
      "max_score": 8
    },
    {
      "name": "Commit format",
      "description": "resolver.py or config.py defines commit messages using 'fix:' prefix with Co-Authored-By attribution",
      "max_score": 8
    },
    {
      "name": "Empty result handling",
      "description": "resolver.py handles the case where no unaddressed comments remain — reports this and stops rather than proceeding with empty work",
      "max_score": 4
    }
  ]
}

Install with Tessl CLI

npx tessl i sahildmk/pr-comment-resolver

evals

tile.json