CtrlK
BlogDocsLog inGet started
Tessl Logo

babysit-pr

Babysit a GitHub pull request after creation by continuously polling CI checks/workflow runs, new review comments, and mergeability state until the PR is ready to merge (or merged/closed). Diagnose failures, retry likely flaky failures up to 3 times, auto-fix/push branch-related issues when appropriate, and stop only when user help is required (for example CI infrastructure issues, exhausted flaky retries, or ambiguous/blocking situations). Use when the user asks Codex to monitor a PR, watch CI, handle review comments, or keep an eye on failures and feedback on an open PR.

Install with Tessl CLI

npx tessl i github:openai/codex --skill babysit-pr
What are skills?

92

2.12x

Quality

92%

Does it follow best practices?

Impact

85%

2.12x

Average score across 3 eval scenarios

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that thoroughly explains the specific capabilities (polling CI, diagnosing failures, retrying flaky tests, auto-fixing branch issues) and includes a comprehensive 'Use when...' clause with natural trigger terms. The description is appropriately detailed without being verbose, uses correct third-person voice, and carves out a distinct niche for PR babysitting that won't conflict with other GitHub-related skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'continuously polling CI checks/workflow runs', 'diagnose failures', 'retry likely flaky failures up to 3 times', 'auto-fix/push branch-related issues'. Very detailed about what the skill does.

3 / 3

Completeness

Clearly answers both what (babysit PR, poll CI, diagnose failures, retry flaky tests, auto-fix issues) AND when with explicit 'Use when...' clause listing specific trigger scenarios like 'monitor a PR', 'watch CI', 'handle review comments'.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'monitor a PR', 'watch CI', 'handle review comments', 'keep an eye on failures', 'GitHub pull request', 'mergeability', 'workflow runs'. These match how users naturally describe PR monitoring tasks.

3 / 3

Distinctiveness Conflict Risk

Very clear niche focused specifically on post-creation PR monitoring and CI babysitting. The combination of 'continuously polling', 'flaky retries', and 'mergeability state' creates a distinct profile unlikely to conflict with general GitHub or code review skills.

3 / 3

Total

12

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality skill with excellent actionability and workflow clarity. The multi-step monitoring process is well-documented with explicit stop conditions, validation checkpoints, and feedback loops. Minor verbosity exists with some repeated instructions across sections, but overall the skill effectively teaches a complex autonomous monitoring task.

Suggestions

Consolidate repeated instructions about restarting --watch after pushes into a single 'Post-Push Protocol' section to reduce redundancy

Consider moving the detailed 'Monitoring Loop Pattern' into the referenced heuristics.md since it largely restates the Core Workflow with additional detail

DimensionReasoningScore

Conciseness

The skill is comprehensive but contains some redundancy, particularly in the monitoring loop pattern which repeats concepts from the core workflow. Some sections could be tightened (e.g., review comment handling repeats restart instructions multiple times).

2 / 3

Actionability

Provides fully executable commands with clear syntax, specific commit message templates, concrete gh CLI commands for diagnosis, and explicit script paths. All guidance is copy-paste ready.

3 / 3

Workflow Clarity

Excellent multi-step workflow with numbered sequences, explicit validation checkpoints (classify before retry, check mergeability on every loop), clear decision points, and feedback loops (push -> resume watching -> repeat). Stop conditions are explicitly enumerated.

3 / 3

Progressive Disclosure

Well-structured with clear sections, appropriate inline content for the main workflow, and one-level-deep references to heuristics.md and github-api-notes.md for detailed supplementary information. Navigation is clear and signaled.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.