checkpoint-mode

Pause for review every N tasks - selective autonomy pattern

Quality

21%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./agent-skills/checkpoint-mode/SKILL.md

Quality

Discovery

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is too abstract and pattern-oriented to be effective for skill selection. It reads more like a design pattern label than a functional skill description, lacking concrete actions, natural trigger terms, and any 'use when' guidance.

Suggestions

Add a 'Use when...' clause specifying triggers, e.g., 'Use when the user wants Claude to check in periodically during multi-step or batch operations, or mentions terms like checkpoint, batch review, or confirm every N steps.'

Replace the abstract pattern name with concrete actions, e.g., 'Inserts review checkpoints during multi-step task execution, pausing after every N operations to request user confirmation before continuing.'

Include natural trigger terms users would say, such as 'checkpoint', 'batch approval', 'confirm before continuing', 'stop and check', 'review progress', or 'don't do everything at once'.

Dimension	Reasoning	Score
Specificity	The description is vague — 'pause for review every N tasks' and 'selective autonomy pattern' are abstract concepts without concrete actions. It doesn't specify what actions the skill performs (e.g., counting tasks, prompting for confirmation, batching operations).	1 / 3
Completeness	The description barely addresses 'what' (pause for review every N tasks) and completely lacks a 'when' clause. There is no explicit trigger guidance for when Claude should select this skill.	1 / 3
Trigger Term Quality	The terms 'selective autonomy pattern' are jargon unlikely to be used by a user naturally. 'Pause for review' is somewhat natural but lacks common variations like 'checkpoint', 'confirm before continuing', 'batch approval', or 'stop and ask'.	1 / 3
Distinctiveness Conflict Risk	The concept of pausing for review every N tasks is somewhat distinctive as a workflow pattern, but the vague phrasing could overlap with other skills related to task management, automation control, or workflow orchestration.	2 / 3
	Total	5 / 12 Passed

Implementation

35%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is overly verbose, spending significant tokens on philosophy, motivational quotes, and explanations of concepts Claude already understands. While it provides a reasonable structure for a checkpoint-based autonomy pattern, the code examples are pseudocode referencing an undefined 'loki' system rather than executable instructions. The workflow lacks error recovery paths and the referenced bundle file doesn't exist.

Suggestions

Remove the Philosophy section, 'When to Use' comparison, and motivational quote — these explain concepts Claude already knows and waste ~40% of the token budget.

Make the agent instructions executable rather than pseudocode: define or reference the actual functions (generate_checkpoint_summary, load_completed_tasks) or replace with concrete shell/file operations Claude can perform.

Add explicit error recovery to the workflow: what should happen when the user rejects a checkpoint or requests course corrections, rather than only handling the happy path.

Either provide the referenced `references/production-patterns.md` bundle file or remove the reference to avoid dead links.

Dimension	Reasoning	Score
Conciseness	Significant verbosity: explains philosophy Claude already understands (why perpetual autonomy is bad), includes a motivational quote, explains when to use vs not use the pattern at length, and has sections like 'Metrics' and 'Comparison with Other Modes' that add bulk without actionable value. The 'Problem with Perpetual Autonomy' bullet points are obvious to Claude.	1 / 3
Actionability	Provides some concrete guidance (config variables, Python pseudocode for checkpoint logic, signal file paths), but the code is illustrative pseudocode referencing a hypothetical 'loki' system rather than executable instructions. The checkpoint summary is a template but relies on undefined functions like `generate_checkpoint_summary()` and `load_completed_tasks()`.	2 / 3
Workflow Clarity	The checkpoint workflow has a clear sequence (work → pause → summary → approval → resume) with a visual diagram, but lacks validation checkpoints. There's no error handling for what happens if approval is denied, if the user wants course corrections, or if the signal file is malformed. The feedback loop for 'course corrections' is mentioned in metrics but never described.	2 / 3
Progressive Disclosure	References `references/production-patterns.md` but no bundle files exist to support it. The content is somewhat monolithic with sections that could be split out (metrics, comparison table, configuration). The structure has clear headings but everything is inline in one large file.	2 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: asklokesh/loki-mode
Commit: 0cf4f4d

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.