context-surfing

Monitors context window health throughout a session and rides peak context quality for maximum output fidelity. Activates automatically after plan-interview and intent-framed-agent. Stays active through execution and hands off cleanly to simplify-and-harden and self-improvement when the wave completes naturally or exits via handoff. Use this skill whenever a multi-step agent task is underway and session continuity or context drift is a concern. Especially important for long-running tasks, complex refactors, or any work where degraded context would silently corrupt the output. Trigger even if the user doesn't say "context surfing" — if an agent task is running across multiple steps with intent and a plan already established, this skill is live.

Quality

53%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./skills/context-surfing/SKILL.md

Quality

Content

39%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill demonstrates strong workflow design with clear sequencing, validation checkpoints, and error recovery paths — the Recovery Protocol and Exit Protocol are well-structured. However, it is severely over-long and verbose, spending hundreds of tokens on metaphor elaboration, inter-skill relationship explanations, and pipeline context that Claude doesn't need inline. The monolithic structure with no supporting bundle files means all content competes for the same token budget, undermining the skill's own stated concern about context window health.

Suggestions

Cut the Mental Model section to 2-3 lines max — the wave metaphor is useful but doesn't need four bullet points of elaboration plus a paragraph introduction.

Extract the 'When to Use the Full Pipeline' table, 'Interoperability with Other Skills', and 'Hook Integration' sections into separate referenced files (e.g., PIPELINE.md, HOOKS.md) to reduce the main SKILL.md to under 150 lines.

Remove or drastically shorten the 'Relationship with intent-framed-agent' subsection — the precedence rule and cadence separation are the only actionable parts; the rest is explanatory prose Claude can infer.

Make drift detection more actionable by providing a concrete checklist or decision tree format rather than prose paragraphs explaining each signal category.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~400+ lines. The ocean/wave metaphor is extensively elaborated when a brief analogy would suffice. Sections like 'Mental Model', 'The Monitoring Paradox', and 'Relationship with intent-framed-agent' explain concepts at length that Claude can infer. The 'When to Use the Full Pipeline' table and extensive interoperability sections add significant token overhead for what is essentially a self-monitoring protocol.	1 / 3
Actionability	The handoff file template is concrete and copy-paste ready, and the hook integration provides executable bash/JSON. However, the core skill — drift detection and recovery — is described in abstract behavioral terms ('monitor for these signals', 'pause and re-read') rather than executable steps. The pre-commit anchor check is described procedurally but relies on subjective judgment rather than concrete, automatable checks.	2 / 3
Workflow Clarity	The multi-step workflows are clearly sequenced with explicit validation checkpoints. The Recovery Protocol has a clear Step 1/Step 2 structure with branching logic (mismatch resolves → resume; uncertainty remains → spawn subagent; still can't reconcile → escalate). The Exit Protocol has three explicit steps with a structured handoff template. Strong vs weak signal classification provides clear decision gates, and the precedence rule between skills is explicit.	3 / 3
Progressive Disclosure	Despite being a very long document, everything is inlined in a single monolithic file with no references to supporting files (the bundle confirms no bundle files exist). The Mental Model section, the full pipeline table, the detailed interoperability matrix, and the extensive hook integration instructions could all be split into separate referenced files. The document would benefit enormously from a concise overview pointing to detailed sub-documents.	1 / 3
	Total	7 / 12 Passed

Description

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has good completeness with explicit 'what' and 'when' clauses, and attempts to define clear activation conditions. However, it suffers from buzzwordy language ('rides peak context quality', 'maximum output fidelity') that obscures what the skill actually does concretely, and its broad activation criteria ('any multi-step agent task') create potential overlap with other orchestration skills.

Suggestions

Replace abstract phrases like 'rides peak context quality' and 'maximum output fidelity' with concrete actions — e.g., 'summarizes accumulated context', 'checkpoints intermediate state', 'prunes irrelevant conversation history'.

Narrow the trigger conditions to reduce conflict risk — currently 'any multi-step agent task' is extremely broad and could conflict with general orchestration or workflow management skills.

Dimension	Reasoning	Score
Specificity	The description names a domain (context window health monitoring) and some actions (monitors context, rides peak context quality, hands off to other skills), but the actual concrete actions are vague — 'rides peak context quality' and 'maximum output fidelity' are buzzwordy rather than specific. It doesn't list concrete operations like 'summarizes context', 'prunes irrelevant history', or 'checkpoints state'.	2 / 3
Completeness	The description clearly answers both 'what' (monitors context window health, manages handoffs between skills) and 'when' (multi-step agent tasks, long-running tasks, complex refactors, when context drift is a concern). It has explicit trigger guidance including the 'Use this skill whenever...' clause and additional trigger conditions.	3 / 3
Trigger Term Quality	It includes some relevant terms like 'context drift', 'long-running tasks', 'complex refactors', 'multi-step agent task', and 'session continuity'. However, many of these are internal/technical jargon rather than natural user language. The description explicitly says to trigger even without the user saying 'context surfing', which is helpful but the actual natural user terms are limited.	2 / 3
Distinctiveness Conflict Risk	The skill occupies a somewhat unique niche around context window management, but its triggers are broad — 'any multi-step agent task' could overlap with many orchestration or workflow skills. The references to specific upstream/downstream skills (plan-interview, intent-framed-agent, simplify-and-harden) help distinguish it, but the activation criteria are still quite wide.	2 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: pskoett/pskoett-ai-skills
Commit: 20e64ce

Reviewed: 15 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.