k8s

Operate the joelclaw Kubernetes cluster — Talos Linux on Colima (Mac Mini). Deploy services, check health, debug pods, recover from restarts, add ports, manage Helm releases, inspect logs, fix networking. Triggers on: 'kubectl', 'pods', 'deploy to k8s', 'cluster health', 'restart pod', 'helm install', 'talosctl', 'colima', 'nodeport', 'flannel', 'port mapping', 'k8s down', 'cluster not working', 'add a port', 'PVC', 'storage', any k8s/Talos/Colima infrastructure task. Also triggers on service-specific deploy: 'deploy redis', 'redeploy inngest', 'livekit helm', 'pds not responding'.

Quality

75%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Fix and improve this skill with Tessl

tessl review fix ./skills/k8s/SKILL.md

Quality

Content

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is highly actionable with excellent concrete commands, paths, and executable examples throughout, making it genuinely useful for cluster operations. However, it suffers significantly from verbosity — it reads more like an evolving operations journal than a concise skill document, with incident dates, ADR references, and historical context that bloat the token cost without proportional value. The content would benefit greatly from aggressive splitting into reference files, keeping only the quick-reference commands and architecture overview in the main SKILL.md.

Suggestions

Move the 20-item Danger Zones section to a separate `references/danger-zones.md` file, keeping only the top 3-5 most critical warnings inline with links to the full list.

Extract the Agent Runner section into its own `references/agent-runner.md` — it's a self-contained subsystem that doesn't need to be in the main operations skill.

Remove incident-specific dates and ADR numbers from inline text (e.g., '2026-03-17 incident', 'ADR-0244') — these add no operational value for Claude and consume tokens. If needed for traceability, consolidate them in a changelog or reference file.

Add explicit verification steps between Colima crash-loop recovery steps (e.g., after step 1 'colima stop && colima start', add 'Verify: colima status && docker info') to improve workflow clarity for the most critical recovery path.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~600+ lines, containing extensive incident-specific details (dates, ADR numbers, historical context), danger zones that read like a post-mortem journal, and operational minutiae that could be split into reference files. Much of this is operational history rather than actionable instruction, and significant portions repeat information (e.g., Talos has no shell is stated multiple times, port mappings appear in several places).	1 / 3
Actionability	The skill provides highly concrete, executable commands throughout — health checks with expected outputs, exact kubectl/helm commands, specific file paths, copy-paste ready YAML manifests (Redis AOF fix pod), deploy scripts, and verification commands. Every section gives specific commands rather than abstract descriptions.	3 / 3
Workflow Clarity	Some workflows are well-sequenced with validation (Redis AOF recovery, deploy commands), but many critical multi-step processes lack explicit validation checkpoints. The Colima crash recovery is listed as steps but without clear verification between steps. The 20-item Danger Zones section is an unstructured list mixing recovery procedures with warnings, making it hard to follow during an incident. The durable recovery rule mentions verification but doesn't give the exact commands inline.	2 / 3
Progressive Disclosure	The skill references `references/operations.md` for detailed procedures (port mappings, cluster recreation, recovery), which is good progressive disclosure. However, the main file itself is monolithic — the Danger Zones section alone has 20 dense items that should be in a reference file, the Agent Runner section is extensive enough to warrant its own file, and the NAS NFS section could also be separated. The bundle has no files provided to verify the reference exists.	2 / 3
	Total	8 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines a specific infrastructure domain (a named Kubernetes cluster with a particular tech stack), enumerates concrete actions, and provides extensive trigger terms covering both technical commands and natural language queries. The explicit 'Triggers on:' clause with service-specific examples makes it highly actionable for skill selection. Minor note: it's slightly verbose but the detail is justified given the breadth of the skill.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: deploy services, check health, debug pods, recover from restarts, add ports, manage Helm releases, inspect logs, fix networking. Very comprehensive enumeration of capabilities.	3 / 3
Completeness	Clearly answers both 'what' (operate the Kubernetes cluster, deploy services, debug pods, manage Helm releases, etc.) and 'when' (explicit 'Triggers on:' clause with extensive list of trigger terms and scenarios).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would actually say, including both technical commands ('kubectl', 'talosctl', 'helm install') and natural language phrases ('cluster not working', 'k8s down', 'add a port', 'deploy redis'). Covers common variations well.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive — names a specific cluster ('joelclaw'), specific infrastructure stack (Talos Linux on Colima on Mac Mini), and specific services (redis, inngest, livekit, pds). Very unlikely to conflict with other skills.	3 / 3
	Total	12 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (535 lines); consider splitting into references/ and linking	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: joelhooks/joelclaw
Commit: 2ca3686

Reviewed: 10 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.