CtrlK
BlogDocsLog inGet started
Tessl Logo

common-skill-creator

Standardizes the creation and evaluation of high-density Agent Skills (Claude, Cursor, Windsurf). Ensures skills achieve high Activation (specificity/completeness) and Implementation (conciseness/actionability) scores. Use when: writing or auditing SKILL.md, improving trigger accuracy, or refactoring skills to reduce redundancy and maximize token ROI.

65

Quality

80%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.github/skills/common/common-skill-creator/SKILL.md
SKILL.md
Quality
Evals
Security

Agent Skill Creator Standard

Priority: P0 — Apply to ALL skills

Maximize Token ROI. Every line in SKILL.md must provide specific procedural value. Activation (how it triggers) and Implementation (how it helps) primary quality metrics.

Three-Level Loading System

  • Level 1 Frontmatter: name + description (Activation Anchor), ≤100 words.
  • Level 2 SKILL.md body: Core Rules + Workflows (Implementation Core), ≤100 lines.
  • Level 3 references/: Detailed examples, schemas, and "TESTS.md" (On-demand).

Workflow (New or Existing Skill)

New skill:

  1. Research — web-search domain best practices, checklists, and standards; extract key terms → triggers, workflows → guidelines, mistakes → anti-patterns. See Web Search Research.
  2. Capture intent — what it , when it trigger, expected output format?
  3. Write SKILL.md — draft using TEMPLATE.md
  4. Test — spawn parallel subagents: one with-skill, one without-skill (baseline)
  5. Evaluate — grade assertions, review benchmark (pass rate, tokens, time)
  6. Iterate — rewrite based on feedback, rerun into next iteration dir, repeat
  7. Optimize description — run trigger eval queries, target ≥80% accuracy
  8. Pressure-test — for discipline skills, capture agent excuses, red flags, and stop conditions Existing skill:
  9. Audit — run Quality Checklist below; identify violations
  10. Snapshotcp -r <skill-dir> <workspace>/skill-snapshot/ before any edits
  11. Improve SKILL.md — fix violations, compress, move oversized content to references/
  12. Test — spawn parallel subagents: one with-new-skill, one with-snapshot (baseline)
  13. Evaluate & iterate — same as steps 4–5 above
  14. Optimize description — re-run trigger eval if description changed
  15. Harden — add rationalization counters where agents still fail under pressure See Eval Workflow for full testing + iteration details.

Description Quality (Activation)

  • Third-Person Voice: Use Standardizes..., Audits..., Encrypts.... Avoid "I will" or "This skill helps to".
  • What + When Structure:
  • What: Define 5–8 specific capabilities (e.g., "Generates JWT tokens, rotates keys").
  • When: Explicitly define triggers (e.g., "Use when user says 'rotate keys'").
  • Specificity: Avoid vague verbs like "manage" or "handle". Use "Validate", "Inject", "Refactor", "Sanitize".
  • Trigger Hint: Include (triggers: *.ext, keyword) suffix for technical skills.

Content Quality (Implementation)

  • No Redundant Knowledge: NOT explain concepts AI already knows (e.g., HTTP status codes, standard library docs, basic SOLID principles). Focus strictly on project-specific rules.
  • Caveman Compression: Use "Caveman Mode" for rules to save tokens. Drop articles (, , ), remove filler words ("should", "will", "), and use telegraphic snippets.
  • Standard: "You should ensure that database connection closed after every query to prevent leaks." (15 tokens)
  • Caveman: "Close DB connection after query. Prevent leaks." (7 tokens)
  • Actionability: Examples must copy-paste ready and executable.
  • Workflow Clarity: Use sequential ordered lists for multi-step processes.
  • Progressive Disclosure: Move code blocks >10 lines to references/.
  • Pressure Hardening: Discipline skills must name red flags, common excuses, and exact stop/restart conditions.

Behavior Guardrails

  • Use pressure tests for discipline skills: TDD, debugging, verification, review, protocol, and workflow skills need baseline failure evidence.
  • Capture rationalizations: Save the exact excuses agents use when they skip the rule.
  • Add red flags: Short phrases that tell the agent to stop and restart the protocol.
  • Encode behavior in evals: Add pressure_scenarios, rationalizations, red_flags, and behavior_assertions when the skill is guardrail-oriented.
  • Keep it local: Put behavior details in evals or references/; keep SKILL.md compact.

Anti-Patterns

  • No "AI-splaining": not explain why pattern good unless it's unique project constraint.
  • No Vague Triggers: Never use src/** or **/*. surgical.
  • No Description Bloat: If description exceeds 100 words, some capabilities belong in body.
  • No long code blocks: >10 lines → extract to references/
  • No redundancy: don't repeat frontmatter content in body
  • No untested guardrails: Rules that were never pressure-tested are speculation.

Quality Checklist (Tessl-Aligned)

  • Activation ≥ 90%: Description covers both capabilities ("What") and triggers ("When").
  • Implementation ≥ 90%: No general-purpose explanations; all examples executable.
  • Structural Compliance: SKILL.md ≤ 100 lines; code blocks moved to references/.
  • Trigger rate ≥80% on should-trigger queries.
  • Guardrail skills include rationalizations, red flags, and behavior eval fields.

References

Repository
HoangNguyen0403/agent-skills-standard
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.