CtrlK
BlogDocsLog inGet started
Tessl Logo

ultraqa

QA cycling workflow - test, verify, fix, repeat until goal met

54

Quality

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill body has a clear, well-sequenced QA-cycling workflow with explicit validation checkpoints and a strong feedback loop, which is its main strength. Its weaknesses are redundancy across the Rules/Cleanup sections, under-specified command discovery for the RUN QA step, and a dangling reference to a missing docs/REFERENCE.md file.

Suggestions

Specify how to discover the QA command for each goal type (e.g., read package.json scripts, detect test runner) instead of 'Run the project's test command', so the core step is executable rather than interpretive.

Remove the redundant 'Important Rules' and 'STATE CLEANUP ON COMPLETION' sections whose content is already covered by Cycle Workflow, Exit Conditions, and Cancellation.

Either create the referenced `docs/REFERENCE.md` or drop the dangling reference, and consider moving the multi-repo/session-id resolution detail into that reference file.

DimensionReasoningScore

Conciseness

The body is mostly efficient and avoids explaining concepts Claude already knows, but the 'Important Rules' and 'STATE CLEANUP ON COMPLETION' sections restate content already covered in Cycle Workflow, Exit Conditions, and Cancellation, so it could be tightened.

2 / 3

Actionability

Concrete Task() invocation templates, a state JSON schema, and an executable `rm -f` cleanup command are provided, but the core 'RUN QA' step ('Run the project's test command') leaves command discovery unspecified and prompts use bracketed placeholders, leaving key details incomplete.

2 / 3

Workflow Clarity

The cycle is an explicitly sequenced validate-diagnose-fix-retry loop (RUN QA -> CHECK RESULT -> ARCHITECT -> FIX -> REPEAT) with explicit exit checkpoints (Goal Met, Cycle 5, Same Failure 3x, Environment Error), matching the anchor for a clear sequence with explicit validation and feedback loops.

3 / 3

Progressive Disclosure

Sections are clearly organized with headers, but the body references a non-existent `docs/REFERENCE.md`, and dense inline detail (OMC_STATE_DIR, OMC_SESSION_ID, .omc-workspace resolution) that could live in a reference file keeps it at the 'some structure but could be better organized' anchor.

2 / 3

Total

9

/

12

Passed

Description

60%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is specific and concrete about its actions, but lacks an explicit 'Use when...' trigger clause and broad natural-language trigger coverage, capping both completeness and trigger quality at the middle level. It is a reasonably clear QA-cycling niche but not maximally distinctive.

Suggestions

Add an explicit 'Use when...' clause stating when to invoke this skill (e.g., 'Use when you need to repeatedly run tests/build/lint and fix failures until a quality gate passes').

Broaden trigger terms with natural user phrasings such as 'QA loop', 'fix until green', 'keep testing until it passes', 'retry failing builds'.

Strengthen distinctiveness by naming the distinguishing behavior (autonomous test-diagnose-fix cycling with a max-cycle cap) that separates it from one-shot testing skills.

DimensionReasoningScore

Specificity

The description lists multiple concrete actions ("test, verify, fix, repeat") within the QA cycling domain, matching the anchor for listing several specific concrete actions rather than vague abstraction.

3 / 3

Completeness

The 'what' is clearly stated (test, verify, fix, repeat until goal met) but 'when' is only implied with no 'Use when...' clause, which per the judging guidelines caps completeness at 2.

2 / 3

Trigger Term Quality

Terms like "test", "verify", "fix", and "QA" are natural words a user might say, but coverage of common variations is narrow and there is no explicit trigger phrasing, so it matches the 'some relevant keywords but missing variations' anchor.

2 / 3

Distinctiveness Conflict Risk

"QA cycling workflow...repeat until goal met" names a recognizable niche, but the generic test/verify/fix verbs and lack of explicit distinct triggers mean it could overlap with other testing skills, matching the 'somewhat specific but could overlap' anchor.

2 / 3

Total

9

/

12

Passed

Validation

93%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation15 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

15

/

16

Passed

Repository
Yeachan-Heo/oh-my-claudecode
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.