judgment-day

Trigger: judgment day, dual review, adversarial review, juzgar. Run blind dual review, fix confirmed issues, then re-judge.

Quality

67%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./internal/assets/skills/judgment-day/SKILL.md

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured orchestration skill with excellent workflow clarity and conciseness. The decision gate table and hard rules efficiently encode complex branching logic. The main weakness is that actionability depends heavily on the referenced prompts file (not provided in the bundle), and the skill itself contains no concrete examples of judge prompts, verdict tables, or output format, making it less self-contained than ideal.

Suggestions

Include at least a minimal example of the expected output format (verdict table, final judgment block) inline so the skill is actionable even without the reference file.

Add a brief inline example or skeleton of the judge prompt structure so Claude can act even if the reference file is missing or unreadable.

Dimension	Reasoning	Score
Conciseness	Every section is lean and purposeful. No unnecessary explanations of what code review is or how agents work. The decision gate table is an efficient encoding of branching logic, and the hard rules are terse but precise.	3 / 3
Actionability	The workflow is clearly described with specific decision gates and classification rules, but there are no concrete code/command examples, no sample judge prompts inline, and no example output format. The actual judge/fix prompts are deferred entirely to a reference file that wasn't provided in the bundle.	2 / 3
Workflow Clarity	The multi-step process is clearly sequenced (confirm target → resolve standards → dual judges → synthesize → ask → fix → re-judge → terminal state). Validation checkpoints are explicit (wait for both judges, re-judge after fixes, verify terminal state before finishing), and there's a clear feedback loop with an escape hatch after 2 iterations.	3 / 3
Progressive Disclosure	The skill references `references/prompts-and-formats.md` for detailed prompts and formats, which is good one-level-deep disclosure. However, the bundle file was not provided, making it impossible to verify the reference exists, and the output contract section could benefit from a concrete example or link to a template rather than a prose description.	2 / 3
	Total	10 / 12 Passed

Description

57%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description establishes a distinctive niche with unique trigger terms and a recognizable workflow pattern, but it lacks specificity about what domain or content is being reviewed and why. The 'what' and 'when' are present but shallow—a user unfamiliar with this workflow would struggle to understand when to invoke it. Adding domain context and concrete examples of use cases would significantly improve clarity.

Suggestions

Specify the domain and concrete actions: what is being reviewed (code, documents, designs?) and what kinds of issues are detected and fixed.

Add a 'Use when...' clause describing user scenarios, e.g., 'Use when the user wants an adversarial quality review of code or text to catch issues through independent dual analysis.'

Dimension	Reasoning	Score
Specificity	It names some actions ('blind dual review', 'fix confirmed issues', 're-judge') which give a sense of the workflow, but the actual capabilities are not concretely described—what is being reviewed? What kind of issues are fixed? The domain is unclear.	2 / 3
Completeness	The 'Trigger:' clause serves as an explicit 'when' signal, and the second sentence describes the 'what' at a high level. However, the 'when' is expressed only as trigger keywords rather than describing the situations or user needs that should invoke this skill, and the 'what' lacks detail about the domain or purpose of the review.	2 / 3
Trigger Term Quality	Includes some useful trigger terms ('judgment day', 'dual review', 'adversarial review', 'juzgar') that a user might say, but these are somewhat niche and miss common natural language variations like 'code review', 'quality check', or 'peer review'. The Spanish term 'juzgar' adds multilingual coverage but is narrow.	2 / 3
Distinctiveness Conflict Risk	The terms 'judgment day', 'dual review', 'adversarial review', and 'juzgar' are quite distinctive and unlikely to conflict with other common skills. The specific workflow pattern (blind dual review → fix → re-judge) further narrows the niche.	3 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: sergiodvillegas-art/gentle-ai
Path: internal/assets/skills/judgment-day/SKILL.md
Commit: 3bfa934

Reviewed: about 11 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.