validating-ai-ethics-and-fairness

Validate AI/ML models and datasets for bias, fairness, and ethical concerns. Use when auditing AI systems for ethical compliance, fairness assessment, or bias detection. Trigger with phrases like "evaluate model fairness", "check for bias", or "validate AI ethics".

Quality

70%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/ai-ml/ai-ethics-validator/skills/validating-ai-ethics-and-fairness/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates when and why to use the skill, with explicit trigger guidance and good keyword coverage. Its main weakness is that the 'what' portion stays at a high level without enumerating specific concrete actions or outputs (e.g., computing specific fairness metrics, generating bias reports, analyzing protected attributes). Adding more granular capabilities would strengthen specificity.

Suggestions

Add more specific concrete actions, e.g., 'compute disparate impact ratios, analyze demographic parity, generate bias audit reports, flag underrepresented groups in training data' to improve specificity.

Dimension	Reasoning	Score
Specificity	The description names the domain (AI/ML models and datasets) and some actions (validate, audit), but the actions remain fairly high-level. It doesn't list multiple concrete actions like 'compute disparate impact ratios, generate fairness reports, analyze demographic parity' — instead it stays at the level of 'validate for bias, fairness, and ethical concerns.'	2 / 3
Completeness	Clearly answers both 'what' (validate AI/ML models and datasets for bias, fairness, and ethical concerns) and 'when' (explicit 'Use when' clause with triggers, plus a 'Trigger with phrases like' clause providing concrete example phrases).	3 / 3
Trigger Term Quality	Good coverage of natural trigger terms: 'evaluate model fairness', 'check for bias', 'validate AI ethics', 'bias detection', 'fairness assessment', 'ethical compliance', 'auditing AI systems'. These are terms users would naturally use when seeking this kind of skill.	3 / 3
Distinctiveness Conflict Risk	The skill occupies a clear niche — AI/ML bias and fairness validation — that is unlikely to conflict with general coding, data analysis, or other skills. The trigger terms are specific to ethical AI auditing.	3 / 3
	Total	11 / 12 Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a comprehensive and well-structured framework for AI ethics validation with clear severity classifications, error handling, and real-world scenarios. Its main weaknesses are the absence of executable code examples (relying entirely on narrative descriptions of what to compute) and the lack of integrated validation checkpoints within the workflow. The content is moderately concise but could be tightened by removing explanatory context Claude already knows and replacing narrative examples with actual code.

Suggestions

Add executable Python code examples showing how to compute fairness metrics using Fairlearn's MetricFrame API and AIF360, rather than describing the process narratively.

Integrate validation checkpoints into the workflow steps (e.g., 'Verify group sample sizes meet minimum threshold before computing metrics; if not, apply bootstrap CI and flag as unreliable').

Replace or supplement the narrative scenario examples with copy-paste-ready code blocks that demonstrate end-to-end metric computation for at least one scenario.

Trim prerequisites to essential items only—Claude doesn't need to be told what pandas and NumPy are for, and optional tools can be mentioned inline when relevant rather than upfront.

Dimension	Reasoning	Score
Conciseness	The skill includes some unnecessary verbosity—listing optional tools Claude already knows about, explaining what fairness metrics are conceptually, and over-specifying prerequisites. However, the structured workflow and output sections are reasonably efficient. The error handling table and examples add value but could be tighter.	2 / 3
Actionability	The skill provides a clear numbered workflow and specific metric names, thresholds (four-fifths rule, r > 0.3), and tool references (ExponentiatedGradient, MetricFrame). However, it lacks any executable code examples—no actual Python snippets showing how to compute these metrics with Fairlearn or AIF360. The examples section describes scenarios narratively rather than showing copy-paste-ready code.	2 / 3
Workflow Clarity	The 10-step workflow is clearly sequenced and logically ordered, with severity classification providing a form of validation. However, there are no explicit validation checkpoints or feedback loops—no 'verify before proceeding' steps, no 'if metric computation fails, do X' within the workflow itself. The error handling table is separate rather than integrated into the workflow.	2 / 3
Progressive Disclosure	The content is well-structured with clear sections (Overview, Prerequisites, Instructions, Output, Error Handling, Examples, Resources). However, with no bundle files, all content is inline in a single monolithic file. The detailed error handling table, extensive output specifications, and multiple examples could benefit from being split into referenced files for a skill of this complexity.	2 / 3
	Total	8 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: jeremylongshore/claude-code-plugins-plus-skills
Commit: 196527a

Reviewed: 2 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.