CtrlK
BlogDocsLog inGet started
Tessl Logo

validating-ai-ethics-and-fairness

Validate AI/ML models and datasets for bias, fairness, and ethical concerns. Use when auditing AI systems for ethical compliance, fairness assessment, or bias detection. Trigger with phrases like "evaluate model fairness", "check for bias", or "validate AI ethics".

74

Quality

70%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/ai-ml/ai-ethics-validator/skills/validating-ai-ethics-and-fairness/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates when and why to use the skill, with explicit trigger phrases and a well-defined 'Use when' clause. Its main weakness is that the capability description stays at a high level ('validate for bias, fairness, and ethical concerns') without enumerating specific concrete actions or outputs the skill produces. The trigger term coverage and completeness are strong.

Suggestions

Add more specific concrete actions to improve specificity, e.g., 'compute disparate impact metrics, analyze demographic parity, generate fairness reports, flag underrepresented groups in training data.'

DimensionReasoningScore

Specificity

Names the domain (AI/ML models and datasets) and some actions (validate, audit), but the actions are fairly high-level. It doesn't list multiple concrete specific actions like 'compute disparate impact ratios, generate fairness reports, analyze demographic parity' — instead it stays at the level of 'validate for bias, fairness, and ethical concerns.'

2 / 3

Completeness

Clearly answers both 'what' (validate AI/ML models and datasets for bias, fairness, and ethical concerns) and 'when' (explicit 'Use when' clause for auditing AI systems, plus a 'Trigger with phrases' clause providing concrete examples). Both components are explicitly stated.

3 / 3

Trigger Term Quality

Includes good natural trigger terms: 'evaluate model fairness', 'check for bias', 'validate AI ethics', 'auditing AI systems', 'fairness assessment', 'bias detection'. These cover multiple natural phrasings a user might employ when seeking this capability.

3 / 3

Distinctiveness Conflict Risk

The focus on AI/ML model bias, fairness, and ethical compliance is a clear niche. The trigger terms are specific to AI ethics auditing and unlikely to conflict with general coding, data analysis, or other skills.

3 / 3

Total

11

/

12

Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a comprehensive and well-organized framework for AI ethics validation with clear severity classifications, useful error handling, and illustrative scenarios. Its main weaknesses are the complete absence of executable code examples despite referencing specific Python libraries, and the lack of explicit validation checkpoints in the workflow. The content is moderately verbose and would benefit from concrete code snippets and better progressive disclosure through bundle files.

Suggestions

Add executable Python code examples using Fairlearn's MetricFrame API and AIF360's BinaryLabelDataset to make the skill copy-paste actionable (e.g., computing demographic parity ratio with actual code).

Insert explicit validation checkpoints in the workflow, such as 'Verify group sample sizes meet minimum threshold (≥30) before computing metrics; if not, apply bootstrap CI and flag as unreliable' between steps 3 and 4.

Extract the detailed error handling table, output specifications, and examples into separate bundle files (e.g., ERROR_HANDLING.md, EXAMPLES.md) and reference them from the main skill to improve progressive disclosure.

Remove or condense the Prerequisites section—Claude doesn't need to be told what pandas and NumPy are for; a simple 'pip install fairlearn aif360' line suffices.

DimensionReasoningScore

Conciseness

The skill includes some unnecessary verbosity—listing optional tools Claude already knows about, explaining what fairness metrics are conceptually, and over-specifying prerequisites. However, the structured tables and examples are reasonably efficient. The Resources section adds bulk that could be trimmed.

2 / 3

Actionability

The instructions provide a clear numbered workflow with specific metric names and thresholds (four-fifths rule, severity bands), but lack any executable code. For a skill involving Python libraries like Fairlearn and AIF360, the absence of concrete code snippets (e.g., MetricFrame usage, ExponentiatedGradient setup) is a significant gap—guidance remains at the descriptive level rather than copy-paste ready.

2 / 3

Workflow Clarity

The 10-step workflow is clearly sequenced and includes severity classification and proxy detection steps. However, there are no explicit validation checkpoints or feedback loops—step 1 mentions 'verify schema' but doesn't specify what to do on failure inline, and there's no 'validate before proceeding' gate between critical steps like metric computation and report generation.

2 / 3

Progressive Disclosure

The content is well-structured with clear sections (Overview, Instructions, Output, Error Handling, Examples, Resources), but it's monolithic—all content is inline in a single file with no bundle files to offload detailed reference material. The extensive error handling table, six output deliverables, and three full examples could benefit from being split into separate reference files.

2 / 3

Total

8

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.