tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill validating-ai-ethics-and-fairnessValidate AI/ML models and datasets for bias, fairness, and ethical concerns. Use when auditing AI systems for ethical compliance, fairness assessment, or bias detection. Trigger with phrases like "evaluate model fairness", "check for bias", or "validate AI ethics".
Validation
81%| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 13 / 16 Passed | |
Implementation
13%This skill is verbose and abstract, explaining concepts Claude already understands while failing to provide any executable code or concrete commands. The workflow structure exists but lacks validation checkpoints, and the content organization is poor with redundant sections and empty placeholders. It reads more like a conceptual overview than actionable guidance.
Suggestions
Add executable Python code examples using Fairlearn or AIF360 showing actual bias detection (e.g., `from fairlearn.metrics import demographic_parity_difference; dpd = demographic_parity_difference(y_true, y_pred, sensitive_features=gender)`)
Remove explanatory content about what fairness metrics are and what bias means - Claude knows this. Focus only on project-specific configurations or non-obvious implementation details.
Add validation checkpoints to the workflow, such as 'Verify sample sizes are sufficient before proceeding' with specific thresholds and commands to check them.
Move the Resources section to a separate RESOURCES.md file and delete the empty Overview and Examples sections at the bottom.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with unnecessary explanations Claude already knows (what fairness metrics are, what demographic parity means). Contains redundant sections (Overview repeats the intro, Examples section is empty placeholder). Many bullet points explain concepts rather than provide actionable guidance. | 1 / 3 |
Actionability | No executable code despite mentioning Python libraries like Fairlearn and AIF360. Instructions are vague ('Use the skill to examine', 'Load model predictions') without concrete commands or code examples. References tools but never shows how to use them. | 1 / 3 |
Workflow Clarity | Steps are numbered and sequenced logically (Identify Scope → Analyze → Report → Mitigate), but lacks validation checkpoints. No feedback loops for verifying bias detection results or confirming mitigation effectiveness. Error handling section exists but is disconnected from the workflow. | 2 / 3 |
Progressive Disclosure | Monolithic wall of text with no references to external files. All content is inline including detailed resources that could be separate. The Overview section at the bottom is misplaced and redundant. Examples section is an empty placeholder. | 1 / 3 |
Total | 5 / 12 Passed |
Activation
90%This is a well-structured description with strong completeness and trigger term coverage. It explicitly addresses both what the skill does and when to use it, with natural language triggers. The main weakness is the somewhat generic action verbs (validate, audit) rather than listing specific concrete capabilities.
Suggestions
Add more specific concrete actions such as 'analyze demographic parity', 'generate fairness metrics reports', 'test for disparate impact', or 'evaluate protected attribute correlations' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (AI/ML models and datasets) and some actions (validate, audit), but lacks specific concrete actions like 'analyze demographic parity metrics', 'generate fairness reports', or 'test for disparate impact'. | 2 / 3 |
Completeness | Clearly answers both what (validate AI/ML models and datasets for bias, fairness, and ethical concerns) and when (explicit 'Use when' clause with triggers, plus 'Trigger with phrases' providing additional explicit guidance). | 3 / 3 |
Trigger Term Quality | Includes good coverage of natural terms users would say: 'evaluate model fairness', 'check for bias', 'validate AI ethics', 'fairness assessment', 'bias detection', 'ethical compliance'. These are phrases users would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused specifically on AI ethics, bias, and fairness validation. Distinct triggers like 'model fairness', 'bias detection', and 'AI ethics' are unlikely to conflict with general ML or data processing skills. | 3 / 3 |
Total | 11 / 12 Passed |
Reviewed
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.