CtrlK
BlogDocsLog inGet started
Tessl Logo

catalan-adobe/ai-fluency-assessment

Assess your AI fluency using Anthropic's 4D framework (Dakan, Feller & Anthropic, 2025). Scans Claude Code sessions, runs LLM-based behavior classification on all messages, asks a self-assessment questionnaire for 6 unobservable behaviors, and generates a visual HTML report with scores and actionable feedback. Use when "assess fluency", "AI fluency", "fluency report", "fluency assessment", "4D framework", or "how AI fluent am I".

94

Quality

94%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates specific capabilities (scanning sessions, behavior classification, questionnaire, HTML report generation), names a specific framework, and provides explicit trigger terms via a 'Use when...' clause. It is concise, uses third-person voice throughout, and occupies a highly distinctive niche that minimizes conflict risk with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: scans Claude Code sessions, runs LLM-based behavior classification on all messages, asks a self-assessment questionnaire for 6 unobservable behaviors, and generates a visual HTML report with scores and actionable feedback.

3 / 3

Completeness

Clearly answers both 'what does this do' (assess AI fluency using the 4D framework, scan sessions, classify behaviors, run questionnaire, generate HTML report) AND 'when should Claude use it' with an explicit 'Use when...' clause listing trigger phrases.

3 / 3

Trigger Term Quality

Includes a strong set of natural trigger terms users would say: 'assess fluency', 'AI fluency', 'fluency report', 'fluency assessment', '4D framework', and 'how AI fluent am I'. These cover both formal and conversational phrasings.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with a clear niche: AI fluency assessment using a specific named framework (Anthropic's 4D framework). The trigger terms are unique and unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, actionable skill with clear step-by-step instructions and good progressive disclosure via external reference files. Its main weaknesses are the lack of validation/error-handling checkpoints in the workflow and some minor verbosity in explanatory notes. The concrete commands, scoring rubric, and questionnaire questions make it highly executable.

Suggestions

Add validation checkpoints after Steps 1 and 2 (e.g., verify evidence.json exists and contains expected keys before proceeding, check for API credentials before running assess.py)

Add error recovery guidance: what to do if assess.py fails (missing SDK, bad credentials, no sessions found), and how to verify output files are well-formed before scoring

DimensionReasoningScore

Conciseness

Generally efficient but includes some unnecessary explanation (e.g., explaining what LLM classification requires, the note about subagent files). The questionnaire section is well-structured but the inline Python snippet for saving responses adds bulk that could be in the script itself.

2 / 3

Actionability

Provides fully executable bash commands with clear flags, concrete questionnaire questions with exact wording, a complete Python snippet for saving responses, and specific scoring criteria with defined levels. Claude can follow this step-by-step without ambiguity.

3 / 3

Workflow Clarity

The 5-step sequence is clear and well-ordered, but there are no validation checkpoints or error recovery steps. What if assess.py fails? What if the API key is missing? What if evidence.json doesn't exist when saving questionnaire responses? For a multi-step workflow involving external API calls and file I/O, the absence of feedback loops caps this at 2.

2 / 3

Progressive Disclosure

Excellent structure: the SKILL.md provides a clear overview and actionable steps, while appropriately deferring the full 24-behavior framework to references/FRAMEWORK.md and the report design to references/REPORT-SPEC.md. References are one level deep and clearly signaled with descriptive links.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents