Assess your AI fluency using Anthropic's 4D framework (Dakan, Feller & Anthropic, 2025). Scans Claude Code sessions, runs LLM-based behavior classification on all messages, asks a self-assessment questionnaire for 6 unobservable behaviors, and generates a visual HTML report with scores and actionable feedback. Use when "assess fluency", "AI fluency", "fluency report", "fluency assessment", "4D framework", or "how AI fluent am I".
100
100%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Assess a user's AI fluency based on Anthropic's 4D AI Fluency Framework (Dakan, Feller & Anthropic, 2025). The framework has 4 Competencies, 12 Sub-competencies, and 24 Behaviors. The assess.py script classifies 18 behaviors from conversation data using an LLM binary classifier (matching Anthropic's own methodology). The remaining 6 behaviors require a self-assessment questionnaire.
First, ask the user:
~/.claude/projects/ — is that correct, or should I scan a different directory?"Then run assess.py to scan and classify all 18 observable behaviors. The script lives next to this SKILL.md — substitute the actual path when running.
# Default: scan ~/.claude/projects/, classify with Claude Haiku
python3 /path/to/skills/ai-fluency-assessment/scripts/assess.py \
--output-dir .ai-fluency --max-sessions 6000
# Custom sessions directory (e.g., exported sessions, another machine's data)
python3 /path/to/skills/ai-fluency-assessment/scripts/assess.py \
--sessions-dir <PATH> --output-dir .ai-fluency --max-sessions 6000
# Fast/free fallback (regex only, 11 behaviors, less accurate)
python3 /path/to/skills/ai-fluency-assessment/scripts/assess.py \
--output-dir .ai-fluency --max-sessions 6000 --regex-onlyUse --max-sessions 6000 to capture all sessions. Many session files are subagent files with no user messages, so even with thousands of files the scan is fast.
LLM classification (default) requires the anthropic SDK and API credentials. The script auto-detects Anthropic Foundry credentials from environment variables (ANTHROPIC_FOUNDRY_API_KEY), or uses ANTHROPIC_API_KEY for direct API access. Cost: ~$2-5 per full assessment with Claude Haiku.
Output files:
.ai-fluency/evidence.json — all extracted user messages with project metadata.ai-fluency/analysis.json — per-behavior match counts, sample messages, top projectsAsk the user the following 6 questions for behaviors that have no observable signal in conversation data. Rate each 1-5 (1=Never, 5=Always).
Present them in conversation — don't run an interactive CLI.
Delegation:
Discernment:
Diligence:
After collecting responses, save them:
python3 -c "
import json
from pathlib import Path
responses = {3: R3, 4: R4, 18: R18, 19: R19, 20: R20, 24: R24}
evidence_path = Path('.ai-fluency/evidence.json')
with open(evidence_path) as f:
evidence = json.load(f)
evidence['questionnaire'] = {str(k): v for k, v in responses.items()}
with open(evidence_path, 'w') as f:
json.dump(evidence, f, indent=2, default=str)
print('Questionnaire saved')
"Replace R3, R4, etc. with the user's actual integer ratings.
Read .ai-fluency/analysis.json for the 18 LLM-classified behaviors and .ai-fluency/evidence.json for the questionnaire responses. Score each behavior 1-5:
Scoring criteria:
For LLM-classified behaviors (B1-B2, B5-B17, B21-B23): Base scores on LLM match counts, quality of sample messages, consistency across projects, and range/variety within the behavior.
Important: High frequency alone does NOT guarantee a high score. Assess quality and range, not just quantity.
For questionnaire-only behaviors (B3, B4, B18, B19, B20, B24): Use the self-assessment rating directly.
Generate a self-contained HTML report at .ai-fluency/fluency-report.html following the report spec for design system, layout, and formatting specifications.
open .ai-fluency/fluency-report.htmlSee the framework reference for the full 24-behavior reference table.