heuristic-evaluation-ai

Adapting Nielsen's heuristics and new AI-specific heuristics for AI interfaces.

Quality

17%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./gemini-extension/evaluation/skills/heuristic-evaluation-ai/SKILL.md

Heuristic Evaluation for AI

Nielsen's 10 usability heuristics were designed for traditional software. AI products need adapted heuristics that address the unique challenges of probabilistic, generative, and conversational systems.

Classic Heuristics, Adapted for AI

1. Visibility of system status AI adaptation: The user should always know what the AI is doing, what it's working with, and how confident it is. Progress indicators for generation. Transparency about data sources. 2. Match between system and real world AI adaptation: The AI should use language and concepts the user understands. Don't expose model internals. Frame capabilities in terms of user tasks, not technical features. 3. User control and freedom AI adaptation: Users must be able to stop generation, undo AI actions, edit outputs, and override suggestions. AI autonomy should always have an exit. 4. Consistency and standards AI adaptation: The AI should behave consistently across similar requests. Same input type should produce same output format. Persona should be stable. 5. Error prevention AI adaptation: Design prompts and interfaces that guide users toward effective interactions. Suggest clarifications before producing low-quality output. 6. Recognition rather than recall AI adaptation: Show users what the AI can do rather than requiring them to discover commands. Surface relevant capabilities contextually. 7. Flexibility and efficiency of use AI adaptation: Support both novice (guided) and expert (shortcut) interaction modes. Power users should be able to customise AI behavior. 8. Aesthetic and minimalist design AI adaptation: AI outputs should be concise and well-structured. Don't pad responses with unnecessary caveats or filler. 9. Help users recognise, diagnose, and recover from errors AI adaptation: When the AI fails, explain what went wrong in user terms, not technical terms. Offer clear recovery paths. 10. Help and documentation AI adaptation: Provide contextual guidance on how to interact with the AI effectively. Teach prompting skills through the interface.

AI-Specific Heuristics

Beyond the classic 10, AI products need evaluation against:

Calibrated trust: Does the interface help users trust the AI appropriately — neither too much nor too little?
Graceful degradation: When the AI can't fully help, does it partially help rather than failing completely?
Feedback effectiveness: Can users correct the AI easily, and does the AI adapt?
Transparency of limitations: Are the AI's boundaries clear before the user hits them?
Appropriate autonomy: Does the AI take the right amount of initiative for the task and context?

Running an AI Heuristic Evaluation

Select 3-5 evaluators with AI product experience
Define the scope (which features, which user tasks)
Each evaluator independently works through the heuristics
Capture issues with severity ratings
Consolidate findings and prioritise

Design Artefacts

AI heuristic checklist (adapted classics + AI-specific)
Evaluation protocol and scoring rubric
Issue severity classification guide
Heuristic evaluation report template
Prioritised findings matrix

Repository: Owl-Listener/ai-design-skills
Commit: f41b650

Last updated: 17 days ago
Created: 17 days ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.