Build AI agents that interact with computers like humans do - viewing screens, moving cursors, clicking buttons, and typing text. Covers Anthropic's Computer Use, OpenAI's Operator/CUA, and open-source alternatives.
62
55%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/computer-use-agents/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description does a good job of specifying concrete actions and naming specific platforms, making it distinctive and specific. However, it lacks an explicit 'Use when...' clause, which limits its completeness for skill selection. Adding natural trigger terms that users might say (e.g., 'browser automation', 'GUI control', 'desktop agent') would improve discoverability.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to build agents that control a computer, automate GUI interactions, or work with screen-based automation tools.'
Include additional natural trigger terms users might say, such as 'browser automation', 'GUI automation', 'desktop automation', 'RPA', 'screen scraping', or 'web agent'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'viewing screens, moving cursors, clicking buttons, and typing text' and names specific platforms (Anthropic's Computer Use, OpenAI's Operator/CUA, open-source alternatives). | 3 / 3 |
Completeness | Clearly answers 'what' (build AI agents that interact with computers via screen/cursor/click/type, covering specific platforms), but lacks an explicit 'Use when...' clause or equivalent trigger guidance for when Claude should select this skill. | 2 / 3 |
Trigger Term Quality | Includes some good terms like 'AI agents', 'Computer Use', 'Operator', 'CUA', 'clicking buttons', 'typing text', but misses common user phrases like 'browser automation', 'GUI automation', 'screen control', 'desktop automation', or 'RPA'. | 2 / 3 |
Distinctiveness Conflict Risk | The niche of computer-use AI agents with screen interaction is quite distinct and unlikely to conflict with other skills. The specific platform names (Anthropic Computer Use, OpenAI Operator/CUA) further narrow the domain. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill demonstrates deep domain expertise with excellent, executable code examples covering the full spectrum of computer use agent development. However, it is severely over-long and monolithic - cramming ~1000+ lines into a single file with no progressive disclosure or external references. The verbosity undermines its utility as a skill file since it would consume enormous context window space, ironically contradicting its own advice about token efficiency.
Suggestions
Extract the large code implementations (ComputerUseAgent, SandboxedAgent, AnthropicComputerUse, BrowserUseAgent, ConfirmationGate, ActionLogger) into separate referenced files, keeping only concise summaries and key patterns in the main SKILL.md.
Remove explanatory text that Claude already knows - e.g., why sandboxing matters, what Gaussian distributions are, how Docker works, what prompt injection is. Focus on the specific configurations and code patterns unique to computer use agents.
Add explicit validation checkpoints to workflows - e.g., verify sandbox isolation before running agent, verify action success before proceeding to next step, validate container security settings after setup.
Add a quick-start overview section with clear navigation links to detailed pattern files, replacing the current flat structure with a progressive disclosure hierarchy.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This skill is extremely verbose at ~1000+ lines. It explains concepts Claude already knows (what a perception-reasoning-action loop is, why sandboxing matters, basic Docker concepts), includes massive code blocks that could be referenced externally, and repeats security advice across multiple sections. The anti-bot detection section explains Gaussian distributions and human clicking behavior unnecessarily. | 1 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code throughout - complete Python classes, Docker configurations, docker-compose files, and concrete implementation patterns. Code examples are real and runnable with proper imports, error handling, and usage examples. | 3 / 3 |
Workflow Clarity | The perception-reasoning-action loop is clearly sequenced with numbered steps, and the sandbox pattern has clear isolation requirements. However, there are no explicit validation checkpoints in the main workflows - for example, the sandbox setup doesn't verify the container is properly isolated before running the agent, and the main agent loop lacks verification that actions succeeded before proceeding. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files. All content - multiple complete class implementations, Docker configs, sharp edges, validation checks - is inlined in a single massive document. The patterns (Perception-Reasoning-Action, Sandboxed Environment, Anthropic Implementation, Browser-Use, User Confirmation, Action Logging) each contain hundreds of lines of code that should be in separate reference files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (2167 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
93c57b2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.