CtrlK
BlogDocsLog inGet started
Tessl Logo

video-understand

Implement specialized video understanding capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to analyze video content, understand motion and temporal sequences, extract information from video frames, describe video scenes, or perform video-based AI analysis. Optimized for MP4, AVI, MOV, and other common video formats.

57

Quality

66%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/video-understand/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid skill description that clearly communicates its purpose and when to use it. The explicit 'Use this skill when...' clause with multiple trigger scenarios and the inclusion of specific file formats strengthen discoverability. The main weakness is that the capability descriptions are somewhat generic—terms like 'analyze video content' and 'understand motion' could be more concrete with specific operations.

Suggestions

Replace generic phrases like 'analyze video content' and 'understand motion and temporal sequences' with more concrete actions such as 'detect objects across frames', 'summarize video narratives', or 'generate frame-by-frame descriptions'.

DimensionReasoningScore

Specificity

It names the domain (video understanding) and mentions some actions like 'analyze video content', 'extract information from video frames', 'describe video scenes', but these are somewhat generic and not as concrete as listing truly specific operations (e.g., 'detect objects in frames', 'generate transcripts', 'track motion between frames').

2 / 3

Completeness

Clearly answers both 'what' (video understanding capabilities including analyzing content, extracting frame info, describing scenes) and 'when' (explicit 'Use this skill when...' clause with multiple trigger scenarios). The 'Use when' clause is explicit and detailed.

3 / 3

Trigger Term Quality

Good coverage of natural terms users would say: 'video', 'analyze video', 'video frames', 'video scenes', 'MP4', 'AVI', 'MOV', 'motion', 'temporal sequences'. These are terms users would naturally use when requesting video analysis tasks.

3 / 3

Distinctiveness Conflict Risk

The combination of video-specific triggers, the named SDK (z-ai-web-dev-sdk), specific video formats (MP4, AVI, MOV), and temporal/motion analysis creates a clear niche that is unlikely to conflict with image analysis, audio processing, or general file handling skills.

3 / 3

Total

11

/

12

Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill is highly actionable with executable, complete code examples and clear CLI usage, but it is severely bloated with repetitive patterns—nearly every example is the same createVision call with a different prompt string. The content would benefit enormously from showing the pattern once and listing prompt variations concisely. The monolithic structure with no progressive disclosure compounds the token waste.

Suggestions

Consolidate the 10+ nearly-identical createVision examples into one canonical pattern, then provide a concise table or list of prompt templates for different use cases (sports, education, moderation, etc.)

Split advanced use cases (multi-turn conversation, batch processing, Express/Next.js integration) into separate referenced files to reduce the main SKILL.md to an overview with navigation

Remove the overview bullet list of capabilities and the 'Common Use Cases' numbered list—these describe what Claude already understands and add no actionable value

Add concrete validation steps for batch processing (e.g., verify response structure, implement retry logic for failed videos) to improve workflow clarity

DimensionReasoningScore

Conciseness

Extremely verbose at ~500+ lines. Massive repetition of the same API pattern (createVision call) across 10+ examples that differ only in the prompt string. The overview lists capabilities Claude already understands, and many use cases (sports analysis, educational summarization, content moderation) are just prompt variations wrapped in identical boilerplate. Could be reduced to ~20% of its size.

1 / 3

Actionability

All code examples are fully executable with proper imports, async/await patterns, error handling, and concrete usage examples. CLI commands are copy-paste ready with clear flag explanations. The Express.js and Next.js integration examples are complete and functional.

3 / 3

Workflow Clarity

The skill is mostly single-step (call the API), so complex workflows aren't strictly needed. However, the batch processing section lacks validation/verification steps (no check that results are valid, no retry logic for failures beyond catching errors). The recommended approach section mentions chunking long videos but doesn't provide a concrete workflow for it.

2 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files despite mentioning a scripts directory. All content is inline—the advanced use cases, integration examples, and troubleshooting could easily be split into separate files. The reference to `{Skill Location}/scripts/video-understand.ts` is mentioned but no bundle files exist to support it.

1 / 3

Total

7

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (917 lines); consider splitting into references/ and linking

Warning

Total

10

/

11

Passed

Repository
jjyaoao/HelloAgents
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.