Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to analyze images, describe visual content, or create applications that combine image understanding with conversational AI. Supports image URLs and base64 encoded images for multimodal interactions.
Security
1 medium severity finding. This skill can be installed but you should review these findings before use.
The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.
Third-party content exposure detected (high risk: 0.90). This skill explicitly accepts and fetches arbitrary external media URLs (see SKILL.md "Supported Content Types" and examples plus scripts/vlm.ts and the Express API endpoint that take imageUrl/file_url/video_url), so the agent ingests untrusted third‑party content (images/files) that it reads/interprets and which could contain embedded instructions or text that materially influence its behavior.
07048a9
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.