CtrlK
BlogDocsLog inGet started
Tessl Logo

baoyu-danger-gemini-web

Generates images and text via reverse-engineered Gemini Web API. Supports text generation, image generation from prompts, reference images for vision input, and multi-turn conversations. Use when other skills need image generation backend, or when user requests "generate image with Gemini", "Gemini text generation", or needs vision-capable AI generation.

66

Quality

82%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides strong actionability with concrete, copy-paste-ready CLI commands and comprehensive option/model tables. Its main weaknesses are moderate verbosity (generic agent instructions, redundant sections) and missing error recovery steps for authentication and runtime resolution workflows. The structure is reasonable but could benefit from clearer separation of reference material from core workflow.

Suggestions

Add explicit error handling/recovery steps for authentication failures (e.g., 'If browser doesn't open, check GEMINI_WEB_CHROME_PATH; if cookies fail, run --login') to improve workflow clarity.

Remove or significantly trim the 'User Input Tools' section — this is generic agent behavior that doesn't need 10+ lines of instruction in a Gemini API skill.

Remove the redundant 'Extension Support' section at the bottom since it just points back to the Preferences section already documented above.

DimensionReasoningScore

Conciseness

Generally efficient with good use of tables, but includes some unnecessary sections like the 'User Input Tools' block which is generic agent behavior, and the 'Extension Support' section at the end is redundant with the Preferences section. The authentication section has some verbose explanation about CDP session reuse that could be tightened.

2 / 3

Actionability

Provides fully executable CLI commands with concrete examples covering all major use cases (text generation, image generation, vision input, multi-turn conversations, JSON output). Options table is complete with clear descriptions, and environment variables are well-documented.

3 / 3

Workflow Clarity

The consent check flow is well-sequenced with clear steps. However, the authentication workflow lacks explicit validation/error recovery steps (what if browser auth fails? what if cookies expire mid-session?). The script directory resolution is a multi-step process but lacks validation checkpoints for whether bun/npx is actually available.

2 / 3

Progressive Disclosure

References EXTEND.md for configuration and mentions scripts/gemini-webapi/* for implementation details, which is good. However, without bundle files to verify, the inline content is somewhat heavy (full options table, models table, env vars table, sessions info could potentially be in a reference file). The EXTEND.md reference is mentioned but its actual supported options are only vaguely described.

2 / 3

Total

9

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly communicates what the skill does (image/text generation via Gemini Web API), lists specific capabilities, and provides explicit trigger guidance with natural user terms. The description is concise, uses third-person voice, and is distinctive enough to avoid conflicts with other generation-related skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: text generation, image generation from prompts, reference images for vision input, and multi-turn conversations. Also specifies the mechanism (reverse-engineered Gemini Web API).

3 / 3

Completeness

Clearly answers both what (generates images and text via Gemini Web API, supports text generation, image generation, vision input, multi-turn conversations) and when (when other skills need image generation backend, or when user requests specific Gemini-related tasks) with explicit 'Use when' clause.

3 / 3

Trigger Term Quality

Includes natural trigger terms users would say: 'generate image with Gemini', 'Gemini text generation', 'image generation', 'vision-capable AI generation'. Good coverage of how users would phrase requests involving Gemini-based generation.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific mention of 'reverse-engineered Gemini Web API' and Gemini-specific trigger terms. Unlikely to conflict with other image generation or text generation skills because of the clear Gemini branding and API specificity.

3 / 3

Total

12

/

12

Passed

Validation

72%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation8 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata.version' is missing

Warning

metadata_field

'metadata' should map string keys to string values

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

8

/

11

Passed

Repository
jimliu/baoyu-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.