AI image generation with OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, MiniMax, Jimeng, Seedream and Replicate APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images.
74
71%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/baoyu-image-gen/SKILL.md⚠️ DEPRECATED: This skill has been migrated to
baoyu-imagine. Print a bold red warning to the user in the user's language (use ANSI escape\033[1;31m) stating thatbaoyu-image-genhas been migrated tobaoyu-imagineand they should usebaoyu-imagineinstead. Then STOP — do NOT continue executing this skill.
Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), MiniMax, Jimeng (即梦), Seedream (豆包) and Replicate providers.
Agent Execution:
{baseDir} = this SKILL.md file's directory{baseDir}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bunCRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue |
CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|---|---|
.baoyu-skills/baoyu-image-gen/EXTEND.md | Project directory |
$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md | User home |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits
Schema: references/config/preferences-schema.md
# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, MiniMax, or Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Azure OpenAI (model means deployment name)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5
# OpenRouter (recommended default model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
# OpenRouter with reference images
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
# DashScope legacy Qwen fixed-size model
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
# MiniMax
${BUN_X} {baseDir}/scripts/main.ts --prompt "A fashion editorial portrait by a bright studio window" --image out.jpg --provider minimax
# MiniMax with subject reference (best for character/portrait consistency)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A girl stands by the library window, cinematic lighting" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9
# MiniMax with custom size (documented for image-01)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cinematic poster" --image out.jpg --provider minimax --model image-01 --size 1536x1024
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.
| Option | Description |
|---|---|
--prompt <text>, -p | Prompt text |
--promptfiles <files...> | Read prompt from files (concatenated) |
--image <path> | Output image path (required in single-image mode) |
--batchfile <path> | JSON batch file for multi-image generation |
--jobs <count> | Worker count for batch mode (default: auto, max from config, built-in default 10) |
--provider google|openai|azure|openrouter|dashscope|minimax|jimeng|seedream|replicate | Force provider (default: auto-detect) |
--model <id>, -m | Model ID (Google: gemini-3-pro-image-preview; OpenAI: gpt-image-1.5; Azure: deployment name such as gpt-image-1.5 or image-prod; OpenRouter: google/gemini-3.1-flash-image-preview; DashScope: qwen-image-2.0-pro; MiniMax: image-01) |
--ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
--size <WxH> | Size (e.g., 1024x1024) |
--quality normal|2k | Quality preset (default: 2k) |
--imageSize 1K|2K|4K | Image size for Google/OpenRouter (default: from quality) |
--ref <files...> | Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate, MiniMax subject-reference, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0 |
--n <count> | Number of images |
--json | JSON output |
| Variable | Description |
|---|---|
OPENAI_API_KEY | OpenAI API key |
AZURE_OPENAI_API_KEY | Azure OpenAI API key |
OPENROUTER_API_KEY | OpenRouter API key |
GOOGLE_API_KEY | Google API key |
DASHSCOPE_API_KEY | DashScope API key (阿里云) |
MINIMAX_API_KEY | MiniMax API key |
REPLICATE_API_TOKEN | Replicate API token |
JIMENG_ACCESS_KEY_ID | Jimeng (即梦) Volcengine access key |
JIMENG_SECRET_ACCESS_KEY | Jimeng (即梦) Volcengine secret key |
ARK_API_KEY | Seedream (豆包) Volcengine ARK API key |
OPENAI_IMAGE_MODEL | OpenAI model override |
AZURE_OPENAI_DEPLOYMENT | Azure default deployment name |
AZURE_OPENAI_IMAGE_MODEL | Backward-compatible alias for Azure default deployment/model name |
OPENROUTER_IMAGE_MODEL | OpenRouter model override (default: google/gemini-3.1-flash-image-preview) |
GOOGLE_IMAGE_MODEL | Google model override |
DASHSCOPE_IMAGE_MODEL | DashScope model override (default: qwen-image-2.0-pro) |
MINIMAX_IMAGE_MODEL | MiniMax model override (default: image-01) |
REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
JIMENG_IMAGE_MODEL | Jimeng model override (default: jimeng_t2i_v40) |
SEEDREAM_IMAGE_MODEL | Seedream model override (default: doubao-seedream-5-0-260128) |
OPENAI_BASE_URL | Custom OpenAI endpoint |
AZURE_OPENAI_BASE_URL | Azure resource endpoint or deployment endpoint |
AZURE_API_VERSION | Azure image API version (default: 2025-04-01-preview) |
OPENROUTER_BASE_URL | Custom OpenRouter endpoint (default: https://openrouter.ai/api/v1) |
OPENROUTER_HTTP_REFERER | Optional app/site URL for OpenRouter attribution |
OPENROUTER_TITLE | Optional app name for OpenRouter attribution |
GOOGLE_BASE_URL | Custom Google endpoint |
DASHSCOPE_BASE_URL | Custom DashScope endpoint |
MINIMAX_BASE_URL | Custom MiniMax endpoint (default: https://api.minimax.io) |
REPLICATE_BASE_URL | Custom Replicate endpoint |
JIMENG_BASE_URL | Custom Jimeng endpoint (default: https://visual.volcengineapi.com) |
JIMENG_REGION | Jimeng region (default: cn-north-1) |
SEEDREAM_BASE_URL | Custom Seedream endpoint (default: https://ark.cn-beijing.volces.com/api/v3) |
BAOYU_IMAGE_GEN_MAX_WORKERS | Override batch worker cap |
BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY | Override provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY |
BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS | Override provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
Model priority (highest → lowest), applies to all providers:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)For Azure, --model / default_model.azure should be the Azure deployment name. AZURE_OPENAI_DEPLOYMENT is the preferred env var, and AZURE_OPENAI_IMAGE_MODEL remains as a backward-compatible alias.
EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODELUse --model qwen-image-2.0-pro or set default_model.dashscope / DASHSCOPE_IMAGE_MODEL when the user wants official Qwen-Image behavior.
Official DashScope model families:
qwen-image-2.0-pro, qwen-image-2.0-pro-2026-03-03, qwen-image-2.0, qwen-image-2.0-2026-03-03
size in 宽*高 format512*512 and 2048*20481024*102421:9 and text-heavy Chinese/English layoutsqwen-image-max, qwen-image-max-2025-12-30, qwen-image-plus, qwen-image-plus-2026-01-09, qwen-image
1664*928, 1472*1104, 1328*1328, 1104*1472, 928*16641664*928qwen-image currently has the same capability as qwen-image-plusz-image-turbo, z-image-ultra, wanx-v1
When translating CLI args into DashScope behavior:
--size wins over --arqwen-image-2.0*, prefer explicit --size; otherwise infer from --ar and use the official recommended resolutions belowqwen-image-max/plus/image, only use the five official fixed sizes; if the requested ratio is not covered, switch to qwen-image-2.0-pro--quality is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping normal / 2k onto the qwen-image-2.0* table below is an implementation inference, not an official API guaranteeRecommended qwen-image-2.0* sizes for common aspect ratios:
| Ratio | normal | 2k |
|---|---|---|
1:1 | 1024*1024 | 1536*1536 |
2:3 | 768*1152 | 1024*1536 |
3:2 | 1152*768 | 1536*1024 |
3:4 | 960*1280 | 1080*1440 |
4:3 | 1280*960 | 1440*1080 |
9:16 | 720*1280 | 1080*1920 |
16:9 | 1280*720 | 1920*1080 |
21:9 | 1344*576 | 2048*872 |
DashScope official APIs also expose negative_prompt, prompt_extend, and watermark, but baoyu-image-gen does not expose them as dedicated CLI flags today.
Official references:
Use --model image-01 or set default_model.minimax / MINIMAX_IMAGE_MODEL when the user wants MiniMax image generation.
Official MiniMax image model options currently documented in the API reference:
image-01 (recommended default)
aspect_ratio values: 1:1, 16:9, 4:3, 3:2, 2:3, 3:4, 9:16, 21:9width / height output sizes when using --size <WxH>width and height must both be between 512 and 2048, and both must be divisible by 8image-01-live
--ar for sizing; MiniMax documents custom width / height as only effective for image-01MiniMax subject reference notes:
--ref files are sent as MiniMax subject_referencesubject_reference[].type as characterimage_file supports public URLs or Base64 Data URLs; baoyu-image-gen sends local refs as Data URLsOfficial references:
Use full OpenRouter model IDs, e.g.:
google/gemini-3.1-flash-image-preview (recommended, supports image output and reference-image workflows)google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-proNotes:
/chat/completions, not the OpenAI /images endpoints--ref is used, choose a multimodal model that supports image input and image output--imageSize maps to OpenRouter imageGenerationOptions.size; --size <WxH> is converted to the nearest OpenRouter size and inferred aspect ratio when possibleSupported model formats:
owner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version>Examples:
# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana--ref provided + no --provider → auto-select Google first, then OpenAI, then Azure, then OpenRouter, then Replicate, then Seedream, then MiniMax (MiniMax subject reference is more specialized toward character/portrait consistency)--provider specified → use it (if --ref, must be google, openai, azure, openrouter, replicate, seedream, or minimax)| Preset | Google imageSize | OpenAI Size | OpenRouter size | Replicate resolution | Use Case |
|---|---|---|---|---|---|
normal | 1K | 1024px | 1K | 1K | Quick previews |
2k (default) | 2K | 2048px | 2K | 2K | Covers, illustrations, infographics |
Google/OpenRouter imageSize: Can be overridden with --imageSize 1K|2K|4K
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
imageConfig.aspectRatioimageGenerationOptions.aspect_ratio; if only --size <WxH> is given, aspect ratio is inferred automaticallyaspect_ratio to model; when --ref is provided without --ar, defaults to match_input_imageaspect_ratio values directly; if --size <WxH> is given without --ar, width / height are sent for image-01Default: Sequential generation.
Batch Parallel Generation: When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.
| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel batch | Batch mode with 2+ tasks |
Execution choice:
| Situation | Preferred approach | Why |
|---|---|---|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead and easier debugging |
| Multiple images already have saved prompt files | Batch (--batchfile) | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
Output comes from baoyu-article-illustrator with outline.md + prompts/ | Batch (build-batch.ts -> --batchfile) | That workflow already produces prompt files, so direct batch execution is the intended path |
Rule of thumb:
Parallel behavior:
--jobs <count>Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
9eb032a
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.