Agent skill
image-generation
Gemini image generation and editing skill for text-to-image, image-to-image edits, multi-reference composition, and Google Search grounding. Use when creating or modifying images via Gemini (default model gemini-3-pro-image-preview) with the Python SDK.
Install this agent skill to your Project
npx add-skill https://github.com/Xiangyu-CAS/Vision-Skills/tree/main/skills/image-generation
SKILL.md
Image Generation with Gemini
Use this skill when the user asks to generate or edit images with Gemini using the Python SDK. Default to gemini-3-pro-image-preview, and mention gemini-2.5-flash-image only as an optional faster/cheaper alternative.
Workflow
- Identify task type (text-to-image, edit, or multi-reference).
- Ensure
GEMINI_API_KEYis available (env or stored in.env), then use the Python SDK. This will make network requests to the Gemini API - Choose model + output (
response_modalities=["IMAGE"]if image-only) and run. Generation can take ~30 seconds; allow 30–60 seconds before retrying. - Save returned images with
part.as_image(); if none, report a clear error.
Use these references
references/python.mdfor Python SDK usage
Response handling (Python SDK)
Use part.as_image() to access image outputs and save them. If no image parts are returned, surface a clear error and suggest checking the API key, model name, and response modalities.
Timing note
Image generation may take around 30 seconds. When running commands via the shell tool, set a longer timeout (e.g., 60–120 seconds) to avoid premature timeouts.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
video-generation
Gemini video generation with Veo 3.1 via the Python SDK. Use when generating videos from text or images, using reference images, first/last frame interpolation, or video extension, and when tuning Veo parameters (aspect ratio, resolution, duration, negative prompts, personGeneration, seed).
bbdown-cli
Install and use the BBDown CLI on Linux/macOS for Bilibili downloads, including login/cookies/access_token, downloading by URL, preferring 720p when available, and writing output under a local data/ directory.
migrate-to-skills
discover-assumptions
Use after solution concepts exist to surface and prioritize assumptions behind outcomes, opportunities, or solution ideas and design experiments to test them.
discover-opportunities
Use after outcomes are defined to discover opportunities, unmet needs, market gaps, or JTBD insights before choosing solutions.
discover-outcomes
Use at the start of product strategy to define or refine desired outcomes and success metrics (e.g., for Opportunity Solution Trees or continuous discovery) before selecting opportunities or solutions.
Didn't find tool you were looking for?