Agent skill
nano-banana
This skill generates images using Google's Gemini image generation models (Nano Banana). It should be used when the user needs to create, generate, or produce images from text prompts -- for presentations, articles, concepts, illustrations, or any visual content. Supports fast generation (Gemini 2.5 Flash Image) and high-quality generation (Gemini 3 Pro Image).
Install this agent skill to your Project
npx add-skill https://github.com/glebis/claude-skills/tree/main/nano-banana
SKILL.md
Nano Banana - Gemini Image Generation
Generate images from text prompts via Google's Gemini image generation API.
When to Use
- User requests image generation, creation, or production from a text description
- Creating illustrations or visuals for presentations, articles, or documents
- Generating concept art, mockups, diagrams, or placeholder images
- Editing existing images with text instructions
Requirements
GEMINI_API_KEYenvironment variable (get from https://ai.google.dev/)
Quick Start
Generate an image using the bundled script:
scripts/generate_image.sh "a minimalist flat illustration of a rocket" ./output.png
The script accepts three arguments:
- Prompt (required) -- text description of the image
- Output path (optional, default:
./generated_image.png) - Model (optional, default:
gemini-2.5-flash-image)
Models
| Model | Use When |
|---|---|
gemini-2.5-flash-image (default) |
Fast iteration, bulk generation, most tasks |
gemini-3-pro-image-preview |
Text in images, final polished assets, complex scenes |
Workflow
Basic Generation
- Craft a descriptive prompt (style + subject + composition + colors)
- Run:
scripts/generate_image.sh "prompt" ./path/to/output.png - Open result:
open ./path/to/output.png
High-Quality Generation
For important or text-heavy images, specify the Pro model:
scripts/generate_image.sh "diagram showing..." ./diagram.png gemini-3-pro-image-preview
Image Editing
To edit an existing image, use the API directly with an input image. See references/api_reference.md for the request format including inlineData with base64-encoded source image.
Batch Generation
To generate multiple images, run the script in a loop:
for prompt in "prompt one" "prompt two" "prompt three"; do
scripts/generate_image.sh "$prompt" "./output_$(date +%s).png"
done
Prompt Tips
- Specify visual style: "photograph", "flat illustration", "watercolor", "3D render"
- Include composition: "centered", "wide shot", "white background"
- Name colors: "blue and white color scheme", "warm earth tones"
- For text rendering, use Pro model and quote exact text: 'with the text "Hello"'
See references/api_reference.md for comprehensive prompt engineering guidance.
Resources
scripts/generate_image.sh-- Main generation script. Handles API call, error reporting, base64 decoding, and file output.references/api_reference.md-- Full API documentation: endpoints, request/response formats, models, prompt tips, error codes.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
tdd
This skill should be used when the user wants to implement features or fix bugs using test-driven development. Enforces the RED-GREEN-REFACTOR cycle with vertical slicing, context isolation between test writing and implementation, human checkpoints, and auto-test feedback loops. Uses multi-agent orchestration with the Task tool for architecturally enforced context isolation. Supports Jest, Vitest, pytest, Go test, cargo test, PHPUnit, and RSpec.
brand-agency
Applies Agency brand colors and typography to artifacts including presentations, SVG graphics, documents, and web interfaces. This skill should be used when brand colors, visual formatting, neobrutalism style, or Agency design standards apply. Keywords - branding, corporate identity, visual identity, styling, brand colors, typography, visual formatting, visual design, neobrutalism.
github-gist
Publish files or Obsidian notes as GitHub Gists. Use when user wants to share code/notes publicly, create quick shareable snippets, or publish markdown to GitHub. Triggers include "publish as gist", "create gist", "share on github", "make a gist from this".
chrome-history
Query Chrome browsing history with natural language. Filter by date range, article type, keywords, and specific sites.
wispr-analytics
This skill should be used when analyzing Wispr Flow voice dictation history for self-reflection, work patterns, mental health insights, or productivity analytics. Triggered by requests like "/wispr-analytics", "analyze my dictations", "what did I dictate today", "wispr reflection", or any request to review voice dictation patterns. Supports modes - technical (coding/work), soft (communication), trends (volume/frequency), mental (sentiment/energy/rumination).
granola
This skill should be used when importing, listing, or exporting Granola meeting recordings and transcripts. Queries Granola's local cache and API to list meetings, extract transcripts, and export to Obsidian notes in Fathom-compatible format.
Didn't find tool you were looking for?