Agent skill

nano-banana

This skill generates images using Google's Gemini image generation models (Nano Banana). It should be used when the user needs to create, generate, or produce images from text prompts -- for presentations, articles, concepts, illustrations, or any visual content. Supports fast generation (Gemini 2.5 Flash Image) and high-quality generation (Gemini 3 Pro Image).

Stars 74
Forks 17

Install this agent skill to your Project

npx add-skill https://github.com/glebis/claude-skills/tree/main/nano-banana

SKILL.md

Nano Banana - Gemini Image Generation

Generate images from text prompts via Google's Gemini image generation API.

When to Use

  • User requests image generation, creation, or production from a text description
  • Creating illustrations or visuals for presentations, articles, or documents
  • Generating concept art, mockups, diagrams, or placeholder images
  • Editing existing images with text instructions

Requirements

Quick Start

Generate an image using the bundled script:

bash
scripts/generate_image.sh "a minimalist flat illustration of a rocket" ./output.png

The script accepts three arguments:

  1. Prompt (required) -- text description of the image
  2. Output path (optional, default: ./generated_image.png)
  3. Model (optional, default: gemini-2.5-flash-image)

Models

Model Use When
gemini-2.5-flash-image (default) Fast iteration, bulk generation, most tasks
gemini-3-pro-image-preview Text in images, final polished assets, complex scenes

Workflow

Basic Generation

  1. Craft a descriptive prompt (style + subject + composition + colors)
  2. Run: scripts/generate_image.sh "prompt" ./path/to/output.png
  3. Open result: open ./path/to/output.png

High-Quality Generation

For important or text-heavy images, specify the Pro model:

bash
scripts/generate_image.sh "diagram showing..." ./diagram.png gemini-3-pro-image-preview

Image Editing

To edit an existing image, use the API directly with an input image. See references/api_reference.md for the request format including inlineData with base64-encoded source image.

Batch Generation

To generate multiple images, run the script in a loop:

bash
for prompt in "prompt one" "prompt two" "prompt three"; do
  scripts/generate_image.sh "$prompt" "./output_$(date +%s).png"
done

Prompt Tips

  • Specify visual style: "photograph", "flat illustration", "watercolor", "3D render"
  • Include composition: "centered", "wide shot", "white background"
  • Name colors: "blue and white color scheme", "warm earth tones"
  • For text rendering, use Pro model and quote exact text: 'with the text "Hello"'

See references/api_reference.md for comprehensive prompt engineering guidance.

Resources

  • scripts/generate_image.sh -- Main generation script. Handles API call, error reporting, base64 decoding, and file output.
  • references/api_reference.md -- Full API documentation: endpoints, request/response formats, models, prompt tips, error codes.

Expand your agent's capabilities with these related and highly-rated skills.

glebis/claude-skills

tdd

This skill should be used when the user wants to implement features or fix bugs using test-driven development. Enforces the RED-GREEN-REFACTOR cycle with vertical slicing, context isolation between test writing and implementation, human checkpoints, and auto-test feedback loops. Uses multi-agent orchestration with the Task tool for architecturally enforced context isolation. Supports Jest, Vitest, pytest, Go test, cargo test, PHPUnit, and RSpec.

74 17
Explore
glebis/claude-skills

brand-agency

Applies Agency brand colors and typography to artifacts including presentations, SVG graphics, documents, and web interfaces. This skill should be used when brand colors, visual formatting, neobrutalism style, or Agency design standards apply. Keywords - branding, corporate identity, visual identity, styling, brand colors, typography, visual formatting, visual design, neobrutalism.

74 17
Explore
glebis/claude-skills

github-gist

Publish files or Obsidian notes as GitHub Gists. Use when user wants to share code/notes publicly, create quick shareable snippets, or publish markdown to GitHub. Triggers include "publish as gist", "create gist", "share on github", "make a gist from this".

74 17
Explore
glebis/claude-skills

chrome-history

Query Chrome browsing history with natural language. Filter by date range, article type, keywords, and specific sites.

74 17
Explore
glebis/claude-skills

wispr-analytics

This skill should be used when analyzing Wispr Flow voice dictation history for self-reflection, work patterns, mental health insights, or productivity analytics. Triggered by requests like "/wispr-analytics", "analyze my dictations", "what did I dictate today", "wispr reflection", or any request to review voice dictation patterns. Supports modes - technical (coding/work), soft (communication), trends (volume/frequency), mental (sentiment/energy/rumination).

74 17
Explore
glebis/claude-skills

granola

This skill should be used when importing, listing, or exporting Granola meeting recordings and transcripts. Queries Granola's local cache and API to list meetings, extract transcripts, and export to Obsidian notes in Fathom-compatible format.

74 17
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results