Gemini Image Generation

Generate and edit images using Google's Gemini API. Requires GEMINI_API_KEY environment variable.

Quick Reference

Setting	Default	Options
Model	`gemini-3-pro-image-preview`	Use this for all generation
Resolution	1K	1K, 2K, 4K
Aspect Ratio	1:1	1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

CLI Scripts

Generate Image

bash

python scripts/generate_image.py "A cat in space" output.jpg
python scripts/generate_image.py "Epic landscape" landscape.jpg --aspect 16:9 --size 2K
python scripts/generate_image.py "Logo for Acme Corp" logo.jpg --aspect 1:1

Edit Image

bash

python scripts/edit_image.py input.jpg "Add a rainbow" output.jpg
python scripts/edit_image.py photo.jpg "Make it look like Van Gogh" artistic.jpg

Core API Pattern

python

import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Your prompt here"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

for part in response.parts:
    if part.text:
        print(part.text)
    elif part.inline_data:
        image = part.as_image()
        image.save("output.jpg")  # Always use .jpg!

Custom Resolution & Aspect Ratio

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",
            image_size="2K"
        ),
    )
)

Editing Images

python

from PIL import Image

img = Image.open("input.jpg")
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Add a sunset to this scene", img],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Multi-Turn Refinement

python

chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)

response = chat.send_message("Create a logo for 'Acme Corp'")
# Save first image...

response = chat.send_message("Make the text bolder and add a blue gradient")
# Save refined image...

Prompting Best Practices

Style	Prompt Pattern
Photorealistic	Include camera: lens, lighting, angle, mood
Stylized Art	Specify style explicitly: "kawaii-style", "cel-shading"
Text in Images	Be explicit: font style, placement, colors
Product Mockups	Describe lighting setup and surface

Examples

# Photorealistic
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"

# Stylized
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"

# Logo with text
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"

# Product mockup
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"

Advanced Features

Google Search Grounding

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Visualize today's weather in Tokyo as an infographic"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

Multiple Reference Images (Up to 14)

python

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        "Create a group photo of these people in an office",
        Image.open("person1.jpg"),
        Image.open("person2.jpg"),
        Image.open("person3.jpg"),
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Critical: File Format

Gemini returns JPEG by default. Always use .jpg extension.

python

# CORRECT
image.save("output.jpg")

# WRONG - causes "Image does not match media type" errors
image.save("output.png")  # Creates JPEG with PNG extension!

If PNG is Required

python

from PIL import Image

for part in response.parts:
    if part.inline_data:
        img = part.as_image()
        img.save("output.png", format="PNG")  # Explicit conversion

Verify Format

bash

file image.png
# If output shows "JPEG image data" - rename to .jpg!

Notes

All generated images include SynthID watermarks
Default to 1K for speed; use 2K/4K when quality is critical
For editing, describe changes conversationally—the model understands semantic masking
Image-only mode won't work with Google Search grounding

Search AI Tools

gemini-image-coder

Install this agent skill to your Project

SKILL.md

Gemini Image Generation

Quick Reference

CLI Scripts

Generate Image

Edit Image

Core API Pattern

Custom Resolution & Aspect Ratio

Editing Images

Multi-Turn Refinement

Prompting Best Practices

Examples

Advanced Features

Google Search Grounding

Multiple Reference Images (Up to 14)

Critical: File Format

If PNG is Required

Verify Format

Notes