Agent skill
image-gen
Modular image generation - supports local SDXL Lightning, OpenAI DALL-E, Replicate, or custom providers
Install this agent skill to your Project
npx add-skill https://github.com/DNYoussef/context-cascade/tree/main/skills/specialists/image-gen
SKILL.md
Modular Image Generation Skill
LIBRARY-FIRST PROTOCOL (MANDATORY)
Before writing ANY code, you MUST check:
Step 1: Library Catalog
- Location:
.claude/library/catalog.json - If match >70%: REUSE or ADAPT
Step 2: Patterns Guide
- Location:
.claude/docs/inventories/LIBRARY-PATTERNS-GUIDE.md - If pattern exists: FOLLOW documented approach
Step 3: Existing Projects
- Location:
D:\Projects\* - If found: EXTRACT and adapt
Decision Matrix
| Match | Action |
|---|---|
| Library >90% | REUSE directly |
| Library 70-90% | ADAPT minimally |
| Pattern exists | FOLLOW pattern |
| In project | EXTRACT |
| No match | BUILD (add to library after) |
Purpose
Generate images using the best available provider - local models (free, private) or cloud APIs (fast, paid). Fully modular architecture allows plugging in any image generation backend.
Providers
| Provider | Type | Cost | Requirements | Quality |
|---|---|---|---|---|
| SDXL Lightning | Local | Free | 8GB VRAM, ~7GB disk | Excellent |
| OpenAI DALL-E 3 | API | ~$0.04/image | OPENAI_API_KEY | Excellent |
| Replicate | API | ~$0.01/image | REPLICATE_API_TOKEN | Good |
| Custom | Any | Varies | User-defined | Varies |
When to Use
Perfect For:
- Blog banners and social media images (LinkedIn: 1200x630)
- Documentation diagrams and illustrations
- UI mockups and wireframes
- Concept visualization
- Any image generation need
Provider Selection:
- Privacy required? -> Use local SDXL
- No GPU? -> Use OpenAI or Replicate API
- Batch generation? -> Use local (no API costs)
- Highest quality? -> DALL-E 3 or SDXL Lightning
Quick Start
1. Check Available Providers
python scripts/multi-model/image-gen/cli.py --list
2. Setup Local SDXL (Recommended)
# First-time setup (downloads ~7GB)
python scripts/multi-model/image-gen/cli.py --setup local
3. Generate Images
# Auto-selects best available provider
python scripts/multi-model/image-gen/cli.py "A sunset over mountains" output.png
# LinkedIn banner size
python scripts/multi-model/image-gen/cli.py "Tech concept" banner.png --width 1200 --height 630
# Specific provider
python scripts/multi-model/image-gen/cli.py "A cat" cat.png --provider openai
Integration with Visual Art Composition
For professional-quality images, combine with visual-art-composition:
Step 1: visual-art-composition (Structure the prompt)
|
+---> 13-dimension aesthetic framework
+---> Cross-cultural synthesis
+---> Productive tension resolution
|
v
Step 2: image-gen (Generate the image)
|
+---> Select best provider (local or API)
+---> Generate high-quality image
+---> Save to specified path
Example Pipeline
# 1. Get structured prompt from visual-art-composition
/visual-art-composition "tech dashboard for productivity app"
# 2. Generate with structured prompt
python scripts/multi-model/image-gen/cli.py \
"Dashboard UI with linear perspective depth, composed blues and warm golds,
focal hierarchy with clear primary metric, notan two-value contrast.
Modern professional aesthetic, clean geometric forms." \
docs/images/dashboard.png --width 1200 --height 630
Provider Setup
Local SDXL Lightning (Recommended)
Requirements:
- GPU with 8GB+ VRAM (or CPU with 16GB+ RAM, slower)
- ~7GB disk space on D: drive
- Python with diffusers, torch
Setup:
python scripts/multi-model/image-gen/cli.py --setup local
Environment Variables (optional):
export SDXL_MODEL_DIR="D:/AI-Models/sdxl-lightning"
OpenAI DALL-E 3
Requirements:
- OpenAI API key
- ~$0.04 per image
Setup:
export OPENAI_API_KEY="sk-..."
python scripts/multi-model/image-gen/cli.py --setup openai
Replicate
Requirements:
- Replicate API token
- ~$0.01 per image
Setup:
export REPLICATE_API_TOKEN="r8_..."
python scripts/multi-model/image-gen/cli.py --setup replicate
Adding Custom Providers
Create a new provider by implementing ImageGeneratorBase:
from base import ImageGeneratorBase, ImageProvider, ProviderRegistry
class MyCustomGenerator(ImageGeneratorBase):
provider = ImageProvider.CUSTOM
def is_available(self) -> bool:
# Check if provider is configured
return True
def setup(self) -> bool:
# Download models, verify API keys, etc.
return True
def generate(self, prompt, output_path, config=None):
# Generate image
# Return GeneratedImage
pass
# Register
ProviderRegistry.register(ImageProvider.CUSTOM, MyCustomGenerator)
Python API
from scripts.multi_model.image_gen.base import ProviderRegistry, ImageConfig
# Get best available provider
provider = ProviderRegistry.get_best_available()
# Configure
config = ImageConfig(
width=1200,
height=630,
num_inference_steps=4
)
# Generate
result = provider.generate(
prompt="A beautiful sunset",
output_path="output.png",
config=config
)
print(f"Generated: {result.path} in {result.generation_time_seconds}s")
Batch Generation
prompts = [
"Sunset over mountains",
"City skyline at night",
"Forest in autumn"
]
results = provider.generate_batch(
prompts=prompts,
output_dir="./images/",
config=config
)
Best Practices
Prompt Engineering
- Be specific about composition, colors, style
- Include negative prompts for local models
- Use visual-art-composition for professional quality
- Specify aspect ratio in prompt when needed
Performance
- Local models: First generation is slow (model loading), subsequent are fast
- API models: Consistent speed, watch for rate limits
- Batch generation: More efficient than individual calls
Quality
- SDXL Lightning: 4 steps is optimal (more steps = minimal improvement)
- DALL-E 3: No step control, always high quality
- Always validate output matches intent
Related Skills
visual-art-composition: 13-dimension aesthetic framework for structured promptsprompt-architect: General prompt optimizationpptx-generation: Uses images for presentation slides
Troubleshooting
"No provider available"
- Run
--listto see what's configured - Run
--setup localto download SDXL Lightning - Or set API keys for cloud providers
Out of VRAM
- Use CPU mode (slower): Set
SDXL_DEVICE=cpu - Use API provider instead
- Reduce image size
Slow First Generation
- Normal for local models (loading ~7GB model)
- Subsequent generations are fast (~2-5 seconds)
Poor Quality
- Use more descriptive prompts
- Apply visual-art-composition framework
- Try different provider
Files
- CLI:
scripts/multi-model/image-gen/cli.py - Base classes:
scripts/multi-model/image-gen/base.py - Local SDXL:
scripts/multi-model/image-gen/local_sdxl.py - API providers:
scripts/multi-model/image-gen/api_providers.py
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
cognitive-mode
Comprehensive cognitive mode management skill for the VERILINGUA x VERIX x DSPy x GlobalMOO integration. Enables automatic mode selection, frame configuration, VERIX epistemic notation, and GlobalMOO optimization. Use this skill when configuring AI behavior for specific task types, optimizing prompt engineering, or ensuring epistemic consistency in responses.
bootstrap-loop
fix-bug
Fix bug command
clarity-linter
dependencies
when-mapping-dependencies-use-dependency-mapper
Didn't find tool you were looking for?