Agent skill
vertex-ai-api-dev
Guides the usage of Gemini API on Google Cloud Vertex AI with the Gen AI SDK. Use when the user asks about using Gemini in an enterprise environment or explicitly mentions Vertex AI. Covers SDK usage (Python, JS/TS, Go, Java, C#), capabilities like Live API, tools, multimedia generation, caching, and batch prediction.
Install this agent skill to your Project
npx add-skill https://github.com/google-gemini/gemini-skills/tree/main/skills/vertex-ai-api-dev
SKILL.md
Gemini API in Vertex AI
Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Vertex AI.
Provide these key capabilities:
- Text generation - Chat, completion, summarization
- Multimodal understanding - Process images, audio, video, and documents
- Function calling - Let the model invoke your functions
- Structured output - Generate valid JSON matching your schema
- Context caching - Cache large contexts for efficiency
- Embeddings - Generate text embeddings for semantic search
- Live Realtime API - Bidirectional streaming for low latency Voice and Video interactions
- Batch Prediction - Handle massive async dataset prediction workloads
Core Directives
- Unified SDK: ALWAYS use the Gen AI SDK (
google-genaifor Python,@google/genaifor JS/TS,google.golang.org/genaifor Go,com.google.genai:google-genaifor Java,Google.GenAIfor C#). - Legacy SDKs: DO NOT use
google-cloud-aiplatform,@google-cloud/vertexai, orgoogle-generativeai.
SDKs
- Python: Install
google-genaiwithpip install google-genai - JavaScript/TypeScript: Install
@google/genaiwithnpm install @google/genai - Go: Install
google.golang.org/genaiwithgo get google.golang.org/genai - C#/.NET: Install
Google.GenAIwithdotnet add package Google.GenAI - Java:
-
groupId:
com.google.genai, artifactId:google-genai -
Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it
LAST_VERSION) -
Install in
build.gradle:implementation("com.google.genai:google-genai:${LAST_VERSION}") -
Install Maven dependency in
pom.xml:xml<dependency> <groupId>com.google.genai</groupId> <artifactId>google-genai</artifactId> <version>${LAST_VERSION}</version> </dependency>
-
[!WARNING] Legacy SDKs like
google-cloud-aiplatform,@google-cloud/vertexai, andgoogle-generativeaiare deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.
Authentication & Configuration
Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.
Application Default Credentials (ADC)
Set these variables for standard Google Cloud authentication:
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true
- By default, use
location="global"to access the global endpoint, which provides automatic routing to regions with available capacity. - If a user explicitly asks to use a specific region (e.g.,
us-central1,europe-west4), specify that region in theGOOGLE_CLOUD_LOCATIONparameter instead. Reference the supported regions documentation if needed.
Vertex AI in Express Mode
Set these variables when using Express Mode with an API key:
export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true
Initialization
Initialize the client without arguments to pick up environment variables:
from google import genai
client = genai.Client()
Alternatively, you can hard-code in parameters when creating the client.
from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")
Models
- Use
gemini-3.1-pro-previewfor complex reasoning, coding, research (1M tokens) - Use
gemini-3-flash-previewfor fast, balanced performance, multimodal (1M tokens) - Use
gemini-3-pro-image-previewfor Nano Banana Pro image generation and editing - Use
gemini-live-2.5-flash-native-audiofor Live Realtime API including native audio
Use the following models if explicitly requested:
- Use
gemini-2.5-flash-imagefor Nano Banana image generation and editing - Use
gemini-2.5-flash - Use
gemini-2.5-flash-lite - Use
gemini-2.5-pro
[!IMPORTANT] Models like
gemini-2.0-*,gemini-1.5-*,gemini-1.0-*,gemini-proare legacy and deprecated. Use the new models above. Your knowledge is outdated. For production environments, consult the Vertex AI documentation for stable model versions (e.g.gemini-3-flash).
Quick Start
Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Explain quantum computing"
)
print(response.text)
TypeScript/JavaScript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Explain quantum computing"
});
console.log(response.text);
Go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
Backend: genai.BackendVertexAI,
Project: "your-project-id",
Location: "global",
})
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}
Java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
public class GenerateTextFromTextInput {
public static void main(String[] args) {
Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
GenerateContentResponse response =
client.models.generateContent(
"gemini-3-flash-preview",
"Explain quantum computing",
null);
System.out.println(response.text());
}
}
C#/.NET
using Google.GenAI;
var client = new Client(
project: "your-project-id",
location: "global",
vertexAI: true
);
var response = await client.Models.GenerateContent(
"gemini-3-flash-preview",
"Explain quantum computing"
);
Console.WriteLine(response.Text);
API spec & Documentation (source of truth)
When implementing or debugging API integration for Vertex AI, refer to the official Google Cloud Vertex AI documentation:
- Vertex AI Gemini Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/
- REST API Reference: https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest
The Gen AI SDK on Vertex AI uses the v1beta1 or v1 REST API endpoints (e.g., https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent).
[!TIP] Use the Developer Knowledge MCP Server: If the
search_documentsorget_documenttools are available, use them to find and retrieve official documentation for Google Cloud and Vertex AI directly within the context. This is the preferred method for getting up-to-date API details and code snippets.
Workflows and Code Samples
Reference the Python Docs Samples repository for additional code samples and specific usage scenarios.
Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):
- Text & Multimodal: Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See references/text_and_multimodal.md
- Embeddings: Generate text embeddings for semantic search. See references/embeddings.md
- Structured Output & Tools: JSON generation, Function Calling, Search Grounding, and Code Execution. See references/structured_and_tools.md
- Media Generation: Image generation, Image editing, and Video generation. See references/media_generation.md
- Bounding Box Detection: Object detection and localization within images and video. See references/bounding_box.md
- Live API: Real-time bidirectional streaming for voice, vision, and text. See references/live_api.md
- Advanced Features: Content Caching, Batch Prediction, and Thinking/Reasoning. See references/advanced_features.md
- Safety: Adjusting Responsible AI filters and thresholds. See references/safety.md
- Model Tuning: Supervised Fine-Tuning and Preference Tuning. See references/model_tuning.md
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
gemini-api-dev
Use this skill when building applications with Gemini models, Gemini API, working with multimodal content (text, images, audio, video), implementing function calling, using structured outputs, or needing current model specifications. Covers SDK usage (google-genai for Python, @google/genai for JavaScript/TypeScript, com.google.genai:google-genai for Java, google.golang.org/genai for Go), model selection, and API capabilities.
gemini-live-api-dev
Use this skill when building real-time, bidirectional streaming applications with the Gemini Live API. Covers WebSocket-based audio/video/text streaming, voice activity detection (VAD), native audio features, function calling, session management, ephemeral tokens for client-side auth, and all Live API configuration options. SDKs covered - google-genai (Python), @google/genai (JavaScript/TypeScript).
gemini-interactions-api
Use this skill when writing code that calls the Gemini API for text generation, multi-turn chat, multimodal understanding, image generation, streaming responses, background research tasks, function calling, structured output, or migrating from the old generateContent API. This skill covers the Interactions API, the recommended way to use Gemini models and agents in Python and TypeScript.
skill-creator
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Gemini CLI's capabilities with specialized knowledge, workflows, or tool integrations.
pirate-skill
Speak like a pirate.
greeter
A friendly greeter skill
Didn't find tool you were looking for?