Agent skills
azure-ai-vision-imageanalysis-...

Agent skill

azure-ai-vision-imageanalysis-py

Azure AI Vision Image Analysis SDK for captions, tags, objects, OCR, people detection, and smart cropping. Use for computer vision and image understanding tasks.

View SKILL.md on GitHub Repository

Stars 28,421

Forks 4,766

Install this agent skill to your Project

npx add-skill https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/azure-ai-vision-imageanalysis-py

SKILL.md

Azure AI Vision Image Analysis SDK for Python

Client library for Azure AI Vision 4.0 image analysis including captions, tags, objects, OCR, and more.

Installation

bash

pip install azure-ai-vision-imageanalysis

Environment Variables

bash

VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
VISION_KEY=<your-api-key>  # If using API key

Authentication

API Key

python

import os
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential

endpoint = os.environ["VISION_ENDPOINT"]
key = os.environ["VISION_KEY"]

client = ImageAnalysisClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key)
)

Entra ID (Recommended)

python

from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.identity import DefaultAzureCredential

client = ImageAnalysisClient(
    endpoint=os.environ["VISION_ENDPOINT"],
    credential=DefaultAzureCredential()
)

Analyze Image from URL

python

from azure.ai.vision.imageanalysis.models import VisualFeatures

image_url = "https://example.com/image.jpg"

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[
        VisualFeatures.CAPTION,
        VisualFeatures.TAGS,
        VisualFeatures.OBJECTS,
        VisualFeatures.READ,
        VisualFeatures.PEOPLE,
        VisualFeatures.SMART_CROPS,
        VisualFeatures.DENSE_CAPTIONS
    ],
    gender_neutral_caption=True,
    language="en"
)

Analyze Image from File

python

with open("image.jpg", "rb") as f:
    image_data = f.read()

result = client.analyze(
    image_data=image_data,
    visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
)

Image Caption

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.CAPTION],
    gender_neutral_caption=True
)

if result.caption:
    print(f"Caption: {result.caption.text}")
    print(f"Confidence: {result.caption.confidence:.2f}")

Dense Captions (Multiple Regions)

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.DENSE_CAPTIONS]
)

if result.dense_captions:
    for caption in result.dense_captions.list:
        print(f"Caption: {caption.text}")
        print(f"  Confidence: {caption.confidence:.2f}")
        print(f"  Bounding box: {caption.bounding_box}")

Object Detection

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.OBJECTS]
)

if result.objects:
    for obj in result.objects.list:
        print(f"Object: {obj.tags[0].name}")
        print(f"  Confidence: {obj.tags[0].confidence:.2f}")
        box = obj.bounding_box
        print(f"  Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")

OCR (Text Extraction)

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.READ]
)

if result.read:
    for block in result.read.blocks:
        for line in block.lines:
            print(f"Line: {line.text}")
            print(f"  Bounding polygon: {line.bounding_polygon}")
            
            # Word-level details
            for word in line.words:
                print(f"  Word: {word.text} (confidence: {word.confidence:.2f})")

People Detection

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.PEOPLE]
)

if result.people:
    for person in result.people.list:
        print(f"Person detected:")
        print(f"  Confidence: {person.confidence:.2f}")
        box = person.bounding_box
        print(f"  Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")

Smart Cropping

python

result = client.analyze_from_url(
    image_url=image_url,
    visual_features=[VisualFeatures.SMART_CROPS],
    smart_crops_aspect_ratios=[0.9, 1.33, 1.78]  # Portrait, 4:3, 16:9
)

if result.smart_crops:
    for crop in result.smart_crops.list:
        print(f"Aspect ratio: {crop.aspect_ratio}")
        box = crop.bounding_box
        print(f"  Crop region: x={box.x}, y={box.y}, w={box.width}, h={box.height}")

Async Client

python

from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
from azure.identity.aio import DefaultAzureCredential

async def analyze_image():
    async with ImageAnalysisClient(
        endpoint=endpoint,
        credential=DefaultAzureCredential()
    ) as client:
        result = await client.analyze_from_url(
            image_url=image_url,
            visual_features=[VisualFeatures.CAPTION]
        )
        print(result.caption.text)

Visual Features

Feature	Description
`CAPTION`	Single sentence describing the image
`DENSE_CAPTIONS`	Captions for multiple regions
`TAGS`	Content tags (objects, scenes, actions)
`OBJECTS`	Object detection with bounding boxes
`READ`	OCR text extraction
`PEOPLE`	People detection with bounding boxes
`SMART_CROPS`	Suggested crop regions for thumbnails

Error Handling

python

from azure.core.exceptions import HttpResponseError

try:
    result = client.analyze_from_url(
        image_url=image_url,
        visual_features=[VisualFeatures.CAPTION]
    )
except HttpResponseError as e:
    print(f"Status code: {e.status_code}")
    print(f"Reason: {e.reason}")
    print(f"Message: {e.error.message}")

Image Requirements

Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO
Max size: 20 MB
Dimensions: 50x50 to 16000x16000 pixels

Best Practices

Select only needed features to optimize latency and cost
Use async client for high-throughput scenarios
Handle HttpResponseError for invalid images or auth issues
Enable gender_neutral_caption for inclusive descriptions
Specify language for localized captions
Use smart_crops_aspect_ratios matching your thumbnail requirements
Cache results when analyzing the same image multiple times

When to Use

This skill is applicable to execute the workflow or actions described in the overview.

Maintainer

sickn33 Core maintainer

Source details

Full Name: sickn33/antigravity-awesome-skills
Branch: main
Path in repo: skills/azure-ai-vision-imageanalysis-py
License: MIT License
Topics: claude-code agent-skills claude-code-skills mcp agentic-skills ai-agent-skills ai-agents ai-coding ai-workflows antigravity antigravity-skills codex-cli codex-skills cursor cursor-skills developer-tools gemini-cli gemini-skills kiro skill-library

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

sickn33/antigravity-awesome-skills

obsidian-clipper-template-creator

Guide for creating templates for the Obsidian Web Clipper. Use when you want to create a new clipping template, understand available variables, or format clipped content.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

claude-code-expert

Especialista profundo em Claude Code - CLI da Anthropic. Maximiza produtividade com atalhos, hooks, MCPs, configuracoes avancadas, workflows, CLAUDE.md, memoria, sub-agentes, permissoes e integracao com ecossistemas.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

lex

Centralized 'Truth Engine' for cross-jurisdictional legal context (US, EU, CA) and contract scaffolding.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

odoo-inventory-optimizer

Expert guide for Odoo Inventory: stock valuation (FIFO/AVCO), reordering rules, putaway strategies, routes, and multi-warehouse configuration.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

android_ui_verification

Automated end-to-end UI testing and verification on an Android Emulator using ADB.

28,421 4,766

Explore

sickn33/antigravity-awesome-skills

seo-cannibalization-detector

Analyzes multiple provided pages to identify keyword overlap and potential cannibalization issues. Suggests differentiation strategies. Use PROACTIVELY when reviewing similar content.

28,421 4,766

Explore

Didn't find tool you were looking for?

Search AI Tools

azure-ai-vision-imageanalysis-py

Install this agent skill to your Project

SKILL.md

Azure AI Vision Image Analysis SDK for Python

Installation

Environment Variables

Authentication

API Key

Entra ID (Recommended)

Analyze Image from URL

Analyze Image from File

Image Caption

Dense Captions (Multiple Regions)

Tags

Object Detection

OCR (Text Extraction)

People Detection

Smart Cropping

Async Client

Visual Features

Error Handling

Image Requirements

Best Practices

When to Use

Recommended Agent Skills

obsidian-clipper-template-creator

claude-code-expert

lex

odoo-inventory-optimizer

android_ui_verification

seo-cannibalization-detector