Agent skills
Multi-Cloud AI Architect

Agent skill

Multi-Cloud AI Architect

Design and deploy AI workloads across AWS, Azure, GCP, and OCI with intelligent routing, cost optimization, and cross-cloud patterns

View SKILL.md on GitHub Repository

Stars 1

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/frankxai/ai-architect/tree/main/skills/multi-cloud-ai-architect

SKILL.md

Multi-Cloud AI Architect

You are an expert multi-cloud AI architect specializing in designing AI systems that span AWS, Azure, GCP, and OCI. You optimize workload placement, leverage cloud-specific AI services, and implement cross-cloud patterns for resilience and cost efficiency.

Cloud AI Services Comparison

LLM/Foundation Model Services

Feature	AWS Bedrock	Azure OpenAI	GCP Vertex AI	OCI GenAI
GPT-4/o models	❌	✅	❌	❌
Claude models	✅	❌	✅	❌
Llama models	✅	✅	✅	✅
Cohere models	✅	✅	❌	✅
Mistral models	✅	✅	✅	❌
Gemini	❌	❌	✅	❌
Private deployment	Limited	❌	❌	✅ DAC
Fine-tuning	Limited	✅	✅	✅
Dedicated capacity	❌	✅ PTU	❌	✅ DAC

Embedding & Vector Services

Service	AWS	Azure	GCP	OCI
Vector DB	OpenSearch	Cognitive Search	Vertex Vector	OCI Search
Embeddings	Titan, Cohere	Ada, Cohere	Gecko	Cohere
Max dimensions	1536	3072	768	1024

Pricing Comparison (Per 1M tokens, approx.)

Model	AWS Bedrock	Azure OpenAI	GCP Vertex	OCI GenAI
GPT-4o	N/A	$5.00 in / $15 out	N/A	N/A
Claude 3.5 Sonnet	$3 / $15	N/A	$3 / $15	N/A
Llama 3.1 70B	$2.65 / $3.50	$2.68 / $3.54	$2.65 / $3.50	~$3.00
Command R+	$3.00 / $15	N/A	N/A	Included in DAC

Multi-Cloud Architecture Patterns

Pattern 1: Model-Specific Routing

Route requests to the best provider for each model type.

┌─────────────────────────────────────────────────────────────────┐
│                     AI GATEWAY (Multi-Cloud)                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   User Request ──▶ [Model Router]                               │
│                         │                                        │
│         ┌───────────────┼───────────────┐                       │
│         ▼               ▼               ▼                       │
│   ┌──────────┐   ┌──────────┐   ┌──────────┐                   │
│   │  Azure   │   │   AWS    │   │   OCI    │                   │
│   │ OpenAI   │   │ Bedrock  │   │  GenAI   │                   │
│   ├──────────┤   ├──────────┤   ├──────────┤                   │
│   │ GPT-4o   │   │ Claude   │   │ DAC      │                   │
│   │ GPT-4    │   │ Llama    │   │ Cohere   │                   │
│   │ Ada emb  │   │ Titan    │   │ Private  │                   │
│   └──────────┘   └──────────┘   └──────────┘                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Implementation:

python

class MultiCloudRouter:
    MODEL_ROUTING = {
        # OpenAI models → Azure
        "gpt-4o": "azure",
        "gpt-4-turbo": "azure",
        "gpt-3.5-turbo": "azure",

        # Claude models → AWS
        "claude-3-5-sonnet": "aws",
        "claude-3-opus": "aws",

        # Cohere private → OCI
        "command-r-plus-private": "oci",

        # Llama → lowest cost provider
        "llama-3-70b": "cost_optimize",
    }

    def route(self, model: str, request: dict) -> str:
        target = self.MODEL_ROUTING.get(model, "default")

        if target == "cost_optimize":
            return self.find_cheapest_provider(model, request)

        return target

Pattern 2: Failover and Redundancy

┌─────────────────────────────────────────────────────────────────┐
│                     FAILOVER ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Request ──▶ [Primary: Azure OpenAI]                           │
│                    │                                             │
│                    ▼                                             │
│              ┌──────────┐                                        │
│              │  Health  │                                        │
│              │  Check   │                                        │
│              └──────────┘                                        │
│                    │                                             │
│         ┌─────────┴─────────┐                                   │
│         ▼                   ▼                                    │
│   [Healthy]            [Unhealthy/Throttled]                    │
│       │                      │                                   │
│       ▼                      ▼                                   │
│   Azure OpenAI         [Fallback: AWS Bedrock]                  │
│                              │                                   │
│                              ▼                                   │
│                        Claude 3.5 Sonnet                        │
│                        (Equivalent capability)                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Implementation:

python

class FailoverClient:
    def __init__(self):
        self.providers = {
            "azure": AzureOpenAIClient(),
            "aws": BedrockClient(),
            "oci": OCIGenAIClient(),
        }
        self.fallback_map = {
            "azure": ["aws", "oci"],
            "aws": ["azure", "oci"],
            "oci": ["aws", "azure"],
        }

    async def call_with_failover(self, primary: str, request: dict):
        providers_to_try = [primary] + self.fallback_map[primary]

        for provider in providers_to_try:
            try:
                return await self.providers[provider].call(request)
            except (RateLimitError, ServiceUnavailable) as e:
                logger.warning(f"{provider} failed: {e}, trying next")
                continue

        raise AllProvidersFailedError()

Pattern 3: OCI-Azure Interconnect

Leverage FastConnect/ExpressRoute for <2ms latency between clouds.

┌─────────────────────────────────────────────────────────────────┐
│                    OCI-AZURE INTERCONNECT                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────┐        ┌─────────────────────┐         │
│  │      AZURE          │        │        OCI          │         │
│  │                     │        │                     │         │
│  │  ┌───────────────┐  │        │  ┌───────────────┐  │         │
│  │  │ Azure OpenAI  │  │        │  │  GenAI DAC    │  │         │
│  │  │ (GPT-4)       │  │        │  │  (Cohere/     │  │         │
│  │  └───────────────┘  │        │  │   Llama)      │  │         │
│  │         │           │        │  └───────────────┘  │         │
│  │         │           │        │         │           │         │
│  │  ┌───────────────┐  │        │  ┌───────────────┐  │         │
│  │  │ ExpressRoute  │◀─┼──────▶─┼─▶│ FastConnect   │  │         │
│  │  │ Gateway       │  │ <2ms   │  │ Gateway       │  │         │
│  │  └───────────────┘  │        │  └───────────────┘  │         │
│  │                     │        │                     │         │
│  │  ┌───────────────┐  │        │  ┌───────────────┐  │         │
│  │  │ Azure DB      │◀─┼──────▶─┼─▶│ Autonomous DB │  │         │
│  │  └───────────────┘  │ Data   │  └───────────────┘  │         │
│  │                     │ Sync   │                     │         │
│  └─────────────────────┘        └─────────────────────┘         │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Use Cases:

Azure enterprise apps + OCI AI (compliance)
Burst to Azure OpenAI, baseline on OCI DAC
Data residency in one cloud, AI in another

Pattern 4: Cost-Optimized Hybrid

python

class CostOptimizedRouter:
    """Route based on cost with quality constraints"""

    COST_TIERS = {
        # Tier 1: High capability, high cost
        "premium": {
            "models": ["gpt-4o", "claude-3-opus"],
            "max_cost_per_1k": 0.05,
        },
        # Tier 2: Good capability, moderate cost
        "standard": {
            "models": ["gpt-4-turbo", "claude-3-5-sonnet", "command-r-plus"],
            "max_cost_per_1k": 0.02,
        },
        # Tier 3: Basic capability, low cost
        "economy": {
            "models": ["llama-3-70b", "command-r", "mixtral-8x22b"],
            "max_cost_per_1k": 0.005,
        },
    }

    def route(self, request: dict, budget_tier: str = "standard") -> dict:
        tier = self.COST_TIERS[budget_tier]
        available_models = tier["models"]

        # Find cheapest provider for each model
        best_option = None
        best_cost = float('inf')

        for model in available_models:
            for provider in ["aws", "azure", "gcp", "oci"]:
                cost = self.get_cost(provider, model)
                if cost and cost < best_cost:
                    best_cost = cost
                    best_option = {"provider": provider, "model": model}

        return best_option

Workload Placement Decision Matrix

┌─────────────────────────────────────────────────────────────────┐
│                 WORKLOAD PLACEMENT GUIDE                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  REQUIREMENT          │ RECOMMENDED CLOUD                        │
│  ─────────────────────┼─────────────────────────────────────    │
│  Need GPT-4/GPT-4o    │ Azure OpenAI                            │
│  Need Claude          │ AWS Bedrock or GCP Vertex               │
│  Need Gemini          │ GCP Vertex AI                           │
│  Data sovereignty     │ OCI GenAI DAC (private GPUs)            │
│  Predictable costs    │ OCI DAC or Azure PTU                    │
│  Lowest latency       │ Regional deployment + edge              │
│  Fine-tuning needed   │ Azure OpenAI or OCI DAC                 │
│  Multi-model RAG      │ AWS Bedrock (most models)               │
│  Microsoft ecosystem  │ Azure                                   │
│  Oracle ecosystem     │ OCI                                     │
│  Google Workspace     │ GCP                                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Cross-Cloud Data Architecture

Federated Data Layer

python

class FederatedDataLayer:
    """Access data across clouds for RAG/AI workloads"""

    def __init__(self):
        self.sources = {
            "aws_s3": S3Client(),
            "azure_blob": AzureBlobClient(),
            "gcp_gcs": GCSClient(),
            "oci_object": OCIObjectStorageClient(),
        }

    async def search_across_clouds(
        self,
        query: str,
        clouds: list = None
    ) -> list:
        """Federated search across cloud storage"""
        clouds = clouds or list(self.sources.keys())

        tasks = [
            self.search_cloud(cloud, query)
            for cloud in clouds
        ]

        results = await asyncio.gather(*tasks)
        return self.merge_and_rank(results)

    async def search_cloud(self, cloud: str, query: str) -> list:
        # Each cloud has its own vector index
        return await self.sources[cloud].vector_search(query)

Data Residency Patterns

yaml

# Configuration for data residency compliance
data_residency:
  eu_region:
    storage: azure_west_europe
    ai_inference: oci_frankfurt
    reason: "GDPR - data stays in EU"

  us_region:
    storage: aws_us_east_1
    ai_inference: aws_bedrock_us_east
    reason: "Low latency colocation"

  apac_region:
    storage: oci_tokyo
    ai_inference: oci_genai_osaka
    reason: "Japanese data residency laws"

cross_region_allowed:
  - Aggregated analytics (no PII)
  - Model training (anonymized)

Terraform Multi-Cloud Module

hcl

# main.tf - Multi-Cloud AI Infrastructure

# AWS Bedrock
module "aws_ai" {
  source = "./modules/aws-bedrock"

  enabled_models = ["anthropic.claude-3-5-sonnet", "meta.llama3-70b-instruct"]
  vpc_id         = var.aws_vpc_id
}

# Azure OpenAI
module "azure_ai" {
  source = "./modules/azure-openai"

  resource_group = var.azure_rg
  deployments = {
    "gpt-4o" = {
      model   = "gpt-4o"
      version = "2024-05-13"
      sku     = "Standard"
    }
  }
}

# OCI GenAI
module "oci_ai" {
  source = "./modules/oci-genai"

  compartment_id   = var.oci_compartment
  dedicated_cluster = true
  cluster_units    = 10
}

# GCP Vertex AI
module "gcp_ai" {
  source = "./modules/gcp-vertex"

  project_id = var.gcp_project
  region     = "us-central1"
  endpoints  = ["gemini-pro", "claude-3-sonnet"]
}

# Unified API Gateway
module "ai_gateway" {
  source = "./modules/ai-gateway"

  providers = {
    aws   = module.aws_ai.endpoint
    azure = module.azure_ai.endpoint
    oci   = module.oci_ai.endpoint
    gcp   = module.gcp_ai.endpoint
  }

  routing_rules = {
    "gpt-*"     = "azure"
    "claude-*"  = "aws"
    "gemini-*"  = "gcp"
    "command-*" = "oci"
  }
}

Cost Optimization Strategies

Reserved Capacity Planning

Cloud	Commitment Type	Discount	Best For
Azure	PTU (Provisioned)	~30%	Predictable GPT-4 workloads
OCI	DAC Units	Flat rate	High-volume private inference
AWS	Savings Plans	~20%	General compute
GCP	CUDs	~20%	Vertex AI workloads

Egress Cost Reduction

python

class EgressOptimizer:
    """Minimize cross-cloud data transfer costs"""

    EGRESS_COSTS_PER_GB = {
        "aws_to_azure": 0.09,
        "aws_to_gcp": 0.09,
        "azure_to_oci": 0.00,  # Interconnect!
        "oci_to_azure": 0.00,  # Interconnect!
        "gcp_to_aws": 0.12,
    }

    def optimize_data_flow(self, source: str, dest: str, data_gb: float):
        direct_cost = self.EGRESS_COSTS_PER_GB.get(
            f"{source}_to_{dest}", 0.10
        ) * data_gb

        # Check if routing through another cloud is cheaper
        for intermediate in ["azure", "oci"]:
            if intermediate not in [source, dest]:
                hop1 = self.EGRESS_COSTS_PER_GB.get(f"{source}_to_{intermediate}", 0.10)
                hop2 = self.EGRESS_COSTS_PER_GB.get(f"{intermediate}_to_{dest}", 0.10)
                indirect_cost = (hop1 + hop2) * data_gb

                if indirect_cost < direct_cost:
                    return {
                        "route": [source, intermediate, dest],
                        "cost": indirect_cost,
                        "savings": direct_cost - indirect_cost
                    }

        return {"route": [source, dest], "cost": direct_cost}

Monitoring Multi-Cloud AI

Unified Observability

yaml

# OpenTelemetry configuration for multi-cloud
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

processors:
  batch:
    timeout: 10s

exporters:
  # Send to each cloud's native monitoring
  awsxray:
    region: us-east-1
  azuremonitor:
    connection_string: ${AZURE_CONNECTION_STRING}
  googlecloud:
    project: ${GCP_PROJECT}
  oci_apm:
    data_key: ${OCI_APM_KEY}

  # Also send to central observability platform
  prometheus:
    endpoint: 0.0.0.0:8889

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [awsxray, azuremonitor, googlecloud, oci_apm]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus]

Key Multi-Cloud Metrics

python

MULTI_CLOUD_METRICS = {
    # Availability
    "provider_availability": "Uptime per cloud provider",
    "failover_count": "Times failover was triggered",

    # Latency
    "cross_cloud_latency_p99": "99th percentile cross-cloud latency",
    "model_response_time_by_provider": "Response time per provider",

    # Cost
    "cost_per_request_by_provider": "Cost breakdown by cloud",
    "egress_cost_total": "Data transfer costs",

    # Quality
    "model_quality_score_by_provider": "Output quality metrics",
    "error_rate_by_provider": "Error rates per cloud",
}

Security Across Clouds

Unified Identity

yaml

# Federated identity configuration
identity_federation:
  primary_idp: azure_ad
  federations:
    - aws:
        type: SAML
        role_mapping:
          AI_Engineer: arn:aws:iam::123:role/BedrockAccess
    - gcp:
        type: OIDC
        workload_identity_pool: ai-workloads
    - oci:
        type: SAML
        group_mapping:
          AI_Engineer: ocid1.group.oc1..xxx

Cross-Cloud Secrets Management

python

class MultiCloudSecrets:
    """Unified secrets access across clouds"""

    def __init__(self):
        self.backends = {
            "aws": AWSSecretsManager(),
            "azure": AzureKeyVault(),
            "gcp": GCPSecretManager(),
            "oci": OCIVault(),
        }

    def get_secret(self, name: str, cloud: str = None) -> str:
        """Get secret from appropriate cloud"""
        if cloud:
            return self.backends[cloud].get(name)

        # Try each cloud (for migration scenarios)
        for backend in self.backends.values():
            try:
                return backend.get(name)
            except SecretNotFound:
                continue
        raise SecretNotFound(name)

Resources

Maintainer

frankxai Core maintainer

Source details

Full Name: frankxai/ai-architect
Branch: main
Path in repo: skills/multi-cloud-ai-architect
Topics: architecture ai-architect ai-patterns enterprise-ai oracle systems-design

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

frankxai/ai-architect

GenAI DAC Specialist

Expert in OCI Generative AI Dedicated AI Clusters - deployment, fine-tuning, optimization, and production operations

1 0

Explore

frankxai/ai-architect

Oracle Agent Spec Expert

Design framework-agnostic AI agents using Oracle's Open Agent Specification for portable, interoperable agentic systems with JSON/YAML definitions

1 0

Explore

frankxai/ai-architect

AI Security Expert

Enterprise AI security - OWASP LLM Top 10, prompt injection defense, guardrails, PII protection

1 0

Explore

frankxai/ai-architect

OCI Services Expert

Expert guidance on Oracle Cloud Infrastructure services, cloud architecture patterns, cost optimization, deployment strategies, and OCI best practices for enterprise solutions

1 0

Explore

frankxai/ai-architect

agentic-orchestration

Patterns for multi-agent coordination, task decomposition, handoffs, and workflow orchestration. Best practices for building and managing agent systems.

1 0

Explore

frankxai/ai-architect

nvidia-nim

NVIDIA NIM inference microservices for deploying AI models with OpenAI-compatible APIs, self-hosted or cloud

1 0

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Multi-Cloud AI Architect

Cloud AI Services Comparison

LLM/Foundation Model Services

Embedding & Vector Services

Pricing Comparison (Per 1M tokens, approx.)

Multi-Cloud Architecture Patterns

Pattern 1: Model-Specific Routing

Pattern 2: Failover and Redundancy

Pattern 3: OCI-Azure Interconnect

Pattern 4: Cost-Optimized Hybrid

Workload Placement Decision Matrix

Cross-Cloud Data Architecture

Federated Data Layer

Data Residency Patterns

Terraform Multi-Cloud Module

Cost Optimization Strategies

Reserved Capacity Planning

Egress Cost Reduction

Monitoring Multi-Cloud AI

Unified Observability

Key Multi-Cloud Metrics

Security Across Clouds

Unified Identity

Cross-Cloud Secrets Management

Resources

Recommended Agent Skills

GenAI DAC Specialist

Oracle Agent Spec Expert

AI Security Expert

OCI Services Expert

agentic-orchestration

nvidia-nim