Agent skills
distributed-caching

Agent skill

distributed-caching

Expert skill for distributed cache design, implementation, and optimization using Redis and Memcached. Design cache architectures, configure eviction policies, implement caching patterns (cache-aside, write-through, write-behind), monitor cache performance, and optimize memory usage.

View SKILL.md on GitHub Repository

Stars 514

Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/a5c-ai/babysitter/tree/main/library/specializations/performance-optimization/skills/distributed-caching

Metadata

Additional technical details for this skill

author: babysitter-sdk
version: 1.0.0
category: caching
backlog id: SK-010

SKILL.md

distributed-caching

You are distributed-caching - a specialized skill for distributed cache architecture and optimization. This skill provides expert capabilities for designing, implementing, and maintaining high-performance caching layers using Redis, Memcached, and related technologies.

Overview

This skill enables AI-powered caching operations including:

Designing Redis data structures and access patterns
Configuring Redis Cluster and Sentinel for high availability
Implementing caching patterns (cache-aside, write-through, write-behind)
Configuring eviction policies (LRU, LFU, TTL-based)
Monitoring cache hit rates and memory usage
Debugging cache invalidation issues
Optimizing memory efficiency

Prerequisites

Redis 6.0+ (7.0+ recommended for advanced features)
Or Memcached 1.6+
redis-cli and memcached utilities
Optional: Redis Stack for JSON, Search, and Time Series
Optional: Redis Enterprise for production deployments

Capabilities

1. Redis Data Structure Design

Design optimal data structures for use cases:

redis

# String - Simple key-value caching
SET user:1001:profile '{"name":"John","email":"john@example.com"}' EX 3600
GET user:1001:profile

# Hash - Structured data with partial updates
HSET product:5001 name "Widget" price 29.99 stock 150
HGET product:5001 price
HINCRBY product:5001 stock -1

# Sorted Set - Leaderboards and ranking
ZADD leaderboard 1500 "player:1" 2200 "player:2" 1800 "player:3"
ZREVRANGE leaderboard 0 9 WITHSCORES  # Top 10
ZRANK leaderboard "player:1"

# List - Message queues and activity feeds
LPUSH notifications:user:1001 '{"type":"order","id":"ord-123"}'
LRANGE notifications:user:1001 0 19  # Latest 20
LTRIM notifications:user:1001 0 99   # Keep only 100

# Set - Tags, unique visitors, relationships
SADD product:5001:tags "electronics" "sale" "featured"
SINTER user:1001:interests product:5001:tags  # Common interests

# HyperLogLog - Cardinality estimation
PFADD daily:visitors:20260124 "user:1001" "user:1002" "guest:abc"
PFCOUNT daily:visitors:20260124

# Stream - Event sourcing and message streaming
XADD orders * action "created" order_id "ord-123" total "99.99"
XREAD COUNT 10 STREAMS orders 0
XGROUP CREATE orders order-processors $ MKSTREAM
XREADGROUP GROUP order-processors worker-1 COUNT 10 STREAMS orders >

2. Caching Patterns Implementation

Implement common caching patterns:

python

import redis
import json
from functools import wraps

r = redis.Redis(host='localhost', port=6379, decode_responses=True)

# Cache-Aside Pattern (Lazy Loading)
def get_user(user_id):
    cache_key = f"user:{user_id}"

    # Try cache first
    cached = r.get(cache_key)
    if cached:
        return json.loads(cached)

    # Cache miss - fetch from database
    user = database.get_user(user_id)

    # Populate cache with TTL
    r.setex(cache_key, 3600, json.dumps(user))
    return user

# Write-Through Pattern
def update_user(user_id, data):
    cache_key = f"user:{user_id}"

    # Update database first
    database.update_user(user_id, data)

    # Update cache immediately
    r.setex(cache_key, 3600, json.dumps(data))
    return data

# Write-Behind (Write-Back) Pattern
def update_user_async(user_id, data):
    cache_key = f"user:{user_id}"

    # Update cache immediately
    r.setex(cache_key, 3600, json.dumps(data))

    # Queue database write
    r.lpush("write_queue", json.dumps({
        "operation": "update_user",
        "user_id": user_id,
        "data": data,
        "timestamp": time.time()
    }))

# Read-Through with Cache-Aside decorator
def cached(ttl=3600, prefix="cache"):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            # Generate cache key from function and arguments
            key = f"{prefix}:{func.__name__}:{hash(str(args) + str(kwargs))}"

            cached_value = r.get(key)
            if cached_value:
                return json.loads(cached_value)

            result = func(*args, **kwargs)
            r.setex(key, ttl, json.dumps(result))
            return result
        return wrapper
    return decorator

@cached(ttl=300, prefix="products")
def get_product_recommendations(user_id, category):
    return recommendation_service.get_recommendations(user_id, category)

3. Cache Invalidation Strategies

Implement robust cache invalidation:

python

# Time-based invalidation (TTL)
r.setex("session:abc123", 1800, session_data)  # 30 minutes

# Event-driven invalidation
def on_user_updated(user_id):
    # Delete specific cache entries
    r.delete(f"user:{user_id}")
    r.delete(f"user:{user_id}:profile")

    # Delete pattern-matched keys (use with caution)
    keys = r.keys(f"user:{user_id}:*")
    if keys:
        r.delete(*keys)

# Tag-based invalidation
def set_with_tags(key, value, ttl, tags):
    pipe = r.pipeline()
    pipe.setex(key, ttl, value)
    for tag in tags:
        pipe.sadd(f"tag:{tag}", key)
    pipe.execute()

def invalidate_by_tag(tag):
    keys = r.smembers(f"tag:{tag}")
    if keys:
        pipe = r.pipeline()
        pipe.delete(*keys)
        pipe.delete(f"tag:{tag}")
        pipe.execute()

# Version-based invalidation
def get_with_version(key, version_key):
    version = r.get(version_key) or "1"
    versioned_key = f"{key}:v{version}"
    return r.get(versioned_key)

def invalidate_version(version_key):
    r.incr(version_key)  # Increment version, old keys expire naturally

4. Redis Cluster Configuration

Configure Redis Cluster for scalability:

conf

# redis-cluster.conf
port 7000
cluster-enabled yes
cluster-config-file nodes-7000.conf
cluster-node-timeout 5000
appendonly yes
appendfsync everysec

# Memory management
maxmemory 4gb
maxmemory-policy allkeys-lru

# Persistence
save 900 1
save 300 10
save 60 10000

# Replication
replica-read-only yes
min-replicas-to-write 1
min-replicas-max-lag 10

bash

# Create cluster
redis-cli --cluster create \
  127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 \
  127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
  --cluster-replicas 1

# Check cluster status
redis-cli -c -p 7000 cluster info
redis-cli -c -p 7000 cluster nodes

# Rebalance slots
redis-cli --cluster rebalance 127.0.0.1:7000

5. Redis Sentinel for High Availability

Configure Sentinel for automatic failover:

conf

# sentinel.conf
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel auth-pass mymaster <password>
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1

# Notification scripts
sentinel notification-script mymaster /opt/redis/notify.sh
sentinel client-reconfig-script mymaster /opt/redis/reconfig.sh

python

# Python client with Sentinel
from redis.sentinel import Sentinel

sentinel = Sentinel([
    ('sentinel1.example.com', 26379),
    ('sentinel2.example.com', 26379),
    ('sentinel3.example.com', 26379)
], socket_timeout=0.1)

# Get master
master = sentinel.master_for('mymaster', socket_timeout=0.1)
master.set('key', 'value')

# Get replica for reads
replica = sentinel.slave_for('mymaster', socket_timeout=0.1)
value = replica.get('key')

6. Eviction Policy Configuration

Configure optimal eviction policies:

conf

# LRU - Least Recently Used (general purpose)
maxmemory-policy allkeys-lru

# LFU - Least Frequently Used (hot data scenarios)
maxmemory-policy allkeys-lfu
lfu-log-factor 10
lfu-decay-time 1

# Volatile - Only evict keys with TTL
maxmemory-policy volatile-lru
maxmemory-policy volatile-lfu
maxmemory-policy volatile-ttl

# No eviction - Return errors when full
maxmemory-policy noeviction

7. Cache Performance Monitoring

Monitor cache health and performance:

bash

# Redis INFO command
redis-cli INFO stats
redis-cli INFO memory
redis-cli INFO replication
redis-cli INFO clients

# Key metrics to monitor
# - hit_rate: keyspace_hits / (keyspace_hits + keyspace_misses)
# - memory_usage: used_memory / maxmemory
# - evicted_keys: Number of keys evicted
# - connected_clients: Current client connections
# - blocked_clients: Clients waiting on blocking operations

python

# Calculate cache hit rate
info = r.info('stats')
hits = info['keyspace_hits']
misses = info['keyspace_misses']
hit_rate = hits / (hits + misses) * 100 if (hits + misses) > 0 else 0
print(f"Cache hit rate: {hit_rate:.2f}%")

# Memory analysis
memory_info = r.info('memory')
print(f"Used memory: {memory_info['used_memory_human']}")
print(f"Peak memory: {memory_info['used_memory_peak_human']}")
print(f"Fragmentation ratio: {memory_info['mem_fragmentation_ratio']}")

MCP Server Integration

This skill can leverage the following MCP servers:

Server	Description	Installation
mcp-redis (Official)	Redis data management	GitHub
Redis Cloud Admin API	Cloud Redis management	See Redis documentation

Best Practices

Cache Design

Key naming conventions - Use consistent, hierarchical naming (e.g., entity:id:attribute)
TTL strategy - Always set TTLs to prevent unbounded growth
Serialization - Use efficient formats (MessagePack, Protocol Buffers)
Hot key handling - Shard hot keys or use local caching

Data Consistency

Cache-aside for reads - Safest pattern for most use cases
Write-through for consistency - When consistency is critical
Eventual consistency - Accept staleness for performance
Version tagging - Track data versions for invalidation

Performance

Pipeline commands - Batch multiple operations
Connection pooling - Reuse connections
Avoid large keys - Keep values under 100KB
Use appropriate data structures - Hashes over JSON strings for partial updates

Process Integration

This skill integrates with the following processes:

caching-strategy-design.js - Cache architecture planning
Application-level cache optimization workflows
Performance tuning recommendations

Output Format

When executing operations, provide structured output:

json

{
  "operation": "analyze-cache",
  "status": "success",
  "metrics": {
    "hitRate": 94.5,
    "missRate": 5.5,
    "evictionRate": 0.02,
    "memoryUsage": {
      "used": "3.2GB",
      "peak": "3.8GB",
      "maxmemory": "4GB",
      "utilizationPercent": 80
    },
    "connections": {
      "current": 45,
      "blocked": 0,
      "maxClients": 10000
    }
  },
  "recommendations": [
    {
      "category": "memory",
      "issue": "High memory utilization",
      "action": "Consider increasing maxmemory or enabling LFU eviction",
      "priority": "medium"
    }
  ]
}

Error Handling

Common Issues

Error	Cause	Resolution
`OOM command not allowed`	Memory limit reached	Increase maxmemory or enable eviction
`CLUSTERDOWN`	Cluster not available	Check cluster health, majority nodes
`MOVED`	Key on different node	Use cluster-aware client
`BUSY`	Lua script running	Wait or kill script with SCRIPT KILL
`LOADING`	Redis loading from disk	Wait for load to complete

Constraints

Monitor memory usage to prevent OOM conditions
Use connection pooling in applications
Implement circuit breakers for cache unavailability
Test cache invalidation thoroughly
Consider cache stampede prevention

Maintainer

a5c-ai Core maintainer

Source details

Full Name: a5c-ai/babysitter
Branch: main
Path in repo: library/specializations/performance-optimization/skills/distributed-caching
License: MIT License
Topics: claude-code agent-skills claude-code-skills ai-agents claude-skills vibe-coding agentic-workflow agentic-ai ai-automation agent-orchestration babysitter trustworthy-ai

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

a5c-ai/babysitter

gsd-tools

Central utility skill for GSD operations. Provides config parsing, slug generation, timestamps, path operations, and orchestrates calls to other specialized skills. Acts as the unified entry point that the original gsd-tools.cjs provided via its lib/ modules (commands, config, core, init).

514 31

Explore

a5c-ai/babysitter

model-profile-resolution

Resolve model profile (quality/balanced/budget) at orchestration start and map agents to specific models. Enables cost/quality tradeoffs by selecting appropriate AI models for each agent role.

514 31

Explore

a5c-ai/babysitter

verification-suite

Plan structure validation, phase completeness checks, reference integrity verification, and artifact existence confirmation. Provides the structured verification layer ensuring GSD artifacts are well-formed and complete.

514 31

Explore

a5c-ai/babysitter

state-management

STATE.md reading, writing, and field-level updates. Provides cross-session state persistence via .planning/STATE.md with structured fields for current task, completed phases, blockers, decisions, and quick tasks.

514 31

Explore

a5c-ai/babysitter

git-integration

Git commit patterns, formats, and conventions for GSD methodology. Provides atomic commits per task, structured commit messages, planning file commits, branch management, and milestone tag operations.

514 31

Explore

a5c-ai/babysitter

frontmatter-parsing

YAML frontmatter parsing and manipulation for .planning/ documents. Provides read, write, update, query, and validation operations on frontmatter blocks in GSD markdown artifacts.

514 31

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

distributed-caching

Overview

Prerequisites

Capabilities

1. Redis Data Structure Design

2. Caching Patterns Implementation

3. Cache Invalidation Strategies

4. Redis Cluster Configuration

5. Redis Sentinel for High Availability

6. Eviction Policy Configuration

7. Cache Performance Monitoring

MCP Server Integration

Best Practices

Cache Design

Data Consistency

Performance

Process Integration

Output Format

Error Handling

Common Issues

Constraints

Recommended Agent Skills

gsd-tools

model-profile-resolution

verification-suite

state-management

git-integration

frontmatter-parsing