Agent skills
langfuse-rate-limits

Agent skill

langfuse-rate-limits

Implement Langfuse rate limiting, batching, and backoff patterns. Use when handling rate limit errors, optimizing trace ingestion, or managing high-volume LLM observability workloads. Trigger with phrases like "langfuse rate limit", "langfuse throttling", "langfuse 429", "langfuse batching", "langfuse high volume".

View SKILL.md on GitHub Repository

Stars 1,803

Forks 241

Install this agent skill to your Project

npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/tree/main/plugins/saas-packs/langfuse-pack/skills/langfuse-rate-limits

SKILL.md

Langfuse Rate Limits

Overview

Handle Langfuse API rate limits with optimized SDK batching, exponential backoff with jitter, concurrent request limiting, and configurable sampling for ultra-high-volume workloads.

Prerequisites

Langfuse SDK installed and configured
High-volume trace workload (1,000+ events/minute)

Instructions

Step 1: Optimize SDK Batching Configuration

The Langfuse SDK batches events internally before sending. Tuning batch settings is the first defense against rate limits.

typescript

// v3 Legacy: Direct configuration
import { Langfuse } from "langfuse";

const langfuse = new Langfuse({
  flushAt: 50,           // Events per batch (default: 15, max ~200)
  flushInterval: 10000,  // Milliseconds between flushes (default: 10000)
  requestTimeout: 30000, // Timeout per batch request
});

// v4+: Configure via OTel span processor
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { NodeSDK } from "@opentelemetry/sdk-node";

const processor = new LangfuseSpanProcessor({
  exportIntervalMillis: 10000, // Flush interval
  maxExportBatchSize: 50,      // Events per batch
});

const sdk = new NodeSDK({ spanProcessors: [processor] });
sdk.start();

Step 2: Implement Retry with Exponential Backoff

For custom API calls (scores, datasets, prompts) that hit rate limits:

typescript

async function withRetry<T>(
  fn: () => Promise<T>,
  options: { maxRetries?: number; baseDelayMs?: number; maxDelayMs?: number } = {}
): Promise<T> {
  const { maxRetries = 5, baseDelayMs = 1000, maxDelayMs = 30000 } = options;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error: any) {
      const status = error?.status || error?.response?.status;

      // Only retry on rate limits (429) and server errors (5xx)
      if (attempt === maxRetries || (status && status < 429)) {
        throw error;
      }

      // Honor Retry-After header if present
      const retryAfter = error?.response?.headers?.["retry-after"];
      let delay: number;

      if (retryAfter) {
        delay = parseInt(retryAfter, 10) * 1000;
      } else {
        // Exponential backoff with jitter
        delay = Math.min(baseDelayMs * Math.pow(2, attempt), maxDelayMs);
        delay += Math.random() * 500; // Jitter
      }

      console.warn(`Rate limited. Retry ${attempt + 1}/${maxRetries} in ${Math.round(delay)}ms`);
      await new Promise((r) => setTimeout(r, delay));
    }
  }
  throw new Error("Unreachable");
}

// Usage with Langfuse client operations
const langfuse = new LangfuseClient();

await withRetry(() =>
  langfuse.score.create({
    traceId: "trace-123",
    name: "quality",
    value: 0.95,
    dataType: "NUMERIC",
  })
);

Step 3: Queue-Based Concurrency Limiting

Use p-queue to cap concurrent Langfuse API calls:

typescript

import PQueue from "p-queue";
import { LangfuseClient } from "@langfuse/client";

const langfuse = new LangfuseClient();

// Max 10 concurrent API calls, 50 per second
const queue = new PQueue({
  concurrency: 10,
  interval: 1000,
  intervalCap: 50,
});

// Queue score submissions
async function queueScore(params: {
  traceId: string;
  name: string;
  value: number;
}) {
  return queue.add(() =>
    langfuse.score.create({
      ...params,
      dataType: "NUMERIC",
    })
  );
}

// Queue dataset item creation
async function queueDatasetItem(datasetName: string, item: any) {
  return queue.add(() =>
    langfuse.api.datasetItems.create({
      datasetName,
      input: item.input,
      expectedOutput: item.expectedOutput,
    })
  );
}

// Monitor queue health
setInterval(() => {
  console.log(`Queue: ${queue.pending} pending, ${queue.size} queued`);
}, 10000);

Step 4: Configurable Sampling for Ultra-High Volume

When tracing volume exceeds rate limits, sample traces instead of dropping them:

typescript

import { observe, updateActiveObservation, startActiveObservation } from "@langfuse/tracing";

class TraceSampler {
  private rate: number;
  private windowCounts: number[] = [];
  private windowMs = 60000; // 1 minute window
  private maxPerWindow: number;

  constructor(sampleRate: number, maxPerMinute: number) {
    this.rate = sampleRate;
    this.maxPerWindow = maxPerMinute;
  }

  shouldSample(tags?: string[]): boolean {
    // Always sample errors
    if (tags?.includes("error") || tags?.includes("critical")) {
      return true;
    }

    // Check window limit
    const now = Date.now();
    this.windowCounts = this.windowCounts.filter((t) => t > now - this.windowMs);
    if (this.windowCounts.length >= this.maxPerWindow) {
      return false;
    }

    // Probabilistic sampling
    if (Math.random() > this.rate) {
      return false;
    }

    this.windowCounts.push(now);
    return true;
  }
}

// 10% sampling, max 1000 traces/minute
const sampler = new TraceSampler(0.1, 1000);

async function sampledOperation(name: string, fn: () => Promise<any>) {
  if (!sampler.shouldSample()) {
    return fn(); // Run without tracing
  }

  return startActiveObservation(name, async () => {
    updateActiveObservation({ metadata: { sampled: true } });
    return fn();
  });
}

Rate Limit Reference

Tier	Traces/min	Batch Size	Strategy
Hobby	~500	15	Default settings
Pro	~5,000	50	Increase `flushAt`
Team	~10,000	100	+ Queue-based limiting
Enterprise	Custom	Custom	+ Sampling

Error Handling

Error	Response	Action
`429 Too Many Requests`	`Retry-After: N`	Backoff for N seconds
`503 Service Unavailable`	Server overloaded	Backoff 30s+
Flush timeout	Large batch	Reduce `flushAt`, increase `requestTimeout`
Memory growth	Queue backup	Add `maxSize` to PQueue

Resources

Maintainer

jeremylongshore Core maintainer

Source details

Full Name: jeremylongshore/claude-code-plugins-plus-skills
Branch: main
Path in repo: plugins/saas-packs/langfuse-pack/skills/langfuse-rate-limits
License: Other
Topics: ai claude-code anthropic agent-skills automation mcp ai-agents developer-tools skills llm marketplace saas claude-code-plugins devops plugin-marketplace plugin-system

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

jeremylongshore/claude-code-plugins-plus-skills

dockerfile-generator

Dockerfile Generator - Auto-activating skill for DevOps Basics. Triggers on: dockerfile generator, dockerfile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

branch-naming-helper

Branch Naming Helper - Auto-activating skill for DevOps Basics. Triggers on: branch naming helper, branch naming helper Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

readme-generator

Readme Generator - Auto-activating skill for DevOps Basics. Triggers on: readme generator, readme generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

makefile-generator

Makefile Generator - Auto-activating skill for DevOps Basics. Triggers on: makefile generator, makefile generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

gitignore-generator

Gitignore Generator - Auto-activating skill for DevOps Basics. Triggers on: gitignore generator, gitignore generator Part of the DevOps Basics skill category.

1,803 241

Explore

jeremylongshore/claude-code-plugins-plus-skills

pre-commit-hook-setup

Pre Commit Hook Setup - Auto-activating skill for DevOps Basics. Triggers on: pre commit hook setup, pre commit hook setup Part of the DevOps Basics skill category.

1,803 241

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Langfuse Rate Limits

Overview

Prerequisites

Instructions

Step 1: Optimize SDK Batching Configuration

Step 2: Implement Retry with Exponential Backoff

Step 3: Queue-Based Concurrency Limiting

Step 4: Configurable Sampling for Ultra-High Volume

Rate Limit Reference

Error Handling

Resources

Recommended Agent Skills

dockerfile-generator

branch-naming-helper

readme-generator

makefile-generator

gitignore-generator

pre-commit-hook-setup