Agent skill
sla-monitor-generator
Generate SLA/SLO/SLI monitoring configurations for reliability tracking and error budget management. Activates for SLO setup, reliability targets, and error budget configuration.
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/sla-monitor-generator
SKILL.md
SLA Monitor Generator
Define and monitor Service Level Objectives (SLOs) and track error budgets.
SLO Definition Example
yaml
slos:
- name: api-availability
sli:
metric: http_requests_total
filter: status < 500
target: 99.9 # 99.9% availability
window: 30d
- name: api-latency
sli:
metric: http_request_duration_seconds
percentile: 99
target: 200 # 200ms at p99
window: 30d
- name: error-rate
sli:
metric: http_requests_total
filter: status >= 500
target: 0.1 # < 0.1% error rate
window: 30d
Prometheus AlertManager Rules
yaml
groups:
- name: slo-alerts
rules:
- alert: SLOBudgetBurnRate
expr: |
(
1 - (sum(rate(http_requests_total{status!~"5.."}[5m]))
/ sum(rate(http_requests_total[5m])))
) > 0.001 * 14.4
for: 2m
labels:
severity: critical
annotations:
summary: "Fast burn rate detected - 2% budget in 1 hour"
Best Practices
- ✅ Define SLIs based on user experience
- ✅ Set realistic SLO targets (99.9% not 100%)
- ✅ Track error budgets continuously
- ✅ Alert on burn rate, not just breaches
- ✅ Review and adjust SLOs quarterly
Didn't find tool you were looking for?