Agent skill
llm-council
Multi-model consensus using Karpathy LLM Council pattern for critical decisions
Install this agent skill to your Project
npx add-skill https://github.com/DNYoussef/context-cascade/tree/main/skills/orchestration/llm-council
SKILL.md
LLM Council Skill
LIBRARY-FIRST PROTOCOL (MANDATORY)
Before writing ANY code, you MUST check:
Step 1: Library Catalog
- Location:
.claude/library/catalog.json - If match >70%: REUSE or ADAPT
Step 2: Patterns Guide
- Location:
.claude/docs/inventories/LIBRARY-PATTERNS-GUIDE.md - If pattern exists: FOLLOW documented approach
Step 3: Existing Projects
- Location:
D:\Projects\* - If found: EXTRACT and adapt
Decision Matrix
| Match | Action |
|---|---|
| Library >90% | REUSE directly |
| Library 70-90% | ADAPT minimally |
| Pattern exists | FOLLOW pattern |
| In project | EXTRACT |
| No match | BUILD (add to library after) |
Purpose
Run 3-stage multi-model consensus for critical decisions where:
- Single-model hallucination risk is unacceptable
- Multiple perspectives improve decision quality
- High-stakes choices need validation
Architecture (Karpathy Pattern)
STAGE 1: COLLECT
+---> Claude ---> Response A
|
Query --+---> Gemini ---> Response B
|
+---> Codex ----> Response C
STAGE 2: RANK
Each model reviews others (anonymized)
Produces rankings with rationale
STAGE 3: SYNTHESIZE
Chairman aggregates rankings
Produces final answer with consensus score
When to Use
Perfect For:
- Architecture decisions
- Technology selection
- Critical bug triage
- Security assessment
- High-risk deployments
- Contentious design choices
Don't Use When:
- Simple, low-risk decisions
- Time-critical responses
- Single correct answer exists
- Cost is a concern (3x API usage)
Usage
Basic Council
/llm-council "Should we use microservices or monolith for this system?"
With Threshold
/llm-council "Which auth approach is best?" --threshold 0.75
With Chairman Override
/llm-council "Architecture decision" --chairman gemini
Command Pattern
bash scripts/multi-model/llm-council.sh "<query>" "<threshold>" "<chairman>"
Configuration
| Parameter | Default | Description |
|---|---|---|
| threshold | 0.67 | Minimum consensus score |
| chairman | claude | Model that synthesizes final answer |
| models | [claude, gemini, codex] | Participating models |
Consensus Scoring
- >0.80: Strong consensus - proceed with confidence
- 0.67-0.80: Moderate consensus - consider minority views
- <0.67: Weak consensus - escalate to human review
Memory Integration
Results stored to Memory-MCP:
- Key:
multi-model/council/decisions/{query_id} - Tags: WHO=llm-council, WHY=consensus-decision
Output Format
{
"query": "Original question",
"final_answer": {
"synthesis": "Combined answer...",
"chairman": "claude"
},
"consensus_score": 0.85,
"responses": {
"claude": "...",
"gemini": "...",
"codex": "..."
},
"rankings": [
{"model": "A", "rank": 1, "rationale": "..."}
]
}
Failure Modes
Deadlock (No Consensus)
- All models disagree
- Consensus < threshold
- Action: Store for human review
Model Unavailable
- One model times out
- Action: Continue with 2 models (2/3 quorum)
Chairman Failure
- Synthesis fails
- Action: Fallback to highest-ranked response
Integration Examples
Architecture Decision
const decision = await runCouncil(
"Microservices vs Monolith for our scale?",
{ threshold: 0.75 }
);
if (decision.consensus_score >= 0.75) {
proceed(decision.final_answer);
} else {
escalateToHuman(decision);
}
Security Assessment
const assessment = await runCouncil(
"Is this authentication approach secure?",
{ threshold: 0.80 }
);
// Higher threshold for security decisions
Sources
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
cognitive-mode
Comprehensive cognitive mode management skill for the VERILINGUA x VERIX x DSPy x GlobalMOO integration. Enables automatic mode selection, frame configuration, VERIX epistemic notation, and GlobalMOO optimization. Use this skill when configuring AI behavior for specific task types, optimizing prompt engineering, or ensuring epistemic consistency in responses.
bootstrap-loop
fix-bug
Fix bug command
clarity-linter
dependencies
when-mapping-dependencies-use-dependency-mapper
Didn't find tool you were looking for?