Agent skill
scenario
Author and manage holdout scenarios for behavioral validation. Scenarios are stored in .agents/holdout/ where implementing agents cannot see them. Triggers: "$scenario", "holdout", "behavioral scenario", "create scenario", "list scenarios".
Install this agent skill to your Project
npx add-skill https://github.com/boshu2/agentops/tree/main/skills-codex/scenario
SKILL.md
Scenario Skill
Author and manage holdout scenarios for Stage 4 behavioral validation.
Scenarios are holdout — implementing agents cannot see them (enforced by hook).
Evaluator agents validate code against scenarios during STEP 1.8 in $validation.
Execution Steps
Step 1: Initialize
ao scenario init # Creates .agents/holdout/ with README
Step 2: Author Scenario
Write a scenario JSON to .agents/holdout/<id>.json following schemas/scenario.v1.schema.json:
{
"id": "s-2026-04-05-001",
"version": 1,
"date": "2026-04-05",
"goal": "User can authenticate with valid credentials",
"narrative": "A user visits login, enters valid credentials, expects dashboard redirect.",
"expected_outcome": "Dashboard loads, session cookie is HttpOnly and Secure.",
"acceptance_vectors": [
{"dimension": "correctness", "threshold": 0.9, "check": "grep -q 'HttpOnly' headers"},
{"dimension": "performance", "threshold": 0.7}
],
"satisfaction_threshold": 0.8,
"scope": {
"files": ["src/auth/middleware.go"],
"functions": ["Authenticate"],
"behaviors": ["login flow"]
},
"source": "human",
"status": "active"
}
Step 3: Validate
ao scenario validate # Checks all scenarios against schema
Step 4: List
ao scenario list # All scenarios
ao scenario list --status active # Active only
Key Rules
- Scenarios use satisfaction scoring (0.0-1.0), not boolean pass/fail
- Scenarios should be written by humans or evaluator agents, NEVER by the implementing agent
sourcefield tracks provenance:human,agent,prod-telemetry- Agent-built specs (from
$implementStep 5c) useauto-*id prefix and live in.agents/specs/
See Also
$validation— STEP 1.8 consumes scenarios$vibe— Exposes satisfaction_score$implement— Step 5c generates agent-built specs
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
swarm
Spawn isolated Codex sub-agents for parallel task execution using the current runtime primitives. Triggers: "swarm", "spawn agents", "parallel work", "run in parallel", "parallel execution".
council
Multi-perspective review for Codex using the current sub-agent runtime. Triggers: "council", "get consensus", "multi-model review", "multi-perspective review", "council validate", "council brainstorm", "council research".
openai-docs
Use when the user asks how to build with OpenAI products or APIs and needs up-to-date official documentation with citations (for example: Codex, Responses API, Chat Completions, Apps SDK, Agents SDK, Realtime, model capabilities or limits); prioritize OpenAI docs MCP tools and restrict any fallback browsing to official OpenAI domains.
crank
Hands-free epic execution for Codex using wave-based sub-agents and lead-side validation. Triggers: "crank", "run epic", "execute epic", "run all tasks", "hands-free execution", "crank it".
pr-retro
Learn from PR outcomes. Analyzes accept/reject patterns and updates contribution lessons. Triggers: "pr retro", "learn from PR", "PR outcome", "why was PR rejected", "analyze PR feedback".
ratchet
Brownian Ratchet progress gates for RPI workflow. Check, record, verify. Triggers: "check gate", "verify progress", "ratchet status".
Didn't find tool you were looking for?