Agent skill

verify

Rigorous self-assessment checklist before marking any task as complete. Use when about to claim task completion, before final commit, when user asks "is it done?", or when transitioning from implementation to reporting. Prevents premature completion claims by requiring evidence for every assertion.

View SKILL.md on GitHub Repository

Stars 33

Forks 4

Install this agent skill to your Project

npx add-skill https://github.com/Jamie-BitFlight/claude_skills/tree/main/.claude/skills/verify

SKILL.md

Verification Protocol

Workflow Reference: See Master Workflow for how this skill fits into the verification stage of the agentic workflow.

STOP. You are NOT done yet. Generate this checklist and provide EVIDENCE for every item.

1. Task Type & Strategy

Type: FIX / FEATURE / REFACTOR / DOCS / INVESTIGATION
Strategy: Executable verification vs. Static verification?

2. The "WORKS" Check

mermaid

flowchart TD
    Start(["Begin WORKS Check -- Section 2"]) --> Q{"Task type?"}
    Q -->|"Executable code -- compiled, scripted, or CLI-run"| A1["Execution check<br>Terminal output showing successful run<br>(exit code 0 is NOT enough)"]
    Q -->|"Static asset -- docs, configs, analysis"| B1["Accuracy check<br>Verified against source code or schema?"]
    A1 --> A2["Real data check<br>Ran changed code path against real data<br>not just read the diff?"]
    A2 --> A3["Regression check<br>Evidence that existing tests still pass?"]
    A3 --> A4["Edge case check<br>Evidence of testing failure scenarios?"]
    A4 --> AEvidence["Record code evidence<br>execution output, real data test,<br>test results, edge case result"]
    B1 --> B2["Clarity check<br>Follows the established format?"]
    B2 --> B3["Validity check<br>Links and references resolve?"]
    B3 --> BEvidence["Record static evidence<br>accuracy check method,<br>format standard, link validation method"]
    AEvidence --> Done(["WORKS Check complete -- proceed to Section 3"])
    BEvidence --> Done

A. For Code (Executable)

Execution: Terminal output showing successful run? (Exit code 0 is NOT enough)
Real data: Ran the changed code path against real data, not just read the diff?
Regression: Evidence that existing tests still pass?
Edge Cases: Evidence of testing failure scenarios?

text

EVIDENCE:
- Execution output: [paste actual output]
- Real data test: [command run, input used, output observed]
- Test results: [paste test output]
- Edge case tested: [describe scenario and result]

B. For Static Assets (Docs, Configs, Analysis)

Accuracy: Verified against source code/schema?
Clarity: Does it follow the established format?
Validity: Do links/references resolve?

text

EVIDENCE:
- Accuracy check: [how verified]
- Format compliance: [standard followed]
- Links validated: [method used]

3. The "FIXED" Check

For bug fixes specifically:

Reproduction: Did I observe the pre-fix state?
Resolution: Does the original problem NO LONGER occur?

text

EVIDENCE:
- Pre-fix behavior: [what was observed]
- Post-fix behavior: [what is now observed]
- Regression test added: [yes/no, location]

4. Quality Gates

Pre-commit hooks passed?
Linting passed? (Necessary, but not sufficient)
Type checking passed? (if applicable)

text

EVIDENCE:
- Pre-commit: [output or "not configured"]
- Linting: [tool and result]
- Type check: [tool and result]

5. Proportional Response Check

If the task has an issue-classification field in its metadata, verify the response matched the issue type. If no issue-classification is present, mark N/A and proceed.

mermaid

flowchart TD
    Start(["Begin Proportional Response Check"]) --> Q1{"issue-classification<br>present in task metadata?"}
    Q1 -->|"absent"| Skip["SKIP -- existing WORKS/FIXED/Quality Gates apply"]
    Q1 -->|"present"| Q2{"Classification type?"}
    Q2 -->|"procedural"| P["Sweep completeness<br>Codebase search returns zero<br>remaining instances of the pattern"]
    Q2 -->|"defect"| D["Root cause addressed<br>Fix targets root cause from evidence chain<br>+ scenario in scenario-target succeeds"]
    Q2 -->|"recurring-pattern"| R["Guardrail added<br>New gate/check exists AND<br>covers the defect CLASS not just instance"]
    Q2 -->|"missing-guardrail"| M["Gate gap filled<br>Guardrail triggers in the<br>exposing scenario"]
    Q2 -->|"unbounded-design"| U["Design implemented<br>Matches chosen direction +<br>trade-offs documented"]
    P --> Evidence
    D --> Evidence
    R --> Evidence
    M --> Evidence
    U --> Evidence
    Skip --> Done(["Proportional Check complete"])
    Evidence["Record proportional evidence"] --> Done

text

EVIDENCE:
- Issue Classification: [type or "not classified"]
- Scenario Target: [scenario -> improvement, or "not specified"]
- Proportional Check: [PASS/FAIL/N/A]
- Check detail: [what was verified and result]

6. Agent Delegation Verification

When work was delegated to a sub-agent, the agent's success report is NOT evidence.

VCS diff reviewed: git diff shows the expected changes?
Changes verified: Read the modified files — content matches intent?
Tests run independently: Ran the verification command yourself, not trusting the agent's claim?

text

EVIDENCE:
- Agent report: [what agent claimed]
- VCS diff: [files changed, scope matches expectation]
- Independent verification: [command run, output observed]

If no agents were used, mark N/A and proceed.

7. Honesty Check

Did I verify the full scope?
Am I distinguishing between "should work" and "verified to work"?
Destination check: Did I read the target state after writing? (Tool output claiming success is not evidence — the state of the destination is.)
Can I answer YES to: "I have VALIDATED this output in its intended context"?

Rationalization Prevention

If any of these thoughts occur, STOP and run the verification command:

Rationalization	Response
"Should work now"	Run the verification command
"I'm confident"	Confidence is not evidence
"Just this once"	No exceptions
"Linter passed so build passes"	Linter does not check compilation
"Agent said success"	Verify independently (Section 6)
"I'm tired"	Exhaustion is not an excuse
"Partial check is enough"	Partial check proves nothing about the whole
"Different words so rule doesn't apply"	Spirit over letter

Red flags in your own output — if you catch yourself writing any of these, the gate has not been passed:

"should", "seems to", "looks correct"
Expressions of satisfaction before verification ("Done!", "Perfect!")
About to commit/push/PR without fresh command output in this message

The Golden Rule

If you cannot demonstrate it working in practice with evidence, it is NOT done.

Claim	Required Evidence
"Code works"	Terminal output showing execution against real data
"Tests pass"	Actual test output, not assumption
"Bug fixed"	Before/after comparison
"Data synced"	Read the destination after writing — not the tool output
"Docs accurate"	Cross-reference with source
"Config valid"	Validation command output
"Root cause fixed"	Evidence chain from grooming + fix addresses root cause claim
"Guardrail added"	New gate/check exists and triggers in exposing scenario
"Agent completed"	VCS diff reviewed + independent verification command run

Quick Reference

text

VERIFICATION SUMMARY:
Task Type: [FIX/FEATURE/REFACTOR/DOCS/INVESTIGATION]
Works Check: [PASS/FAIL] - Evidence: ___
Fixed Check: [PASS/FAIL/N/A] - Evidence: ___
Proportional Check: [PASS/FAIL/N/A] - Evidence: ___
Quality Gates: [PASS/FAIL] - Evidence: ___
Agent Delegation: [PASS/FAIL/N/A] - Evidence: ___
Honesty Check: [PASS/FAIL]

VERDICT: [COMPLETE / NOT COMPLETE - reason]

Maintainer

Jamie-BitFlight Core maintainer

Source details

Full Name: Jamie-BitFlight/claude_skills
Branch: main
Path in repo: .claude/skills/verify

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

Jamie-BitFlight/claude_skills

ccc

This skill should be used when code search is needed (whether explicitly requested or as part of completing a task), when indexing the codebase after changes, or when the user asks about ccc, cocoindex-code, or the codebase index. Trigger phrases include 'search the codebase', 'find code related to', 'update the index', 'ccc', 'cocoindex-code'.

33 4

Explore

Jamie-BitFlight/claude_skills

agent-browser

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

33 4

Explore

Jamie-BitFlight/claude_skills

delegate

Quick delegation template for sub-agent prompts. Use when assigning work to a sub-agent, before invoking the Agent tool, or when preparing prompts for specialized agents. Provides the WHERE-WHAT-WHY framework. For comprehensive delegation guidance, activate the agent-orchestration how-to-delegate skill.

33 4

Explore

Jamie-BitFlight/claude_skills

swarm-spawning

Spawn agents and teammates in Claude Code swarms. Use when choosing between subagents vs teammates, selecting agent types (Explore, Plan, general-purpose, plugin agents), configuring spawn backends (in-process, tmux, iterm2), or setting environment variables for spawned agents.

33 4

Explore

Jamie-BitFlight/claude_skills

knowledge-explorer

Manage the research/ knowledge base (KB) of tool and library research entries. Use when browsing KB topics, adding new research entries, updating existing entries with dated revisions, fetching GitHub repo metadata into a draft KB entry, or migrating old-format entries to skill-spec frontmatter. Triggers on tasks like "what do we have on X", "add this to the KB", "update the KB entry for Y", "fetch github info for owner/repo", or "migrate old entries".

33 4

Explore

Jamie-BitFlight/claude_skills

design-anti-patterns

Enforce anti-AI UI design rules based on the Uncodixfy methodology. Use when generating HTML, CSS, React, Vue, Svelte, or any frontend UI code. Prevents "Codex UI" — the generic AI aesthetic of soft gradients, floating panels, oversized rounded corners, glassmorphism, hero sections in dashboards, and decorative copy. Applies constraints from Linear/Raycast/Stripe/GitHub design philosophy: functional, honest, human-designed interfaces. Triggers on: UI generation, dashboard building, frontend component creation, CSS styling, landing page design, or any task producing visual interface code.

33 4

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Verification Protocol

1. Task Type & Strategy

2. The "WORKS" Check

A. For Code (Executable)

B. For Static Assets (Docs, Configs, Analysis)

3. The "FIXED" Check

4. Quality Gates

5. Proportional Response Check

6. Agent Delegation Verification

7. Honesty Check

Rationalization Prevention

The Golden Rule

Quick Reference

Recommended Agent Skills

ccc

agent-browser

delegate

swarm-spawning

knowledge-explorer

design-anti-patterns