Agent skills
verification-protocol

Agent skill

verification-protocol

Trigger Pattern Always (used by all verifier agents) - Inject Into security-verifier agents (Phase 5)

View SKILL.md on GitHub Repository

Stars 215

Forks 33

Install this agent skill to your Project

npx add-skill https://github.com/PlamenTSV/plamen/tree/main/agents/skills/sui/verification-protocol

SKILL.md

Verification Protocol (Sui Move)

Trigger Pattern: Always (used by all verifier agents) Inject Into: security-verifier agents (Phase 5) Purpose: Prove hypotheses TRUE or FALSE using Sui Move test framework with test_scenario PoC code.

Evidence Source Tracking (MANDATORY)

CRITICAL: For EVERY piece of evidence used in verification, you MUST tag its source. Evidence from mocks or unverified external packages CANNOT support a REFUTED verdict.

Evidence Source Tags

Tag	Meaning	Valid for REFUTED?
[PROD-ONCHAIN]	Production Sui object data (via Sui Explorer or RPC)	YES
[PROD-SOURCE]	Verified source from Sui Explorer / published package	YES
[PROD-PUBLISHED]	Test against published package bytecode	YES
[CODE]	Audited codebase (in-scope source)	YES
[MOCK]	Mock/test modules or objects	NO
[EXT-UNV]	External, unverified package behavior	NO
[DOC]	Documentation/spec only	NO (needs verification)

Evidence Audit Table (REQUIRED in every verification output)

Before ANY verdict, fill this table:

markdown

### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|
| "External package returns X" | Mock module | [MOCK] | NO |
| "Object ownership is Y" | sources/module.move:123 | [CODE] | YES |
| "Shared object state is Z" | Sui Explorer object view | [PROD-ONCHAIN] | YES |

Mock Rejection Rule

AUTOMATIC OVERRIDE: If ANY evidence supporting REFUTED has tag [MOCK] or [EXT-UNV]:

CANNOT return REFUTED
MUST return CONTESTED
Triggers production verification

Example:

markdown

## Verdict: REFUTED -> CONTESTED (mock evidence override)

### Evidence Audit
| Claim | Source | Tag | Valid? |
|-------|--------|-----|--------|
| "External module validates input" | test_helper.move:45 | [MOCK] | NO |

**Override reason**: REFUTED verdict relies on mock behavior at test_helper.move:45.
Production package behavior is UNVERIFIED. Must fetch published package source.

Pre-Verification Understanding

Before writing ANY test code, you MUST answer:

Question 1: What is the EXACT bug?

NOT: "Object ownership is wrong"
NOT: "Access control is missing"
NOT: "State is inconsistent"

YES: "Function [X] in module [Y] accepts shared object [Z] as `&mut` without
      verifying caller holds [CapabilityType], allowing any address to mutate
      field [W] at line [N]"

Question 2: What OBSERVABLE difference proves it?

NOT: "State changed"
NOT: "Object was modified"

YES: "Before exploit: pool.total_supply = 1000, attacker_balance = 0
      After exploit: pool.total_supply = 1000, attacker_balance = 500
      Expected: transaction should have aborted with ENotAuthorized"

Question 3: What is the EXACT assertion?

NOT: assert!(exploit_worked, 0)

YES: assert!(coin::value(&stolen_coin) > 0, ERR_EXPLOIT_FAILED)
 OR: // Transaction should abort -- if it succeeds, the bug exists
 OR: assert!(state_after.field != state_before.field, ERR_STATE_UNCHANGED)

If you cannot answer all three -> ASK FOR CLARIFICATION

Pre-PoC Feasibility Gates (MANDATORY)

Before writing test code, verify these two gates. If either FAILS, adjust the hypothesis.

Gate F1: Reachability

Trace a call path from a permissionless entry point to the vulnerable code.

Entry point identified (public/external/entry function)
Call path traced through intermediary functions
All access checks on the path are passable by the attacker profile

If NO entry point reaches the vulnerable code → UNREACHABLE → FALSE_POSITIVE. If reachable only through a restricted path → document the restriction, adjust likelihood.

Gate F2: Math Bounds

Substitute real-world value domains into the expression that triggers the bug.

Parameter domains identified (token decimals, max supply, TVL range, fee range, time bounds)
Expression evaluated at worst-case feasible inputs
Result crosses the bug threshold

If the bug requires values outside feasible domains → INFEASIBLE → FALSE_POSITIVE. If feasible only at extreme but realistic parameters → document the threshold, proceed with adjusted severity.

Both gates PASS → proceed to PoC. Either gate FAILS → document and stop.

Test File Templates

See templates.md in this directory for all Sui Move test templates (Templates 1-6: shared object mutation, capability theft, dynamic fields, object wrapping, PTB exploit, concurrent access).

Interpreting Results

Test PASSES -> Bug CONFIRMED

The assertion that "proves the bug" succeeded.

Test FAILS -> Check Why

Failure	Meaning	Action
Abort with error code	Function validation rejected the action	Check if rejection IS the bug or a fix
`test_scenario::take_from_sender` fails	Object not at expected address	Check transfer logic in setup
`test_scenario::take_shared` fails	Shared object not published	Check initialization creates shared objects
Type mismatch	Wrong object type taken from scenario	Fix type parameters
Arithmetic abort (overflow/underflow)	Math operation failed	Check if this IS the bug or setup error
Borrow checker error (compile)	Cannot borrow object mutably	Restructure test to respect Move borrow rules

Iteration Protocol

Attempt 1: Direct implementation of test strategy from hypothesis.

Attempt 2: Adjust parameters:

Different coin amounts (larger/smaller, edge values like 0, 1, u64::MAX)
Different transaction ordering (swap next_tx blocks)
Different actor addresses
Different object states (empty pool, full pool, single-user, multi-user)

Attempt 3: Re-examine assumptions:

Are shared objects properly published in setup?
Are capability objects at the right addresses?
Is the module's initialization complete (all shared objects created)?
Are type parameters correct (generic type instantiation)?
Does the function require a Clock or TxContext argument not provided?

After 5 attempts: If still fails -> FALSE_POSITIVE with documented reasoning.

Severity Determination

CRITICAL

Direct fund theft (Coin drain from shared pools)
Unauthorized admin capability acquisition
Arbitrary package upgrade (if upgrade cap compromised and no timelock)
No special prerequisites needed
Attacker profits significantly

HIGH

Fund loss with specific setup (object pre-creation, ordering dependency)
Broken core functionality (deposits, withdrawals, swaps, liquidations)
Shared object state corruption affecting all users
Significant TVL at risk

MEDIUM

Limited fund loss under specific conditions
Object state corruption (non-fund data)
Edge cases with real impact at design limits
Dynamic field pollution affecting protocol behavior
Moderate value at risk

LOW

Negligible direct impact
Extreme edge cases only
Admin-controlled risk (with multisig governance)
View function / event emission issues
Stranded non-value objects

Exchange Rate Finding Severity (MANDATORY)

CRITICAL: Before assigning severity to ANY finding affecting share/asset ratios or exchange rates, you MUST complete this quantitative analysis.

Required Quantitative Analysis

For findings affecting exchange rates, fill in this table:

Metric	Value	Source
Protocol TVL	[X SUI or USD]	Production or documented estimate
Attack cost	[Y]	Calculated from attack steps (gas, tokens, opportunity)
Attacker profit	[Z]	Calculated (extraction - cost)
Victim loss per user	[W]	Calculated per affected user
Affected user count	[N]	one / some / all
Profit ratio	[Z/Y]	Attacker profit / attack cost

Severity Calculation

Step 1: Calculate total impact = W * N (victim loss * affected users) Step 2: Calculate profitability = Z/Y (attacker profit / cost) Step 3: Apply severity matrix:

Total Impact	Profitability > 2x	Profitability 1-2x	Profitability < 1x
> $100,000	CRITICAL	HIGH	HIGH
$10,000 - $100,000	HIGH	HIGH	MEDIUM
$1,000 - $10,000	HIGH	MEDIUM	MEDIUM
< $1,000	MEDIUM	LOW	LOW

What NOT to Do

"This enables extraction" (qualitative, no numbers)
"Attacker can profit significantly" (undefined)
"Loss of funds possible" (unquantified)

What TO Do

"Attacker profits 500,000 SUI ($500,000) from 1,000 SUI ($1,000) investment"
"Each victim loses up to 2% of deposit value, affecting all pool users"
"Total extractable value: $500,000 with 500x profit ratio -> CRITICAL"

Design Flaw Severity Escalation

When a finding is classified as a "design flaw" rather than an exploit, apply this escalation check:

Criterion	YES/NO
Risk-free for the attacker (no capital at risk, or attacker profits even if partial)
Repeatable (can be executed on every occurrence of a triggering event)
Scales with protocol usage (impact grows with TVL, user count, or time)
No mitigation without code change (off-chain monitoring cannot prevent, only detect)

If ALL 4 criteria are YES: Severity floor = MEDIUM (cannot be rated LOW or Informational) If 3 of 4 criteria are YES: Recheck -- the remaining criterion may not actually block the attack at scale

Advanced Protocol Reference: See advanced.md for RAG queries, RAG confidence override, chain hypothesis protection, Sui-specific testing considerations, dual-perspective verification, realistic parameter validation, anti-downgrade guard, new observations, error trace output, and bidirectional role analysis.

Output Format

CONFIRMED

markdown

## Verdict: CONFIRMED

### Bug Mechanism Verified
{Explain what the test_scenario test proves in 2-3 sentences}

### Test Code
{Full Move test function}

### Test Output
{Relevant assertions and logged values from `sui move test`}

### Key Evidence
| Metric | Value |
|--------|-------|
| Before | {value} |
| After | {value} |
| Expected | {value} |
| Difference | {calculation} |

### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|

### RAG Evidence
- **Attack Vectors Consulted**: [list]
- **Similar Exploits Found**: [count]
- **Historical Precedent**: [description]

### Severity: {LEVEL}
{Justification in 1-2 sentences}

FALSE_POSITIVE

markdown

## Verdict: FALSE_POSITIVE

### Attempts Made

**Attempt 1:**
- Approach: {description}
- Result: {what happened -- include abort codes}
- Learning: {insight}

**Attempt 2:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}

**Attempt 3:**
- Approach: {description}
- Result: {what happened}
- Learning: {insight}

### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|

### Why It Is Not a Bug
{Explain the actual behavior and why hypothesis was wrong in 2-3 sentences}

### Error Trace
- **Failure Type**: {type}
- **Location**: {location}
- **Error Code**: {code}
- **State at Failure**: {state}
- **Investigation Question**: {question}

CONTESTED

markdown

## Verdict: CONTESTED

### Evidence Status
| Checkpoint | Status | Details |
|------------|--------|---------|
| External package behavior verified against PRODUCTION | YES/NO | {details} |
| All entry functions checked | YES/NO | {details} |
| Object ownership model verified | YES/NO | {details} |
| Shared object access control confirmed | YES/NO | {details} |

### Evidence Audit
| Claim | Evidence Source | Tag | Valid for REFUTED? |
|-------|-----------------|-----|-------------------|

### Why This Cannot Be REFUTED
{Explain what evidence is missing to definitively rule out the bug}

### Escalation Required
- [ ] Fetch published package source for {external dep}
- [ ] Dump production object state for {object}
- [ ] Check additional entry function paths: {list}

### Error Trace
- **Failure Type**: {type}
- **Location**: {location}
- **Error Code**: {code}
- **State at Failure**: {state}
- **Investigation Question**: {question}

Insufficient Evidence (HALT CONDITIONS)

Before marking REFUTED, check ALL boxes:

External package behavior verified against PRODUCTION (not mock)
Attack path checked on ALL public entry functions that access the same shared objects
Profit calculated with attacker HOLDING tokens (not just transferring in)
Missing precondition documented (type: STATE / ACCESS / TIMING / EXTERNAL / BALANCE)
Searched other findings for matching postconditions (chain analysis integration)
Object ownership verified in source (not assumed from naming)
Capability access control verified for ALL shared object mutation paths
Dynamic field access patterns verified (correct key types, no collisions)

Evidence That Does NOT Count

"Mock module shows X" -- mocks are not production behavior
"Standard Coin<T>" -- may be wrapped in custom module with hooks/restrictions
"Attacker loses by sending coins" -- may profit via position held in pool
"Function is public(package)" -- may be callable via CPI from another module in the same package
"Requires AdminCap" -- AdminCap may have store ability and be transferable
"Attacker cannot acquire X" -- another finding may CREATE this condition
"Object is owned by admin" -- ownership may be transferable if object has store

Maintainer

PlamenTSV Core maintainer

Source details

Full Name: PlamenTSV/plamen
Branch: main
Path in repo: agents/skills/sui/verification-protocol
License: MIT License
Topics: claude-code ai-agent security-audit solana ethereum solidity aptos smart-contract-audit sui web3-security

Featured Tools

Join Our Newsletter

215 33

Explore

PlamenTSV/plamen

dex-integration-security

215 33

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Verification Protocol (Sui Move)

Evidence Source Tracking (MANDATORY)

Evidence Source Tags

Evidence Audit Table (REQUIRED in every verification output)

Mock Rejection Rule

Pre-Verification Understanding

Question 1: What is the EXACT bug?

Question 2: What OBSERVABLE difference proves it?

Question 3: What is the EXACT assertion?

Pre-PoC Feasibility Gates (MANDATORY)

Gate F1: Reachability

Gate F2: Math Bounds

Test File Templates

Interpreting Results

Test PASSES -> Bug CONFIRMED

Test FAILS -> Check Why

Iteration Protocol

Severity Determination

CRITICAL

HIGH

MEDIUM

LOW

Exchange Rate Finding Severity (MANDATORY)

Required Quantitative Analysis

Severity Calculation

What NOT to Do

What TO Do

Design Flaw Severity Escalation

Output Format

CONFIRMED

FALSE_POSITIVE

CONTESTED

Insufficient Evidence (HALT CONDITIONS)

Evidence That Does NOT Count

Recommended Agent Skills

integration-hazard-research

outcome-determinism

governance-attack-vectors

vault-accounting

lending-protocol-security

dex-integration-security