Agent skill
methodology
Analyze captured HTTP traffic, design CLI architecture, and implement the Python CLI package. Covers Phase 2 of the pipeline: parse raw-traffic.json, identify protocol type, map endpoints, design Click command groups, implement with parallel subagents. TRIGGER when: "analyze traffic", "design CLI", "implement CLI", "build CLI from network traffic", "generate API wrapper", "reverse engineer web API", "start Phase 2", raw-traffic.json exists and capture is complete, or after the capture skill finishes. DO NOT trigger for: traffic recording (use capture), test writing (use testing), or quality checks (use standards).
Install this agent skill to your Project
npx add-skill https://github.com/ItamarZand88/CLI-Anything-WEB/tree/main/cli-anything-web-plugin/skills/methodology
SKILL.md
CLI-Anything-Web Methodology (Phase 2)
Analyze captured traffic, design the CLI command structure, and implement the complete Python CLI package. This skill owns the core transformation from raw HTTP traffic to a production-ready CLI.
Prerequisites (Hard Gate)
Do NOT start unless:
-
raw-traffic.jsonexists (with WRITE operations, or read-only GET-only traffic) - Auth state was captured during Phase 1 (if the site requires auth)
If raw-traffic.json is missing or has no WRITE operations, invoke the
capture skill first.
Exception for read-only sites: If the site is genuinely read-only (search engine,
dashboard, analytics viewer with no create/update/delete), the trace may contain only
GET requests. In this case, note "read-only site — no write operations" in <APP>.md
and proceed. The generated CLI will have read-only commands (list, get, search) but
no create/update/delete commands. This is valid.
No-auth sites: If the target site requires no authentication (public API,
no login needed), the "Auth state captured" prerequisite does not apply. Note
"no-auth site" in <APP>.md and proceed.
Step A: Analyze (API Discovery)
Goal: Map raw traffic to a structured API model.
Process:
-
Read
traffic-analysis.jsonfirst (if it exists alongsideraw-traffic.json). This file is auto-generated byparse-trace.pyormitmproxy-capture.py→analyze-traffic.pyand contains pre-detected protocol type, auth pattern, endpoint grouping, GraphQL operations, batchexecute RPC IDs, and suggested CLI commands. Use it as a starting point — verify its findings and fill in anything marked "unknown" by readingraw-traffic.jsonmanually.Enhanced analysis (v1.3.0, when captured via mitmproxy-capture.py):
request_sequence: Timeline-ordered requests with auth flow detection (login → token → API calls)session_lifecycle: Cookie inventory, auth cookie identification, session pattern (cookie_auth/token_refresh/no_session)endpoint_sizes: Response body size classification per endpoint (small/medium/large) and total data transferred These fields are only present whenmitmproxy-capture.pywas used. If missing (has_timestamps: false), rely on manual analysis.
If
traffic-analysis.jsondoesn't exist, run the analyzer:bashpython ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \ <app>/traffic-capture/raw-traffic.json --summary -
Parse
raw-traffic.json(for details the analyzer couldn't extract) -
Group requests by base path (e.g.,
/api/v1/boards/,/api/v1/items/) -
For each endpoint group, identify:
- HTTP method (GET/POST/PUT/DELETE/PATCH)
- URL pattern (extract path parameters like
:id) - Query parameters and their types
- Request body schema (JSON fields, types, required/optional)
- Response body schema
- Authentication method (Bearer token, cookie, API key)
- Rate limiting signals (429 responses, retry-after headers)
-
Identify RPC protocol type -- classify the API transport:
Protocol Detection Signal Client Pattern REST Resource URLs ( /api/v1/boards/:id), standard HTTP methodsclient.pywith method-per-endpointGraphQL Single /graphqlendpoint,query/mutationin bodyclient.pywith query templatesgRPC-Web application/grpc-webcontent type, binary payloadsProto-based client Google batchexecute batchexecutein URL,f.req=body,)]}'\nprefixrpc/subpackage (seereferences/google-batchexecute.md)Custom RPC Single endpoint, method name in body, proprietary encoding Custom codec module Public REST API Documented /api/endpoints, OpenAPI spec, JSON responsesStandard client.pywith httpxPlain HTML (no framework) No SPA root, no framework globals, data in <table>/<div>client.pywith httpx + BeautifulSoup4This determines client architecture in Step B -- REST uses simple
client.py, non-REST protocols need a dedicatedrpc/subpackage with encoder/decoder/types. -
Detect data model:
- Entity types (boards, items, users, projects...)
- Relationships (board has many items, item belongs to board)
- ID formats (UUID, numeric, slug)
-
Detect auth pattern:
- Cookie-based sessions
- Bearer/JWT tokens
- OAuth refresh flow
- API key headers
- Browser-delegated auth: tokens embedded in page JavaScript (e.g.,
WIZ_global_data), not in HTTP headers. Requires CDP for initial cookies, HTTP for token extraction. Seereferences/auth-strategies.md"Browser-Delegated Auth" section. - No auth / public access: fully public API, no login required. CLI may optionally support API key auth for write operations (e.g., dev.to).
-
Write
<APP>.md-- software-specific SOP document
Output: <APP>.md with API map, data model, auth scheme.
References: traffic-patterns.md, google-batchexecute.md, ssr-patterns.md
Step B: Implement (Code Generation)
Study Existing CLIs First (Critical for Accuracy)
Before implementing, read an existing CLI that uses the same protocol as your target. These are battle-tested implementations that solved the same problems you'll face.
| Protocol | Reference CLI | Key files to read |
|---|---|---|
| Google batchexecute | notebooklm/agent-harness/cli_web/notebooklm/ |
core/rpc/encoder.py, core/rpc/decoder.py, core/client.py, core/auth.py |
| GraphQL + WAF | booking/agent-harness/cli_web/booking/ |
core/client.py (curl_cffi + GraphQL), core/auth.py (WAF tokens) |
| HTML scraping | futbin/agent-harness/cli_web/futbin/ |
core/client.py (httpx + BS4), commands/players.py |
| HTML + Cloudflare | producthunt/agent-harness/cli_web/producthunt/ |
core/client.py (curl_cffi impersonate) |
| REST API | unsplash/agent-harness/cli_web/unsplash/ |
core/client.py, commands/photos.py |
| Simple HTML | gh-trending/agent-harness/cli_web/gh_trending/ |
Minimal structure example |
How to use reference CLIs:
- Read the reference CLI's
core/client.py— understand the request/response pattern - Read
core/auth.py— copy the login_browser() pattern exactly for Google apps - Read
core/rpc/(for batchexecute) — understand encoder/decoder, DO NOT reinvent - Read
commands/— see how Click commands are structured, how --json works - Read
utils/helpers.py— see handle_errors(), _resolve_cli(), repl patterns
For batchexecute apps specifically, the notebooklm CLI is your bible:
- Copy the encoder/decoder architecture (don't reinvent the batchexecute wire format)
- Copy the auth token extraction pattern (CSRF, session ID, build label)
- Copy the cookie domain priority logic (critical for Israeli/international users)
- Adapt the RPC method IDs and param structures to your target app
The agent implementing the CLI MUST read these files before writing code. Use the
Agent tool to dispatch a research agent that reads
the reference implementation while you design the command structure.
Design Before You Code
Before writing any code, note the command structure in <APP>.md (10 minutes max):
- Map each API endpoint group to a Click command group:
/api/v1/boards/*→boardscommand group/api/v1/items/*→itemscommand group
- Map CRUD operations to subcommands (GET list →
list, GET single →get, POST →create, PUT/PATCH →update, DELETE →delete) - Note auth design:
auth login,auth status,auth refresh; credentials at~/.config/cli-web-<app>/auth.json - Note REPL design: bare command enters REPL, branded banner via
repl_skin.py
Goal: Generate the complete Python CLI package.
Package Structure
See HARNESS.md "Generated CLI Structure" for the complete package template.
Key points: cli_web/ namespace (NO __init__.py), <app>/ sub-package (HAS __init__.py),
core/, commands/, utils/, tests/ directories.
Step B.0: Scaffold Core Modules
Run the scaffold generator script to create all boilerplate files:
python ${CLAUDE_PLUGIN_ROOT}/scripts/scaffold-cli.py <app>/agent-harness \
--app-name <app> \
--protocol <rest|graphql|html-scraping|batchexecute> \
--http-client <httpx|curl_cffi> \
--auth-type <none|cookie|api-key|google-sso> \
--resources <comma-separated-resources> \
[--has-polling] [--has-context] [--has-partial-ids]
This generates exceptions.py, client.py skeleton, helpers.py, config.py, output.py, the CLI entry point with REPL, setup.py, conftest.py, repl_skin.py, and (for batchexecute) the rpc/ subpackage.
Fallback: If the script is unavailable, read
${CLAUDE_PLUGIN_ROOT}/skills/boilerplate/SKILL.mdand follow its instructions to scaffold manually.
After scaffolding, review the generated files and customize client.py with actual
endpoint methods from <APP>.md.
Implementation Rules
-
exceptions.py-- implement first. Required types: AppError (base), AuthError(recoverable), RateLimitError(retry_after), NetworkError, ServerError(status_code), NotFoundError. Seereferences/exception-hierarchy-example.pyfor the complete template. -
client.py-- HTTP client with exception mapping and auth retry:- HTTP library choice:
httpx(default) — for most sites (REST, GraphQL, batchexecute)curl_cffi— for Cloudflare-protected sites. Uses Chrome TLS fingerprint impersonation to bypass bot detection without cookies or auth:pythonUsefrom curl_cffi import requests as curl_requests resp = curl_requests.get(url, impersonate="chrome")curl_cffiwhen Phase 1 detects Cloudflare (cf-rayheader, challenge page). Addcurl_cffi, beautifulsoup4tosetup.pyinstead ofhttpx.
- Centralized auth header/cookie injection
- Automatic JSON parsing with response body verification
- Status code → exception mapping: 401/403→
AuthError, 404→NotFoundError, 429→RateLimitError, 5xx→ServerError - Auth retry: On
AuthError(recoverable=True), refresh tokens and retry once - Exponential backoff for rate limits (see
references/polling-backoff-example.py) - For apps with 3+ resource types: split into namespaced sub-clients (
client.notebooks.list(),client.sources.add()) - See
references/client-architecture-example.pyfor the full pattern
- HTTP library choice:
-
auth.py-- handles token storage, refresh, expiry. Implementation depends on auth type:For no-auth sites: DO NOT create
auth.py,session.py, or auth command groups. These files are dead code for public APIs and confuse users. The CLI should have NO auth-related files or commands. The only exception is if the site has optional auth (e.g., API key for write operations) — in that case, implement a minimal auth module.For browser-delegated auth (Google, Microsoft, etc.): Full playwright-cli login flow with cookie domain priority for international users.
See
references/auth-strategies.mdfor all patterns (browser login, cookie priority, API key, env var, context commands). Store cookies at~/.config/cli-web-<app>/auth.jsonwith chmod 600. -
Anti-bot resilient client construction (when detected in Phase 2):
- Extract session tokens via CDP first (cookies), then HTTP GET + HTML parsing (CSRF, session IDs)
- Never hardcode build labels (
bl), session IDs (f.sid), or CSRF tokens -- extract dynamically at runtime - Replicate same-origin headers captured during Phase 1 traffic (e.g.,
x-same-domain: 1for Google apps) - Implement auto-retry on 401/403: re-fetch homepage -> re-extract tokens -> retry once
- See
references/google-batchexecute.mdfor the complete Google pattern
-
RPC codec subpackage (for non-REST protocols like batchexecute): When the API uses a non-REST protocol, add
core/rpc/with:types.py-- method ID enum, URL constantsencoder.py-- request encoding (protocol-specific format)decoder.py-- response decoding (strip prefix, parse chunks, extract results) Theclient.pystill exists but delegates encoding/decoding torpc/.
-
Progress feedback -- Use
rich>=13.0spinners for operations >2s (suppress in --json mode). Seereferences/rich-output-example.py. -
JSON error output --
--jsonmode errors are JSON too, not plain text. Standard codes: AUTH_EXPIRED, RATE_LIMITED, NOT_FOUND, SERVER_ERROR, NETWORK_ERROR. Implement viautils/output.pyjson_error(). -
All commands use
handle_errors(json_mode)context manager — centralizes error handling, exit codes (1=user, 2=system, 130=interrupt), and JSON errors. Seereferences/helpers-module-example.py. -
Generation commands support
--wait,--retry N,--output path— for agent-scriptable end-to-end workflows. Seereferences/polling-backoff-example.py. -
Windows UTF-8 fix — Add at the top of
<app>_cli.pybefore any imports that print:pythonimport sys if sys.stdout.encoding and sys.stdout.encoding.lower() not in ("utf-8", "utf8"): try: sys.stdout.reconfigure(encoding="utf-8", errors="replace") except AttributeError: pass -
HTML table parsers MUST extract ALL visible columns — not just name/price, because missing fields in
--jsonoutput make the CLI useless for filtering and analysis. If the site shows version, club, nation, stats, skills, weak foot — parse all of them. Empty fields in--jsonoutput = incomplete parser. -
Entry point:
cli-web-<app>via setup.py console_scripts -
Namespace:
cli_web.* -
Copy
repl_skin.pyfrom plugin for consistent REPL experience -
utils/helpers.py-- shared CLI helpers (generate for every CLI):resolve_partial_id(partial, items)— prefix-match UUIDs for get/rename/deletehandle_errors(json_mode)— context manager replacing try/except in all commandsrequire_notebook(notebook_arg)— gets notebook ID from arg or persistent contextsanitize_filename(name)— safe filenames from artifact titlespoll_until_complete(check_fn)— exponential backoff pollingget_context_value(key)/set_context_value(key, value)— persistent context.json Seereferences/helpers-module-example.pyfor the complete module.
Not all helpers apply to every CLI. Include only what the CLI uses:
handle_errorsandprint_jsonare always needed.resolve_partial_idonly for UUID-based apps.require_notebook/context helpers only for apps with persistent context.poll_until_completeonly for generation/async operations.
REPL Implementation Rules (Critical)
These three bugs appear in almost every generated REPL. Get them right the first time:
1. Use shlex.split(), never line.split()
# ✓ Correct — handles quoted args: players search "messi" -> ['players', 'search', 'messi']
import shlex
args = shlex.split(line)
# ✗ Wrong — produces: ['players', 'search', '"messi"'] — quotes become part of the value
args = line.split()
2. Never pass **ctx.params to cli.main() in REPL dispatch
# ✓ Correct — preserve --json flag by prepending to args
repl_args = ["--json"] + args if ctx.obj.get("json") else args
cli.main(args=repl_args, standalone_mode=False)
# ✗ Wrong — ctx.params = {"json_mode": False} gets passed to Context.__init__()
# which doesn't accept it → TypeError: Context.__init__() got an unexpected
# keyword argument 'json_mode'
cli.main(args=args, standalone_mode=False, **ctx.params)
3. Keep _print_repl_help() in sync with the actual command surface
The _print_repl_help() function in <app>_cli.py is the user's first discovery surface — it's what they see when they type help in the REPL. It must mirror the real commands, including all key options. A REPL that shows outdated or incomplete help is confusing and makes the CLI feel broken.
# ✓ Correct — help lists actual options users can pass
def _print_repl_help():
_skin.info("Available commands:")
print(" players list [OPTIONS]")
print(" --position <GK|ST|CM|...> Filter by position")
print(" --rating-min N --rating-max N Rating range")
print(" --cheapest Sort cheapest first")
# ✗ Wrong — stale help doesn't mention new --position, --rating-min, etc.
def _print_repl_help():
print(" players list [--min-price N] List players with filters")
Rule: every time you add options to a command, update _print_repl_help() in the same commit.
4. Use @click.argument for positional REPL params, not @click.option("--x", required=True)
REPL commands show players search <query> in help. If query is a --query option,
users typing players search messi get "Error: Missing option '--query'".
Use positional arguments for natural command-line style:
# ✓ Correct — users type: players search messi OR players get 21610
@players.command()
@click.argument("query")
def search(query): ...
@players.command()
@click.argument("player_id", type=int)
def get(player_id): ...
# ✗ Wrong — users get an error unless they type: players search --query messi
@players.command()
@click.option("--query", required=True)
def search(query): ...
Rule of thumb: if a command takes a single required value that would be a positional arg
in a shell command (git checkout main, grep pattern), use @click.argument.
Use @click.option only for optional or named parameters (--rating-min, --platform).
Parallel Implementation (dispatch independent modules as subagents)
When the CLI has 3+ command groups (e.g., notebooks, sources, chat, artifacts), dispatch parallel subagents -- one per command module. Each agent gets:
- The
<APP>.mdAPI spec for its resource - The
client.pyandauth.pyinterfaces it depends on - Clear scope: "Implement
commands/notebooks.pywith list, get, create, delete"
Parallelization opportunities:
| Independent from each other | Dispatch in parallel |
|---|---|
commands/notebooks.py, commands/sources.py, commands/chat.py |
Yes -- each command file only depends on client.py |
rpc/encoder.py and rpc/decoder.py |
Yes -- encoder doesn't depend on decoder |
auth.py and models.py |
Yes -- no shared logic |
client.py and commands/* |
No -- commands depend on client |
<app>_cli.py (entry point) |
Last -- imports all commands, write after they're done |
Implementation order (with maximum parallelism):
Phase A (sequential): Write core foundation
exceptions.py → client.py → auth.py (if needed) → models.py
Phase B (parallel): Dispatch ALL independent work simultaneously
┌─ Agent 1: commands/notebooks.py
├─ Agent 2: commands/sources.py
├─ Agent 3: commands/chat.py
├─ Agent 4: commands/artifacts.py
├─ Agent 5: rpc/encoder.py + rpc/decoder.py (if non-REST)
└─ Agent 6 (background): test_core.py (unit tests for core modules)
All run concurrently — each only depends on Phase A modules
Phase C (sequential): Wire everything together
utils/helpers.py → <app>_cli.py → __main__.py → setup.py → copy repl_skin.py
Key parallelism rules:
- Dispatch independent command modules as parallel subagents (one per
commands/*.pyfile) - Start unit test writing as a background agent during command implementation
- Entry point (
<app>_cli.py,setup.py) must come last (depends on all commands)
Mandatory Smoke Check (Before Testing Phase)
Before invoking testing, install (pip install -e .) and verify:
cli-web-<app> --helploadscli-web-<app> auth status --jsonshows valid (if auth-required)cli-web-<app> <resource> list --jsonreturns real data- One WRITE command works (if applicable)
Red flags — fix before testing:
wrb.fr,af.httprmin output → decoder broken[]ornullwhere data expected → wrong params or client-side operation- Wrong field values (e.g., "3" instead of prompt text) → parser index mismatch
- Null write response → may be client-side, see
references/google-batchexecute.md"Client-Side Operations"
Update phase state:
python ${CLAUDE_PLUGIN_ROOT}/scripts/phase-state.py complete <app> \
--phase methodology --output <app>/agent-harness/
Next Step
When implementation is complete and the smoke check passes, invoke the testing
skill to plan and write tests.
Do NOT skip testing -- every CLI must have comprehensive tests before publishing.
Companion Skills
| Skill | When it activates |
|---|---|
capture |
Phase 1 -- traffic recording (prerequisite for this skill) |
testing |
Phase 3 -- test writing, documentation |
standards |
Phase 4 -- publish, verify, smoke test |
Integration
| Relationship | Skill |
|---|---|
| Preceded by | capture (Phase 1) |
| Followed by | testing (Phase 3) |
| References | traffic-patterns.md, auth-strategies.md, google-batchexecute.md, ssr-patterns.md, exception-hierarchy-example.py, client-architecture-example.py, polling-backoff-example.py, rich-output-example.py |
Reference Files
references/traffic-patterns.md-- Common API patterns (REST, GraphQL, RPC)references/auth-strategies.md-- Auth implementation strategiesreferences/google-batchexecute.md-- Google batchexecute RPC protocol specreferences/ssr-patterns.md-- SSR framework patterns and data extraction strategiesreferences/exception-hierarchy-example.py-- Complete exception hierarchy with HTTP status mappingreferences/client-architecture-example.py-- Namespaced sub-client pattern with auth retryreferences/polling-backoff-example.py-- Exponential backoff polling and rate-limit retryreferences/rich-output-example.py-- Rich progress bars, JSON error responses, table formatting
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
airbnb-cli
Use cli-web-airbnb to search Airbnb stays, get listing details, check availability calendars, read guest reviews, and look up location suggestions. Invoke this skill whenever the user asks about Airbnb accommodations, vacation rentals, listing prices, availability, guest reviews, or wants to search for places to stay. Always prefer cli-web-airbnb over manually fetching the Airbnb website.
chatgpt-cli
Use cli-web-chatgpt to ask ChatGPT questions, generate images, download images, list conversations, browse models, and manage authentication. Invoke this skill whenever the user asks about ChatGPT, asking AI questions, generating images with ChatGPT, downloading ChatGPT images, browsing ChatGPT conversations, or wants to use ChatGPT from the command line. Always prefer cli-web-chatgpt over manually browsing chatgpt.com.
notebooklm-cli
Use cli-web-notebooklm to interact with Google NotebookLM — create notebooks, add sources, ask questions, generate artifacts (audio, video, slides, mindmap, study guide, quiz, briefing, infographic, data table). Invoke this skill whenever the user asks about NotebookLM, wants to create notebooks, add sources to a notebook, ask a notebook questions, generate study materials, create presentations, podcasts, or manage NotebookLM content programmatically. Always prefer cli-web-notebooklm over manually browsing NotebookLM.
unsplash-cli
Use cli-web-unsplash to answer questions about Unsplash photos, search for free images by keyword, download photos, browse photo topics and collections, view photographer profiles, get photo details (EXIF, location, tags), and discover random photos. Invoke this skill whenever the user asks about Unsplash, free stock photos, searching for images, downloading images, photo topics, photographer profiles, photo collections, or wants to find or download images by keyword, orientation, or color. Always prefer cli-web-unsplash over manually fetching the Unsplash website.
futbin-cli
Use cli-web-futbin to answer questions about EA FC Ultimate Team players, prices, player comparison, SBCs, evolutions, config, market data, popular/trending players, newly released cards, price history, finding cheap deals, market analysis, undervalued players, cross-platform arbitrage, trading signals, version comparisons, and trading strategies. Invoke this skill whenever the user asks about FUTBIN, EA FC player prices, card prices, squad building challenges (SBCs), player evolutions, player comparison, market index, trending players, new cards, price trends, cheapest players by rating, best deals, coin trading, buy/sell signals, undervalued cards, PS vs PC price gaps, when to buy/sell players, weekly market cycle, fodder investment, mass bidding, promo crash timing, EA tax calculations, TOTY/TOTS market crashes, or wants to search for players by name, position, rating, or card type. Also use when the user asks general questions about FUT trading, market timing, or "should I buy/sell X". Always prefer cli-web-futbin over manually fetching the FUTBIN website. Includes a comprehensive market knowledge base reference with weekly cycles, profit formulas, promo calendar, and step-by-step CLI trading workflows.
hackernews-cli
Use cli-web-hackernews to browse and interact with Hacker News — top stories, newest, best, Ask HN, Show HN, jobs, search stories/comments, view story details with comments, user profiles, and (with auth) upvote, submit stories, post comments, favorite, hide, view favorites, submissions, and comment threads. Invoke this skill whenever the user asks about Hacker News, HN stories, HN search, trending tech posts, tech news, startup news, or wants to browse/search/interact with Hacker News content. Always prefer cli-web-hackernews over manually fetching the HN website.
Didn't find tool you were looking for?