Agent skill
capture
Capture HTTP traffic from web apps using playwright-cli. Includes site fingerprinting (framework detection, protection checks, iframe detection, auth detection, API discovery) and full traffic recording with tracing and optional HAR output. TRIGGER when: "record traffic from", "capture API calls from", "start Phase 1 for", "analyze traffic from URL", "assess site", "site fingerprint", "start capture for", "open browser for", or any URL is given as the first step of CLI generation. DO NOT trigger for: Phase 2 implementation, test writing, or quality validation.
Install this agent skill to your Project
npx add-skill https://github.com/ItamarZand88/CLI-Anything-WEB/tree/main/cli-anything-web-plugin/skills/capture
SKILL.md
Traffic Capture (Phase 1)
Assess the site, then capture comprehensive HTTP traffic. This skill combines site assessment with full traffic recording in a single browser session.
CRITICAL EXECUTION RULES
NEVER use
run_in_background: truefor ANY playwright-cli command. All playwright-cli commands must run in the foreground with appropriate timeouts. Background execution causes task ID tracking failures — the command completes before you can read the output. Seereferences/playwright-cli-commands.mdfor the timeout table.
NEVER use
evalfor complex expressions.evalfails silently on ternaries, comma operators, and multi-branch logic with "not well-serializable" errors. Userun-codeinstead. Seereferences/framework-detection.mdfor details.
ESM context — no
require().run-codeuses ESM. Useawait import('fs')instead ofrequire('fs'). Seereferences/playwright-cli-commands.md.
Prerequisites (Hard Gate)
Do NOT start unless:
- playwright-cli is available (
npx @playwright/cli@latest --version) - Target URL is known
Default capture method: playwright-cli tracing (standard workflow below).
Optional --mitmproxy mode: If the user passed --mitmproxy flag to /cli-anything-web, use mitmproxy-capture.py instead — it provides no body truncation, real-time noise filtering, deduplication, and enhanced metadata (timestamps, cookies, body sizes). Requires pip install mitmproxy (Python 3.12+):
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080
npx @playwright/cli@latest open <url> --config=.playwright/cli.proxy.config.json --headed
# ... browse the site as normal (snapshot, click, fill, goto) ...
npx @playwright/cli@latest -s=<app> close
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy --port 8080 -o <app>/traffic-capture/raw-traffic.json
If playwright-cli fails, fall back to chrome-devtools-mcp (see HARNESS.md Tool Hierarchy).
Public API Shortcut
If the target site has a documented public REST/JSON API (e.g., Hacker News Firebase API, Dev.to API, Reddit API, Wikipedia API), browser capture is optional:
- Probe the API endpoints directly with
httpxorcurl - Save responses as
<app>/traffic-capture/raw-traffic.json - Skip to Phase 2 (methodology)
This applies when:
- API docs exist (OpenAPI/Swagger, developer docs page,
/api/prefix) - The API is publicly accessible without browser-specific auth
- Endpoints return JSON (not HTML)
If unsure whether a public API exists, proceed with browser capture as normal.
Resume from Checkpoint
Before starting, check if a previous capture session exists:
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py restore <app>
If a checkpoint exists, read the guidance field and resume from the last
completed step instead of starting over. This prevents duplicate work when
sessions are interrupted.
Step 1: Setup
# Create output directory
mkdir -p <app>/traffic-capture
# Clear any stale sessions
npx @playwright/cli@latest kill-all 2>/dev/null || true
npx @playwright/cli@latest -s=<app> open <url> --headed --persistent
# Note: heavy SPAs (Next.js, React) may show "TimeoutError: page._snapshotForAI" on open.
# This is non-fatal — verify with: npx @playwright/cli@latest list
#
# IMPORTANT — "Browser opened with pid..." in command output means the daemon
# RE-ATTACHED to the existing browser, NOT that a new session was created.
# Do NOT re-navigate or restart when you see this. The session is still open.
If
--mitmproxymode: Replace theopencommand above with:bashpython ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080 npx @playwright/cli@latest -s=<app> open <url> --config=.playwright/cli.proxy.config.json --headedThis starts the proxy first, then opens the browser routed through it. All subsequent
snapshot,click,fill,gotocommands work exactly the same.
Do NOT ask the user to log in yet — Step 2 will determine if auth is needed.
Step 2: Site Fingerprint (Single Command)
Run the all-in-one site fingerprint command instead of individual eval calls. This is faster, more reliable, and detects framework + protection + iframes + auth requirements in one shot.
Use the script file — multi-line JS with arrow functions and optional chaining fails in playwright-cli's single-line command parser. The script file approach has been tested and works reliably:
npx @playwright/cli@latest -s=<app> run-code "$(grep -v '^\s*//' ${CLAUDE_PLUGIN_ROOT}/scripts/site-fingerprint.js | tr '\n' ' ')"
IMPORTANT: The
site-fingerprint.jsscript must be loaded via the command above. Do NOT copy-paste the JS inline — it will fail with SyntaxError. Thegrep -vstrips comments andtrjoins lines for single-line execution.
### Interpret fingerprint results
**Framework:**
- `googleBatch: true` → Google batchexecute RPC protocol. Generate `rpc/` subpackage.
- `nextPages: true` → Next.js Pages Router. Extract `__NEXT_DATA__` + trace `/_next/data/` fetches.
- `nextApp: true` → Next.js App Router. Trace client navigations for RSC payloads.
- `nuxt: true` → Nuxt. Extract `__NUXT__` + trace API calls.
- No framework flags → likely SSR HTML or custom SPA. Check for REST API in probe.
**Protection:**
- `cloudflare: true` → Use `curl_cffi` with `impersonate='chrome'` in generated CLI.
- `awsWaf: true` → Need WAF token cookie via browser. Use curl_cffi for API calls.
- `captcha: true` → Add pause-and-prompt to auth flow.
- `serviceWorker: true` → Site has an active Service Worker that may intercept requests
and hide them from traces. Note in assessment.md. Generated CLI's auth.py should use
`service_workers="block"` in browser context. See `references/protection-detection.md`.
**Iframes:**
- `iframeCount > 0` → App is iframe-embedded. **Re-run detection inside the iframe:**
```bash
npx @playwright/cli@latest -s=<app> run-code "async page => {
const frame = page.frames()[1];
if (!frame) return { error: 'no iframe found' };
return await frame.evaluate(() => ({
framework: {
nextPages: !!document.getElementById('__NEXT_DATA__'),
googleBatch: typeof WIZ_global_data !== 'undefined',
spaRoot: document.querySelector('#app, #root')?.id || null,
vite: !!document.querySelector('script[type=\"module\"][src*=\"/@vite\"]') || !!document.querySelector('script[type=\"module\"][src*=\"/src/\"]')
},
title: document.title,
bodyPreview: document.body?.textContent?.substring(0, 300) || ''
}));
}"
Common iframe pattern: Google Labs apps (Stitch, MusicFX, ImageFX) embed a
Vite/React SPA in an iframe. Parent has WIZ_global_data, iframe has the real app.
See references/playwright-cli-advanced.md for iframe interaction patterns.
Note: snapshot and click <ref> auto-resolve iframes. Only use run-code
for iframe interaction when built-in commands fail.
Auth detection (BEFORE exploration)
Check the fingerprint auth fields:
| Condition | Meaning | Action |
|---|---|---|
hasLoginButton && !hasUserMenu |
Login required, not logged in | Ask user to log in NOW |
hasUserMenu |
Already logged in | Proceed to capture |
!hasLoginButton && !hasUserMenu |
No auth needed (public site) | Skip auth, proceed |
If auth is needed:
- Tell the user: "This site requires login. Please log in in the browser window."
- Wait for user confirmation
- Save auth state:
npx @playwright/cli@latest -s=<app> state-save <app>/traffic-capture/<app>-auth.json
If NO auth is needed: Skip directly to Step 2b.
2b. Classify Site Profile
Based on fingerprint results AND what you see in the UI, classify the site:
| Profile | Auth? | Operations | Exploration Focus |
|---|---|---|---|
| Auth + CRUD | Yes | Create, Read, Update, Delete | Full CRUD per resource |
| Auth + Generation | Yes | Generate, Poll, Download | Generation lifecycle + projects |
| Auth + Read-only | Yes | Read, Search, Export | Read operations + auth flow |
| No-auth + CRUD | No/Optional | Full CRUD | Skip auth, full CRUD |
| No-auth + Read-only | No | Read, Search | Minimal capture |
2c. Quick API Probe (Force SPA Navigation Trick)
Start a SHORT trace, click 3-4 internal links, stop. This reveals hidden API endpoints that SSR hides on initial page load.
npx @playwright/cli@latest -s=<app> tracing-start
npx @playwright/cli@latest -s=<app> click <internal-link-1>
npx @playwright/cli@latest -s=<app> click <internal-link-2>
npx @playwright/cli@latest -s=<app> click <internal-link-3>
npx @playwright/cli@latest -s=<app> tracing-stop
# Quick parse to see what endpoints appeared
python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py .playwright-cli/traces/ --latest --output /tmp/probe.json
Check the probe results — what API patterns did you find?
See references/api-discovery.md for the priority chain and decision tree.
2d. Write Assessment Summary
Create <app>/traffic-capture/assessment.md to consolidate all findings:
# Site Assessment: <app>
- **URL**: <url>
- **Framework**: <detected framework or "none/custom">
- **Protocol**: <REST / GraphQL / batchexecute / HTML scraping / hybrid>
- **Protection**: <none / cloudflare / captcha / aws-waf / etc.>
- **Auth required**: <yes (type: Google SSO / cookie / JWT / API key) / no>
- **Iframes**: <yes (N frames, app in frame N at <url>) / no>
- **Site profile**: <Auth+CRUD / Auth+Generation / Auth+Read-only / No-auth+CRUD / No-auth+Read-only>
- **Capture strategy**: <API-first / SSR+API hybrid / batchexecute / HTML scraping / protected-manual>
- **Key observations**: <any quirks, localized UI, rate limits, special patterns>
Step 3: Full Traffic Capture
Now do the comprehensive capture based on what Step 2 revealed.
# Optional: Start HAR recording alongside trace for standard-format capture
# HAR files enable mitmproxy2swagger (auto OpenAPI spec) and third-party analysis tools
npx @playwright/cli@latest -s=<app> run-code "async page => {
await page.context().routeFromHAR('<app>/traffic-capture/capture.har', {
update: true,
updateContent: 'embed',
updateMode: 'full'
});
return 'HAR recording started';
}"
# Start fresh trace for full capture (note the trace ID from output!)
npx @playwright/cli@latest -s=<app> tracing-start
# Output: "trace-<ID>" — record this ID
If
--mitmproxymode: Skiptracing-startand HAR recording above. mitmproxy is already capturing all traffic since Step 1 — just proceed to the exploration below. Every click, navigation, and form submission is automatically recorded by the proxy.
HAR recording is optional but recommended. It produces a standard HAR file alongside the trace. This enables
mitmproxy2swaggerto auto-generate an OpenAPI spec:pip install mitmproxy2swagger && mitmproxy2swagger -i capture.har -o api-spec.yaml -p <base-url>The HAR file is saved when the browser context is closed (Step 5).
Exploration by site profile
Use the profile-specific checklist from Step 2b:
Auth + CRUD:
For EACH resource visible in the UI:
- [ ] List/browse: navigate to list view
- [ ] Detail: open one item
- [ ] Create: fill form, submit (capture POST body!)
- [ ] Update: edit an item, save
- [ ] Delete: delete a test item
- [ ] Settings/profile: check app settings
- [ ] Export: if available, trigger export/download
Auth + Generation:
- [ ] Dashboard/projects: navigate to project list
- [ ] Open existing project: view editor/canvas
- [ ] Generate new content: type prompt, click generate, WAIT for completion
- [ ] Edit/iterate: modify generation, re-generate
- [ ] Export/download: trigger download of generated content
- [ ] Delete: delete a test project
- [ ] Settings: check model selection, preferences
Auth + Read-only:
- [ ] Main view: navigate to primary content
- [ ] Search/filter: use search functionality
- [ ] Detail pages: open 2-3 different items
- [ ] Pagination: go to page 2 if available
- [ ] Export: if available
No-auth + CRUD:
Same as Auth + CRUD, but skip auth-related captures.
No-auth + Read-only:
- [ ] Homepage: capture initial data
- [ ] Search: try 2-3 different queries
- [ ] Detail pages: open 2-3 items
- [ ] Filters: apply different filters
- [ ] Pagination: check next page
General interaction rules
- Click by ref (from snapshot) is most reliable:
snapshot→ note ref →click <ref> - Refs go stale — always take a fresh snapshot before clicking
- For localized UIs (Hebrew, Arabic, etc.) — use refs or data-testid, not text
- For iframe-embedded apps —
snapshot+click <ref>auto-resolves iframes - Wait after generation — if the app generates content async, wait:
bash
npx @playwright/cli@latest -s=<app> run-code "async page => { await page.waitForTimeout(15000); return 'waited'; }" - The trace MUST contain at least one WRITE operation (POST/PUT/DELETE) unless the site is genuinely read-only (see exception below)
Exception for read-only sites: If the site is genuinely read-only, the trace
may contain only GET requests. Note "read-only site" in assessment.md and proceed.
Step 4: Stop, Save, Parse
npx @playwright/cli@latest -s=<app> tracing-stop
If tracing-stop fails:
- Retry once with 15s timeout
- If it fails again — the trace is lost. Start a new trace (Step 3).
- NEVER retry more than twice. See
references/playwright-cli-tracing.mdfor recovery.
python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py \
.playwright-cli/traces/ --latest \
--output <app>/traffic-capture/raw-traffic.json
# parse-trace.py now auto-runs analyze-traffic.py and produces:
# - <app>/traffic-capture/raw-traffic.json (raw request/response data)
# - <app>/traffic-capture/traffic-analysis.json (auto-detected protocol, auth, endpoints)
#
# The analysis output shows: protocol type, auth pattern, endpoint groups,
# GraphQL operations, batchexecute RPC IDs, and suggested CLI commands.
# Review the analysis — anything marked "unknown" needs manual investigation.
# You can also run the analyzer separately for more detail:
python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
<app>/traffic-capture/raw-traffic.json --summary
If
--mitmproxymode: Replace everything above with:bash# Stop the proxy and save captured traffic (includes auto-analysis) python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy \ --port 8080 -o <app>/traffic-capture/raw-traffic.json # The stop-proxy command writes raw-traffic.json directly. # Then run the analyzer for the full report: python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \ <app>/traffic-capture/raw-traffic.json --summaryNo
tracing-stoporparse-trace.pyneeded — mitmproxy already has the data. The analysis will include enhanced fields (request_sequence, session_lifecycle, endpoint_sizes) that are only available with mitmproxy capture.
Step 5: Close
npx @playwright/cli@latest -s=<app> close
# Mark capture complete
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py update <app> --step complete
If an endpoint is missing — USE THE FEATURE
Don't grep JS bundles. Start a new trace → screenshot → click the button → fill → submit → stop → parse. The browser IS the API documentation.
Fallback
Fallback: If playwright-cli is not available, see HARNESS.md Tool Hierarchy for chrome-devtools-mcp fallback instructions.
Next Step
When capture is complete (raw-traffic.json has WRITE operations, or the site is
read-only with only GET requests), invoke methodology to analyze the traffic
and build the CLI.
References
See references/ for: command syntax (playwright-cli-commands.md), tracing (playwright-cli-tracing.md),
sessions (playwright-cli-sessions.md), advanced patterns (playwright-cli-advanced.md),
framework detection (framework-detection.md), protection (protection-detection.md),
API discovery (api-discovery.md).
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
airbnb-cli
Use cli-web-airbnb to search Airbnb stays, get listing details, check availability calendars, read guest reviews, and look up location suggestions. Invoke this skill whenever the user asks about Airbnb accommodations, vacation rentals, listing prices, availability, guest reviews, or wants to search for places to stay. Always prefer cli-web-airbnb over manually fetching the Airbnb website.
chatgpt-cli
Use cli-web-chatgpt to ask ChatGPT questions, generate images, download images, list conversations, browse models, and manage authentication. Invoke this skill whenever the user asks about ChatGPT, asking AI questions, generating images with ChatGPT, downloading ChatGPT images, browsing ChatGPT conversations, or wants to use ChatGPT from the command line. Always prefer cli-web-chatgpt over manually browsing chatgpt.com.
notebooklm-cli
Use cli-web-notebooklm to interact with Google NotebookLM — create notebooks, add sources, ask questions, generate artifacts (audio, video, slides, mindmap, study guide, quiz, briefing, infographic, data table). Invoke this skill whenever the user asks about NotebookLM, wants to create notebooks, add sources to a notebook, ask a notebook questions, generate study materials, create presentations, podcasts, or manage NotebookLM content programmatically. Always prefer cli-web-notebooklm over manually browsing NotebookLM.
unsplash-cli
Use cli-web-unsplash to answer questions about Unsplash photos, search for free images by keyword, download photos, browse photo topics and collections, view photographer profiles, get photo details (EXIF, location, tags), and discover random photos. Invoke this skill whenever the user asks about Unsplash, free stock photos, searching for images, downloading images, photo topics, photographer profiles, photo collections, or wants to find or download images by keyword, orientation, or color. Always prefer cli-web-unsplash over manually fetching the Unsplash website.
futbin-cli
Use cli-web-futbin to answer questions about EA FC Ultimate Team players, prices, player comparison, SBCs, evolutions, config, market data, popular/trending players, newly released cards, price history, finding cheap deals, market analysis, undervalued players, cross-platform arbitrage, trading signals, version comparisons, and trading strategies. Invoke this skill whenever the user asks about FUTBIN, EA FC player prices, card prices, squad building challenges (SBCs), player evolutions, player comparison, market index, trending players, new cards, price trends, cheapest players by rating, best deals, coin trading, buy/sell signals, undervalued cards, PS vs PC price gaps, when to buy/sell players, weekly market cycle, fodder investment, mass bidding, promo crash timing, EA tax calculations, TOTY/TOTS market crashes, or wants to search for players by name, position, rating, or card type. Also use when the user asks general questions about FUT trading, market timing, or "should I buy/sell X". Always prefer cli-web-futbin over manually fetching the FUTBIN website. Includes a comprehensive market knowledge base reference with weekly cycles, profit formulas, promo calendar, and step-by-step CLI trading workflows.
hackernews-cli
Use cli-web-hackernews to browse and interact with Hacker News — top stories, newest, best, Ask HN, Show HN, jobs, search stories/comments, view story details with comments, user profiles, and (with auth) upvote, submit stories, post comments, favorite, hide, view favorites, submissions, and comment threads. Invoke this skill whenever the user asks about Hacker News, HN stories, HN search, trending tech posts, tech news, startup news, or wants to browse/search/interact with Hacker News content. Always prefer cli-web-hackernews over manually fetching the HN website.
Didn't find tool you were looking for?