Agent skill
agent-browser
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
Install this agent skill to your Project
npx add-skill https://github.com/exceptionless/Exceptionless/tree/main/src/Exceptionless.Web/ClientApp/.agents/skills/agent-browser
SKILL.md
Browser Automation with agent-browser
Core Workflow
Every browser automation follows this pattern:
- Navigate:
agent-browser open <url> - Snapshot:
agent-browser snapshot -i(get element refs like@e1,@e2) - Interact: Use refs to click, fill, select
- Re-snapshot: After navigation or DOM changes, get fresh refs
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
Essential Commands
# Navigation
agent-browser open <url> # Navigate (aliases: goto, navigate)
agent-browser close # Close browser
# Snapshot
agent-browser snapshot -i # Interactive elements with refs (recommended)
agent-browser snapshot -s "#selector" # Scope to CSS selector
# Interaction (use @refs from snapshot)
agent-browser click @e1 # Click element
agent-browser fill @e2 "text" # Clear and type text
agent-browser type @e2 "text" # Type without clearing
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser press Enter # Press key
agent-browser scroll down 500 # Scroll page
# Get information
agent-browser get text @e1 # Get element text
agent-browser get url # Get current URL
agent-browser get title # Get page title
# Wait
agent-browser wait @e1 # Wait for element
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/page" # Wait for URL pattern
agent-browser wait 2000 # Wait milliseconds
# Capture
agent-browser screenshot # Screenshot to temp dir
agent-browser screenshot --full # Full page screenshot
agent-browser pdf output.pdf # Save as PDF
Common Patterns
Form Submission
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
Authentication with State Persistence
# Login once and save state
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "$USERNAME"
agent-browser fill @e2 "$PASSWORD"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Reuse in future sessions
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
Data Extraction
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5 # Get specific element text
agent-browser get text body > page.txt # Get all page text
# JSON output for parsing
agent-browser snapshot -i --json
agent-browser get text @e1 --json
Parallel Sessions
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com
agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i
agent-browser session list
Visual Browser (Debugging)
agent-browser --headed open https://example.com
agent-browser highlight @e1 # Highlight element
agent-browser record start demo.webm # Record session
Ref Lifecycle (Important)
Refs (@e1, @e2, etc.) are invalidated when the page changes. Always re-snapshot after:
- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
agent-browser click @e5 # Navigates to new page
agent-browser snapshot -i # MUST re-snapshot
agent-browser click @e1 # Use new refs
Semantic Locators (Alternative to Refs)
When refs are unavailable or unreliable, use semantic locators:
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
Deep-Dive Documentation
| Reference | When to Use |
|---|---|
| references/commands.md | Full command reference with all options |
| references/snapshot-refs.md | Ref lifecycle, invalidation rules, troubleshooting |
| references/session-management.md | Parallel sessions, state persistence, concurrent scraping |
| references/authentication.md | Login flows, OAuth, 2FA handling, state reuse |
| references/video-recording.md | Recording workflows for debugging and documentation |
| references/proxy-support.md | Proxy configuration, geo-testing, rotating proxies |
Ready-to-Use Templates
| Template | Description |
|---|---|
| templates/form-automation.sh | Form filling with validation |
| templates/authenticated-session.sh | Login once, reuse state |
| templates/capture-workflow.sh | Content extraction with screenshots |
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
foundatio-repositories
releasenotes
Generate formatted changelogs from git history since the last release tag. Use when preparing release notes that categorize changes into breaking changes, features, fixes, and other sections.
e2e-testing
Use this skill when writing or running end-to-end browser tests with Playwright. Covers Page Object Model patterns, selector strategies (data-testid, getByRole, getByLabel), fixtures, and accessibility audits with axe-playwright. Apply when adding E2E test coverage, debugging flaky tests, or testing user flows through the browser.
tanstack-query
Use this skill when fetching data, managing server state, or handling API mutations in the Svelte frontend. Covers createQuery, createMutation, query keys, cache invalidation, optimistic updates, and WebSocket-driven refetching. Apply when adding API calls, managing loading/error states, or coordinating cache updates after mutations.
dogfood
Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
storybook
Use this skill when creating or updating Storybook stories for Svelte components. Covers Svelte CSF story format, defineMeta, argTypes, snippet-based customization, and autodocs. Apply when adding visual documentation for components, setting up story files, or running Storybook for development.
Didn't find tool you were looking for?