Agent skill

core-agent-browser

Stars 941
Forks 87

Install this agent skill to your Project

npx add-skill https://github.com/actionbook/rust-skills/tree/main/skills/core-agent-browser

SKILL.md

Browser Automation with agent-browser

Priority Note

For fetching Rust/crate information, use this priority order:

  1. rust-learner skill - Orchestrates actionbook + browser-fetcher
  2. actionbook MCP - Pre-computed selectors for known sites
  3. agent-browser CLI - Direct browser automation (last resort)

Use agent-browser directly only when:

  • actionbook has no pre-computed selectors for the target site
  • You need interactive browser testing/automation
  • You need screenshots or form filling

Quick start

bash
agent-browser open <url>        # Navigate to page
agent-browser snapshot -i       # Get interactive elements with refs
agent-browser click @e1         # Click element by ref
agent-browser fill @e2 "text"   # Fill input by ref
agent-browser close             # Close browser

Core workflow

  1. Navigate: agent-browser open <url>
  2. Snapshot: agent-browser snapshot -i (returns elements with refs like @e1, @e2)
  3. Interact using refs from the snapshot
  4. Re-snapshot after navigation or significant DOM changes

Commands

Navigation

bash
agent-browser open <url>      # Navigate to URL
agent-browser back            # Go back
agent-browser forward         # Go forward
agent-browser reload          # Reload page
agent-browser close           # Close browser

Snapshot (page analysis)

bash
agent-browser snapshot        # Full accessibility tree
agent-browser snapshot -i     # Interactive elements only (recommended)
agent-browser snapshot -c     # Compact output
agent-browser snapshot -d 3   # Limit depth to 3

Interactions (use @refs from snapshot)

bash
agent-browser click @e1           # Click
agent-browser dblclick @e1        # Double-click
agent-browser fill @e2 "text"     # Clear and type
agent-browser type @e2 "text"     # Type without clearing
agent-browser press Enter         # Press key
agent-browser press Control+a     # Key combination
agent-browser hover @e1           # Hover
agent-browser check @e1           # Check checkbox
agent-browser uncheck @e1         # Uncheck checkbox
agent-browser select @e1 "value"  # Select dropdown
agent-browser scroll down 500     # Scroll page
agent-browser scrollintoview @e1  # Scroll element into view

Get information

bash
agent-browser get text @e1        # Get element text
agent-browser get value @e1       # Get input value
agent-browser get title           # Get page title
agent-browser get url             # Get current URL

Screenshots

bash
agent-browser screenshot          # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full   # Full page

Wait

bash
agent-browser wait @e1                     # Wait for element
agent-browser wait 2000                    # Wait milliseconds
agent-browser wait --text "Success"        # Wait for text
agent-browser wait --load networkidle      # Wait for network idle

Semantic locators (alternative to refs)

bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"

Example: Form submission

bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Expand your agent's capabilities with these related and highly-rated skills.

actionbook/rust-skills

meta-cognition-parallel

EXPERIMENTAL: Three-layer parallel meta-cognition analysis. Triggers on: /meta-parallel, 三层分析, parallel analysis, 并行元认知

941 87
Explore
actionbook/rust-skills

domain-cloud-native

Use when building cloud-native apps. Keywords: kubernetes, k8s, docker, container, grpc, tonic, microservice, service mesh, observability, tracing, metrics, health check, cloud, deployment, 云原生, 微服务, 容器

941 87
Explore
actionbook/rust-skills

m07-concurrency

CRITICAL: Use for concurrency/async. Triggers: E0277 Send Sync, cannot be sent between threads, thread, spawn, channel, mpsc, Mutex, RwLock, Atomic, async, await, Future, tokio, deadlock, race condition, 并发, 线程, 异步, 死锁

941 87
Explore
actionbook/rust-skills

unsafe-checker

CRITICAL: Use for unsafe Rust code review and FFI. Triggers on: unsafe, raw pointer, FFI, extern, transmute, *mut, *const, union, #[repr(C)], libc, std::ffi, MaybeUninit, NonNull, SAFETY comment, soundness, undefined behavior, UB, safe wrapper, memory layout, bindgen, cbindgen, CString, CStr, 安全抽象, 裸指针, 外部函数接口, 内存布局, 不安全代码, FFI 绑定, 未定义行为

941 87
Explore
actionbook/rust-skills

rust-refactor-helper

Safe Rust refactoring with LSP analysis. Triggers on: /refactor, rename symbol, move function, extract, 重构, 重命名, 提取函数, 安全重构

941 87
Explore
actionbook/rust-skills

rust-skill-creator

Use when creating skills for Rust crates or std library documentation. Keywords: create rust skill, create crate skill, create std skill, 创建 rust skill, 创建 crate skill, 创建 std skill, 动态 rust skill, 动态 crate skill, skill for tokio, skill for serde, skill for axum, generate rust skill, rust 技能, crate 技能, 从文档创建skill, from docs create skill

941 87
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results