Agent skill
agent-browser
使用此技能进行浏览器自动化操作,包括网页抓取、表单填写、UI 测试和任何 Web 交互任务。
Install this agent skill to your Project
npx add-skill https://github.com/KroMiose/nekro-agent/tree/main/nekro_agent/builtin_skills/agent-browser
SKILL.md
agent-browser 浏览器自动化指南
环境提示:
agent-browser及 Playwright Chromium 已在 nekro-cc-sandbox 中预装(PLAYWRIGHT_BROWSERS_PATH=/opt/playwright-browsers),可直接使用。如遇安装问题,请读取本技能目录下的install.md。
基本工作流
agent-browser open <url> # 打开页面
agent-browser snapshot -i # 获取交互元素快照(返回带 ref 的元素树)
agent-browser click @e1 # 通过 ref 点击元素
agent-browser fill @e2 "文本" # 填写表单
agent-browser screenshot # 截图
agent-browser close # 关闭浏览器
快照(核心功能)
agent-browser snapshot # 完整可访问性树
agent-browser snapshot -i # 仅交互元素(推荐,节省 token)
agent-browser snapshot -c # 紧凑模式(移除空元素)
agent-browser snapshot -d 3 # 限制深度为 3 层
agent-browser snapshot -s "#main" # 限定 CSS 选择器范围
快照输出示例:
@e1 [heading] "Example Domain" [level=1]
@e2 [button] "Submit"
@e3 [input type="email"] placeholder="Email"
@e4 [link] "Learn more"
重要:页面导航或元素变化后,refs 会失效,必须重新执行 snapshot 获取新 refs。
元素交互
agent-browser click @e1 # 点击(使用 ref)
agent-browser fill @e2 "文本" # 清空并填写(表单推荐)
agent-browser type @e3 "追加" # 追加输入
agent-browser press Enter # 按键
agent-browser hover @e4 # 悬停
agent-browser select @e5 "选项值" # 选择下拉选项
agent-browser check @e6 # 勾选复选框
agent-browser scroll down 300 # 滚动(up/down/left/right,单位 px)
语义定位器(备选)
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "user@example.com"
agent-browser find placeholder "搜索..." fill "关键词"
agent-browser find text "登录" click
获取页面信息
agent-browser get text @e1 # 获取文本内容
agent-browser get value @e2 # 获取输入框值
agent-browser get attr @e3 href # 获取属性
agent-browser get title # 获取页面标题
agent-browser get url # 获取当前 URL
等待与导航
agent-browser wait --text "欢迎" # 等待文本出现
agent-browser wait --load # 等待页面加载完成
agent-browser wait 2000 # 等待 2 秒
agent-browser wait --url "**/dash" # 等待 URL 匹配
agent-browser back # 后退
agent-browser reload # 刷新
截图与调试
agent-browser screenshot page.png # 截图
agent-browser screenshot full.png --full # 截取完整页面
agent-browser pdf report.pdf # 保存为 PDF
agent-browser eval "document.title" # 执行 JavaScript
agent-browser console # 查看控制台消息
会话管理(多实例隔离)
# 不同会话完全隔离(cookie、localStorage、登录状态)
agent-browser --session user1 open site-a.com
agent-browser --session user2 open site-b.com
# 持久化登录状态
agent-browser --profile ~/.my-profile open myapp.com
最佳实践
- 使用
-i标志:只获取交互元素,大幅减少 token 消耗 - 导航后重新快照:click/fill 触发页面变化后必须重新
snapshot - 用
-s限定范围:复杂页面只关注目标区域 - 优先用
fill而非type:fill先清空再输入,更适合表单 - 用语义定位器作备选:
find role/label比 CSS 选择器更稳定
常见错误
| 错误 | 解决方案 |
|---|---|
| "Executable doesn't exist" | 读取 install.md 安装 Playwright Chromium |
| Ref 无效 | 重新执行 agent-browser snapshot |
| 元素找不到 | 改用 find 语义定位器 |
| 页面加载超时 | 使用 agent-browser wait --load |
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
git-github-workflow
处理基于 git 和 GitHub 的真实协作工作流。当任务涉及仓库同步、分支管理、修复 bug、提交代码、创建或更新 PR、处理 review、解决冲突、检查 GitHub 认证与权限、或需要通过 fork 与用户仓库协作时使用,强调安全、干净、可审计的协作流程。
skill-creator
创建新的 Claude Code 技能,修改和优化已有技能。当用户想从头创建技能、将当前工作流封装为技能、优化已有技能的内容或触发描述时使用此技能。即使用户没有明确说"技能",当他们想把某个重复工作流程固定下来时也应使用。
ubiquitous-language
Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
every-style-editor
This skill should be used when reviewing or editing copy to ensure adherence to Every's style guide. It provides a systematic line-by-line review process for grammar, punctuation, mechanics, and style guide compliance.
manage-codex
Autonomous Codex batch orchestrator. Use for "/manage-codex", "manage codex", "use codex", "dispatch to codex", or long-running Codex work.
seo-audit
When the user wants to audit, review, or diagnose SEO issues on their site. Also use when the user mentions "SEO audit," "technical SEO," "why am I not ranking," "SEO issues," "on-page SEO," "meta tags review," "SEO health check," "my traffic dropped," "lost rankings," "not showing up in Google," "site isn't ranking," "Google update hit me," "page speed," "core web vitals," "crawl errors," or "indexing issues." Use this even if the user just says something vague like "my SEO is bad" or "help with SEO" — start with an audit. For building pages at scale to target keywords, see programmatic-seo. For adding structured data, see schema-markup. For AI search optimization, see ai-seo.
Didn't find tool you were looking for?