Agent skill
Web Browser
Browse the web, take screenshots, interact with pages, fill forms. Use when: researching online, checking websites, filling out web forms, taking screenshots.
Install this agent skill to your Project
npx add-skill https://github.com/outworked/outworked/tree/main/electron/skills/browser
SKILL.md
Web Browser Skill
You can browse the web using the browse:* tools. A managed browser window handles navigation and interaction.
Available Tools
- browse:navigate — Navigate to a URL. Returns the page text AND a snapshot of all interactive elements with their CSS selectors. Start here. Params:
url(string). - browse:snapshot — Get the interactive element snapshot for the current page without navigating. Use after a click to see what changed. No required params.
- browse:screenshot — Take a screenshot of the current page. Returns an actual image you can see. Use to visually verify state. No required params.
- browse:click — Click an element using a CSS selector from the interactive snapshot. Returns the updated snapshot after clicking. Params:
selector(string). - browse:fill — Fill a form field by setting its value. Works for standard
<input>and<textarea>elements. Params:selector(string),value(string). - browse:type — Type text using simulated keyboard input. Unlike browse:fill, this sends real key-press events through Chromium, so it works with contentEditable fields, rich text editors, and sites like Twitter/X that ignore programmatic value changes. Click the target element first to focus it, or pass
selectorto auto-focus. Params:text(string, required),selector(string, optional),clearFirst(boolean, optional). - browse:evaluate — Execute JavaScript in the page context. Use as a last resort — prefer the other tools. Params:
script(string). - browse:show — Show the browser window to the user so they can see or interact with the page. Displays a "Done" banner. Use to present results, let the user review content, or hand off for manual steps. Params:
url(string, optional),message(string, optional). - browse:login — Show the browser window so the user can log in manually. Params:
url(string, optional),message(string, optional).
Workflow
- Navigate to the target URL with
browse:navigate— the response includes all interactive elements with selectors - Click or fill using the selectors from the snapshot — no need to probe the DOM
- After a click, the response includes the updated snapshot so you can see what changed
- Use
browse:screenshotif you need to visually verify the page state - Use
browse:snapshotto refresh the interactive element list without navigating
Best Practices
- Never probe the DOM manually with
browse:evaluateto find elements — the interactive snapshot gives you everything - Use the CSS selectors exactly as shown in the snapshot (e.g.
[aria-label="Like"]) - When a site requires authentication, use
browse:loginto let the user sign in - Use
browse:fillfor standard form inputs (<input>,<textarea>,<select>). Usebrowse:typefor rich text editors, contentEditable fields, or any site wherebrowse:filldoesn't work (Twitter/X, Notion, Google Docs, etc.) - A typical interaction should be 2-3 tool calls: navigate → click → done
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
Google Calendar
View, create, update, and delete Google Calendar events. Use when: scheduling meetings, checking availability, managing calendar events.
Google Sheets
Read, write, and manage Google Sheets spreadsheets
Google Drive
Search, read, upload, and share files in Google Drive
Notion
Search, read, create, and manage Notion pages and databases
Gmail
Read, search, send, and manage emails via Gmail API. Use when: managing email on the user's behalf, drafting replies, searching for specific emails, organizing with labels.
Slack
Search messages, manage channels, reactions, pins, and users in Slack
Didn't find tool you were looking for?