Token-light browser automation via Playwright. CLI + MCP server.
dev-browser-mcp
Token-light browser automation via Playwright. CLI-first design for LLM agent workflows.
Uses ref-based interaction: get a compact accessibility snapshot, then click/fill by ref ID. Keeps context small.
Acknowledgments
This project is a Python/CLI rewrite inspired by SawyerHood/dev-browser. The ARIA snapshot extraction logic is vendored from that project. Thanks to Sawyer Hood for the original work and the ref-based interaction model.
Consider using Sawyer's original if you want:
- Native Claude Skill integration (install via
.claude-plugin) - TypeScript/Bun ecosystem
- Tighter Claude Desktop integration
This repo is for CLI-first workflows, Nix packaging, or if you prefer Python/Playwright.
Comparison
| Feature | SawyerHood/dev-browser | dev-browser-mcp |
|---------|------------------------|-----------------|
| Language | TypeScript | Python |
| Runtime | Bun + browser extension | Playwright (Python) |
| Interface | Claude Skill plugin | CLI + daemon (+ MCP) |
| Install | .claude-plugin | pip/Nix |
| Best for | Claude Desktop users | CLI agents, Codex, Nix users |
| Snapshot engine | ARIA (JS) | Same (vendored) |
Both use the same ref-based interaction model. Pick based on your environment.
Why CLI over MCP?
MCP adds overhead: extra process, stdio piping, JSON-RPC framing, connection management. For browser automation, that's a lot of indirection when you can just call a CLI.
The CLI approach:
- Lower latency - direct subprocess, no protocol overhead
- Easier debugging - run commands yourself, see exactly what happens
- Simpler integration - any agent that can shell out works
- Persistent sessions - daemon keeps browser alive between calls
The MCP server exists if you need it, but the CLI + daemon is the recommended path.
Install
Requires Python 3.11+ and Playwright browsers.
# Install playwright browsers (one-time)
playwright install chromium
# Run CLI directly
python cli.py goto https://example.com
python cli.py snapshot
python cli.py click-ref e3
Nix (flake)
No overlays required. The flake exposes the CLI, daemon, and MCP server:
nix run github:joshp123/dev-browser-mcp#dev-browser -- goto https://example.com
nix run github:joshp123/dev-browser-mcp#dev-browser -- snapshot
Install to your profile:
nix profile install github:joshp123/dev-browser-mcp#dev-browser
CLI Usage
dev-browser goto https://example.com
dev-browser snapshot # get refs
dev-browser click-ref e3 # click ref e3
dev-browser fill-ref e5 "search query" # fill input
dev-browser screenshot
dev-browser press Enter
The daemon starts automatically on first command and keeps the browser session alive.
Integration with Claude Code
Add to your project's CLAUDE.md:
## Browser Automation
Use `dev-browser` CLI for browser tasks. Keeps context small via ref-based interaction.
Workflow:
1. `dev-browser goto <url>` - navigate
2. `dev-browser snapshot` - get interactive elements as refs (e1, e2, etc.)
3. `dev-browser click-ref <ref>` or `dev-browser fill-ref <ref> "text"` - interact
4. `dev-browser screenshot` - capture state if needed
Example:
\`\`\`bash
dev-browser goto https://github.com/login
dev-browser snapshot
# Output: e1: textbox "Username" | e2: textbox "Password" | e3: button "Sign in"
dev-browser fill-ref e1 "myuser"
dev-browser fill-ref e2 "mypass"
dev-browser click-ref e3
\`\`\`
Integration with Codex
Codex can use the CLI directly via its shell access. Example prompt:
Use dev-browser to navigate to example.com and find all links on the page.
Available commands:
- dev-browser goto <url>
- dev-browser snapshot [--interactive-only / --no-interactive-only]
- dev-browser click-ref <ref>
- dev-browser fill-ref <ref> "text"
- dev-browser screenshot
- dev-browser press <key>
Tools
CLI commands (recommended):
goto <url>- navigatesnapshot- accessibility tree with refsclick-ref <ref>- click elementfill-ref <ref> "text"- fill inputpress <key>- keyboard inputscreenshot- save screenshotsave-html- save page HTMLlist-pages- show open pagesstatus/start/stop- daemon management
MCP tools (if you must):
page/list_pages/close_pagegoto/snapshot/click_ref/fill_ref/pressscreenshot/save_htmlactions- batch calls
License
AGPL-3.0-or-later. See LICENSE.
Vendored code from SawyerHood/dev-browser is MIT licensed. See THIRD_PARTY_NOTICES.md.