MCPSafari: Native Safari MCP Server for AI Agents
MCPSafari: Native Safari MCP Server for AI Agents
Give Claude, Cursor, or any MCP-compatible AI full native control of Safari on macOS. Navigate tabs, click/type/fill forms (even React), read HTML/accessibility trees, execute JS, capture screenshots, inspect console & network — all with 24 secure tools. Zero Chrome overhead, Apple Silicon optimized, token-authenticated, and built with official Swift + Manifest V3 Safari Extension.
Why MCPSafari?
- Smarter element targeting (UID + CSS + text + coords + interactive ranking)
- Works flawlessly with complex sites
- Local & private (runs on your Mac)
- Perfect drop-in for Mac-first agent workflows
macOS 14+ • Safari 17+ • Xcode 16+
Built with the official swift-sdk and a Manifest V3 Safari Web Extension.
Why Safari over Chrome?
- 40–60% less CPU/heat on Apple Silicon
- Keeps your existing Safari logins/cookies
- Native accessibility tree (better than Playwright for complex UIs)
How It Works
MCP Client (Claude, etc.)
│ stdio
┌───────▼──────────────┐
│ Swift MCP Server │
│ (MCPSafari binary) │
└───────┬──────────────┘
│ WebSocket (localhost:8089)
┌───────▼──────────────┐
│ Safari Extension │
│ (background.js) │
└───────┬──────────────┘
│ content scripts
┌───────▼──────────────┐
│ Safari Browser │
│ (macOS 14.0+) │
└──────────────────────┘
The MCP server communicates with clients over stdio and bridges tool calls to the Safari extension over a local WebSocket. The extension executes actions via browser APIs and content scripts injected into pages.
Requirements
- macOS 14.0 (Sonoma) or later
- Safari 17+
- Swift 6.1+ (for building from source)
- Xcode 16+ (for building the Safari extension)
Installation
From Release
Download the latest release from GitHub Releases:
| Asset | Description |
|-------|-------------|
| MCPSafari-arm64-apple-darwin | MCP server binary for Apple Silicon Macs (M1, M2, M3, M4) |
| MCPSafari-x86_64-apple-darwin | MCP server binary for Intel Macs |
| MCPSafari-universal-apple-darwin | MCP server binary — universal, runs on any Mac |
| MCPSafari-arm64.tar.gz | Safari extension app for Apple Silicon Macs (M1, M2, M3, M4) |
| MCPSafari-x86_64.tar.gz | Safari extension app for Intel Macs |
# Example: Apple Silicon Mac
curl -L -o MCPSafari https://github.com/Epistates/MCPSafari/releases/latest/download/MCPSafari-arm64-apple-darwin
chmod +x MCPSafari
mv MCPSafari ~/.local/bin/
# Download and install the Safari extension
curl -L -o MCPSafari-arm64.tar.gz https://github.com/Epistates/MCPSafari/releases/latest/download/MCPSafari-arm64.tar.gz
tar xzf MCPSafari-arm64.tar.gz
open MCPSafari.app
Then enable the extension in Safari > Settings > Extensions > MCPSafari Extension.
From Source
# Clone the repository
git clone https://github.com/Epistates/MCPSafari.git
cd MCPSafari
# Build the MCP server
cd MCPServer
swift build -c release
# The binary is at .build/release/MCPSafari
Install the Safari Extension
# Build and open the host app (registers the extension with Safari)
cd MCPSafari
xcodebuild -project MCPSafari.xcodeproj -scheme MCPSafari -configuration Debug build
open ~/Library/Developer/Xcode/DerivedData/MCPSafari-*/Build/Products/Debug/MCPSafari.app
Then enable the extension in Safari > Settings > Extensions > MCPSafari Extension.
Configuration
Claude Code
Add to your MCP settings (.claude/settings.json or project-level):
{
"mcpServers": {
"mcp-safari": {
"command": "/path/to/MCPSafari",
"args": []
}
}
}
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mcp-safari": {
"command": "/path/to/MCPSafari",
"args": []
}
}
}
Other MCP Clients
Any client that supports the MCP stdio transport can connect. Point it at the MCPSafari binary.
CLI Options
| Flag | Description |
|------|-------------|
| --port <n> / -p <n> | WebSocket port (default: 8089) |
| --verbose | Debug-level logging to stderr |
Tools (24)
Tab Management
| Tool | Description |
|------|-------------|
| tabs_context | List all open tabs with IDs, URLs, and titles |
| tabs_create | Open a new tab, optionally with a URL |
| close_tab | Close a tab by ID |
| select_tab | Pin a tab as the default context for future calls |
Navigation
| Tool | Description |
|------|-------------|
| navigate | Go to a URL, or use back / forward / reload actions |
Page Reading
| Tool | Description |
|------|-------------|
| read_page | Get page content as text, html, or snapshot |
| get_page_text | Get visible text content |
| snapshot | Accessibility tree with element UIDs for interaction |
| find | Find elements by CSS selector, text, or ARIA role |
Interaction
| Tool | Description |
|------|-------------|
| click | Click by UID, CSS selector, text, or coordinates |
| type_text | Type into an element with optional clearFirst and submitKey |
| form_input | Batch fill form fields (CSS selector → value map) |
| select_option | Select a dropdown option by value or label |
| scroll | Scroll page or element in any direction |
| press_key | Press key combinations (e.g., Enter, Meta+a, Control+c) |
| hover | Hover to trigger tooltips, menus, or hover states |
| drag | Drag and drop between elements |
Dialogs
| Tool | Description |
|------|-------------|
| handle_dialog | Accept or dismiss alerts, confirms, and prompts |
Screenshots
| Tool | Description |
|------|-------------|
| screenshot | Capture the visible tab area as a PNG image |
JavaScript
| Tool | Description |
|------|-------------|
| javascript_tool | Execute arbitrary JS in the page context |
Debugging
| Tool | Description |
|------|-------------|
| read_console | Read console messages with level and regex filtering |
| read_network | Read captured XHR/fetch requests with type filtering |
Window
| Tool | Description |
|------|-------------|
| resize_window | Resize the browser window to specific dimensions |
Utility
| Tool | Description |
|------|-------------|
| wait | Wait for a duration, CSS selector, or text to appear |
Usage
Basic Workflow
- Start with context — call
tabs_contextto see what's open, ornavigateto a URL. - Take a snapshot — call
snapshotto get the accessibility tree with element UIDs. - Interact — use UIDs from the snapshot with
click,type_text,hover, etc. - Verify — pass
includeSnapshot: trueon interaction tools to see the updated state, or take ascreenshot.
Element Targeting
Tools that interact with elements accept multiple targeting strategies:
| Strategy | Example | When to Use |
|----------|---------|-------------|
| UID | uid: "e42" | Most precise — from a snapshot |
| CSS selector | selector: "#login-btn" | When you know the DOM structure |
| Text | text: "Sign In" | Interactive elements are ranked higher |
| Coordinates | x: 100, y: 200 | Last resort — click at exact position |
Form Filling
Use form_input to fill multiple fields at once:
{
"fields": {
"#name": "Jane Doe",
"#email": "jane@example.com",
"textarea[name=message]": "Hello!"
}
}
This uses React-compatible value setting (nativeInputValueSetter) so it works with controlled inputs in React, Next.js, and similar frameworks.
Smart Text Matching
When targeting by text, interactive elements (buttons, links, inputs) are ranked higher than generic containers. Clicking text: "Submit" will prefer a <button>Submit</button> over a <div>Submit</div>.
Post-Action Snapshots
Most interaction tools support includeSnapshot: true, which returns the updated accessibility tree after the action — useful for verifying the result without a separate snapshot call.
Architecture
MCP Server (MCPServer/)
A Swift executable using the official modelcontextprotocol/swift-sdk. Communicates with MCP clients via stdio and with the Safari extension via a WebSocket bridge using Network.framework.
main.swift— Entry point, parses CLI flags, starts the serverSafariMCPServer.swift— Tool definitions and handlers (actor)WebSocketBridge.swift— WebSocket server with request/response correlation (actor)BridgeMessage.swift— Wire protocol types andAnyCodableserialization
Safari Extension (MCPSafari/)
A Manifest V3 Safari Web Extension with:
background.js— WebSocket client, request router, tab/navigation/screenshot handlerscontent.js— DOM interaction, accessibility snapshots, element finding, click/type/scroll simulationdialog-interceptor.js— Patcheswindow.alert/confirm/promptbefore page scripts runconsole-interceptor.js— Captures console messages forread_consolenetwork-interceptor.js— Captures XHR/fetch requests forread_networkpopup.html/js/css— Extension popup showing connection status
macOS Host App
A minimal macOS app (AppDelegate.swift, ViewController.swift) that registers the Safari extension and provides native messaging for auth token exchange.
Security
WebSocket Authentication
The server generates a random UUID token at startup, writes it to ~/.config/mcp-safari/token (mode 0600), and requires it as the first WebSocket message. The extension reads the token via native messaging from the host app. Connections without a valid token are accepted in unauthenticated mode for development convenience.
Input Validation
- URL schemes restricted to
http,https,about, andfile - Navigation actions validated against an allowlist
- Regex patterns capped at 200 characters and validated before forwarding
- Wait durations capped at 300 seconds
Permissions
The extension requests these permissions in manifest.json:
| Permission | Purpose |
|-----------|---------|
| tabs | List and manage tabs |
| activeTab | Access the active tab |
| scripting | Inject content scripts and execute JS |
| webNavigation | Navigate tabs (back/forward/reload) |
| nativeMessaging | Auth token exchange with host app |
| alarms | Service worker keepalive |
| storage | Persist selected tab across suspensions |
Troubleshooting
Extension shows "Disconnected"
- Make sure the MCP server is running (check your MCP client logs)
- Verify port 8089 is not in use:
lsof -i :8089 - Click "Reconnect" in the extension popup
- Use
--verboseflag on the server for debug logs
"Could not establish connection" errors
The content scripts may not be injected yet. The extension auto-injects on first interaction, but you can also reload the page.
Safari permission prompts
Safari prompts for per-site permissions the first time the extension interacts with a domain. Click "Always Allow on Every Website" in Safari > Settings > Extensions > MCPSafari Extension to avoid repeated prompts.
Port already in use
Use --port to pick a different port:
{
"mcpServers": {
"mcp-safari": {
"command": "/path/to/MCPSafari",
"args": ["--port", "9090"]
}
}
}
Development
Build & Test
# Build the MCP server
cd MCPServer
swift build
# Build the Safari extension
cd MCPSafari
xcodebuild -project MCPSafari.xcodeproj -scheme MCPSafari build
# Run the server with verbose logging
.build/debug/MCPSafari --verbose
CI
The CI workflow runs on every push and PR to main:
- Builds the MCP server (
swift build) - Tests the MCP handshake (verifies the binary responds to
initialize) - Builds the Safari extension (
xcodebuild)
Project Structure
MCPSafari/
├── MCPServer/ # Swift MCP server
│ ├── Package.swift
│ └── Sources/mcp-safari/
│ ├── main.swift
│ ├── SafariMCPServer.swift
│ ├── WebSocketBridge.swift
│ └── BridgeMessage.swift
├── MCPSafari/ # Xcode project
│ ├── MCPSafari/ # macOS host app
│ ├── MCPSafari Extension/ # Safari web extension
│ │ ├── Resources/
│ │ │ ├── background.js
│ │ │ ├── content.js
│ │ │ ├── dialog-interceptor.js
│ │ │ ├── console-interceptor.js
│ │ │ ├── network-interceptor.js
│ │ │ ├── manifest.json
│ │ │ └── popup.html/js/css
│ │ └── SafariWebExtensionHandler.swift
│ └── MCPSafari.xcodeproj
├── .github/workflows/
│ ├── ci.yml
│ └── release.yml
└── CHANGELOG.md
License
MIT