Production-grade MCP server giving Windsurf IDE (or any MCP-compatible agent) total vision, mouse, and keyboard control of your machine — with two independent hard-interrupt kill switches (Escape key + mouse movement). 38 tools, hot-reloadable config, live screen recording.
Z-ComputerUse V1.0.0
A production-grade Model Context Protocol (MCP) server that hands Windsurf IDE (or any MCP-compatible agent) total vision, mouse, and keyboard control of your Windows, macOS, or Linux machine — with hard-interrupt safety you can trigger at any moment.
Built for the Director. Controlled by the Director. You interrupt, it stops.
Highlights
- 38 MCP tools covering screenshots, live recording, mouse, keyboard, zoom, session control, runtime interrupt toggles, and hot-reloadable config
- Anthropic computer-use schema compatible — tool names match so Cascade / Claude recognize them natively
- Two independent kill switches:
- Press Escape anywhere on your system
- Move your mouse more than N pixels (configurable threshold)
- Live screen recording with a circular memory buffer and auto-save on interrupt
- Multi-monitor fast capture via
mss - Massive
config.json— 9 sections, 50+ knobs, hot-reloadable - Human-like input — configurable jitter, easing, timing variance
- Safety guards — rate limits, blocked regions, dangerous-key confirmation, session duration watchdog
- DPI-aware on Windows out of the box
- stdio transport — zero port conflicts, spawns directly from Windsurf
Installation
Option A: pip (simplest)
git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
copy config.example.json config.json
With optional extras (OCR + template matching + interrupt sound):
pip install -e ".[full]"
Option B: uv (fast, recommended)
git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
uv venv
.\.venv\Scripts\Activate.ps1
uv pip install -e ".[full]"
copy config.example.json config.json
Option C: one-shot installer
.\scripts\install.ps1
Windsurf Integration
Add this entry to ~/.codeium/windsurf/mcp_config.json (or ~/.codeium/windsurf-next/mcp_config.json for Windsurf Next):
{
"mcpServers": {
"z-computeruse": {
"command": "python",
"args": ["-m", "z_computeruse"],
"env": {
"Z_COMPUTERUSE_CONFIG": "<ABSOLUTE_PATH_TO_REPO>/config.json"
}
}
}
}
Replace <ABSOLUTE_PATH_TO_REPO> with the full path to your cloned repository (e.g. C:/Users/you/code/Z-ComputerUse-MCP-Server on Windows, /home/you/code/Z-ComputerUse-MCP-Server on Linux).
Then press the refresh button in Windsurf's MCP panel. Cascade should pick up 38 tools under the z-computeruse server.
Tools Exposed
Screen & Vision
| Tool | Purpose |
|------|---------|
| screenshot | Full-screen capture with auto-scaling |
| screenshot_region | Capture a rectangle |
| zoom_region | Full-resolution crop for fine inspection |
| get_screen_info | Monitor list, resolutions, active primary |
| find_text_on_screen | OCR-based text locator (requires vision extras) |
| find_image_on_screen | Template-matching locator (requires vision extras) |
Recording
| Tool | Purpose |
|------|---------|
| start_recording | Begin live frame capture to ring buffer |
| stop_recording | Stop + optionally save to MP4 |
| get_recent_frames | Retrieve N most recent frames |
| get_recording_status | Running state, frame count, duration |
Mouse
| Tool | Purpose |
|------|---------|
| mouse_move | Move cursor to (x, y) |
| mouse_position | Get current cursor position |
| left_click / right_click / middle_click / double_click / triple_click | Click variants |
| left_click_drag | Click-drag between two points |
| scroll | Wheel scroll with direction + clicks |
| mouse_down / mouse_up | Fine-grained click control |
Keyboard
| Tool | Purpose |
|------|---------|
| type_text | Type a string with human-like variance |
| key | Press key or combo (ctrl+s, alt+tab) |
| hold_key | Hold a key for N seconds |
| key_down / key_up | Fine-grained key control |
Control & Session
| Tool | Purpose |
|------|---------|
| wait | Pause for N seconds (interruptible) |
| get_status | Interrupt state, session info, safety stats |
| reset_interrupt | Clear the interrupt flag so the agent can resume |
| pause_session / resume_session | Temporarily halt all actions |
| end_session | Finalize session, optionally save recording |
Config & Runtime Toggles
| Tool | Purpose |
|------|---------|
| get_config | Fetch current config (whole or by section) |
| reload_config | Hot-reload config.json without restart; propagates interrupt changes live |
| save_current_config | Persist in-memory config back to disk |
| set_mouse_interrupt | Toggle mouse-movement kill switch on/off at runtime |
| set_escape_interrupt | Toggle escape-key kill switch on/off at runtime |
| set_interrupts_enabled | Master switch for the entire interrupt manager |
The Interrupt System (Total Control)
Z-ComputerUse runs two independent global listeners from the moment it starts:
- Keyboard listener — watches for a single press of
Esc(configurable) anywhere on the system. - Mouse listener — watches for any movement more than
mouse_movement_threshold_pxpixels away from where the bot expected the cursor to be.
When either fires:
- A shared
threading.Eventgets set. - The current tool call aborts (raises
InterruptedErrorcaught at tool boundary). - All subsequent tool calls return
{"interrupted": true, "reason": "escape_key" | "mouse_movement"}until you explicitly callreset_interrupt. - If
save_on_interruptis on, the lastbuffer_secondsof screen recording is flushed to disk so you can review what it was doing. - If
sound_on_interruptis on, a sound plays.
To resume: tell Cascade something like "I interrupted because X. The situation is now Y. Call reset_interrupt and continue with Z." Cascade then calls the tool, the flag clears, and it proceeds.
The bot can't self-reset without Cascade calling the tool, and Cascade can't call the tool unless you tell it to — so you always hold the kill switch.
Config Highlights
All options in config.json. Sections: server, display, recording, mouse, keyboard, interrupt, safety, vision, session, advanced.
Key knobs to tune:
| Key | Effect |
|-----|--------|
| interrupt.mouse_movement_threshold_px | Lower = more sensitive (default 35 px) |
| interrupt.mouse_movement_grace_ms | Time window where programmatic moves don't trigger interrupt |
| mouse.move_duration_seconds | Slower = more human-looking |
| mouse.human_like_movement | Adds easing + jitter |
| display.screenshot_scale_max_edge_px | 1568 is LLM-safe; higher = sharper but bigger tokens |
| recording.fps | 5 is a good balance of quality vs. memory |
| safety.max_actions_per_minute | Rate limiter |
| safety.max_session_duration_minutes | Hard session cap |
| safety.blocked_regions | List of [x, y, w, h] rectangles the bot can't click |
Environment variable Z_COMPUTERUSE_CONFIG overrides the default config path.
Running Standalone
For testing outside Windsurf:
python -m z_computeruse
It will listen on stdio. Pair with MCP Inspector:
npx -y @modelcontextprotocol/inspector python -m z_computeruse
Requirements
- Python 3.11+
- Windows 10/11, macOS 12+, or Linux with X11
mss,pyautogui,pynput(installed automatically)- Optional: Tesseract OCR binary (for
find_text_on_screen), ffmpeg-bundled automatically viaimageio-ffmpeg
Safety & Disclaimer
This tool gives an AI direct, unchecked control of your mouse, keyboard, and screen. It is designed for power users who understand the risks. Always:
- Keep one hand near the mouse / Escape key
- Start with
safety.max_actions_per_minutelow until you trust the agent - Review recordings after every session
- Never run on a machine with sensitive credentials exposed to the agent
License
MIT — see LICENSE.