Local image analysis MCP server for Apple Silicon using Qwen3.5-4B — run vision AI entirely on-device with Claude Code, Codex, and OpenCode
Local Vision MCP Server
Local image analysis on Apple Silicon using Qwen3.5-4B — a natively multimodal vision-language model running via mlx-vlm. No external ML runtimes (Ollama, LM Studio, llama.cpp) required.
Requirements
- macOS with Apple Silicon (M1/M2/M3/M4)
- Python 3.10+
Setup
chmod +x setup.sh && ./setup.sh
This creates a .venv/ directory and installs all dependencies.
MCP Configuration
opencode
Add to opencode.json under the mcp key:
"local-vision": {
"type": "local",
"command": ["/path/to/.venv/bin/python", "server.py"]
}
Replace /path/to/ with the absolute path to this project directory.
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"local-vision": {
"command": "/path/to/.venv/bin/python",
"args": ["server.py"]
}
}
}
Replace /path/to/ with the absolute path to this project directory.
Available Tools
analyze_image(image_path: str, prompt: str)
Analyze an image file using the local vision model.
image_path— Absolute path to the image file (JPEG, PNG, WebP)prompt— What to ask about the image (default: "Describe what you see in this image.")
analyze_screenshot(prompt: str)
Capture the current screen and analyze it.
prompt— What to ask about the screenshot (default: "Describe what you see on the screen.")
Model Info
- Model:
mlx-community/Qwen3.5-4B-MLX-4bit - Download size: ~2.9 GB (4-bit quantized)
- RAM usage: ~3 GB when loaded
- Auto-download: Model downloads from HuggingFace on first use
- Auto-unload: Model is unloaded after 15 minutes of inactivity to free RAM
- Cache location:
~/.cache/huggingface/
Troubleshooting
| Issue | Solution |
|-------|----------|
| Import errors | Run ./setup.sh to install dependencies |
| screencapture fails | Grant Screen Recording permission in System Preferences |
| Model download fails | Check internet connection; model caches at ~/.cache/huggingface/ |
| Out of memory | Model uses ~3 GB RAM; ensure sufficient free memory |