Apple OCR MCP

中文

MCP (Model Context Protocol) server that extracts text from images using Apple's native Vision framework. Works entirely offline on macOS — no API keys, no internet, no third-party services.

How It Works

MCP Client (Claude Code, etc.) → JSON-RPC → server.py → ocr binary → Apple Vision → text

server.py — Python MCP server. Receives tool calls via JSON-RPC, validates inputs, delegates to the OCR binary.
ocr — Swift CLI compiled with swiftc. Wraps Apple's VNRecognizeTextRequest API.

No cloud services, no API keys. Just macOS system frameworks.

Compatibility

| Component | Requirement | |-----------|-------------| | macOS | 10.15 (Catalina) or later | | Arch | Apple Silicon (arm64) native. Intel (x86_64) — recompile or use pre-built binary from Releases | | Python | 3.8+ (stdlib only, no pip deps) | | Swift | 5.0+ (build only, not runtime) |

Quick Start

1. Clone & Install

git clone https://github.com/kains2866/apple-ocr-mcp.git
cd apple-ocr-mcp
./install.sh --download

This downloads the pre-built ocr binary for your Mac's architecture. No Xcode or compiler needed.

Prefer to build from source? Just run ./install.sh (requires Xcode Command Line Tools for swiftc).

2. Use It

The project includes a .mcp.json file. Claude Code auto-discovers MCP servers from the project root — just open this directory and start asking:

"Read the text from /path/to/image.jpg"

No manual config needed for project-local use.

3. (Optional) Global Install

To make apple-ocr available in all projects, not just this one:

./install.sh --global

Then add to ~/.claude/settings.local.json:

{
  "enabledMcpjsonServers": ["apple-ocr"]
}

Tool Schema

| Tool | read_image_text | |------|-------------------| | path (required) | Absolute path to image (.jpg, .png, .gif, .webp, .bmp, .tiff) | | lang (optional) | Language code: zh-Hans (default), en, ja, ko, etc. |

Configuration for Other MCP Clients

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "apple-ocr": {
      "command": "python3",
      "args": ["/path/to/apple-ocr-mcp/server.py"]
    }
  }
}

Continue (VS Code / JetBrains)

Add to ~/.continue/config.json:

{
  "experimental": {
    "mcpServers": {
      "apple-ocr": {
        "command": "python3",
        "args": ["/path/to/apple-ocr-mcp/server.py"]
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project:

{
  "mcpServers": {
    "apple-ocr": {
      "command": "python3",
      "args": ["/path/to/apple-ocr-mcp/server.py"]
    }
  }
}

Project Structure

apple-ocr-mcp/
├── README.md
├── README_zh.md
├── LICENSE
├── .mcp.json           # Claude Code auto-discovery config
├── install.sh          # One-command setup (--download or compile)
├── build.sh            # Build release binaries for both archs
├── server.py           # MCP server (Python, stdlib only)
└── ocr.swift           # OCR source (compile with swiftc)

Runtime (after ./install.sh):

├── ocr                 # Compiled binary (gitignored)
├── server.py
└── .mcp.json

Manual Build (without install.sh)

swiftc -o ocr ocr.swift
chmod +x ocr

Then use the .mcp.json already in the repo, or deploy globally:

mkdir -p ~/.claude/mcp-servers/apple-ocr
cp server.py ocr .mcp.json ~/.claude/mcp-servers/apple-ocr/

Creating a Release

For maintainers — build both architectures and publish:

./build.sh

This produces release/ocr-arm64 and release/ocr-x86_64. Create a GitHub Release, tag it (e.g. v1.0.0), and upload both binaries. Users can then run ./install.sh --download to get the right one automatically.

FAQ

Does this require an API key?

No. This project uses Apple Vision — a macOS system framework. It works entirely offline with zero API dependencies.

Will this work on Intel Macs?

Yes. Use ./install.sh --download for the x86_64 binary, or compile from source with swiftc.

What if OCR returns nothing?

Apple Vision needs reasonably clear text. Handwriting, stylized fonts, or low-contrast text may not be recognized. Try increasing image resolution or contrast.

Can I use this with other MCP clients?

Yes. This is a standard MCP server — any client supporting MCP tools can use it, provided the app has permission to execute the ocr binary. See configuration examples above.

Why Swift instead of a Python OCR library?

Apple Vision is faster, more accurate for CJK text, and requires no pip dependencies. The Swift binary is ~70KB and calls the OS directly.

License

MIT — see LICENSE

MCP Servers

Apple OCR MCP

How It Works

Compatibility

Quick Start

1. Clone & Install

2. Use It

3. (Optional) Global Install

Tool Schema

Configuration for Other MCP Clients

Claude Desktop

Continue (VS Code / JetBrains)

Cursor

Project Structure

Manual Build (without install.sh)

Creating a Release

FAQ

Does this require an API key?

Will this work on Intel Macs?

What if OCR returns nothing?

Can I use this with other MCP clients?

Why Swift instead of a Python OCR library?

License

安装包（如果需要）

Cursor 配置 (mcp.json)

Apple OCR MCP

How It Works

Compatibility

Quick Start

1. Clone & Install

2. Use It

3. (Optional) Global Install

Tool Schema

Configuration for Other MCP Clients

Claude Desktop

Continue (VS Code / JetBrains)

Cursor

Project Structure

Manual Build (without install.sh)

Creating a Release

FAQ

Does this require an API key?

Will this work on Intel Macs?

What if OCR returns nothing?

Can I use this with other MCP clients?

Why Swift instead of a Python OCR library?

License

安装包 （如果需要）

Cursor 配置 (mcp.json)

安装包（如果需要）