MCP server that lets Claude Code delegate tasks to MiMo for dual-model collaboration
CC-mcp-mimo-bridge
An MCP server that bridges Claude Code and MiMo v2.5 Pro for dual-model collaboration: Claude orchestrates, MiMo implements.
Claude breaks a task into self-contained briefs and delegates them to MiMo through this server. MiMo executes — single-turn, or autonomously across up to 20 agent-loop turns with file/grep/write tools — and returns a structured result. Claude reviews, retries with feedback if needed, and writes the final output to disk.
This is the same pattern Anthropic-internal "subagent" tooling uses, applied to a different model on the other end.
Why this exists
Claude Code is good at decision-making, code review, and long-horizon planning. MiMo v2.5 Pro is cheap and fast at code generation. Pairing them through MCP gives:
- Token-budget offload: Claude no longer spends output tokens on bulk code generation. It writes the brief, MiMo writes the code.
- Two-model review: Claude reviews MiMo's output before any disk write. Bad output is rejected with feedback and retried.
- Auditable: Every agent-loop call dumps a full JSON trace (prompts, tool calls, results, token counts) to
.traces/for inspection.
The BRIEF.md and MIMO_HANDOFF.md files in this repo are the original specs handed to MiMo to implement this very server — kept here as a record of the brief-driven development flow.
Architecture
Claude Code
│
│ delegate_to_mimo(brief, …) single-turn call
│ agent_delegate_to_mimo(brief, …) multi-turn agent loop
│ write_files([{path, content}, …]) batch disk write
▼
MCP server (this repo)
│
│ Anthropic-compatible API
▼
MiMo v2.5 Pro
Inside agent_delegate_to_mimo, MiMo gets its own toolbelt:
| Internal tool | What it does |
|---------------|--------------|
| read_file | Read a UTF-8 file |
| list_dir | List directory entries |
| grep | Recursive regex search, capped at 100 matches |
| write_file | Write a file (auto-backup on first overwrite per loop) |
The loop runs up to max_turns (default 20). On overflow, MiMo is asked to summarize what it did and what's unfinished. Every loop writes a trace.json to <working_dir>/.traces/<timestamp>_<slug>/ plus per-file backups under backups/.
Repo layout
.
├── BRIEF.md # Original spec handed to MiMo (Chinese)
├── MIMO_HANDOFF.md # Final delivery instructions to MiMo (Chinese)
├── MCP-mimo/ # Implementation
│ ├── mcp_mimo_server.py # Server (3 tools)
│ ├── smoke_test.py # Standalone smoke test
│ ├── requirements.txt
│ └── .gitignore
├── .gitignore
└── README.md # this file
Install
git clone https://github.com/<your-account>/CC-mcp-mimo-bridge.git
cd CC-mcp-mimo-bridge/MCP-mimo
pip install -r requirements.txt
Requires Python 3.10+.
Configure
1. Environment variables
| Variable | Required | Default | Description |
|--------------------|----------|------------------|----------------------------------------------|
| MIMO_BASE_URL | yes | — | MiMo Anthropic-compatible endpoint URL |
| MIMO_API_KEY | yes | — | MiMo API key |
| MIMO_MODEL | no | mimo-v2.5-pro | Model ID |
| MIMO_TIMEOUT_SEC | no | 120 | Request timeout in seconds |
| MIMO_MAX_TOKENS | no | 8192 | Max output tokens per request |
2. Register in Claude Code
Add to your ~/.claude.json under mcpServers:
{
"mcpServers": {
"mimo-bridge": {
"command": "python",
"args": ["/absolute/path/to/CC-mcp-mimo-bridge/MCP-mimo/mcp_mimo_server.py"],
"env": {
"MIMO_BASE_URL": "https://your-mimo-endpoint/anthropic",
"MIMO_API_KEY": "<YOUR_API_KEY>",
"MIMO_MODEL": "mimo-v2.5-pro",
"MIMO_TIMEOUT_SEC": "120",
"MIMO_MAX_TOKENS": "8192"
}
}
}
}
Or via CLI:
claude mcp add mimo-bridge \
-- python /absolute/path/to/CC-mcp-mimo-bridge/MCP-mimo/mcp_mimo_server.py
(then add the env block manually).
3. Smoke test
cd MCP-mimo
export MIMO_BASE_URL="https://your-mimo-endpoint/anthropic"
export MIMO_API_KEY="your_key_here"
python smoke_test.py
Expected:
[1/5] Basic delegate_to_mimo call ... PASS
[2/5] With context_files ... PASS
[3/5] With retry_context ... PASS
[4/5] SQLite usage_log has entries ... PASS
[5/5] Structured error when config missing ... PASS
All 5 smoke tests passed.
Tool reference
delegate_to_mimo
Single-turn call. Conversation history is accumulated in-process and persists for the lifetime of the MCP server.
| Parameter | Type | Required | Description |
|-------------------|--------|----------|--------------------------------------------------------------------------|
| brief | string | yes | Self-contained task brief (MiMo can't see Claude's history) |
| expected_output | enum | yes | patch / full_file / code_snippet / structured_json |
| context_files | array | no | [{path, content?}] — server reads from disk if content is omitted |
| constraints | string | no | Hard constraints (e.g. "pure C99, no new deps") |
| retry_context | object | no | {previous_output, review_feedback} for retry loop |
Returns:
{
"status": "ok | failed | timeout",
"output": "...",
"format": "code_snippet",
"tokens_used": {"input": 1234, "output": 567},
"model": "mimo-v2.5-pro",
"error": ""
}
agent_delegate_to_mimo
Multi-turn agent loop. MiMo autonomously calls read_file / list_dir / grep / write_file until it produces a final answer or hits max_turns.
| Parameter | Type | Required | Default | Description |
|-------------------|---------|----------|---------|----------------------------------------------------------|
| brief | string | yes | — | Self-contained task brief |
| expected_output | string | yes | — | Output format |
| working_dir | string | yes | — | Working directory; trace and backups land here |
| context_files | array | no | — | Same shape as above |
| constraints | string | no | — | Hard constraints |
| max_turns | int | no | 20 | Hard cap on loop iterations |
| allow_write | bool | no | true | When false, MiMo gets read-only tools |
Returns:
{
"status": "ok | max_turns_exceeded | failed",
"final_output": "...",
"trace_path": "<working_dir>/.traces/<timestamp>_<slug>/trace.json",
"files_written": ["..."],
"files_backed_up": [{"path": "...", "backup": "..."}],
"turns_used": 7,
"tokens_used": {"input": 12345, "output": 6789},
"model": "mimo-v2.5-pro",
"error": ""
}
write_files
Batch file write helper. Independent of MiMo — useful when Claude already has the content and just wants to commit it through one tool call.
| Parameter | Type | Required | Description |
|-----------|-------|----------|------------------------------------------------------|
| files | array | yes | [{path, content}, …] — UTF-8, parents auto-created |
Returns: {"written": [...], "failed": [{"path": ..., "error": ...}, ...]}. A single failure does not abort the rest.
Token-usage tracking
Every call (single-turn and agent-loop) is appended to a local SQLite database:
- Linux/macOS:
~/.euterpe/mcp_mimo_usage.db - Windows:
%APPDATA%\Euterpe\mcp_mimo_usage.db
Schema:
usage_log(id, timestamp, brief_hash, input_tokens, output_tokens, model, status)
brief_hash is the first 16 hex chars of sha256(brief) — enough to deduplicate without leaking content.
Troubleshooting
| Symptom | Likely cause | Fix |
|-------------------------------------------|-------------------------------|-----------------------------------------------------------|
| MIMO_BASE_URL or MIMO_API_KEY not set | Env vars not exported | Set them in ~/.claude.json env block or shell |
| auth failed, check MIMO_API_KEY | Invalid or expired API key | Regenerate at the MiMo dashboard |
| connection timeout after Ns | Network/endpoint unreachable | Verify MIMO_BASE_URL, raise MIMO_TIMEOUT_SEC |
| rate limited, retry later | Too many requests | Wait; retry-after header is surfaced in the error |
| upstream error: 5xx | MiMo server issue | Retry; check MiMo status |
| prompt too large: ~Nk tokens | Brief + context > ~32k tokens | Trim or split; for big workloads use agent_delegate_to_mimo (it reads files lazily) |
All errors are returned as structured dicts — the server never raises across the MCP boundary.
Design boundaries
What this server does:
- Protocol-layer bridge between Claude Code and MiMo (Anthropic-compatible API).
- Multi-turn agent loop with file/grep/write tools.
- Local token-usage logging.
- Trace dump and auto-backup for every agent-loop call.
What it does not:
- No semantic judgment of MiMo's output (Claude reviews).
- No concurrent dispatch (one task at a time, by design — keeps tracing simple).
- No result caching (every call hits MiMo).
- No prompt-cache optimization.
- No
query_usagetool yet (next version).
License
MIT — see LICENSE if present, otherwise treat the MIT terms as the default.