MCP Servers

模型上下文协议服务器、框架、SDK 和模板的综合目录。

T
Turbo Memory MCP
作者 @hodorii

TurboQuant-based compressed vector memory MCP server — ~10x compression with unbiased inner product estimation

创建于 4/16/2026
更新于 about 4 hours ago
Repository documentation and setup instructions

turbo-memory-mcp

Compressed vector memory MCP server based on TurboQuant (ICLR 2026).

Implements Google Research's TurboQuant algorithm to compress embedding vectors ~10x while preserving inner product estimation accuracy for unbiased similarity search.

한국어

Features

  • Two-stage pipeline: Random rotation + Lloyd-Max quantization (b-1 bits) → QJL residual correction (1 bit)
  • Unbiased inner product: QJL correction eliminates similarity search bias
  • Direct compressed search: Inner product estimated without dequantization
  • Local embeddings: all-MiniLM-L6-v2 (sentence-transformers), no API key required
  • Multi-client safe: SQLite WAL mode for concurrent access from multiple MCP clients
  • MCP standard: Works with Gemini CLI, Kiro, Antigravity, opencode and any MCP client

Compression

| bits | Ratio (vs FP32) | MSE distortion | |------|----------------|----------------| | 2 | ~14x | 0.117 | | 3 | ~10x | 0.030 | | 4 | ~7x | 0.009 |

At 3-bit, matches FP16 LongBench score (50.06, per paper).

File Structure

turbo_quant.py    — TurboQuant core (quantization + compressed inner product)
memory_store.py   — Compressed vector store (SQLite WAL)
server.py         — MCP server (JSON-RPC over stdio or HTTP)
pyproject.toml    — Package definition for uvx

Installation

# No venv needed — uvx handles isolation automatically
uvx --from git+https://github.com/hodorii/turbo-memory-mcp turbo-memory-mcp

Or clone for local use:

git clone https://github.com/hodorii/turbo-memory-mcp
cd turbo-memory-mcp
make install
make register   # registers in all supported MCP clients

MCP Registration

stdio (default)

"memory": {
  "command": "uvx",
  "args": ["--from", "/path/to/turbo-memory-mcp", "turbo-memory-mcp"]
}

Supported config paths:

  • Kiro: ~/.kiro/settings/mcp.json
  • Gemini CLI: ~/.gemini/settings.json
  • Antigravity: ~/.gemini/antigravity/mcp_config.json
  • opencode: ~/.config/opencode/opencode.json

HTTP (for multi-client / sub-agent parallel access)

make serve          # default port 8765
make serve PORT=9000
"memory": {
  "type": "http",
  "url": "http://127.0.0.1:8765"
}

MCP Tools

| Tool | Description | |------|-------------| | remember(text) | Embed and store a memory (compressed) | | remember(texts=[...]) | Batch store (single encode pass, faster) | | recall(query, top_k?) | Search similar memories | | forget(id) | Delete a memory | | memory_stats() | Show compression statistics |

How TurboQuant Works

input vector x
    │
    ▼ Stage 1 (b-1 bits): random rotation Π·x → Beta distribution → Lloyd-Max scalar quantization
    │
    ▼ Stage 2 (1 bit): residual r = x - x̂  →  sign(S·r)  [QJL, unbiased]

Search uses compressed representation directly:

score ≈ centroids[idx] · (Π·query) · norm        # Stage 1
      + (√π/2 / d) · r_norm · sign(S·query) · qjl  # Stage 2 QJL correction

Theoretical guarantee: MSE distortion ≤ (√3·π/2) · (1/4^b) — within 2.7x of information-theoretic lower bound.

References

快速设置
此服务器的安装指南

安装包 (如果需要)

uvx turbo-memory-mcp

Cursor 配置 (mcp.json)

{ "mcpServers": { "hodorii-turbo-memory-mcp": { "command": "uvx", "args": [ "turbo-memory-mcp" ] } } }