Dynamic MCP - Semantic tool discovery for Model Context Protocol
DMCP - Dynamic Model Context Protocol
Semantic tool discovery for MCP - Solves the "too many tools" problem by making tool discovery query-driven with vector search.
🎬 Inspiration & Credits
This project was inspired by:
- 📺 MCP Tool Overload Problem - YouTube video explaining the challenge
- 📝 From Reasoning to Retrieval: Solving the MCP Tool Overload Problem - Redis blog post with the vector search solution
🔬 Research Foundation
Implementation based on "Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models":
- 📄 Paper: ACL 2025 Findings | DOI
- 🏠 Project: GitHub | Leaderboard
- 🤗 Model:
mangopy/ToolRet-trained-e5-large-v2(1024 dimensions) - 🎯 Key Insight: General IR models perform poorly on tool retrieval; tool-specific training is essential
- 🏗️ Architecture: E5-large-v2 fine-tuned on 200k+ tool-query pairs with contrastive learning
Citation:
@inproceedings{shi-etal-2025-retrieval,
title={Retrieval Models Aren't Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models},
author={Shi, Zhengliang and Wang, Yuhan and Yan, Lingyong and Ren, Pengjie and Wang, Shuaiqiang and Yin, Dawei and Ren, Zhaochun},
booktitle={Findings of the Association for Computational Linguistics: ACL 2025},
pages={24497--24524},
year={2025},
address={Vienna, Austria},
publisher={Association for Computational Linguistics},
url={https://aclanthology.org/2025.findings-acl.1258}
}
🎯 The Problem
When you aggregate 20+ MCP servers (~300+ tools):
- Token explosion: 100,000+ tokens just listing tools
- LLM confusion: Too many choices = poor tool selection
- No filtering: Standard MCP returns ALL tools upfront
✨ The Solution
DMCP uses a two-process architecture with semantic search:
User: "Create a GitHub issue for this bug"
LLM calls: search_tools(query="create GitHub issue")
→ Returns top-30 relevant tools (via semantic vector search)
→ Tools become available for use
LLM calls: github_create_issue(...)
→ Issue created!
Key insight: The LLM discovers tools by asking, not by loading everything upfront.
🏗️ Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ VS Code / GitHub Copilot │
│ │
│ User: "search for kubernetes tools" │
│ ─────────────────────────────► │
│ search_tools("kubernetes") │
│ │
│ ◄───────────────────────────── │
│ Returns: 15 k8s tools (get_pods, list_deployments, describe_service...) │
└─────────────────────────────────┬───────────────────────────────────────────┘
│ stdio
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ DMCP Server (server/) │
│ │
│ • Exposes 1 meta-tool: search_tools │
│ • Pure vector search (COSINE similarity, HNSW index) │
│ • Sends listChanged notifications when tools discovered │
│ • Forwards tool calls to backend MCP servers via SSE │
└────────────┬──────────────────────────────────────────────────┬─────────────┘
│ │
│ Query embeddings │ Tool calls (SSE)
▼ ▼
┌────────────────────────┐ ┌────────────────────────────┐
│ Redis Stack (VSS) │ │ Agent Gateway │
│ Port: 6380 │ │ Port: 15000 │
│ │ │ │
│ ┌──────────────────┐ │ │ ┌──────────────────────┐ │
│ │ Vector Index │ │ │ │ 20+ MCP Servers │ │
│ │ HNSW + COSINE │ │ │ │ (SSE endpoints) │ │
│ │ 400+ tools │ │ │ └──────────────────────┘ │
│ └──────────────────┘ │ │ │
│ │ │ • GitHub, Jira, Confluence│
└────────────────────────┘ │ • Google Workspace │
▲ │ • Kubernetes, AWS, Azure │
│ │ • Grafana, Datadog │
│ │ • PostgreSQL, and more... │
┌────────────────────────┐ └────────────────────────────┘
│ Infinity Embedding │ ▲
│ Port: 5000 │ │
│ │ │
│ • ToolRet e5-large-v2 │ ┌──────────────────────────────┘
│ • 1024 dimensions │ │
│ • OpenAI-compatible │ │ Fetch config + discover tools
└────────────────────────┘ │
▲ │
│ Generate │
│ embeddings │
│ │
┌─────────────────────────────────────────────────────────────────────────────┐
│ DMCP Indexer (indexer/) │
│ npm run index │
│ │
│ 1. Fetches MCP server config from Agent Gateway (/config_dump) │
│ 2. Connects to servers in parallel (10 concurrent) │
│ 3. Discovers tools from each server │
│ 4. Generates embeddings via Infinity service │
│ 5. Stores tools + vectors in Redis │
└─────────────────────────────────────────────────────────────────────────────┘
📁 Project Structure
dmcp/
├── docker-compose.yml # Infrastructure (Redis VSS + infinity-emb)
├── .env.example # Environment configuration template
│
├── server/ # DMCP Server (TypeScript)
│ └── src/
│ ├── dmcp-server.ts # Runtime server (stdio)
│ ├── redis-vss.ts # Redis vector search
│ └── custom-embedding-provider.ts # Embedding API client
│
├── indexer/ # Standalone Indexer (TypeScript)
│ └── src/
│ ├── index.ts # CLI indexer with parallel discovery
│ ├── redis-vss.ts # Redis vector search
│ └── custom-embedding-provider.ts # Embedding API client
│
├── gateway/ # Agent Gateway Configuration
│ ├── agentgateway # Binary (download from Agent Gateway)
│ ├── config.yaml # Generated config (gitignored)
│ ├── config.yaml.example # Example config structure
│ └── config_parts/ # ⚠️ YOUR PRIVATE CONFIGS (gitignored)
🚀 Quick Start
Prerequisites
- Docker & Docker Compose
- Node.js 18+
- Agent Gateway binary (for running MCP servers)
1. Clone and Setup
git clone https://github.com/yourusername/dmcp.git
cd dmcp
# Configure embedding model (optional - defaults to tool-optimized model)
cp .env.example .env
2. Start Infrastructure
# Start Redis VSS + Embedding Service
docker-compose up -d
# Verify services are healthy
```bash
# Start Redis VSS + Embedding Service
docker-compose up -d
# Verify services are healthy
curl http://localhost:5000/health
# → {"unix": 1703452800.0}
# Test embedding model
curl -X POST http://localhost:5000/embeddings \
-H "Content-Type: application/json" \
-d '{"input":"create a GitHub issue","model":"mangopy/ToolRet-trained-e5-large-v2","encoding_format":"float"}' \
| jq '.data[0].embedding | length'
# → 1024
docker exec mcp-redis-vss redis-cli ping
# → PONG
3. Start Agent Gateway
cd gateway
./start.sh
# Gateway exposes MCP servers on ports 3101-3120
4. Index Tools
cd indexer
npm install
npm run index
# Output:
# ╔════════════════════════════════════════════════════════════════╗
# ║ DMCP Tool Indexer ║
# ╚════════════════════════════════════════════════════════════════╝
# ✔ Connected to Redis at localhost:6380
# ✔ Discovering tools from MCP servers... (parallel, 10 concurrent)
# ...
# ✔ Indexed 429 tools in 45.2s
5. Configure VS Code
Add to your .vscode/mcp.json:
{
"servers": {
"dmcp": {
"command": "node",
"args": [
"/path/to/dmcp/server/dist/dmcp-server.js"
],
"env": {
"REDIS_PORT": "6380",
"DMCP_TOP_K": "30",
"DMCP_MIN_SCORE": "0.25"
}
}
}
}
🔍 How Search Works
DMCP uses pure vector search with the ToolRet embedding model:
- Model was trained specifically on tool-query pairs
- Encodes semantic intent directly (no keyword matching needed)
- Returns top-k tools by COSINE similarity
Example queries and what they find:
| Query | Finds | Why |
|-------|-------|-----|
| "create GitHub issue" | GitHub tools | Semantic match |
| "ticket management" | Jira tools | Semantic similarity |
| "check pod logs" | Kubernetes tools | Semantic match |
| "search emails" | Google Workspace | Semantic match |
| "query AWS costs" | AWS Cost Explorer | Semantic match |
⚙️ Configuration
Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| MCP_GATEWAY_URL | http://127.0.0.1:15000/config_dump | Agent Gateway config endpoint (indexer) |
| REDIS_HOST | localhost | Redis server host |
| REDIS_PORT | 6380 | Redis server port |
| EMBEDDING_URL | http://localhost:5000 | Embedding service URL |
| EMBEDDING_MODEL | mangopy/ToolRet-trained-e5-large-v2 | ToolRet model (1024 dims) |
| DMCP_TOP_K | 30 | Max tools returned per search |
| DMCP_MIN_SCORE | 0.25 | Minimum similarity threshold |
Indexer CLI
cd indexer
npm run index # Index all tools from gateway
npm run index:force # Force re-index (clear existing)
npm run index -- -s name # Index only specific server
🖥️ Server Deployment
For deploying to your own server:
- Copy your private configs to
gateway/config_parts/on your server - Generate gateway config:
cat gateway/config_parts/*.yaml > gateway/config.yaml - Start services:
docker-compose up -d - Start gateway:
cd gateway && ./start.sh - Index tools:
cd indexer && npm run index - Build server:
cd server && npm run build
For Apple Silicon (M1/M2/M3), uncomment the platform: linux/arm64 line in docker-compose.yml.
📊 Performance
| Metric | Value | |--------|-------| | Tools indexed | 429 | | Index time | ~45 seconds | | Search latency | ~50ms | | Token reduction | 98% (from ~100k to ~2k) | | Embedding model | ToolRet-e5-large-v2 (1024 dims) |
📐 MCP Spec Compliance
Implements MCP Tool Discovery:
- ✅
listChanged: truecapability - ✅
notifications/tools/list_changednotifications - ✅ Dynamic tool availability based on search
📄 License
MIT