MCP server exposing semantic search, knowledge graph queries, and keyword search over DocSmith document collections
DocSmith MCP Server
MCP (Model Context Protocol) server that exposes DocSmith's semantic search and knowledge graph as tools for AI agents.
Tools
semantic_search
Search documents by semantic similarity with multiple retrieval strategies.
- Strategies:
vanilla(cosine similarity),hyde(hypothetical document embeddings),query_fusion(multi-query + RRF) - Features: metadata filters, context expansion (section/parent/neighbor), graph expansion (breadcrumbs + related entities), cross-encoder reranking
graph_query
Execute read-only Cypher queries against the Neo4j knowledge graph.
- Schema: 6 node types (Entity, Document, Section, Chunk, Cluster, Collection), 9 relationship types
- Patterns: entity relationships, cross-document connections, multi-hop traversal, document hierarchy
- Safety: write operations blocked,
$collection_idauto-injected
keyword_search
Exact text matching for error codes, identifiers, and literal strings.
Knowledge Graph Schema
(Document)-[:CONTAINS]->(Section)-[:CONTAINS]->(Chunk)
(Chunk)-[:MENTIONS]->(Entity)
(Entity)-[:EXTRACTED_FROM]->(Document)
(Entity)-[:RELATED_TO {predicate, confidence}]->(Entity)
(Document)-[:BELONGS_TO]->(Cluster)
Important: canonical_name is always lowercase — use toLower() when matching entities.
Setup
# Install dependencies
uv sync
# Configure environment
cp .env.example .env
# Edit .env with your DocSmith API URL, API key, and collection ID
# Run the server
uv run uvicorn server:app --host 0.0.0.0 --port 8005
Testing with MCP Inspector
npx @modelcontextprotocol/inspector
# Connect to: http://localhost:8005/mcp (Streamable HTTP transport)
See TESTS-MCP.txt for validated test cases.
Configuration
| Variable | Description |
|----------|-------------|
| DOCSMITH_URL | DocSmith API base URL |
| DOCSMITH_API_KEY | Tenant API key |
| COLLECTION_ID | Collection UUID for this session |
Stack
- Python 3.12
- FastMCP (MCP SDK)
- Starlette (Streamable HTTP transport)
- httpx (async HTTP client)
License
MIT