Qdrant MCP Server

A Model Context Protocol (MCP) server providing semantic search capabilities using Qdrant vector database with multiple embedding providers.

Features

Zero Setup: Works out of the box with Ollama - no API keys required
Privacy-First: Local embeddings and vector storage - data never leaves your machine
Code Vectorization: Intelligent codebase indexing with AST-aware chunking and semantic code search
Multiple Providers: Ollama (default), OpenAI, Cohere, and Voyage AI
Hybrid Search: Combine semantic and keyword search for better results
Semantic Search: Natural language search with metadata filtering
Incremental Indexing: Efficient updates - only re-index changed files
Configurable Prompts: Create custom prompts for guided workflows without code changes
Rate Limiting: Intelligent throttling with exponential backoff
Full CRUD: Create, search, and manage collections and documents
Flexible Deployment: Run locally (stdio) or as a remote HTTP server

Quick Start

Prerequisites

Node.js 20+
Docker and Docker Compose

Installation

# Clone and install
git clone https://github.com/mhalder/qdrant-mcp-server.git
cd qdrant-mcp-server
npm install

# Start services and pull model
docker compose up -d
docker exec ollama ollama pull nomic-embed-text

# Build
npm run build

Configuration

Local Setup (stdio transport)

Add to ~/.claude/claude_code_config.json:

{
  "mcpServers": {
    "qdrant": {
      "command": "node",
      "args": ["/path/to/qdrant-mcp-server/build/index.js"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "EMBEDDING_BASE_URL": "http://localhost:11434"
      }
    }
  }
}

Remote Setup (HTTP transport)

⚠️ Security Warning: When deploying the HTTP transport in production:

Always run behind a reverse proxy (nginx, Caddy) with HTTPS

Implement authentication/authorization at the proxy level

Use firewalls to restrict access to trusted networks

Never expose directly to the public internet without protection

Consider implementing rate limiting at the proxy level

Monitor server logs for suspicious activity

Start the server:

TRANSPORT_MODE=http HTTP_PORT=3000 node build/index.js

Configure client:

{
  "mcpServers": {
    "qdrant": {
      "url": "http://your-server:3000/mcp"
    }
  }
}

Using a different provider:

"env": {
  "EMBEDDING_PROVIDER": "openai",  // or "cohere", "voyage"
  "OPENAI_API_KEY": "sk-...",      // provider-specific API key
  "QDRANT_URL": "http://localhost:6333"
}

Restart after making changes.

See Advanced Configuration section below for all options.

Tools

Collection Management

| Tool | Description | | --------------------- | -------------------------------------------------------------------- | | create_collection | Create collection with specified distance metric (Cosine/Euclid/Dot) | | list_collections | List all collections | | get_collection_info | Get collection details and statistics | | delete_collection | Delete collection and all documents |

Document Operations

| Tool | Description | | ------------------ | ----------------------------------------------------------------------------- | | add_documents | Add documents with automatic embedding (supports string/number IDs, metadata) | | semantic_search | Natural language search with optional metadata filtering | | hybrid_search | Hybrid search combining semantic and keyword (BM25) search with RRF | | delete_documents | Delete specific documents by ID |

Code Vectorization

| Tool | Description | | ------------------ | ---------------------------------------------------------------------------------------------- | | index_codebase | Index a codebase for semantic code search with AST-aware chunking | | search_code | Search indexed codebase using natural language queries | | reindex_changes | Incrementally re-index only changed files (detects added/modified/deleted) | | get_index_status | Get indexing status and statistics for a codebase | | clear_index | Delete all indexed data for a codebase |

Resources

qdrant://collections - List all collections
qdrant://collection/{name} - Collection details

Configurable Prompts

Create custom prompts tailored to your specific use cases without modifying code. Prompts provide guided workflows for common tasks.

Note: By default, the server looks for prompts.json in the project root directory. If the file exists, prompts are automatically loaded. You can specify a custom path using the PROMPTS_CONFIG_FILE environment variable.

Setup

Create a prompts configuration file (e.g., prompts.json in the project root):

See prompts.example.json for example configurations you can copy and customize.
Configure the server (optional - only needed for custom path):

If you place prompts.json in the project root, no additional configuration is needed. To use a custom path:

{
  "mcpServers": {
    "qdrant": {
      "command": "node",
      "args": ["/path/to/qdrant-mcp-server/build/index.js"],
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "PROMPTS_CONFIG_FILE": "/custom/path/to/prompts.json"
      }
    }
  }
}

Use prompts in your AI assistant:

Claude Code:

/mcp__qdrant__find_similar_docs papers "neural networks" 10

VSCode:

/mcp.qdrant.find_similar_docs papers "neural networks" 10

Example Prompts

See prompts.example.json for ready-to-use prompts including:

find_similar_docs - Semantic search with result explanation
setup_rag_collection - Create RAG-optimized collections
analyze_collection - Collection insights and recommendations
bulk_add_documents - Guided bulk document insertion
search_with_filter - Metadata filtering assistance
compare_search_methods - Semantic vs hybrid search comparison
collection_maintenance - Maintenance and cleanup workflows
migrate_to_hybrid - Collection migration guide

Template Syntax

Templates use {{variable}} placeholders:

Required arguments must be provided
Optional arguments use defaults if not specified
Unknown variables are left as-is in the output

Code Vectorization

Intelligently index and search your codebase using semantic code search. Perfect for AI-assisted development, code exploration, and understanding large codebases.

Features

AST-Aware Chunking: Intelligent code splitting at function/class boundaries using tree-sitter
Multi-Language Support: 35+ file types including TypeScript, Python, Java, Go, Rust, C++, and more
Incremental Updates: Only re-index changed files for fast updates
Smart Ignore Patterns: Respects .gitignore, .dockerignore, and custom .contextignore files
Semantic Search: Natural language queries to find relevant code
Metadata Filtering: Filter by file type, path patterns, or language
Local-First: All processing happens locally - your code never leaves your machine

Quick Start

1. Index your codebase:

# Via Claude Code MCP tool
/mcp__qdrant__index_codebase /path/to/your/project

2. Search your code:

# Natural language search
/mcp__qdrant__search_code /path/to/your/project "authentication middleware"

# Filter by file type
/mcp__qdrant__search_code /path/to/your/project "database schema" --fileTypes .ts,.js

# Filter by path pattern
/mcp__qdrant__search_code /path/to/your/project "API endpoints" --pathPattern src/api/**

3. Update after changes:

# Incrementally re-index only changed files
/mcp__qdrant__reindex_changes /path/to/your/project

Usage Examples

Index a TypeScript Project

// The MCP tool automatically:
// 1. Scans all .ts, .tsx, .js, .jsx files
// 2. Respects .gitignore patterns (skips node_modules, dist, etc.)
// 3. Chunks code at function/class boundaries
// 4. Generates embeddings using your configured provider
// 5. Stores in Qdrant with metadata (file path, line numbers, language)

index_codebase({
  path: "/workspace/my-app",
  forceReindex: false  // Set to true to re-index from scratch
})

// Output:
// ✓ Indexed 247 files (1,823 chunks) in 45.2s

Search for Authentication Code

search_code({
  path: "/workspace/my-app",
  query: "how does user authentication work?",
  limit: 5
})

// Results include file path, line numbers, and code snippets:
// [
//   {
//     filePath: "src/auth/middleware.ts",
//     startLine: 15,
//     endLine: 42,
//     content: "export async function authenticateUser(req: Request) { ... }",
//     score: 0.89,
//     language: "typescript"
//   },
//   ...
// ]

Search with Filters

// Only search TypeScript files
search_code({
  path: "/workspace/my-app",
  query: "error handling patterns",
  fileTypes: [".ts", ".tsx"],
  limit: 10
})

// Only search in specific directories
search_code({
  path: "/workspace/my-app",
  query: "API route handlers",
  pathPattern: "src/api/**",
  limit: 10
})

Incremental Re-indexing

// After making changes to your codebase
reindex_changes({
  path: "/workspace/my-app"
})

// Output:
// ✓ Updated: +3 files added, ~5 files modified, -1 files deleted
// ✓ Chunks: +47 added, -23 deleted in 8.3s

Check Indexing Status

get_index_status({
  path: "/workspace/my-app"
})

// Output:
// {
//   isIndexed: true,
//   collectionName: "code_a3f8d2e1",
//   chunksCount: 1823,
//   filesCount: 247,
//   lastUpdated: "2025-01-30T10:15:00Z",
//   languages: ["typescript", "javascript", "json"]
// }

Supported Languages

Programming Languages (35+ file types):

Web: TypeScript, JavaScript, Vue, Svelte
Backend: Python, Java, Go, Rust, Ruby, PHP
Systems: C, C++, C#
Mobile: Swift, Kotlin, Dart
Functional: Scala, Clojure, Haskell, OCaml
Scripting: Bash, Shell, Fish
Data: SQL, GraphQL, Protocol Buffers
Config: JSON, YAML, TOML, XML, Markdown

See configuration for full list and customization options.

Custom Ignore Patterns

Create a .contextignore file in your project root to specify additional patterns to ignore:

# .contextignore
**/test/**
**/*.test.ts
**/*.spec.ts
**/fixtures/**
**/mocks/**
**/__tests__/**

Best Practices

Index Once, Update Incrementally: Use index_codebase for initial indexing, then reindex_changes for updates
Use Filters: Narrow search scope with fileTypes and pathPattern for better results
Meaningful Queries: Use natural language that describes what you're looking for (e.g., "database connection pooling" instead of "db")
Check Status First: Use get_index_status to verify a codebase is indexed before searching
Local Embedding: Use Ollama (default) to keep everything local and private

Performance

Typical performance on a modern laptop (Apple M1/M2 or similar):

| Codebase Size | Files | Indexing Time | Search Latency | |--------------|-------|---------------|----------------| | Small (10k LOC) | 50 | ~10s | <100ms | | Medium (100k LOC) | 500 | ~2min | <200ms | | Large (500k LOC) | 2,500 | ~10min | <500ms |

Note: Indexing time varies based on embedding provider. Ollama (local) is fastest for initial indexing.

Examples

See examples/ directory for detailed guides:

Basic Usage - Create collections, add documents, search
Knowledge Base - Structured documentation with metadata
Advanced Filtering - Complex boolean filters
Rate Limiting - Batch processing with cloud providers
Code Search - Index codebases and semantic code search

Advanced Configuration

Environment Variables

Core Configuration

| Variable | Description | Default | | ----------------------------------- | -------------------------------------- | --------------------- | | TRANSPORT_MODE | "stdio" or "http" | stdio | | HTTP_PORT | Port for HTTP transport | 3000 | | EMBEDDING_PROVIDER | "ollama", "openai", "cohere", "voyage" | ollama | | QDRANT_URL | Qdrant server URL | http://localhost:6333 | | PROMPTS_CONFIG_FILE | Path to prompts configuration JSON | prompts.json |

Embedding Configuration

| Variable | Description | Default | | ----------------------------------- | -------------------------------------- | --------------------- | | EMBEDDING_MODEL | Model name | Provider-specific | | EMBEDDING_BASE_URL | Custom API URL | Provider-specific | | EMBEDDING_MAX_REQUESTS_PER_MINUTE | Rate limit | Provider-specific | | EMBEDDING_RETRY_ATTEMPTS | Retry count | 3 | | EMBEDDING_RETRY_DELAY | Initial retry delay (ms) | 1000 | | OPENAI_API_KEY | OpenAI API key | - | | COHERE_API_KEY | Cohere API key | - | | VOYAGE_API_KEY | Voyage AI API key | - |

Code Vectorization Configuration

| Variable | Description | Default | | ------------------------- | ------------------------------------------------ | ------- | | CODE_CHUNK_SIZE | Maximum chunk size in characters | 2500 | | CODE_CHUNK_OVERLAP | Overlap between chunks in characters | 300 | | CODE_ENABLE_AST | Enable AST-aware chunking (tree-sitter) | true | | CODE_BATCH_SIZE | Number of chunks to embed in one batch | 100 | | CODE_CUSTOM_EXTENSIONS | Additional file extensions (comma-separated) | - | | CODE_CUSTOM_IGNORE | Additional ignore patterns (comma-separated) | - | | CODE_DEFAULT_LIMIT | Default search result limit | 5 |

Provider Comparison

| Provider | Models | Dimensions | Rate Limit | Notes | | ---------- | --------------------------------------------------------------- | -------------- | ---------- | -------------------- | | Ollama | nomic-embed-text (default), mxbai-embed-large, all-minilm | 768, 1024, 384 | None | Local, no API key | | OpenAI | text-embedding-3-small (default), text-embedding-3-large | 1536, 3072 | 3500/min | Cloud API | | Cohere | embed-english-v3.0 (default), embed-multilingual-v3.0 | 1024 | 100/min | Multilingual support | | Voyage | voyage-2 (default), voyage-large-2, voyage-code-2 | 1024, 1536 | 300/min | Code-specialized |

Note: Ollama models require docker exec ollama ollama pull <model-name> before use.

Troubleshooting

| Issue | Solution | | ------------------------------- | ---------------------------------------------------------------------------- | | Qdrant not running | docker compose up -d | | Collection missing | Create collection first before adding documents | | Ollama not running | Verify with curl http://localhost:11434, start with docker compose up -d | | Model missing | docker exec ollama ollama pull nomic-embed-text | | Rate limit errors | Adjust EMBEDDING_MAX_REQUESTS_PER_MINUTE to match your provider tier | | API key errors | Verify correct API key in environment configuration | | Filter errors | Ensure Qdrant filter format, check field names match metadata | | Codebase not indexed | Run index_codebase before search_code | | Slow indexing | Use Ollama (local) for faster indexing, or increase CODE_BATCH_SIZE | | Files not found | Check .gitignore and .contextignore patterns | | Search returns no results | Try broader queries, check if codebase is indexed with get_index_status | | Out of memory during index | Reduce CODE_CHUNK_SIZE or CODE_BATCH_SIZE |

Development

npm run dev          # Development with auto-reload
npm run build        # Production build
npm run type-check   # TypeScript validation
npm test             # Run test suite
npm run test:coverage # Coverage report

Testing

422 tests (376 unit + 46 functional) with 98%+ coverage:

Unit Tests: QdrantManager (21), Ollama (31), OpenAI (25), Cohere (29), Voyage (31), Factory (32), MCP Server (19)
Functional Tests: Live API integration, end-to-end workflows (46)

CI/CD: GitHub Actions runs build, type-check, and tests on Node.js 20 & 22 for every push/PR.

Contributing

Contributions welcome! See CONTRIBUTING.md for:

Development workflow
Conventional commit format (feat:, fix:, BREAKING CHANGE:)
Testing requirements (run npm test, npm run type-check, npm run build)

Automated releases: Semantic versioning via conventional commits - feat: → minor, fix: → patch, BREAKING CHANGE: → major.

Acknowledgments

License

MIT - see LICENSE file.

Qdrant MCP Server

Features

Quick Start

Prerequisites

Installation

Configuration

Local Setup (stdio transport)

Remote Setup (HTTP transport)

Tools

Collection Management

Document Operations

Code Vectorization

Resources

Configurable Prompts

Setup

Example Prompts

Template Syntax

Code Vectorization

Features

Quick Start

Usage Examples

Index a TypeScript Project

Search for Authentication Code

Search with Filters

Incremental Re-indexing

Check Indexing Status

Supported Languages

Custom Ignore Patterns

Best Practices

Performance

Examples

Advanced Configuration

Environment Variables

Core Configuration

Embedding Configuration

Code Vectorization Configuration

Provider Comparison

Troubleshooting

Development

Testing

Contributing

Acknowledgments

License

Install Package (if required)

Cursor configuration (mcp.json)