An MCP server that lets AI assistants read, search, and extract data from PDF files
mcp-server-pdf
An MCP server that lets AI assistants read, search, and extract data from PDF files.
Works with Claude Desktop, Claude Code, Cursor, Windsurf, VS Code, and any MCP-compatible client.
Features
| Tool | Description |
|------|-------------|
| read_pdf | Extract text from all or specific pages |
| search_pdf | Find text with page numbers and line context |
| get_pdf_info | Get metadata (title, author, pages, dimensions) |
| extract_tables | Extract tables as Markdown format |
| list_pdfs | List all PDFs in a directory with page counts |
Quick Start
Install
# Using uvx (recommended)
uvx mcp-server-pdf
# Using pip
pip install mcp-server-pdf
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"pdf": {
"command": "uvx",
"args": ["mcp-server-pdf"]
}
}
}
Claude Code (CLI)
claude mcp add pdf -- uvx mcp-server-pdf
Cursor / VS Code
Add to .cursor/mcp.json or .vscode/mcp.json:
{
"servers": {
"pdf": {
"command": "uvx",
"args": ["mcp-server-pdf"]
}
}
}
Usage Examples
Once configured, your AI assistant can:
"Read the contract at /path/to/contract.pdf"
"Search for 'payment terms' in /path/to/document.pdf"
"What's the metadata of this PDF?"
"Extract the table on page 3"
"List all PDFs in my Downloads folder"
Example: Read a PDF
You: Read pages 1-3 of /Users/me/report.pdf
AI: (calls read_pdf with file_path="/Users/me/report.pdf", start_page=1, end_page=3)
--- Page 1/15 ---
Annual Report 2025
Executive Summary
...
--- Page 2/15 ---
Financial Highlights
Revenue grew 23% year-over-year...
...
Example: Search within a PDF
You: Find all mentions of "revenue" in the report
AI: (calls search_pdf with query="revenue")
Found matches in 4 page(s):
--- Page 2 (2 matches) ---
L5: Revenue grew 23% year-over-year to $4.2B
L12: Recurring revenue now represents 68% of total
--- Page 7 (1 match) ---
L3: Revenue breakdown by segment:
...
Example: Extract Tables
You: Extract the table on page 5
AI: (calls extract_tables with page_number=5)
### Table 1
| Quarter | Revenue | Growth |
| --- | --- | --- |
| Q1 | $980M | 18% |
| Q2 | $1.05B | 21% |
| Q3 | $1.1B | 24% |
| Q4 | $1.07B | 19% |
Tools Reference
read_pdf
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| file_path | string | Yes | Absolute path to the PDF file |
| start_page | int | No | First page to read (1-based, default: 1) |
| end_page | int | No | Last page to read (inclusive, default: all) |
search_pdf
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| file_path | string | Yes | Absolute path to the PDF file |
| query | string | Yes | Text to search for |
| case_sensitive | bool | No | Case-sensitive search (default: false) |
get_pdf_info
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| file_path | string | Yes | Absolute path to the PDF file |
extract_tables
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| file_path | string | Yes | Absolute path to the PDF file |
| page_number | int | No | Page number to extract from (default: 1) |
list_pdfs
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| directory | string | Yes | Absolute path to the directory |
Development
# Clone
git clone https://github.com/liz7788/mcp-server-pdf.git
cd mcp-server-pdf
# Install in dev mode
pip install -e .
# Run
mcp-server-pdf
How It Works
┌─────────────┐ MCP Protocol ┌─────────────────┐ pdfplumber ┌──────────┐
│ AI Client │ ◄──────────────────► │ mcp-server-pdf │ ◄────────────────► │ PDF File │
│ (Claude, │ JSON-RPC stdio │ (this server) │ text extraction │ │
│ Cursor...) │ │ │ table parsing │ │
└─────────────┘ └─────────────────┘ └──────────┘
Contributing
PRs welcome! See CONTRIBUTING.md for guidelines.