MCP Servers

模型上下文协议服务器、框架、SDK 和模板的综合目录。

M
MCP Doc Reader
作者 @aardpro

A MCP server that enables AI assistants to read and extract content from PDF, Excel, and Word documents.一个服务器,赋予 AI 助手读取和提取 PDF、Excel 和 Word 文档内容的能力。

创建于 12/27/2025
更新于 about 4 hours ago
Repository documentation and setup instructions

MCP Doc Reader

English | 中文

A Model Context Protocol (MCP) server that enables AI assistants to read and extract content from PDF, Excel, and Word documents.

Features

  • PDF Reading: Extract text content from PDF files using pdfminer.six
  • Excel Reading: Read .xlsx and .xls files with formatted table output
  • Word Reading: Extract text and tables from .docx files
  • Cross-Platform: Works on Windows, Linux, and macOS
  • Unicode Support: Full support for non-ASCII characters (Chinese, Japanese, etc.)

Installation

Using uvx (Recommended)

uvx mcp-doc-reader

Using pip

pip install mcp-doc-reader

From Source

git clone https://github.com/yourusername/mcp-doc-reader.git
cd mcp-doc-reader
pip install -e .

Configuration

Add the following to your MCP client configuration (e.g., Claude Desktop, Cursor):

Option 1: Using uvx (Recommended)

{
  "mcpServers": {
    "DocReader": {
      "command": "uvx",
      "args": ["mcp-doc-reader"]
    }
  }
}

Option 2: Using pip-installed command

{
  "mcpServers": {
    "DocReader": {
      "command": "mcp-doc-reader"
    }
  }
}

Option 3: Windows with Unicode Support

For Windows systems with non-ASCII file paths (e.g., Chinese characters):

{
  "mcpServers": {
    "DocReader": {
      "command": "cmd",
      "args": [
        "/c",
        "chcp 65001 >nul && uvx mcp-doc-reader"
      ]
    }
  }
}

Option 4: Linux/macOS with Python module

{
  "mcpServers": {
    "DocReader": {
      "command": "python",
      "args": ["-m", "docreader"]
    }
  }
}

Available Tools

read_pdf

Read text content from a PDF file.

Parameters:

  • file_path (string, required): Absolute path to the PDF file

Example:

{
  "name": "read_pdf",
  "arguments": {
    "file_path": "/path/to/document.pdf"
  }
}

read_excel

Read content from an Excel file (.xlsx or .xls).

Parameters:

  • file_path (string, required): Absolute path to the Excel file

Example:

{
  "name": "read_excel",
  "arguments": {
    "file_path": "/path/to/spreadsheet.xlsx"
  }
}

read_word

Read text content from a Word file (.docx).

Parameters:

  • file_path (string, required): Absolute path to the Word file

Example:

{
  "name": "read_word",
  "arguments": {
    "file_path": "/path/to/document.docx"
  }
}

Usage Examples

Once configured, you can ask your AI assistant to:

  • "Read the contents of /path/to/report.pdf"
  • "Extract data from /path/to/data.xlsx"
  • "What does the document /path/to/memo.docx contain?"

Development

Setup Development Environment

git clone https://github.com/yourusername/mcp-doc-reader.git
cd mcp-doc-reader
pip install -e ".[dev]"

Run Tests

pytest

Build Package

pip install build
python -m build

Publish to PyPI

pip install twine
twine upload dist/*

Project Structure

mcp-doc-reader/
├── src/
│   └── docreader/
│       ├── __init__.py
│       ├── __main__.py
│       ├── server.py
│       └── readers/
│           ├── __init__.py
│           ├── pdf_reader.py
│           ├── excel_reader.py
│           └── word_reader.py
├── examples/
│   ├── mcp_config_pip.json
│   ├── mcp_config_uvx.json
│   ├── mcp_config_windows.json
│   └── mcp_config_linux.json
├── pyproject.toml
├── README.md
└── LICENSE

Troubleshooting

Windows: Unicode/Chinese filename issues

If you encounter issues with non-ASCII characters in file paths on Windows, use the Windows-specific configuration that sets the code page to UTF-8:

{
  "mcpServers": {
    "DocReader": {
      "command": "cmd",
      "args": ["/c", "chcp 65001 >nul && mcp-doc-reader"]
    }
  }
}

.doc files not supported

The Word reader only supports .docx format. To read .doc files, please convert them to .docx first using Microsoft Word or LibreOffice.

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

快速设置
此服务器的安装指南

安装包 (如果需要)

uvx mcp-doc-reader

Cursor 配置 (mcp.json)

{ "mcpServers": { "aardpro-mcp-doc-reader": { "command": "uvx", "args": [ "mcp-doc-reader" ] } } }