MCP Servers

模型上下文协议服务器、框架、SDK 和模板的综合目录。

MCP server for class 12 physics students.

创建于 3/26/2026
更新于 about 7 hours ago
Repository documentation and setup instructions

PDF OCR MCP Server

A production-oriented Model Context Protocol (MCP) server for working with OCR text extracted from PDF page images. The repository includes an OCR pipeline for generating page text files and an HTTP-based MCP server built with Express and the official MCP TypeScript SDK.

This project is designed for local corpora that have already been split into page images. It exposes that OCR corpus through MCP tools, resources, and prompts so MCP clients can search, inspect, and summarize page content efficiently.

Features

  • HTTP-based MCP server using Streamable HTTP and Express
  • OCR pipeline for converting page images into .txt files with Tesseract
  • Safe file access patterns to prevent path traversal
  • In-memory LRU-style cache for hot text pages
  • Fast file-list caching for repeated MCP calls
  • Session-aware MCP transport with graceful shutdown
  • Health and readiness endpoints for local ops and deployment checks
  • MCP tools, resources, and prompts for page discovery and retrieval

Repository Structure

.
|-- pages/                  # Source page images (.png)
|-- texts/                  # OCR output text files (.txt)
|-- src/
|   |-- scripts/
|   |   `-- index.ts        # OCR generation script
|   `-- tools/
|       |-- http-server.ts  # Express + HTTP MCP server entrypoint
|       |-- mcp-server.ts   # MCP tools/resources/prompts registration
|       |-- text-repository.ts
|       `-- tools.ts        # Compatibility entrypoint
|-- package.json
`-- tsconfig.json

Architecture

The codebase is split into three clean layers:

  • TextRepository Handles safe file resolution, page listing, cached reads, range reads, and text search.

  • createTextMcpServer Defines the MCP contract exposed to clients, including tools, resource templates, and prompts.

  • http-server Hosts the MCP server over Express using Streamable HTTP transport, manages sessions, and exposes operational endpoints.

Requirements

  • Node.js 18+
  • npm
  • Tesseract OCR installed and available on PATH

The OCR script currently invokes tesseract directly, so the binary must be accessible from your shell.

Installation

npm install

OCR Workflow

If your page images already exist in pages/, generate text files with:

npm run generate:textfiles

This will:

  • read .png files from pages/
  • sort them by page number
  • run Tesseract with English language data
  • write matching .txt files into texts/

Running the MCP Server

Start the HTTP-based MCP server with:

npm run start

The server defaults to:

  • MCP endpoint: http://127.0.0.1:3000/mcp
  • health endpoint: http://127.0.0.1:3000/healthz
  • readiness endpoint: http://127.0.0.1:3000/readyz

Environment Variables

The server supports the following environment variables:

| Variable | Default | Description | | ------------------- | ----------- | ------------------------------------------ | | MCP_HOST | 127.0.0.1 | Host interface to bind the server to | | MCP_PORT | 3000 | Port used by the HTTP MCP server | | MCP_BODY_LIMIT | 1mb | Maximum JSON request body size | | MCP_ALLOWED_HOSTS | empty | Optional comma-separated host allowlist | | MCP_PRELOAD_CACHE | true | Preload text content into cache on startup |

Example:

MCP_HOST=127.0.0.1
MCP_PORT=3000
MCP_BODY_LIMIT=1mb
MCP_PRELOAD_CACHE=true

MCP Capabilities

Tools

  • list_text_pages Lists available OCR text files with pagination support.

  • read_text_page Reads a single OCR page by file name and can optionally truncate the response.

  • read_text_range Reads a bounded character range from a page for more efficient retrieval.

  • search_text_pages Searches the corpus and returns contextual snippets for matching pages.

  • get_corpus_stats Returns repository and cache metrics for diagnostics and monitoring.

Resources

  • texts://page/{file} Exposes OCR text pages as MCP resources through a dynamic resource template.

Prompts

  • summarize_text_page Generates a reusable prompt for summarizing a specific OCR page, with optional focus text.

Operational Notes

The HTTP server includes several production-friendly behaviors:

  • per-request request IDs for logging
  • request duration logging
  • JSON parse error handling
  • Cache-Control: no-store on MCP endpoints
  • session tracking for stateful MCP transport
  • graceful cleanup on SIGINT and SIGTERM

Development

Type-check the project with:

npx tsc --noEmit

Suggested MCP Client Usage

This repository is a good fit for clients that need to:

  • search large OCR corpora before loading full pages
  • retrieve only relevant text ranges instead of full files
  • consume page text as MCP resources
  • build summarization or extraction workflows on top of OCR output

Limitations

  • OCR currently assumes English text via -l eng
  • OCR input is currently limited to .png page images in pages/
  • Search is simple substring matching, not full-text indexed search
  • Cache is in-memory and resets on process restart

Future Improvements

  • add structured metadata for page numbers, source PDFs, and sections
  • support multi-language OCR
  • add full-text indexing for faster corpus-wide search
  • add tests for repository and transport behavior
  • add Docker support and deployment examples

License

This repository is currently marked as ISC in package.json.

快速设置
此服务器的安装指南

安装包 (如果需要)

npx @modelcontextprotocol/server-pdf-ocr-mcpserver

Cursor 配置 (mcp.json)

{ "mcpServers": { "adeelahmedhashmi-pdf-ocr-mcpserver": { "command": "npx", "args": [ "adeelahmedhashmi-pdf-ocr-mcpserver" ] } } }