MCP Servers

模型上下文协议服务器、框架、SDK 和模板的综合目录。

M
MCP Driven Rag Enhanced Llm For Oncology

An MCP-governed, RAG-enhanced multi-agent LLM system for prostate cancer care, developed in collaboration with UChicago Medicine. Built using LangGraph, LangChain, ChromaDB, FastAPI, OpenAI GPT-4, HuggingFace embeddings, and ML & survival models (XGBoost, Cox, Weibull, RSF) for clinical reasoning and treatment prediction.

创建于 6/9/2026
更新于 about 6 hours ago
Repository documentation and setup instructions

MCP-Driven Multi-Agent RAG-Enhanced LangGraph-orchestrated LLM System for Prostate Cancer Decision Support

📍 Presented at SIIM-CAIMI 2025 (Society for Imaging Informatics in Medicine)
🏥 University of Chicago – MS in Applied Data Science Capstone


🔬 Project Overview

This project presents an auditable, multi-agent Retrieval-Augmented Generation (RAG) framework designed to assist oncology workflows through:

  • Longitudinal temporal summarization
  • Evidence-grounded literature integration
  • Treatment recommendation via supervised ML
  • Lifespan estimation via survival analysis
  • Hallucination detection and validation loop

The system processes 500 synthetic longitudinal prostate cancer records and produces:

  • Structured timeline summaries
  • Literature-verified clinical context
  • Ranked treatment recommendations (with probabilities)
  • Survival probabilities at 5-, 10-, and 15-year horizons
  • Expected lifespan estimates (in years)

🏗 System Architecture

The architecture follows an MCP-governed modular design with strict tool mediation and validation.

High-Level Flow

Patient ID
   ↓
MCP Server (secure context retrieval)
   ↓
LangGraph Orchestration
   ├── Retrieval Tool (PubMed phenotype-aware query builder)
   ├── Summarizer Agent (structured clinical report generation)
   ├── Validator Agent (hallucination + missing data detection)
   ↓
Structured Model APIs
   ├── XGBoost Treatment Model
   ├── Survival Ensemble (Cox + Weibull + RSF)
   ↓
Final Validated Clinical Report

🧠 Model Architecture Diagram (for README)

flowchart TD

A[Patient ID] --> B[MCP Server<br/>Longitudinal Context Retrieval]

B --> C[LangGraph Orchestrator]

C --> D[Phenotype-Aware RAG Tool<br/>PubMed Query + Scoring]
C --> E[Summarizer Agent<br/>Structured Clinical Report]
C --> F[Validator Agent<br/>Hallucination & Missing Data Check]

E --> F
F -->|Retry if Needed| E

F --> G[Treatment Prediction API<br/>XGBoost Classifier]
F --> H[Lifespan Estimation API<br/>Cox + Weibull + RSF]

G --> I[Ranked Therapy + Probabilities]
H --> J[5/10/15-Year Survival + Expected Years]

I --> K[Final MCP-Audited Clinical Report]
J --> K

🛠 Technical Stack

🧩 LLM & Orchestration

  • LangGraph (multi-agent workflow control)
  • OpenAI GPT-4 (summarizer + validator agents)
  • Prompt-constrained structured generation
  • Retry routing controller with bounded iterations

🔐 Governance Layer

  • Model Context Protocol (MCP) Server

    • Versioned patient context retrieval
    • Tool mediation
    • Auditable endpoint calls
    • Metadata tracking (model name, version, timestamp)

📚 Retrieval (RAG)

  • PubMed XML API

  • Phenotype-aware query builder

  • Signal extraction via regex parsers

  • Evidence scoring:

    • Clinical alignment
    • Recency filtering (≥2016)
    • Endpoint relevance
    • Novelty weighting
  • Deterministic citation embedding (verbatim insertion)

📊 Treatment Recommendation Model

  • XGBoost classifier

  • Features:

    • TNM stage
    • Gleason grade
    • PSA trajectory
    • PSA velocity
    • Metastatic indicators
    • Treatment history
  • Output:

    • Top-N ranked therapies
    • Class probabilities
    • Feature-driven rationale
  • Patient-level train/test split

  • Synthetic dataset accuracy: 1.00 (upper bound, not clinical claim)

📈 Lifespan Estimation Model (Ensemble Survival Framework)

Three complementary models:

  1. Cox Proportional Hazards (interpretable hazard ratios)
  2. Weibull Regression (stage-stratified baseline survival curves)
  3. Random Survival Forest (nonlinear feature interactions)

Workflow:

  • TNM-based stratification (localized / N1 / M1)
  • Weibull baseline curve
  • Cox-based patient-specific risk shift
  • RSF nonlinear modulation
  • Ensemble averaging

Outputs:

  • 5-, 10-, 15-year survival probabilities
  • Expected survival time (years)
  • Monotonic survival validation checks

Internal QA:

  • Survival ordering check (M1 < N1 < localized)
  • Probability bounds enforcement
  • Curve monotonicity

🏥 Conference Presentation

This work was presented at:

SIIM-CAIMI 2025
Society for Imaging Informatics in Medicine – Conference on Artificial Intelligence in Medical Imaging

Screenshot 2026-01-11 at 10 55 00 AM

📁 Data

  • 500 synthetic longitudinal prostate cancer records

  • 5–7 time-stamped visits per patient

  • Variables include:

    • PSA (with kinetics)
    • Gleason grade
    • TNM stage
    • Bone lesion count
    • Visceral metastasis flag
    • ALP, LDH, albumin, hemoglobin
    • Treatment history
    • Weight trends

Validated for medical plausibility by a practicing radiologist.


🔎 Evaluation & Validation

Structured Models

  • Held-out patient-level testing
  • Directional consistency validation
  • Internal statistical QA checks

LLM Components

  • Dedicated validator agent

  • Detection of:

    • Hallucinated content
    • Missing patient data
  • Iterative retry loop

  • Verbatim literature line insertion (no citation hallucination)

No summary finalized with unresolved hallucinations.


🎯 Key Contributions

  • MCP-governed agentic clinical AI framework
  • Hallucination-resistant RAG integration
  • Survival ensemble integrated into LLM workflow
  • Deterministic literature grounding
  • Modular API-based predictive model integration
  • Fully auditable report generation pipeline

📌 Research Context

This project addresses limitations in current oncology AI systems:

  • Lack of temporal reasoning
  • Hallucination in generative summaries
  • Non-auditable clinical AI outputs
  • Separation between ML survival models and narrative reasoning

The architecture demonstrates a reproducible pattern for safe LLM deployment in healthcare.


⚠ Disclaimer

This system was trained and evaluated on sample data and is intended for research demonstration only.

快速设置
此服务器的安装指南

安装命令 (包未发布)

git clone https://github.com/hyunji0618/MCP-Driven-RAG-Enhanced-LLM-for-Oncology
手动安装: 请查看 README 获取详细的设置说明和所需的其他依赖项。

Cursor 配置 (mcp.json)

{ "mcpServers": { "hyunji0618-mcp-driven-rag-enhanced-llm-for-oncology": { "command": "git", "args": [ "clone", "https://github.com/hyunji0618/MCP-Driven-RAG-Enhanced-LLM-for-Oncology" ] } } }