MCP server by praveenkumar-pokala
MCP + LangGraph + RAG Research Agent Demo
This repo shows how to attach MCP tools to an OpenAI LLM in LangGraph, including a RAG MCP server, and orchestrate everything as a single research assistant agent.
The setup is intentionally opinionated and production-inspired:
- Multiple MCP servers:
MathServer(stdio)ResearchServer(HTTP)RAGServer(stdio, vector search over local docs)
- A LangGraph agent that:
- Uses
openai:gpt-4.1as the reasoning engine. - Discovers all MCP tools via
MultiServerMCPClient. - Binds them to the LLM with
model.bind_tools(tools). - Executes tool calls via
ToolNode+tools_condition.
- Uses
- A CLI that lets you interrogate the system end-to-end.
1. Architecture Overview
At a high level:
+---------------------------+
| ResearchServer (HTTP) |
| - search_docs() |
| - get_paper_abstract() |
+-------------+-------------+
^
|
+---------------------------+ |
| MathServer (stdio) | |
| - add() | |
| - multiply() | |
| - compound_interest() | |
+-------------+-------------+ |
^ |
| |
+-------------+-------------+ |
| RAGServer (stdio) | |
| - index_corpus() | |
| - search_corpus() | |
+-------------+-------------+ |
^ |
| |
+-----+-------------------------+----------------------------+
| MultiServerMCPClient (langchain-mcp-adapters) |
| - Connects to all MCP servers |
| - Exposes tools to LangGraph / LangChain |
+----------------------+------------------------------------+
|
v
+------+-------+
| OpenAI LLM |
| openai:gpt-4.1
+------+-------+
|
model.bind_tools(tools)
|
v
+------+----------------------------+
| LangGraph StateGraph |
| - Node: LLM |
| - Node: ToolNode(MCP tools) |
| - Edge: tools_condition |
+-------------------+---------------+
|
v
CLI / Notebook
The agent behaves like a research co-pilot that can:
- Pull quick definitions and context via the
ResearchServer. - Run small calculations with
MathServer. - Ground answers in your local documentation via
RAGServer.
2. Folder Structure
mcp-langgraph-rag-agent-demo/
├─ agent/
│ ├─ __init__.py
│ └─ graph_research_agent.py # LangGraph + MCP wiring
│
├─ mcp_servers/
│ ├─ data/
│ │ ├─ 001_mcp_intro.md # Sample docs indexed by RAG
│ │ ├─ 002_security_considerations.md
│ │ └─ 003_rag_playbook.md
│ ├─ math_server.py # MCP math tools (stdio)
│ ├─ research_server.py # MCP research tools (HTTP)
│ └─ rag_server.py # MCP RAG tools (stdio)
│
├─ scripts/
│ └─ run_agent.py # CLI entrypoint
│
├─ .env.example # Template for OpenAI key
├─ requirements.txt
└─ README.md
3. Quickstart
3.1 Prerequisites
- Python 3.10+
- An OpenAI API key for
gpt-4.1(or update the model name ingraph_research_agent.py). - Ability to install Python dependencies (see
requirements.txt).
3.2 Install dependencies
git clone <this-repo-url>
cd mcp-langgraph-rag-agent-demo
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
Create your .env:
cp .env.example .env
# Then edit .env and paste your OPENAI_API_KEY
3.3 Start the HTTP Research MCP server
In terminal 1:
cd mcp_servers
python research_server.py
You should see logs indicating it is listening on http://0.0.0.0:8000/mcp.
The math and RAG servers are launched automatically via stdio by the MCP client, so you do not need to run them manually.
3.4 Run the LangGraph agent
In terminal 2 (at repo root):
python scripts/run_agent.py
You should see:
✅ MCP + LangGraph + RAG agent is ready.
Type 'exit' or 'quit' to stop.
Now chat:
You: What is MCP and why is it useful for connecting internal tools?
You: Search our docs for security considerations and summarise three key points.
You: Use RAG to find anything about RAG Playbook and explain it simply.
You: If each query saves 3 minutes for 50 researchers per week, how many hours do we save in a year?
The agent will decide when to call tools like:
search_docs(HTTP MCP)index_corpus/search_corpus(RAG MCP)compound_interestoradd/multiply(math MCP)
4. How MCP tools are attached to the OpenAI LLM
The key wiring lives in agent/graph_research_agent.py:
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import ToolNode, tools_condition
from langgraph.graph import StateGraph, START, MessagesState
from langchain.chat_models import init_chat_model
async def build_mcp_tools():
client = MultiServerMCPClient(
{
"math": {
"command": "python",
"args": ["mcp_servers/math_server.py"],
"transport": "stdio",
},
"research": {
"url": "http://localhost:8000/mcp",
"transport": "streamable_http",
},
"rag": {
"command": "python",
"args": ["mcp_servers/rag_server.py"],
"transport": "stdio",
},
}
)
tools = await client.get_tools()
return tools
def make_llm_node(model, tools):
bound = model.bind_tools(tools)
def llm_node(state):
messages = state["messages"]
response = bound.invoke(messages)
return {**state, "messages": messages + [response]}
return llm_node
async def build_graph():
tools = await build_mcp_tools()
model = init_chat_model("openai:gpt-4.1")
builder = StateGraph(MessagesState)
builder.add_node("llm", make_llm_node(model, tools))
builder.add_node("tools", ToolNode(tools))
builder.add_edge(START, "llm")
builder.add_conditional_edges("llm", tools_condition)
builder.add_edge("tools", "llm")
return builder.compile()
Interpretation:
MultiServerMCPClientconnects to three MCP servers and discovers their tools.tools = await client.get_tools()returns a flat list of LangChain-style tools.model.bind_tools(tools)tellsopenai:gpt-4.1that these tools are available.ToolNode(tools)knows how to execute whatever tool calls the LLM emits.tools_conditionroutes the graph:- If the LLM did not call tools ➜ end.
- If it did call tools ➜ go to
ToolNode, run them, then back to LLM.
5. The RAG MCP server in detail
The RAG server lives in mcp_servers/rag_server.py and exposes two tools:
index_corpus(reset: bool = False)search_corpus(query: str, top_k: int = 5)
The corpus is the .md / .txt files under mcp_servers/data/.
The core pattern:
rag_index = RAGIndex()
@mcp.tool()
def index_corpus(reset: bool = False) -> str:
count = rag_index.build_index(reset=reset)
return f"Indexed {count} documents from {DATA_DIR}."
@mcp.tool()
def search_corpus(query: str, top_k: int = 5) -> str:
results = rag_index.search(query, top_k=top_k)
# Format into a compact text snippet for the LLM
You can replace the implementation of RAGIndex with:
- FAISS, Elasticsearch, or your internal vector store.
- Your own embedding service instead of
sentence-transformers.
The LangGraph wiring does not change.
6. Extending this repo
Some ideas to push this toward production:
-
Authentication and RBAC for MCP servers
- Attach auth headers / tokens for HTTP MCP servers.
- Enforce method-level permissions (e.g., who can call
index_corpus).
-
Per-user conversation state
- Add a
user_idin the state. - Persist conversation history in a database and reload per session.
- Add a
-
Structured tool outputs
- Return JSON from tools and parse it in the LLM prompt.
- Attach citations and document IDs explicitly in answers.
-
Additional MCP servers
- Jira, GitHub, Databricks, internal REST APIs.
- All accessed via the same
MultiServerMCPClientpattern.
7. Troubleshooting
-
ImportError: sentence-transformers
- Install it:
pip install sentence-transformers - Or swap out the embedding logic in
rag_server.pyfor your own.
- Install it:
-
Cannot connect to research server
- Make sure
python mcp_servers/research_server.pyis running. - Check that port
8000is free.
- Make sure
-
OpenAI authentication errors
- Confirm
OPENAI_API_KEYis present in your environment. - Try a trivial LangChain OpenAI call in a REPL to verify.
- Confirm
8. Summary
This repository is a template for:
- Building serious, multi-tool agents using LangGraph.
- Surfacing those tools via MCP (stdio + HTTP).
- Grounding answers in RAG over your own docs.
- Keeping the orchestration clean:
model.bind_tools+ToolNode+tools_condition.
Use it as a starting point to plug in your own:
- Enterprise RAG stack
- Ticketing systems
- Databricks jobs
- Any other internal services that you want your agents to reason over.