MCP server by Sumithraju
ENA REST MCP - Model Context Protocol Server
Expose a carefully selected subset of European Nucleotide Archive (ENA) REST services through the Model Context Protocol (MCP), enabling AI agents to interact with ENA programmatically in a safe, structured, and reproducible way.
Overview
The European Nucleotide Archive (ENA) provides a rich set of REST APIs that allow users to query genomic metadata, sequence records, and submission information. This project bridges ENA services with modern AI agents through MCP.
Key Features
- Flexible Data Query Interface: Support for natural language, structured queries, and API endpoint filtering
- Multi-Format Response Support: CSV, Excel, JSON outputs with downloadable reports
- Downloadable Summary & Reports: Concise summaries with analytics and links to raw datasets
- Internal Authentication: Secure handling of authentication for protected ENA services
- Performance Optimized: Fast response times and high accuracy
Project Objectives
- Monolith MCP Server: Single MCP server exposing REST APIs from multiple ENA services
- Data Access: Users can request data in simple formats or plain text with clear summaries
- Full Stack Delivery: Backend MCP service with proper test cases and optional UI
Services Integrated
Core Services
- Webin-REST: ENA core service for metadata storage and access (requires authentication)
- Webin-Report: Service for reporting queries from ENA databases (requires authentication)
- EBI Search Service: BM25 text-based search service (https://www.ebi.ac.uk/ebisearch/ws/rest/sra-sample)
- ENA Browser: Public access to genomic metadata
Project Structure
ENA-MCP/
├── src/
│ ├── mcp_server/ # MCP server implementation
│ ├── ena_api/ # ENA API client and wrappers
│ ├── utils/ # Utility functions
│ └── __init__.py
├── tests/ # Test suite
├── docs/ # Documentation
├── config/ # Configuration files
├── requirements.txt # Python dependencies
├── setup.py # Package setup
├── .gitignore # Git ignore rules
├── .env.example # Environment variables template
└── README.md # This file
Installation
Prerequisites
- Python 3.8+
- pip or conda
Setup
- Clone the repository:
git clone https://github.com/Sumithraju/ena-mcp.git
cd ENA-MCP
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Configure environment variables:
cp .env.example .env
# Edit .env with your ENA credentials (if needed)
Usage
Start the MCP Server
python src/mcp_server/main.py
Example Queries
Natural Language Query:
"Show widely used Mpox samples"
"Give me latest Covid related submission details"
Structured Query:
{
"query_type": "sample_search",
"keywords": ["Mpox"],
"fields": ["acc", "description", "name"],
"format": "json",
"limit": 100
}
API Endpoints
The MCP server exposes the following main tools:
search_samples: Search ENA sample recordssearch_studies: Query study metadatasearch_runs: Look up run/sequence recordsget_submission_details: Retrieve submission informationdownload_data: Generate downloadable reports
Configuration
Configuration is managed through:
.envfile for credentialsconfig/directory for server settings- Command-line arguments for runtime options
Testing
Run the test suite:
pytest tests/ -v
Generate coverage report:
pytest tests/ --cov=src --cov-report=html
Documentation
- API Reference: See
docs/API.md - Architecture: See
docs/ARCHITECTURE.md - ENA Programmatic Access: https://ena-docs.readthedocs.io/en/latest/retrieval/programmatic-access.html
Development
Contributing
- Create a feature branch:
git checkout -b feature/your-feature - Make your changes and write tests
- Commit with clear messages:
git commit -m "Add feature: description" - Push to your fork:
git push origin feature/your-feature - Create a Pull Request
Code Style
This project follows PEP 8 style guidelines. Use black and flake8 for formatting and linting:
black src/
flake8 src/
Performance Targets
- Response time: < 2 seconds for typical queries
- Test coverage: ≥ 95%
- Support for large datasets with downloadable reports
Project Timeline (GSoC 2026)
- Phase 1: Research and setup
- Phase 2: Core MCP server implementation
- Phase 3: ENA API integration
- Phase 4: Testing and optimization
- Phase 5: Documentation and UI (optional)
References
- ENA Browser: https://www.ebi.ac.uk/ena/browser/home
- Webin-Reports Swagger: https://www.ebi.ac.uk/ena/submit/report/swagger-ui/index.html
- ENA Programmatic Access: https://ena-docs.readthedocs.io/en/latest/retrieval/programmatic-access.html
- FastMCP: https://gofastmcp.com/getting-started/welcome
- MCP Python SDK: https://github.com/modelcontextprotocol/python-sdk
- GSoC 2026 Timeline: https://developers.google.com/open-source/gsoc/timeline
License
This project is part of EMBL-EBI's Google Summer of Code 2026 initiative.
Author
Sumithra Raju - GSoC 2026 Contributor at EMBL-EBI
Support
For issues, questions, or contributions, please open an issue on GitHub or contact the development team.
Last Updated: March 16, 2026