Train gradient boosting models quickly and export portable artifacts for smooth deployment. Run fast predictions from newly trained models or existing artifacts. Upload datasets, browse available backends, and access a practical guide with best practices and troubleshooting.
⚡ WarpGBM MCP Service
GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Getartifact_idfor <100ms cached predictions • Download portable artifacts
🎯 What is This?
Outsource your GBDT workload to the world's fastest GPU implementation.
WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.
🏗️ How It Works (The Smart Cache Workflow)
graph LR
A[Train on GPU] --> B[Get artifact_id + model]
B --> C[5min Cache]
C --> D[<100ms Predictions]
B --> E[Download Artifact]
E --> F[Use Anywhere]
- Train: POST your data → Train on A10G GPU → Get
artifact_id+ portable artifact - Fast Path: Use
artifact_id→ Sub-100ms cached predictions (5min TTL) - Slow Path: Use
model_artifact_joblib→ Download and use anywhere
Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts
⚡ Quick Start
For AI Agents (MCP)
Add to your MCP settings (e.g., .cursor/mcp.json):
{
"mcpServers": {
"warpgbm": {
"url": "https://warpgbm.ai/mcp/sse"
}
}
}
For Developers (REST API)
# 1. Train a model
curl -X POST https://warpgbm.ai/train \
-H "Content-Type: application/json" \
-d '{
"X": [[5.1,3.5,1.4,0.2], [6.7,3.1,4.4,1.4], ...],
"y": [0, 1, 2, ...],
"model_type": "warpgbm",
"objective": "multiclass"
}'
# Response includes artifact_id for fast predictions
# {"artifact_id": "abc-123", "model_artifact_joblib": "H4sIA..."}
# 2. Make fast predictions (cached, <100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
-H "Content-Type: application/json" \
-d '{
"artifact_id": "abc-123",
"X": [[5.0,3.4,1.5,0.2]]
}'
🚀 Key Features
| Feature | Description |
|---------|-------------|
| 🎯 Multi-Model | WarpGBM (GPU) + LightGBM (CPU) |
| ⚡ Smart Caching | artifact_id → 5min cache → <100ms inference |
| 📦 Portable Artifacts | Download joblib models, use anywhere |
| 🤖 MCP Native | Direct tool integration for AI agents |
| 💰 X402 Payments | Optional micropayments (Base network) |
| 🔒 Stateless | No data storage, you own your models |
| 🌐 Production Ready | Deployed on Modal with custom domain |
🐍 Python Package vs MCP Service
This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:
| Feature | MCP Service (This Repo) | Python Package |
|---------|------------------------|----------------|
| Installation | None needed | pip install git+https://... |
| GPU | Cloud (pay-per-use) | Your GPU (free) |
| Control | REST API parameters | Full Python API |
| Features | Train, predict, upload | + Cross-validation, callbacks, feature importance |
| Best For | Quick experiments, demos | Production pipelines, research |
| Cost | $0.01 per training | Free (your hardware) |
Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control
📡 Available Endpoints
Core Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | /models | List available model backends |
| POST | /train | Train model, get artifact_id + model |
| POST | /predict_from_artifact | Fast predictions (artifact_id or model) |
| POST | /predict_proba_from_artifact | Probability predictions |
| POST | /upload_data | Upload CSV/Parquet for training |
| POST | /feedback | Submit feedback to improve service |
| GET | /healthz | Health check with GPU status |
MCP Integration
| Method | Endpoint | Description |
|--------|----------|-------------|
| SSE | /mcp/sse | MCP Server-Sent Events endpoint |
| GET | /.well-known/mcp.json | MCP capability manifest |
| GET | /.well-known/x402 | X402 pricing manifest |
💡 Complete Example: Iris Dataset
# 1. Train WarpGBM on Iris (60 samples recommended for proper binning)
curl -X POST https://warpgbm.ai/train \
-H "Content-Type: application/json" \
-d '{
"X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]],
"y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2],
"model_type": "warpgbm",
"objective": "multiclass",
"n_estimators": 100
}'
# Response:
{
"artifact_id": "abc123-def456-ghi789",
"model_artifact_joblib": "H4sIA...",
"training_time_seconds": 0.0
}
# 2. Fast inference with cached artifact_id (<100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
-H "Content-Type: application/json" \
-d '{
"artifact_id": "abc123-def456-ghi789",
"X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]]
}'
# Response: {"predictions": [0, 1, 2], "inference_time_seconds": 0.05}
# Perfect classification! ✨
⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.
🏠 Self-Hosting
Local Development
# Clone repo
git clone https://github.com/jefferythewind/mcp-warpgbm.git
cd mcp-warpgbm
# Setup environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Run locally (GPU optional for dev)
uvicorn local_dev:app --host 0.0.0.0 --port 8000 --reload
# Test
curl http://localhost:8000/healthz
Deploy to Modal (Production)
# Install Modal
pip install modal
# Authenticate
modal token new
# Deploy
modal deploy modal_app.py
# Service will be live at your Modal URL
Deploy to Other Platforms
# Docker (requires GPU)
docker build -t warpgbm-mcp .
docker run --gpus all -p 8000:8000 warpgbm-mcp
# Fly.io, Railway, Render, etc.
# See their respective GPU deployment docs
🧪 Testing
# Install dev dependencies
pip install -r requirements-dev.txt
# Run all tests
./run_tests.sh
# Or use pytest directly
pytest tests/ -v
# Test specific functionality
pytest tests/test_train.py -v
pytest tests/test_integration.py -v
📦 Project Structure
mcp-warpgmb/
├── app/
│ ├── main.py # FastAPI app + routes
│ ├── mcp_sse.py # MCP Server-Sent Events
│ ├── model_registry.py # Model backend registry
│ ├── models.py # Pydantic schemas
│ ├── utils.py # Serialization, caching
│ ├── x402.py # Payment verification
│ └── feedback_storage.py # Feedback persistence
├── .well-known/
│ ├── mcp.json # MCP capability manifest
│ └── x402 # X402 pricing manifest
├── docs/
│ ├── AGENT_GUIDE.md # Comprehensive agent docs
│ ├── MODEL_SUPPORT.md # Model parameter reference
│ └── WARPGBM_PYTHON_GUIDE.md
├── tests/
│ ├── test_train.py
│ ├── test_predict.py
│ ├── test_integration.py
│ └── conftest.py
├── examples/
│ ├── simple_train.py
│ └── compare_models.py
├── modal_app.py # Modal deployment config
├── local_dev.py # Local dev server
├── requirements.txt
└── README.md
💰 Pricing (X402)
Optional micropayments on Base network:
| Endpoint | Price | Description |
|----------|-------|-------------|
| /train | $0.01 | Train model on GPU, get artifacts |
| /predict_from_artifact | $0.001 | Batch predictions |
| /predict_proba_from_artifact | $0.001 | Probability predictions |
| /feedback | Free | Help us improve! |
Note: Payment is optional for demo/testing. See
/.well-known/x402for details.
🔐 Security & Privacy
✅ Stateless: No training data or models persisted
✅ Sandboxed: Runs in temporary isolated directories
✅ Size Limited: Max 50 MB request payload
✅ No Code Execution: Only structured JSON parameters
✅ Rate Limited: Per-IP throttling to prevent abuse
✅ Read-Only FS: Modal deployment uses immutable filesystem
🌍 Available Models
🚀 WarpGBM (GPU)
- Acceleration: NVIDIA A10G GPUs
- Speed: 13× faster than LightGBM
- Best For: Time-series, financial modeling, temporal data
- Special: Era-aware splitting, invariant learning
- Min Samples: 60+ recommended
⚡ LightGBM (CPU)
- Acceleration: Highly optimized CPU
- Speed: 10-100× faster than sklearn
- Best For: General tabular data, large datasets
- Special: Categorical features, low memory
- Min Samples: 20+
🗺️ Roadmap
- [x] Core training + inference endpoints
- [x] Smart artifact caching (5min TTL)
- [x] MCP Server-Sent Events integration
- [x] X402 payment verification
- [x] Modal deployment with GPU
- [x] Custom domain (warpgbm.ai)
- [x] Smithery marketplace listing
- [ ] ONNX export support
- [ ] Async job queue for large datasets
- [ ] S3/IPFS dataset URL support
- [ ] Python client library (
warpgbm-client) - [ ] Additional model backends (XGBoost, CatBoost)
💬 Feedback & Support
Help us make this service better for AI agents!
Submit feedback about:
- Missing features that would unlock new use cases
- Confusing documentation or error messages
- Performance issues or timeout problems
- Additional model types you'd like to see
# Via API
curl -X POST https://warpgbm.ai/feedback \
-H "Content-Type: application/json" \
-d '{
"feedback_type": "feature_request",
"message": "Add support for XGBoost backend",
"severity": "medium"
}'
Or via:
- GitHub Issues: mcp-warpgbm/issues
- GitHub Discussions: warpgbm/discussions
- Email: support@warpgbm.ai
📚 Learn More
- 🐍 WarpGBM Python Package - The core library (91+ ⭐)
- 🤖 Agent Guide - Complete usage guide for AI agents
- 📖 API Docs - Interactive OpenAPI documentation
- 🔌 Model Context Protocol - MCP specification
- 💰 X402 Specification - Payment protocol for agents
- ☁️ Modal Docs - Serverless GPU platform
📄 License
GPL-3.0 (same as WarpGBM core)
This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.
🙏 Credits
Built with:
- WarpGBM - GPU-accelerated GBDT library
- Modal - Serverless GPU infrastructure
- FastAPI - Modern Python web framework
- LightGBM - Microsoft's GBDT library
Built with ❤️ for the open agent economy