A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.
ML Lab MCP
A comprehensive MCP (Model Context Protocol) server for ML model training, fine-tuning, and experimentation. Transform your AI assistant into a full ML engineering environment.
Features
Unified Credential Management
- Encrypted vault for API keys (Lambda Labs, RunPod, Mistral, OpenAI, Together AI, etc.)
- PBKDF2 key derivation with AES encryption
- Never stores credentials in plaintext
Dataset Management
- Register datasets from local files OR client-provided content (JSONL, CSV, Parquet)
- Upload datasets directly without server filesystem access
- Automatic schema inference and statistics
- Train/val/test splitting
- Template-based transformations
Experiment Tracking
- SQLite-backed experiment storage
- Version control and comparison
- Fork experiments with config modifications
- Full metrics history
Multi-Backend Training
- Local: transformers + peft + trl for local GPU training
- Mistral API: Native fine-tuning for Mistral models
- Together AI: Hosted fine-tuning service
- OpenAI: GPT model fine-tuning
Cloud GPU Provisioning
- Lambda Labs: H100, A100 instances
- RunPod: Spot and on-demand GPUs
- Automatic price comparison across providers
- Smart routing based on cost and availability
Remote VPS Support
- Use any SSH-accessible machine (Hetzner, Hostinger, OVH, home server, university cluster)
- Automatic environment setup
- Dataset sync via rsync
- Training runs in tmux (persistent across disconnects)
- Amortized hourly cost calculation from monthly fees
Cost Estimation
- Pre-training cost estimates across all providers
- Real-time pricing queries
- Time estimates based on model and dataset size
Ollama Integration
- Deploy fine-tuned GGUF models to Ollama
- Pull models from Ollama registry
- Chat/inference testing directly from MCP
- Model management (list, delete, copy)
Open WebUI Integration
- Create model presets with system prompts
- Knowledge base management (RAG)
- Chat through Open WebUI (applies configs + knowledge)
- Seamless Ollama ↔ Open WebUI workflow
Installation
pip install ml-lab-mcp
# With training dependencies
pip install ml-lab-mcp[training]
# With cloud provider support
pip install ml-lab-mcp[cloud]
# Everything
pip install ml-lab-mcp[training,cloud,dev]
Quick Start
1. Initialize and Create Vault
ml-lab init
ml-lab vault create
2. Add Provider Credentials
ml-lab vault unlock
ml-lab vault add --provider lambda_labs --api-key YOUR_KEY
ml-lab vault add --provider mistral --api-key YOUR_KEY
3. Configure with Claude Code / Claude Desktop
Add to your MCP configuration:
{
"mcpServers": {
"ml-lab": {
"command": "ml-lab",
"args": ["serve"]
}
}
}
MCP Tools
Credentials
| Tool | Description |
|------|-------------|
| creds_create_vault | Create encrypted credential vault |
| creds_unlock | Unlock vault with password |
| creds_add | Add provider credentials |
| creds_list | List configured providers |
| creds_test | Verify credentials work (Lambda Labs, GCP, OpenAI supported) |
Datasets
| Tool | Description |
|------|-------------|
| dataset_register | Register a dataset from a local file |
| dataset_register_content | Register a dataset from client-provided content (CSV, JSON, JSONL, Parquet) |
| dataset_list | List all datasets |
| dataset_inspect | View schema and statistics |
| dataset_preview | Preview samples |
| dataset_split | Create train/val/test splits |
| dataset_transform | Apply template transformations |
Experiments
| Tool | Description |
|------|-------------|
| experiment_create | Create new experiment |
| experiment_list | List experiments |
| experiment_get | Get experiment details |
| experiment_compare | Compare multiple experiments |
| experiment_fork | Fork with modifications |
Training
| Tool | Description |
|------|-------------|
| train_estimate | Estimate cost/time across providers |
| train_launch | Start training run |
| train_status | Check run status |
| train_stop | Stop training |
Infrastructure
| Tool | Description |
|------|-------------|
| infra_list_gpus | List available GPUs with pricing |
| infra_provision | Provision cloud instance |
| infra_terminate | Terminate instance |
Remote VPS
| Tool | Description |
|------|-------------|
| vps_register | Register a VPS (host, user, key, GPU info, monthly cost) |
| vps_list | List all registered VPS machines |
| vps_status | Check VPS status (online, GPU, running jobs) |
| vps_unregister | Remove a VPS from registry |
| vps_setup | Install training dependencies on VPS |
| vps_sync | Sync dataset to VPS |
| vps_run | Run command on VPS |
| vps_logs | Get training logs from VPS |
Ollama
| Tool | Description |
|------|-------------|
| ollama_status | Check Ollama status (running, version, GPU) |
| ollama_list | List models in Ollama |
| ollama_pull | Pull model from registry |
| ollama_deploy | Deploy GGUF to Ollama |
| ollama_chat | Chat with a model |
| ollama_delete | Delete a model |
Open WebUI
| Tool | Description |
|------|-------------|
| owui_status | Check Open WebUI connection |
| owui_list_models | List model configurations |
| owui_create_model | Create model preset (system prompt, params) |
| owui_delete_model | Delete model configuration |
| owui_list_knowledge | List knowledge bases |
| owui_create_knowledge | Create knowledge base |
| owui_add_knowledge_file | Add file to knowledge base |
| owui_chat | Chat through Open WebUI |
Security
| Tool | Description |
|------|-------------|
| security_audit_log | View recent audit log entries |
| security_audit_summary | Get audit activity summary |
| security_tailscale_status | Check Tailscale VPN connection |
| security_ssh_key_rotate | Rotate SSH key for a VPS |
| creds_expiry_check | Check credential expiry status |
| creds_rotate | Rotate credentials for a provider |
Codex Integration (Executor LLM)
| Tool | Description |
|------|-------------|
| codex_status | Check if Codex CLI is available |
| codex_analyze_error | Have Codex analyze errors and suggest fixes |
| codex_generate_training_script | Generate training script from experiment config |
| codex_fix_code | Have Codex fix issues in training code |
| codex_optimize_config | Optimize training config for quality/speed/memory |
| codex_debug_training | Debug training issues from logs |
| codex_run | Run arbitrary task with Codex |
Architecture: Separation of intelligence vs execution:
- Planner LLM (Claude): Reasoning, architecture, tradeoffs
- Executor LLM (Codex): Precise code edits + CLI work
- World tools (ML Lab): Data, training, infra, deployment
Deep Thinking Analysis (Ollama Reasoning Models)
| Tool | Description |
|------|-------------|
| thinking_analyze | Run deep analysis with reasoning models (DeepSeek R1, QwQ) |
| thinking_schedule | Schedule automated analysis (after training, daily, weekly) |
| thinking_reports | List and retrieve analysis reports |
Analysis Types:
training- Analyze loss curves, convergence, hyperparametersexperiment- Compare experiments, suggest improvementsactivity- Review audit logs for patterns/anomaliescost- Cost efficiency analysis and recommendationsdataset- Data quality and training suitability
Schedule Frequencies:
after_training- Auto-analyze when training completeshourly/daily/weekly- Periodic reviews
Reports stored in ~/.cache/ml-lab/reports/ for historical reference.
Example Workflow
User: Fine-tune Llama 3.1 8B on my customer support data
Claude: I'll help you fine-tune Llama 3.1 8B. Let me first check your data.
[Uses dataset_register to load ./support_data.jsonl]
Found 15,000 samples. Let me estimate the costs:
[Uses train_estimate]
| Provider | Time | Cost | Availability |
|---------------|--------|--------|--------------|
| vps:hetzner-01| ~4h | $0.28* | Idle |
| Local (4090) | ~12h | $0 | Ready |
| Lambda H100 | ~2h | $5.50 | 3 available |
| Together API | ~4h | $12 | Queued |
*amortized from $200/mo
Which would you prefer?
User: Use Lambda Labs
Claude: [Uses experiment_create, then train_launch with backend=lambda]
Training started! Run ID: abc123
I'll monitor progress and let you know when it completes.
Architecture
src/ml_lab/
├── server.py # MCP server entry point (61 tools)
├── credentials.py # Encrypted credential vault
├── cli.py # Command-line interface
├── backends/
│ ├── base.py # Training backend interface
│ ├── local.py # Local GPU training
│ ├── mistral_api.py # Mistral fine-tuning API
│ ├── together_api.py # Together AI API
│ ├── openai_api.py # OpenAI fine-tuning API
│ └── vertex_api.py # Google Vertex AI (Gemini)
├── cloud/
│ ├── base.py # Cloud provider interface
│ ├── lambda_labs.py # Lambda Labs integration
│ ├── runpod.py # RunPod integration
│ ├── modal_provider.py # Modal integration
│ └── remote_vps.py # Generic SSH VPS support (+ Tailscale)
├── storage/
│ ├── datasets.py # Dataset management
│ └── experiments.py # Experiment tracking
├── inference/
│ ├── ollama.py # Ollama integration
│ ├── openwebui.py # Open WebUI integration
│ └── thinking.py # Deep thinking analysis (DeepSeek R1, QwQ)
├── integrations/
│ └── codex.py # Codex CLI integration (executor LLM)
├── security/
│ └── audit.py # Audit logging
└── evals/
└── benchmarks.py # Evaluation suite
Security
- Credentials encrypted with Fernet (AES-128-CBC)
- PBKDF2-SHA256 key derivation (480,000 iterations)
- Vault file permissions set to 600 (owner read/write only)
- API keys never logged or transmitted unencrypted
- Audit logging: All sensitive operations logged to
~/.cache/ml-lab/audit.log - Credential expiry: Automatic tracking with rotation reminders
- Tailscale support: Optional VPN requirement for VPS connections
- SSH key rotation: Automated rotation with rollback on failure
Supported Providers
Compute Providers
- Lambda Labs (H100, A100, A10)
- RunPod (H100, A100, RTX 4090)
- Modal (serverless GPU functions)
Fine-Tuning APIs
- Mistral AI (Mistral, Mixtral, Codestral)
- Together AI (Llama, Mistral, Qwen)
- OpenAI (GPT-4o, GPT-3.5)
- Google Vertex AI (Gemini 1.5 Pro, Gemini 1.5 Flash)
Model Hubs
- Hugging Face Hub
- Replicate
- Ollama (local GGUF models)
Contributing
Contributions welcome! Please read CONTRIBUTING.md for guidelines.
License
PolyForm Noncommercial 1.0.0 - free for personal use, contact for commercial licensing.
See LICENSE for details.