Text-to-video animation via Manim + Gemini — CLI, agent mode, and MCP server
manim-mcp
Text-to-video animation powered by manimgl (3Blue1Brown's library) and a multi-agent LLM pipeline. Describe what you want to see, and get a rendered animation back.
Works as a CLI tool, an LLM-powered agent, or an MCP server for integration with AI assistants like Claude.
Examples
| Circle to Square Transform | 3D Rotating Cube |
|---------------------------|------------------|
| manim-mcp gen "Transform a blue circle into a red square" | manim-mcp gen "A 3D cube rotating. Use ThreeDScene." |
|
|
|
Watch 20+ example animations on YouTube
Features
- RAG-powered code generation - Uses 5,300+ indexed documents for high-quality code generation:
- 3,140 3Blue1Brown scene examples
- 1,652 manimgl API signatures with exact parameters
- 101 animation pattern templates (Riemann sums, transforms, physics, etc.)
- 470 library documentation files
- 16+ error patterns for common mistakes
- Probe integration - Optional AST-aware semantic code search using Probe:
- Tree-sitter based code parsing (understands Python structure)
- Hybrid BM25 + TF-IDF ranking for better keyword matching
- Complete code block extraction (no truncation)
- Multi-animation videos - Each video uses 2+ animation patterns for professional quality
- Multi-agent pipeline - Concept analysis, scene planning, code generation, and code review
- Self-learning - Stores error patterns and fixes for continuous improvement
- Multi-provider LLM - Supports Google Gemini, Anthropic Claude, and DeepSeek
- Audio narration - Parallel audio generation with automatic sync:
- Video code generated first (no narration constraint)
- TTS runs in parallel with video rendering
- Audio automatically paced to match video duration
- Parameter validation - API signatures prevent invalid method calls
Quick Start
pip install -e ".[rag]"
Prerequisites
- Python 3.11+
- manimgl installed:
pip install manimgl - A Google Gemini API key set as
MANIM_MCP_GEMINI_API_KEY - Optional: ChromaDB (for RAG), ffmpeg (for audio mixing), LaTeX (for math text), S3/MinIO (for cloud storage)
- Optional: Probe for AST-aware code search (install via
cargo install probe-search)
Environment Variables
Copy .env.example to .env and fill in your values:
# LLM Provider
MANIM_MCP_GEMINI_API_KEY=your-gemini-api-key
MANIM_MCP_GEMINI_MODEL=gemini-3-flash-preview # default
# Alternative: Claude
# MANIM_MCP_LLM_PROVIDER=claude
# MANIM_MCP_CLAUDE_API_KEY=your-claude-api-key
# MANIM_MCP_CLAUDE_MODEL=claude-sonnet-4-20250514
# Alternative: DeepSeek
# MANIM_MCP_LLM_PROVIDER=deepseek
# MANIM_MCP_DEEPSEEK_API_KEY=your-deepseek-api-key
# RAG (ChromaDB)
MANIM_MCP_RAG_ENABLED=true
MANIM_MCP_CHROMADB_HOST=localhost
MANIM_MCP_CHROMADB_PORT=8000
# S3 Storage (optional)
MANIM_MCP_S3_ENDPOINT=localhost:9000
MANIM_MCP_S3_ACCESS_KEY=minioadmin
MANIM_MCP_S3_SECRET_KEY=minioadmin
MANIM_MCP_S3_BUCKET=manim-renders
# Probe Search (optional - for AST-aware code search)
# Colon-separated paths to search for scene examples
MANIM_MCP_PROBE_PATHS=/path/to/3b1b-videos:/path/to/manim-examples
Usage
Generate an animation
# Simple mode (default) - direct LLM generation
manim-mcp gen "Transform a blue circle into a red square"
# Advanced mode - multi-agent pipeline with RAG
manim-mcp gen "Visualize the central limit theorem" --mode advanced
# With quality and format options
manim-mcp gen "Animate eigenvectors" --quality high --format mp4
Generation modes:
--mode simple(default): Direct LLM code generation, faster--mode advanced: Multi-agent pipeline (ConceptAnalyzer → ScenePlanner → CodeGenerator → CodeReviewer) with RAG retrieval
Generate with audio narration
manim-mcp gen "Introduction to linear algebra" --audio
manim-mcp gen "Pythagorean theorem proof" --audio --voice Kore
Audio uses a parallel pipeline with automatic sync:
- Manim code is generated first (video-driven)
- Video rendering and TTS generation run in parallel
- Audio is automatically paced to match video duration
- Audio is mixed into the final video
Edit an existing animation
manim-mcp edit <render_id> "Make the vectors red and add axis labels"
List, inspect, delete renders
manim-mcp list --status completed --limit 10
manim-mcp get <render_id>
manim-mcp delete <render_id> --yes
Agent mode
Let the LLM interpret multi-step requests:
manim-mcp prompt "Create a video on eigenvectors, then edit it with better colors"
MCP server
Start the Model Context Protocol server for integration with Claude, Cursor, or other MCP clients:
manim-mcp serve
manim-mcp serve --transport stdio
manim-mcp serve --transport streamable-http
RAG Indexing
Index all knowledge sources for best code generation quality:
# Check current index status
manim-mcp index status
# Index 3b1b video scenes (3,140 scenes)
manim-mcp index 3b1b-videos --path /path/to/3b1b/videos
# Index manimgl API signatures (1,652 signatures)
manim-mcp index api
# Index animation patterns (101 patterns)
manim-mcp index patterns
# Index error patterns (16+ patterns)
manim-mcp index errors
# Index library documentation (470 docs)
manim-mcp index manim-docs
# Clear a collection
manim-mcp index clear patterns --yes
Docker
Run with all dependencies (ChromaDB, MinIO):
export MANIM_MCP_GEMINI_API_KEY=your-api-key
docker compose up
This starts:
- MCP server on port 8000
- ChromaDB on port 8001
- MinIO on ports 9000/9001
Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ AUDIO PIPELINE (parallel with video) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ prompt ──► Code Generation (video-driven) │
│ │ │
│ ┌────────────┴────────────┐ │
│ │ │ PARALLEL │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Render Video │ │ Generate Script │ │
│ │ │ │ + TTS Audio │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ video.mp4 audio segments │
│ │ │ │
│ └────────────┬────────────┘ │
│ ▼ │
│ Pace audio to video duration │
│ │ │
│ ▼ │
│ Mix Audio + Video ──► S3 upload ──► URL │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ MULTI-AGENT PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ prompt ──► ConceptAnalyzer ──► ScenePlanner ──► CodeGenerator ──► CodeReviewer
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ ChromaDB RAG (5,300+ docs) │ │
│ │ ┌──────────┬──────────┬──────────┬────────┬────────┐ │ │
│ │ │ scenes │ api │ patterns │ docs │ errors │ │ │
│ │ │ (3,140) │ (1,652) │ (101) │ (470) │ (16) │ │ │
│ │ └──────────┴──────────┴──────────┴────────┴────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────┬──────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ RENDER PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ validated code ──► CodeSandbox ──► manimgl (xvfb) ──► S3 upload ──► URL │
│ │ │ │
│ ▼ ▼ │
│ ┌───────────┐ ┌───────────────┐ │
│ │ SQLite │ │ MinIO/S3 │ │
│ │ (tracker) │ │ (storage) │ │
│ └───────────┘ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Components
| Component | Description | |-----------|-------------| | ConceptAnalyzer | Extracts domain, complexity, and key concepts from prompts | | ScenePlanner | Designs animation structure, timing, and transitions | | CodeGenerator | Generates manimgl code using scenes, API signatures, and animation patterns | | CodeReviewer | Validates code quality and applies fixes | | ParameterValidator | Validates method parameters against API signatures | | GeminiTTSService | Parallel TTS with Gemini voices, generates narration script | | ChromaDBService | Vector similarity search across 5,300+ indexed documents | | ProbeSearcher | AST-aware semantic code search using Probe (tree-sitter + BM25) | | Linter | Pre-generation code validation using ruff | | SelfCritique | Multi-pass code generation with self-review | | SchemaGenerator | JSON schema-based structured scene generation | | TemplateGenerator | Template-first generation (fill-in-the-middle style) | | CodeSandbox | AST-based security validation (blocks dangerous code) | | ManimRenderer | Executes manimgl with xvfb for headless rendering | | S3Storage | Uploads to MinIO/S3 with presigned URLs | | RenderTracker | Persists job metadata in SQLite |
RAG Collections
| Collection | Documents | Description |
|------------|-----------|-------------|
| manim_scenes | 3,140 | Production 3Blue1Brown scene code |
| manim_api | 1,652 | API signatures with exact parameters |
| animation_patterns | 101 | Reusable animation templates |
| manim_docs | 470 | manimgl library documentation |
| error_patterns | 16+ | Self-learning error/fix patterns |
Self-Learning
The system learns from every error:
- Validation failures - Stored with fixes when LLM corrects them
- Render failures - Stored for future pattern matching
- Successful fixes - Stored as error→fix pairs for RAG retrieval
This creates a feedback loop where the system improves over time.
MCP Tools
When running as an MCP server, these tools are available:
| Tool | Description |
|------|-------------|
| generate_animation | Create an animation from a text prompt |
| edit_animation | Edit an existing animation with instructions |
| list_renders | List past renders with pagination and filtering |
| get_render | Get full details and a fresh download URL |
| delete_render | Permanently delete a render and its files |
| rag_search | Search the RAG database for similar scenes |
| rag_stats | Get collection statistics |
Recommended Prompts
The system performs best with mathematical and educational topics that have high RAG coverage:
| Topic | Indexed Scenes | Example Prompts | |-------|----------------|-----------------| | Linear Algebra | 810+ | "Animate a matrix transformation", "Show eigenvectors during transformation" | | Geometry | 568+ | "Visual proof of Pythagorean theorem", "Inscribed angle theorem" | | Probability | 290+ | "Central limit theorem", "Bayes theorem with updating priors" | | Calculus | 178+ | "Derivative as tangent slope", "Riemann sums converging to integral" |
Development
pip install -e ".[dev,rag]"
pytest
Testing Scripts
# Test all LLM provider combinations (Gemini/Claude × Simple/Advanced × RAG On/Off)
python scripts/test_providers.py
python scripts/test_providers.py --no-audio # Skip audio generation
python scripts/test_providers.py --quick # Only simple mode tests
# Benchmark LLM providers (DeepSeek vs Gemini)
python scripts/benchmark_providers.py
python scripts/benchmark_providers.py --providers gemini,deepseek --categories simple,medium
License
MIT