MCP server by ArtemisAI
Code Execution with MCP - Template Repository
A production-ready template for building AI agents using the Code Execution with MCP pattern. This harness enables AI agents to dynamically discover and execute MCP tools through secure, sandboxed code execution.
Inspired by: This template implements the architectural patterns and design philosophy from Anthropic's Code Execution with MCP engineering blog post and their Skills Repository. We are grateful to Anthropic for openly sharing these patterns.
🌟 Key Features
- Dynamic Tool Discovery - Tools discovered at runtime using
list_mcp_tools()andget_mcp_tool_details()(no static files) - Secure Sandbox Execution - Docker-based isolation with resource limits, read-only filesystem, and network restrictions
- PII Protection - Automatic tokenization/de-tokenization of sensitive data
- Persistent Skills -
/skillsdirectory for reusable agent code - Ephemeral Workspace -
/workspacedirectory for temporary task files - Multi-Turn Conversations - Support for complex agent workflows
- Extensible Architecture - Easy to customize and extend
💡 Why Code Execution?
The Token Efficiency Problem: Traditional AI agents must describe every computational step in natural language, consuming valuable context window space. Processing 1,000 records might use 50,000 tokens just to describe the transformations.
The Solution: Code execution lets agents write and run code, delegating computation to traditional software while focusing their intelligence on high-level reasoning. The same 1,000-record task uses just ~500 tokens of code.
Key Benefits:
- 📊 Scalability: Handle tasks of any complexity within token limits
- 🔄 Reusability: Save code to
/skillsfor future use - 🔒 Privacy: PII tokenized before reaching the LLM
- 🎯 Reliability: Deterministic code execution vs. natural language descriptions
📖 Read the full philosophy in
docs/PHILOSOPHY.md- explains the "why" behind this architecture based on Anthropic's research.
🏗️ Architecture
┌─────────────────────────────────────────────────────────────┐
│ User / Application │
└────────────────────┬────────────────────────────────────────┘
│ HTTP Request
▼
┌─────────────────────────────────────────────────────────────┐
│ Agent Orchestrator │
│ ┌──────────────┐ ┌──────────────┐ ┌─────────────────┐ │
│ │ AgentManager │◄─┤ PII Censor │◄─┤ MCP Client │ │
│ └──────┬───────┘ └──────────────┘ └─────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ LLM Provider │ (OpenAI, Anthropic, etc.) │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ Sandbox Manager (Docker) │ │
│ └──────┬───────────────────────────┘ │
└─────────┼────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Secure Docker Container │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Agent Code Execution │ │
│ │ - Runtime API (callMCPTool, fs, utils) │ │
│ │ - Dynamic Tool Discovery │ │
│ │ - /skills (persistent, mounted) │ │
│ │ - /workspace (ephemeral, mounted) │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ Security: Non-root user, read-only rootfs, resource limits │
└─────────────────────────────────────────────────────────────┘
│
│ Authenticated API Call
▼
┌─────────────────────────────────────────────────────────────┐
│ MCP Servers │
│ (File System, Databases, APIs, Custom Tools) │
└─────────────────────────────────────────────────────────────┘
🚀 Quick Start
Prerequisites
- Node.js >= 18.0.0
- Docker (for sandbox execution)
- TypeScript knowledge
Installation
# Clone the repository
git clone <your-repo-url>
cd code-execution-with-MCP
# Install dependencies
npm install
# Build the project
npm run build
# Build the Docker sandbox image
npm run build-sandbox
# Create required directories
npm run prepare-workspace
# Start the server
npm start
Development
# Run in development mode with auto-reload
npm run dev
# Type checking only
npm run type-check
# Clean build artifacts
npm run clean
📁 Project Structure
mcp-code-exec-harness/
├── src/
│ ├── agent_orchestrator/ # Main agent logic
│ │ ├── AgentManager.ts # Agent execution loop
│ │ └── prompt_templates.ts # System prompts
│ │
│ ├── sandbox_manager/ # Secure code execution
│ │ ├── SandboxManager.ts # Abstract interface
│ │ └── DockerSandbox.ts # Docker implementation
│ │
│ ├── mcp_client/ # MCP communication
│ │ ├── McpClient.ts # MCP server client
│ │ └── PiiCensor.ts # PII tokenization
│ │
│ ├── agent_runtime/ # Sandbox runtime API
│ │ └── runtime_api.ts # Injected helper functions
│ │
│ ├── tools_interface/ # Dynamic tool discovery
│ │ └── DynamicToolManager.ts
│ │
│ └── index.ts # Main server entry point
│
├── servers/ # MCP server collection (NEW!)
│ ├── official/ # Official MCP servers
│ ├── archived/ # Archived reference servers
│ ├── community/ # Community-contributed servers
│ ├── README.md # Server collection documentation
│ ├── catalog.json # Structured server index
│ └── QUICKSTART.md # Quick start guide
│
├── skills/ # Persistent agent skills (user-specific)
├── workspace/ # Ephemeral execution workspace
├── Dockerfile.sandbox # Secure sandbox container
├── package.json
├── tsconfig.json
└── README.md
🔧 Configuration
Environment Variables
Create a .env file in the root directory:
# Server Configuration
PORT=3000
NODE_ENV=development
# Sandbox Configuration
SANDBOX_IMAGE=sandbox-image-name
SANDBOX_TIMEOUT_MS=30000
SANDBOX_MEMORY_MB=100
SANDBOX_CPU_QUOTA=50000
# LLM Provider (configure for your provider)
LLM_API_KEY=your-api-key-here
LLM_MODEL=your-model-name
# MCP Servers (customize for your setup)
# Add your MCP server configurations here
Customizing the Agent
-
Implement LLM Integration - Edit
src/agent_orchestrator/AgentManager.ts:async function callLLM(prompt: string, tools: any[]): Promise<LLMResponse> { // Add your LLM API call here // Examples: OpenAI, Anthropic, Google Gemini, etc. } -
Connect MCP Servers - Edit
src/mcp_client/McpClient.ts:private initializeServers(): void { // Add your MCP server connections // Use @modelcontextprotocol/sdk } -
Customize System Prompts - Edit
src/agent_orchestrator/prompt_templates.ts -
Adjust Sandbox Security - Edit
src/sandbox_manager/DockerSandbox.ts
🔐 Security Features
Sandbox Isolation
- Non-root execution - Runs as
sandboxuser - Read-only root filesystem - Prevents system modifications
- Resource limits - CPU and memory constraints
- Network restrictions - Configurable network access
- Capability dropping - Minimal container privileges
PII Protection
Automatic detection and tokenization of:
- Email addresses
- Phone numbers
- Social Security Numbers
- Credit card numbers
- IP addresses
- Custom patterns (extensible)
Authentication
- Session-specific auth tokens for sandbox ↔ host communication
- Validate tokens in production deployment
📚 Usage Examples
Making a Request
curl -X POST http://localhost:3000/task \
-H "Content-Type: application/json" \
-d '{
"userId": "user123",
"task": "Analyze the latest sales data and create a summary report"
}'
Agent Code Example
The agent writes code like this (executed in sandbox):
// 1. Discover available tools
const tools = await list_mcp_tools();
console.log("Available tools:", tools);
// 2. Get tool details
const dbTool = await get_mcp_tool_details("database__query");
console.log("Tool info:", dbTool.description);
// 3. Execute tools
const salesData = await callMCPTool("database__query", {
query: "SELECT * FROM sales WHERE date > '2024-01-01'"
});
// 4. Process data in code
const summary = salesData.reduce((acc, sale) => {
acc.total += sale.amount;
acc.count += 1;
return acc;
}, { total: 0, count: 0 });
// 5. Save to skills for reuse
await fs.writeFile('/skills/sales_summary.js', `
module.exports = async function summarizeSales(data) {
return data.reduce((acc, sale) => {
acc.total += sale.amount;
acc.count += 1;
return acc;
}, { total: 0, count: 0 });
};
`);
// 6. Return results
return { summary, totalSales: summary.total, count: summary.count };
🛠️ Extending the Template
MCP Servers Collection
This repository includes a comprehensive collection of 18 MCP servers organized for progressive discovery:
- 📦 7 Official Servers - Filesystem, Git, Memory, Fetch, Everything, Time, Sequential Thinking
- 🗄️ 5 Archived Servers - PostgreSQL, Redis, SQLite, Puppeteer, Sentry
- 🌍 6 Community Servers - MongoDB, GreptimeDB, Unstructured, Semgrep, MCP Installer, PostgreSQL Community Fork
Quick Start:
# Browse the server collection
cd servers/
# Read the documentation
cat README.md
# Check the quick start guide
cat QUICKSTART.md
# View the structured catalog
cat catalog.json
Documentation:
servers/README.md- Complete server collection documentationservers/QUICKSTART.md- Quick start guide with common use casesservers/catalog.json- Structured server index for programmatic discovery- Category-specific READMEs in
servers/official/,servers/archived/, andservers/community/
Adding New MCP Servers
// In src/mcp_client/McpClient.ts
async addServer(config: MCPServerConfig): Promise<void> {
const client = new Client({
name: config.name,
version: '1.0.0'
}, {
capabilities: { tools: {} }
});
const transport = new StdioClientTransport({
command: config.command,
args: config.args
});
await client.connect(transport);
// Discover and register tools
const tools = await client.listTools();
tools.forEach(tool => this.registerTool(tool));
}
Example configurations for servers from the collection:
// Filesystem server (official)
await this.addServer({
name: 'filesystem',
command: 'npx',
args: ['@modelcontextprotocol/server-filesystem', '/workspace', '/skills']
});
// MongoDB server (community)
await this.addServer({
name: 'mongodb',
command: 'npx',
args: ['-y', 'mongodb-mcp-server', '--readOnly'],
env: { MDB_MCP_CONNECTION_STRING: process.env.MONGODB_URI }
});
// Git server (official)
await this.addServer({
name: 'git',
command: 'npx',
args: ['mcp-server-git']
});
Custom PII Patterns
// In your code
const piiCensor = new PiiCensor();
piiCensor.addPattern('custom_id', /\bID-\d{6}\b/g);
Alternative Sandbox Implementations
Extend SandboxManager to create custom execution environments:
- WebAssembly-based sandboxes
- Cloud function execution
- Process-based isolation
🧪 Testing
# Test the sandbox
curl -X POST http://localhost:3000/task \
-H "Content-Type: application/json" \
-d '{
"userId": "test",
"task": "Write a simple hello world function and save it to skills"
}'
# Check health
curl http://localhost:3000/health
📖 Documentation & References
Core Documentation
- PHILOSOPHY.md - ⭐ Start here! Explains the "why" behind code execution, token efficiency, and design principles based on Anthropic's research
- QUICK_START.md - Get running in 5 minutes
- ARCHITECTURE.md - Technical deep dive into system components
- SECURITY.md - Security best practices and hardening checklist
- DEPLOYMENT.md - Production deployment guides (Docker, K8s, Cloud)
- API_EXAMPLES.md - Usage examples and patterns
Skills & Examples
- skills/examples/ - Example skills following the Anthropic skills pattern
template-skill/- Template for creating new skillsdata-processor/- Token-efficient data transformation example
External References
- Code Execution with MCP - Anthropic's engineering blog post describing the dynamic execution model and philosophy
- Anthropic Skills Repository - Open-source examples of skills that extend agent capabilities
- Equipping Agents for the Real World with Agent Skills - Philosophy behind persistent agent capabilities
- Model Context Protocol Documentation - MCP specification and guides
- Docker Security Best Practices - Container security hardening
🤝 Contributing & Community Collaboration
This is a template repository that represents a new paradigm in AI agent development - one where code execution, security, and persistent capabilities work together seamlessly. We believe this approach has the potential to transform how AI agents are built and deployed at scale.
We're Inviting You to Build This Together
The open-source community is fundamental to advancing this paradigm. We welcome contributions in all forms:
Areas We're Looking For Help
- LLM Integrations - Add support for more providers (Claude, GPT-4, Gemini, Llama, etc.)
- MCP Server Connectors - Build adapters for popular services (databases, APIs, file systems)
- Security Hardening - Audit the sandbox, propose additional security measures
- Performance Optimizations - Container pooling, caching strategies, resource tuning
- Monitoring & Observability - Prometheus metrics, logging, distributed tracing
- Skills Library - Create reusable, domain-specific skills for the community
- Documentation - Tutorials, deployment guides, best practices
- Testing & Examples - Integration tests, real-world use cases, benchmarks
- Alternative Sandboxes - WebAssembly, cloud functions, process isolation implementations
- Frontend UI - Dashboard, skill explorer, task monitoring interface
How to Contribute
- Fork & Customize - Start with this template for your specific use case
- Share Improvements - Submit PRs with general-purpose enhancements
- Build Skills - Create reusable skills and submit to the community skills library
- Report Issues - Help us identify bugs and security concerns
- Discuss Ideas - Join conversations about the architecture and design
- Write Documentation - Help others understand and adopt the pattern
The Vision
We're building toward a future where:
- 🧠 AI agents scale beyond token limitations through code execution
- 🔄 Skills accumulate over time, making agents continuously smarter
- 🔒 Privacy is built-in with automatic PII protection
- 🛡️ Security is layered with multiple defense mechanisms
- 🌐 Tools are discovered dynamically, not statically configured
- 📚 Community-driven with shared skills and best practices
Customization Guide for Your Organization
Customize this template for your specific needs:
- Implement your LLM integration - Choose your preferred provider
- Connect your MCP servers - Wire up your tools and data sources
- Customize security policies - Adjust for your threat model
- Extend PII detection - Add patterns for your domain
- Add monitoring and logging - Integrate with your observability stack
- Build domain-specific skills - Create your organization's capability library
- Share back - Contribute generic improvements to help the community
Community Resources
- Issues & Discussions - Ask questions, propose features, discuss architecture
- Skills Repository - Contribute reusable skills to
skills/examples/ - Documentation - Help improve guides and examples
- Partnerships - Collaborate on larger initiatives
Recognition
Contributors will be recognized in:
- Project README
- Release notes
- Community Hall of Fame
- Speaking opportunities at community events
Together, we can build the next generation of AI agent infrastructure. Whether you're an AI researcher, DevOps engineer, security expert, or full-stack developer, there's a place for your contributions. Join us in advancing this paradigm!
📝 License
MIT License - See LICENSE file for details
⚠️ Important Notes
- TODO Items: Search for
TODOcomments in the code for areas requiring implementation - Security: Review and harden security settings before production deployment
- LLM Integration: The LLM calling function is a placeholder - implement with your provider
- MCP Servers: Mock implementations are provided - replace with actual MCP connections
- Production Ready: Additional hardening required for production use (monitoring, error handling, scaling)
Built with the Code Execution with MCP pattern for dynamic, secure AI agent workflows.