MCPHunter

The Self-Evolving MCP Security Engine — the only MCP security tool that attacks itself to get stronger.

With L1-L4 (Groq LLM):                    Without API key (L1-L3 only):
  Seeds:       100% (68/68)                  Seeds:       100% (68/68)
  Real-World:  100% (19/19)                  Real-World:   84% (16/19)
  Red Team:     95% (19/20)                  Red Team:     80% (16/20)
  FP:            0% (0/22)                   FP:            0% (0/22)

16 attack types | 20 mutation strategies | 143 rules | 3 novel discoveries

The Problem

MCP (Model Context Protocol) has 97M+ monthly SDK downloads, 10,000+ servers, and catastrophic security:

50+ vulnerabilities documented at vulnerablemcp.info
30+ CVEs in 60 days
Tool poisoning attacks that succeed against the best LLMs
Supply chain compromise (LiteLLM .pth attack, March 24 2026)

Five existing tools (MCP-Scan, MCP Guard, MCPTrust, Pipelock, MCP Guardian) are all static — they ship fixed rules that never improve.

How MCPHunter Is Different

MCPHunter combines three components in an autonomous adversarial loop:

                    HUNTER (Red Team)
                    +-----------------+
                    | 20 Strategies:  |
                    |  Syntactic:     |
                    |   encoding,     |
                    |   homoglyphs,   |
                    |   fragmentation |--- generates --->  Attack Payloads
                    |  Semantic:      |                          |
                    |   role inject,  |                          |
                    |   social proof, |                          v
                    |   benefit frame |               SHIELD (Firewall)
                    +--------^--------+              +------------------+
                             |                       | L1: Regex (141)  |
                    updates  |                       | L2: Encoding     |
                    strategy |                       | L3: Heuristic    |
                             |                       | L4: LLM Judge    |
                    +--------+--------+              +------------------+
                    |   EVOLUTION     |                       |
                    |  (Karpathy)     |<--- learns from ------+
                    |                 |     evasions
                    | Extract rules,  |
                    | classify novel, |
                    | generate        |
                    | advisories      |
                    +-----------------+

The Evolution Story: 68% to 100%

MCPHunter started with 15 hand-written regex rules detecting 68% of seed attacks. Through autonomous adversarial evolution:

| Phase | Detection | Rules | How | |-------|:---------:|:-----:|-----| | Day 1 baseline | 68% | 15 | Manual regex + encoding detection | | + Heuristic scoring | 74% | 15 | 6-feature statistical analysis | | + LLM judge | 86% | 15 | Groq Llama 3.3 70B for semantic analysis | | + 50-iter overnight | 92% | 45 | Heuristic rule extraction | | + LLM learning sprint | 94% | 50 | Groq-powered rule extraction | | + Red team hardening | 98% | 64 | Adversarial self-testing | | + A11-A16 expansion | 100% | 141 | New attack types + evolved rules |

Key moment: Iteration 11 — HUNTER's LLM evasion mode created 6 attacks that evaded SHIELD. The learner extracted 12 rules in a single iteration, the biggest learning event.

Detection Capabilities: 16 Attack Types

| ID | Type | Description | |----|------|-------------| | A1 | Direct Injection | Plaintext malicious instructions in descriptions | | A2 | Unicode Steganography | Zero-width chars, Cyrillic homoglyphs | | A3 | Base64 Payload | Encoded instructions for decode-and-execute | | A4 | Schema Poisoning | Malicious parameter descriptions and enum values | | A5 | Output Injection | Fake system messages in tool return values | | A6 | Cross-Server Shadowing | Override legitimate tools from other servers | | A7 | Rug Pull | Behavior changes after initial approval | | A8 | Error Message Injection | Recovery instructions in error messages | | A9 | Nested Encoding | Multi-layer encoding chains (base64+URL+HTML) | | A10 | Semantic Camouflage | Attacks disguised as documentation | | A11 | Sampling Exploitation | MCP sampling requests hijacking LLM | | A12 | Preference Manipulation | Trick AI into preferring malicious tools | | A13 | Parasitic Toolchain | Chained tools escalating attacks | | A14 | Supply Chain .pth | Python .pth files for persistent backdoors | | A15 | Indirect Content Injection | Poison in fetched content, not tool definition | | A16 | System Prompt Leakage | Extract host LLM's system instructions |

Final Overnight Run

50-iteration evolution with Groq Llama 3.3 70B, 20 mutation strategies, L1-L4:

470 attacks tested, 21 evasions, 26 rules auto-generated
3 errors (Groq JSON parse) — all auto-restarted
Shield evolved from v0.1.78 to v0.1.130
Key moment: iteration 16 — LLM evasion burst created 5 attacks that evaded SHIELD, learner extracted 10 rules
Post-run: 8 overfitting rules disabled after FP validation

Full report: results/overnight_report.md

Real-World Benchmark: 100% Detection

19 attacks recreated from published security research — all detected:

| Source | Attacks | Detected | |--------|:-------:|:--------:| | Invariant Labs (tool poisoning, shadowing) | 4 | 4/4 | | CyberArk (schema poisoning) | 3 | 3/3 | | Docker GitHub (PR/issue injection) | 2 | 2/2 | | vulnerablemcp.info CVEs | 3 | 3/3 | | MCPTox benchmark (arxiv) | 3 | 3/3 | | Invariant Labs (delayed activation) | 3 | 3/3 | | LiteLLM .pth compromise (March 2026) | 1 | 1/1 | | Total | 19 | 19/19 (100%) |

OWASP LLM Top 10 2025 Coverage

| OWASP Risk | MCPHunter Coverage | Attack Types | Status | |---|---|---|---| | LLM01 Prompt Injection | Direct + indirect detection across all MCP surfaces | A1-A5, A9-A11, A15, A16 | Full | | LLM02 Sensitive Info Disclosure | Exfiltration pattern detection in descriptions and outputs | A1, A5, A8 | Partial | | LLM03 Supply Chain | .pth poisoning, dependency attack patterns, double-base64 | A14 | Covered | | LLM04 Data & Model Poisoning | Tool poisoning, rug pull detection, schema manipulation | A4, A6, A7 | Covered | | LLM05 Improper Output Handling | Output injection + error message injection scanning | A5, A8 | Covered | | LLM06 Excessive Agency | Preference manipulation, parasitic toolchain detection | A12, A13 | Covered | | LLM07 System Prompt Leakage | Prompt extraction pattern detection in all surfaces | A16 | Covered | | LLM08 Vector/Embedding Weaknesses | Out of scope (RAG-specific, not MCP tool-level) | -- | N/A | | LLM09 Misinformation | Out of scope (content-level, not tool-security) | -- | N/A | | LLM10 Unbounded Consumption | Resource theft and compute drain detection | A11 | Partial |

Novel Discovery Engine

MCPHunter's evolution loop classified 16 evasion discoveries against a registry of 28 known attack techniques:

3 genuinely novel (whitespace steganography — binary encoding in trailing whitespace, 0% similarity to any known technique)
11 known technique variants (context blending, benefit framing, instruction rephrasing — new twists on documented techniques)
2 known technique rediscoveries (encoding wrapping, synonym rotation)

Each discovery generates a structured security advisory with attack mechanism, closest known technique, and severity rating. Honest classification — we don't inflate novelty counts.

Quick Start

# Install
git clone https://github.com/your-username/mcphunter.git
cd mcphunter
pip install -r requirements.txt

# Configure LLM (Groq free tier: 14,400 RPD)
echo "GROQ_API_KEY=your-key-here" > .env

# Scan a tool description
python -c "
from mcphunter.shield.pipeline import ShieldPipeline
result = ShieldPipeline().scan('Your MCP tool description here')
print(f'{result.verdict.value} ({result.confidence:.0%}) - {result.explanation}')
"

# Run evolution loop (50 iterations, 5-min sleep)
python scripts/evolve.py -n 50 --sleep 300 --attacks 10

# Run with Groq LLM for deep analysis
python scripts/evolve.py -n 10 --sleep 60 --attacks 10

# Sandbox mode (no network, deterministic)
python scripts/evolve.py -n 50 --sleep 0 --sandbox

# Run benchmarks
python benchmarks/run_benchmark.py
python scripts/diagnose.py

# Launch dashboard
python -m mcphunter.dashboard.app

What We Learned

Heuristic-only learning is insufficient for semantic attacks. The overnight sandbox run (50 iterations, no LLM) produced 142 evasions but extracted only 1 rule.

LLM-enabled learning is 14x more effective. A Groq-powered sprint extracted 14 rules in 22 iterations vs 1 rule in 50 heuristic-only iterations.

The optimal configuration mirrors real security operations: automated monitoring (sandbox mode) + periodic expert analysis (LLM sprints). This is how real SOCs operate.

Comparison with Existing Tools

| Feature | MCPHunter | MCP-Scan | MCP Guard | Pipelock | |---------|:-:|:-:|:-:|:-:| | Self-improving | Yes | No | No | No | | Attack types | 16 | ~5 | ~3 | ~2 | | Mutation strategies | 20 | 0 | 0 | 0 | | LLM-powered analysis | Yes (Groq) | No | No | No | | Novel discovery | Yes (9 advisories) | No | No | No | | Encoding detection | 6 types | Basic | No | No | | False positive rate | 0/22 | N/A | N/A | N/A |

MCPHunter is complementary to these tools. MCP-Scan for fast blocking at connection time, MCPHunter for deep analysis and continuous improvement.

Limitations & Future Work

L4 depends on LLM API availability — falls back gracefully to L1-L3 (80-98% detection)
Regex can be evaded by novel semantic attacks — the red team proved 1/20 still evades all 4 layers
LLM-generated rules can overfit — FP validation catches this but 8 rules were disabled post-overnight
Single-tool scanning — doesn't monitor live MCP traffic or inter-tool interactions
Not tested against all 50+ CVEs — our benchmark covers 19 recreated attacks
Evolution --seed is partially reproducible — mutation selection and strategy rotation are deterministic, but attack IDs use uuid4() for uniqueness
Future: recursive encoding, live MCP proxy, cross-tool correlation, community rule sharing

References

License

MIT

MCP Servers

MCPHunter

The Problem

How MCPHunter Is Different

The Evolution Story: 68% to 100%

Detection Capabilities: 16 Attack Types

Final Overnight Run

Real-World Benchmark: 100% Detection

OWASP LLM Top 10 2025 Coverage

Novel Discovery Engine

Quick Start

What We Learned

Comparison with Existing Tools

Limitations & Future Work

References

License

安装包（如果需要）

Cursor 配置 (mcp.json)

MCPHunter

The Problem

How MCPHunter Is Different

The Evolution Story: 68% to 100%

Detection Capabilities: 16 Attack Types

Final Overnight Run

Real-World Benchmark: 100% Detection

OWASP LLM Top 10 2025 Coverage

Novel Discovery Engine

Quick Start

What We Learned

Comparison with Existing Tools

Limitations & Future Work

References

License

安装包 （如果需要）

Cursor 配置 (mcp.json)

安装包（如果需要）