A protocol-aware gateway for Model Context Protocol (MCP) servers — fan many MCP servers behind one endpoint with agent identity (JWT/OIDC) and structured audit logging. Single Go binary.
Helmsman
A gateway for Model Context Protocol servers.
You run Helmsman in front of your fleet of MCP servers — Slack, GitHub, Postgres, your internal tools — and every agent in your company connects to Helmsman instead of connecting to each server directly. Helmsman is the place where agent identity gets stamped, policy gets enforced, and audit logs get written.
It's a single Go binary, ships with sensible defaults, and is designed to be the smallest possible thing you'd put between your agents and production.
Status: v0.1-track. Landed: multi-upstream routing with tool-catalog merging, JWT/OIDC agent identity (with an HS256 dev mode + token minter), structured audit log, structured logs, and operational endpoints (
/healthz,/readyz). Still to come:/metrics,helmsman init/doctor, and the release toolchain. See docs/v0.1-plan.md for the full plan.
Why this exists
MCP solved the protocol problem. It did not solve the operational problem. Today, MCP server management is handled client-side — every IDE, agent, and SDK has its own config file listing servers as either stdio commands or HTTP endpoints. That works for individual developers. It does not work for an organization.
Three gaps that a config file fundamentally cannot fill:
Identity. A stdio MCP server has no idea which agent is calling it; an HTTP server gets whatever bearer token the client decides to send. Helmsman is the place where every call gets stamped with a verifiable agent identity, and every audit log row, metric, policy decision, and rate-limit bucket is keyed on that identity.
Enforceable policy. Policy in client config is a suggestion — the agent
process owns the file. Server-side policy is the only kind that survives a
buggy or jailbroken agent. "Agent X may call github.delete_repo only on
repos matching sandbox-*" has to live somewhere the agent cannot reach.
Audit. Per-client config means logs scattered across every developer laptop and CI runner. Helmsman is the single choke point your security team can query.
For a deeper take on why this is worth building separately from a generic API gateway, see docs/why.md.
Quickstart
# Resolve dependencies (populates go.sum; needed after a fresh clone).
go mod tidy
# Build.
make build
# Copy the example config (two upstreams: filesystem + everything).
cp helmsman.example.yaml helmsman.yaml
# Run.
./bin/helmsman --config helmsman.yaml
In another terminal, initialize and call a tool through the gateway —
note the X-Helmsman-Agent header, which v0.1 uses as a development
identity placeholder until JWT/OIDC validation lands:
curl -s http://127.0.0.1:7474/mcp \
-H 'content-type: application/json' \
-H 'x-helmsman-agent: dev-agent' \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}' \
| jq '.result.serverInfo'
# Tool catalog merged across all configured upstreams.
curl -s http://127.0.0.1:7474/mcp \
-H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
| jq '.result.tools | map(.name)'
# Tool call routed to the right upstream by name.
curl -s http://127.0.0.1:7474/mcp \
-H 'content-type: application/json' \
-H 'x-helmsman-agent: dev-agent' \
-d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"list_directory","arguments":{"path":"/tmp"}}}' \
| jq
Audit rows land on stdout by default — pipe helmsman through jq and
you'll see one event per forwarded request:
{"ts":"2026-...Z","request_id":"...","agent":"dev-agent","upstream":"filesystem","method":"tools/call","tool":"list_directory","arg_hash":"sha256:...","arg_size":42,"latency_ms":7,"result":"ok"}
make smoke runs a full end-to-end check against npx @modelcontextprotocol/server-filesystem and server-everything,
including audit-log verification.
Run with Docker
The image bundles the Helmsman binary on a Node runtime with the two reference MCP servers pre-installed, so it runs out of the box.
# Build the image.
docker build -t helmsman:0.1.0 .
# Run it. Files in ./data become visible to the filesystem upstream.
mkdir -p data && echo "hello from docker" > data/hello.txt
docker run --rm -p 7474:7474 -v "$PWD/data:/data" helmsman:0.1.0
Or with Compose:
docker compose up --build
Then, from another terminal:
curl -s http://localhost:7474/readyz
curl -s http://localhost:7474/mcp \
-H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | jq '.result.tools | map(.name)'
The baked-in config (deploy/helmsman.docker.yaml)
runs in anonymous mode for easy evaluation. For anything real, mount your
own config with an identity: block:
docker run --rm -p 7474:7474 \
-v "$PWD/helmsman.yaml:/etc/helmsman/helmsman.yaml:ro" \
-v "$PWD/data:/data" \
helmsman:0.1.0
See docs/deployment.md for the container model (why stdio upstreams are co-located), TLS, and Kubernetes notes.
Operational endpoints
GET /healthz— process liveness; always 200 while running.GET /readyz— readiness; 200 once every upstream has finished its initialize handshake. Returns 503 with the unready upstream names otherwise. Wire this to your load balancer.POST /mcp— the JSON-RPC endpoint. Each response carries anX-Request-Idheader that ties back to log lines and audit rows.
Configuration
listen: ":7474"
log:
level: info # debug | info | warn | error
format: text # text (dev) | json (prod / Kubernetes)
audit:
sinks:
- type: stdout
# - type: file
# path: ./audit.jsonl
log_arguments: false # global default; per-upstream override
buffer_size: 1024
upstreams:
- name: filesystem
transport: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
namespaced: false # native tool names visible
audit_log_arguments: true # full args in audit rows for this trusted upstream
- name: everything
transport: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-everything"]
# namespaced defaults to true: clients see "everything:<tool>"
Tool name namespacing
Two upstreams might both define a tool called list. Helmsman resolves
this with namespacing: each upstream's tools are exposed as
<upstream>:<tool> by default, so filesystem:list and github:list
coexist cleanly. For a single trusted upstream where you want the
original tool names visible, set namespaced: false on that upstream.
If two non-namespaced upstreams declare the same tool name, Helmsman
refuses to start and asks you to disambiguate.
Argument logging
By default audit rows include only a sha256 hash and size of each tool
call's arguments — never the full args. Per-upstream audit_log_arguments: true opts that upstream into full-argument logging when the operational
need is worth the sensitivity tradeoff.
Identity
Helmsman authenticates agents with JWT bearer tokens. If no identity:
block is configured it runs in anonymous mode (every request is
recorded as agent=anonymous) and prints a loud startup warning — fine
for kicking the tires locally, never for production.
Production (OIDC). Point Helmsman at your issuer; it discovers the
JWKS, validates RS256/ES256 signatures, and checks issuer, audience, and
expiry on every request. The agent identity comes from a configurable
claim (sub by default) and flows into every audit row.
identity:
issuer: https://accounts.example.com
audience: helmsman
agent_claim: sub
A request then carries the token as a normal bearer header:
curl -s http://127.0.0.1:7474/mcp \
-H 'authorization: Bearer eyJhbGc...' \
-H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
Missing or invalid tokens get 401 with a WWW-Authenticate header.
The /healthz and /readyz endpoints never require authentication.
Local development (HS256). Standing up an IdP just to try Helmsman is friction, so dev mode accepts HS256 tokens signed with a shared secret and ships a minter:
export HELMSMAN_DEV_SECRET="something-long-and-random"
# config: identity.dev_mode.enabled = true
# Mint a 1-hour token for an agent and call through the gateway:
TOKEN=$(helmsman token mint billing-bot --ttl 1h)
curl -s http://127.0.0.1:7474/mcp \
-H "authorization: Bearer $TOKEN" \
-H 'content-type: application/json' \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
Dev mode and OIDC are mutually exclusive — dev mode only ever accepts HS256, and OIDC only ever accepts RS256/ES256, so a deployment can't be tricked into accepting a symmetric-key token where an asymmetric one is expected (the classic JWT algorithm-confusion attack).
Helmsman is / is not
| Helmsman is | Helmsman is not | |---------------------------------------------|---------------------------------------| | A control plane for your MCP servers | An LLM proxy (no Helicone competitor) | | Identity-, policy-, audit-aware | A model router or prompt cache | | A single Go binary, self-hostable | A SaaS product | | Vendor- and transport-neutral | Tied to any specific agent SDK | | The choke point your security team queries | A general-purpose API gateway |
Architecture
agent (Cursor, Claude Code, custom)
│
│ POST /mcp X-Helmsman-Agent: <id>
▼
┌─────────────────────┐
│ helmsman │
│ HTTP transport │ /healthz /readyz
│ ─ middleware ─ │ X-Request-Id
│ proxy │ ← identity (v0.2: JWT/OIDC)
│ ├ tool registry │ ← policy (v0.3)
│ └ audit writer ─┐ │
└────────┬─────────┼──┘
│ │
┌─────────────┴───┐ ┌───▼──────────┐
│ upstream A │ │ audit sinks │
│ stdio child │ │ stdout/file │
└─────────────────┘ └──────────────┘
│
┌─────────────┴───┐
│ upstream B │
│ stdio child │
└─────────────────┘
The proxy holds a tool registry built from each upstream's tools/list
response. tools/call is routed by name; non-tool methods are forwarded
to the first configured upstream (proper prompts/resources routing
lands in v0.2). Every forwarded request emits one audit event,
serialized as a JSON line through a bounded async writer.
Roadmap
v0.1 — usable. Multi-upstream routing with tool-catalog merging, JWT/OIDC agent identity, structured audit log, Prometheus metrics, prebuilt binaries + Docker image + Homebrew formula, three deployment recipes. The version a security-conscious team would actually deploy. See docs/v0.1-plan.md for the full plan.
v0.2 — production HA. Helm chart, Redis-backed shared state for multi-replica deployments, SSE / streamable-HTTP upstreams, hot-reload on SIGHUP.
v0.3 — policy. Per-agent, per-tool, per-argument policy. Simple match-rule DSL with Rego as a documented escape hatch. Tool-level deny that no agent can bypass.
v0.4 — transports. Server-push (GET /mcp SSE) downstream. Connection pooling and circuit breaking. Standardized session lifecycle.
v1.0 — production-grade. Embedded admin UI for live audit-log browsing and config inspection. Per-(agent, tool) rate limits with token-bucket semantics. Documented spec contributions for the gaps Helmsman hits.
Contributing
Helmsman is MIT licensed. PRs welcome — start with an issue describing the change so we don't sink time into the wrong direction.
License
MIT.