Helmsman

A gateway for Model Context Protocol servers.

You run Helmsman in front of your fleet of MCP servers — Slack, GitHub, Postgres, your internal tools — and every agent in your company connects to Helmsman instead of connecting to each server directly. Helmsman is the place where agent identity gets stamped, policy gets enforced, and audit logs get written.

It's a single Go binary, ships with sensible defaults, and is designed to be the smallest possible thing you'd put between your agents and production.

Status: v0.1-track. Landed: multi-upstream routing with tool-catalog merging, JWT/OIDC agent identity (with an HS256 dev mode + token minter), structured audit log, structured logs, and operational endpoints (/healthz, /readyz). Still to come: /metrics, helmsman init/doctor, and the release toolchain. See docs/v0.1-plan.md for the full plan.

Why this exists

MCP solved the protocol problem. It did not solve the operational problem. Today, MCP server management is handled client-side — every IDE, agent, and SDK has its own config file listing servers as either stdio commands or HTTP endpoints. That works for individual developers. It does not work for an organization.

Three gaps that a config file fundamentally cannot fill:

Identity. A stdio MCP server has no idea which agent is calling it; an HTTP server gets whatever bearer token the client decides to send. Helmsman is the place where every call gets stamped with a verifiable agent identity, and every audit log row, metric, policy decision, and rate-limit bucket is keyed on that identity.

Enforceable policy. Policy in client config is a suggestion — the agent process owns the file. Server-side policy is the only kind that survives a buggy or jailbroken agent. "Agent X may call github.delete_repo only on repos matching sandbox-*" has to live somewhere the agent cannot reach.

Audit. Per-client config means logs scattered across every developer laptop and CI runner. Helmsman is the single choke point your security team can query.

For a deeper take on why this is worth building separately from a generic API gateway, see docs/why.md.

Quickstart

# Resolve dependencies (populates go.sum; needed after a fresh clone).
go mod tidy

# Build.
make build

# Copy the example config (two upstreams: filesystem + everything).
cp helmsman.example.yaml helmsman.yaml

# Run.
./bin/helmsman --config helmsman.yaml

In another terminal, initialize and call a tool through the gateway — note the X-Helmsman-Agent header, which v0.1 uses as a development identity placeholder until JWT/OIDC validation lands:

curl -s http://127.0.0.1:7474/mcp \
  -H 'content-type: application/json' \
  -H 'x-helmsman-agent: dev-agent' \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}' \
  | jq '.result.serverInfo'

# Tool catalog merged across all configured upstreams.
curl -s http://127.0.0.1:7474/mcp \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
  | jq '.result.tools | map(.name)'

# Tool call routed to the right upstream by name.
curl -s http://127.0.0.1:7474/mcp \
  -H 'content-type: application/json' \
  -H 'x-helmsman-agent: dev-agent' \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"list_directory","arguments":{"path":"/tmp"}}}' \
  | jq

Audit rows land on stdout by default — pipe helmsman through jq and you'll see one event per forwarded request:

{"ts":"2026-...Z","request_id":"...","agent":"dev-agent","upstream":"filesystem","method":"tools/call","tool":"list_directory","arg_hash":"sha256:...","arg_size":42,"latency_ms":7,"result":"ok"}

make smoke runs a full end-to-end check against npx @modelcontextprotocol/server-filesystem and server-everything, including audit-log verification.

Run with Docker

The image bundles the Helmsman binary on a Node runtime with the two reference MCP servers pre-installed, so it runs out of the box.

# Build the image.
docker build -t helmsman:0.1.0 .

# Run it. Files in ./data become visible to the filesystem upstream.
mkdir -p data && echo "hello from docker" > data/hello.txt
docker run --rm -p 7474:7474 -v "$PWD/data:/data" helmsman:0.1.0

Or with Compose:

docker compose up --build

Then, from another terminal:

curl -s http://localhost:7474/readyz
curl -s http://localhost:7474/mcp \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | jq '.result.tools | map(.name)'

The baked-in config (deploy/helmsman.docker.yaml) runs in anonymous mode for easy evaluation. For anything real, mount your own config with an identity: block:

docker run --rm -p 7474:7474 \
  -v "$PWD/helmsman.yaml:/etc/helmsman/helmsman.yaml:ro" \
  -v "$PWD/data:/data" \
  helmsman:0.1.0

See docs/deployment.md for the container model (why stdio upstreams are co-located), TLS, and Kubernetes notes.

Operational endpoints

GET /healthz — process liveness; always 200 while running.
GET /readyz — readiness; 200 once every upstream has finished its initialize handshake. Returns 503 with the unready upstream names otherwise. Wire this to your load balancer.
POST /mcp — the JSON-RPC endpoint. Each response carries an X-Request-Id header that ties back to log lines and audit rows.

Configuration

listen: ":7474"

log:
  level: info       # debug | info | warn | error
  format: text      # text (dev) | json (prod / Kubernetes)

audit:
  sinks:
    - type: stdout
    # - type: file
    #   path: ./audit.jsonl
  log_arguments: false   # global default; per-upstream override
  buffer_size: 1024

upstreams:
  - name: filesystem
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
    namespaced: false           # native tool names visible
    audit_log_arguments: true   # full args in audit rows for this trusted upstream

  - name: everything
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-everything"]
    # namespaced defaults to true: clients see "everything:<tool>"

See helmsman.example.yaml.

Tool name namespacing

Two upstreams might both define a tool called list. Helmsman resolves this with namespacing: each upstream's tools are exposed as <upstream>:<tool> by default, so filesystem:list and github:list coexist cleanly. For a single trusted upstream where you want the original tool names visible, set namespaced: false on that upstream. If two non-namespaced upstreams declare the same tool name, Helmsman refuses to start and asks you to disambiguate.

Argument logging

By default audit rows include only a sha256 hash and size of each tool call's arguments — never the full args. Per-upstream audit_log_arguments: true opts that upstream into full-argument logging when the operational need is worth the sensitivity tradeoff.

Identity

Helmsman authenticates agents with JWT bearer tokens. If no identity: block is configured it runs in anonymous mode (every request is recorded as agent=anonymous) and prints a loud startup warning — fine for kicking the tires locally, never for production.

Production (OIDC). Point Helmsman at your issuer; it discovers the JWKS, validates RS256/ES256 signatures, and checks issuer, audience, and expiry on every request. The agent identity comes from a configurable claim (sub by default) and flows into every audit row.

identity:
  issuer: https://accounts.example.com
  audience: helmsman
  agent_claim: sub

A request then carries the token as a normal bearer header:

curl -s http://127.0.0.1:7474/mcp \
  -H 'authorization: Bearer eyJhbGc...' \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Missing or invalid tokens get 401 with a WWW-Authenticate header. The /healthz and /readyz endpoints never require authentication.

Local development (HS256). Standing up an IdP just to try Helmsman is friction, so dev mode accepts HS256 tokens signed with a shared secret and ships a minter:

export HELMSMAN_DEV_SECRET="something-long-and-random"
# config: identity.dev_mode.enabled = true

# Mint a 1-hour token for an agent and call through the gateway:
TOKEN=$(helmsman token mint billing-bot --ttl 1h)
curl -s http://127.0.0.1:7474/mcp \
  -H "authorization: Bearer $TOKEN" \
  -H 'content-type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Dev mode and OIDC are mutually exclusive — dev mode only ever accepts HS256, and OIDC only ever accepts RS256/ES256, so a deployment can't be tricked into accepting a symmetric-key token where an asymmetric one is expected (the classic JWT algorithm-confusion attack).

Helmsman is / is not

| Helmsman is | Helmsman is not | |---------------------------------------------|---------------------------------------| | A control plane for your MCP servers | An LLM proxy (no Helicone competitor) | | Identity-, policy-, audit-aware | A model router or prompt cache | | A single Go binary, self-hostable | A SaaS product | | Vendor- and transport-neutral | Tied to any specific agent SDK | | The choke point your security team queries | A general-purpose API gateway |

Architecture

        agent (Cursor, Claude Code, custom)
                      │
                      │ POST /mcp        X-Helmsman-Agent: <id>
                      ▼
            ┌─────────────────────┐
            │      helmsman       │
            │  HTTP transport     │     /healthz  /readyz
            │  ─ middleware ─     │     X-Request-Id
            │  proxy              │     ← identity (v0.2: JWT/OIDC)
            │  ├ tool registry    │     ← policy   (v0.3)
            │  └ audit writer ─┐  │
            └────────┬─────────┼──┘
                     │         │
       ┌─────────────┴───┐ ┌───▼──────────┐
       │   upstream A    │ │  audit sinks │
       │   stdio child   │ │  stdout/file │
       └─────────────────┘ └──────────────┘
                     │
       ┌─────────────┴───┐
       │   upstream B    │
       │   stdio child   │
       └─────────────────┘

The proxy holds a tool registry built from each upstream's tools/list response. tools/call is routed by name; non-tool methods are forwarded to the first configured upstream (proper prompts/resources routing lands in v0.2). Every forwarded request emits one audit event, serialized as a JSON line through a bounded async writer.

Roadmap

v0.1 — usable. Multi-upstream routing with tool-catalog merging, JWT/OIDC agent identity, structured audit log, Prometheus metrics, prebuilt binaries + Docker image + Homebrew formula, three deployment recipes. The version a security-conscious team would actually deploy. See docs/v0.1-plan.md for the full plan.

v0.2 — production HA. Helm chart, Redis-backed shared state for multi-replica deployments, SSE / streamable-HTTP upstreams, hot-reload on SIGHUP.

v0.3 — policy. Per-agent, per-tool, per-argument policy. Simple match-rule DSL with Rego as a documented escape hatch. Tool-level deny that no agent can bypass.

v0.4 — transports. Server-push (GET /mcp SSE) downstream. Connection pooling and circuit breaking. Standardized session lifecycle.

v1.0 — production-grade. Embedded admin UI for live audit-log browsing and config inspection. Per-(agent, tool) rate limits with token-bucket semantics. Documented spec contributions for the gaps Helmsman hits.

Contributing

Helmsman is MIT licensed. PRs welcome — start with an issue describing the change so we don't sink time into the wrong direction.

License

MIT.

MCP Servers