Apple-native MCP server for persistent memory: Core ML embeddings, Core Data store, optional Postgres+pgvector mirror with hybrid candidate selection.
claude-mind-mcp
Local-first Apple-native MCP server for time-aware persistent memory, with an optional Postgres+pgvector mirror.
License: MIT (see LICENSE). Third-party licenses for the bundled Swift packages and the embedding model are in THIRD_PARTY_NOTICES.md.
ClaudeMindMCP starter
A local-first Swift MCP server for time-aware persistent memory on Apple platforms.
This starter is designed around:
- official MCP Swift SDK for local
stdioand remote HTTP transports - Core Data as the canonical on-device store
- NSPersistentCloudKitContainer for optional iCloud-backed sync across the same user's Apple devices
- NaturalLanguage (
NLTagger,NLEmbedding/NLContextualEmbedding) and Foundation (NSDataDetector) for on-device enrichment - optional Postgres + pgvector mirror for filtered semantic retrieval, SQL joins, and larger corpora
Recommended shape
Use a two-tier design rather than choosing only one store:
- Canonical store: Core Data on device
- Mirror store: Postgres/pgvector behind a feature flag or background sync worker
That gets you:
- offline-first writes
- Apple-native persistence and sync
- fast metadata filters and joins once the corpus grows
- a clean failure mode when Postgres is unavailable
Transport split
- Local harness / desktop client:
StdioTransport - Cloud-hosted agent / shared connector: Streamable HTTP transport
Tool surface
Time
nowparse_daterelativecalendar_context
Memory
rememberrecallrecall_aroundrelatetraverselist_recentforget
Retrieval policy
- Parse the query for dates, entities, tags, and source hints.
- Apply structured filters first.
- Retrieve semantic candidates from the vector index.
- Expand 1-2 hops through mentions / relations.
- Re-rank by semantic similarity + recency + graph proximity + explicit filter matches.
- Return provenance and timestamps with every hit.
Canonical Core Data entities
-
MemoryRecordid: UUIDtext: StringcreatedAt: DateoccurredAt: Date?source: String?conversationID: String?language: String?sentiment: DoubleembeddingBlob: Data?metadataJSON: Data?tombstoned: Bool
-
EntityRecordid: UUIDcanonicalName: Stringtype: StringaliasesJSON: Data?
-
MentionRecordid: UUIDmemoryID: UUIDentityID: UUIDstartOffset: Int64endOffset: Int64
-
RelationRecordid: UUIDsubjectEntityID: UUIDpredicate: StringobjectEntityID: UUIDprovenanceMemoryID: UUIDcreatedAt: Date
-
TagRecordid: UUIDname: String
-
MemoryTagRecordmemoryID: UUIDtagID: UUID
Postgres mirror
Mirror only the pieces that benefit retrieval:
- memory text
- timestamps
- source / conversation / tags
- entity ids
- embedding vector
tsvector
Keep Core Data as the authoritative store and mirror via an append-only outbox table or durable sync queue. Never make Postgres the only place a memory exists unless you intentionally switch the architecture later.
Why this staged design is the safest fit
- It preserves the single-binary local experience from the original time/date server.
- It gives you Apple-native persistence and optional iCloud sync.
- It avoids turning a simple MCP server into a hard dependency on a database daemon.
- It still leaves room for SQL joins and filtered ANN search once recall quality or corpus size makes them worthwhile.
Claude Code skills pack
For clean /now, /remember, /recall, /recall_around, /list_recent,
/parse_date, and /forget slash commands in Claude Code, install the
companion skills pack:
scripts/install_skills.sh # symlink skills into ~/.claude/skills/
scripts/install_skills.sh --dry-run # preview
scripts/install_skills.sh --uninstall # remove only the symlinks we own
The installer is idempotent and refuses to overwrite a non-symlink at the destination, so any hand-rolled skill at the same path is left untouched.
Each skill is a thin wrapper that calls the matching MCP tool — no memory
logic is reimplemented in the skill layer. See
claude-mind-skills/README.md for the full
tool ↔ skill table and the response contract for recall / recall_around.
You still need to register the MCP server with Claude Code separately:
claude mcp add claude-mind -- /absolute/path/to/.build/release/claude-mind-mcp
Build & smoke-test
swift build
.build/debug/claude-mind-mcp # speaks MCP over stdio
Run a hand-rolled stdio handshake:
TMP=$(mktemp -d)/memory.sqlite
{ printf '%s\n' \
'{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"smoke","version":"0.0.1"}}}' \
'{"jsonrpc":"2.0","method":"notifications/initialized"}' \
'{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
'{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"remember","arguments":{"text":"Coffee with Sarah in Oakland.","tags":["coffee"]}}}' \
'{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"recall","arguments":{"query":"who did I see","k":3}}}'
sleep 5; } | env CLAUDE_MIND_STORE_URL="$TMP" .build/debug/claude-mind-mcp
Status (milestone 2.5, hybrid candidate selection — accepted on real data)
recall no longer relies on a single seed branch. The mirror's seed pool is the union of three branches, deduped by memory_id:
- vector — cosine over the per-profile pgvector table.
- lexical —
tsvector @@ websearch_to_tsqueryagainstmemories.search_document. Query text is OR-tokenized in Swift before binding so multi-word queries don't collapse to AND-joined zero-result tsqueries. - entity — join through
mentionsandentitiesfor any named-entity strings the query's NER pass extracts. NER includes an acronym fallback so technical tokens likeTWAP,OBI,HIP-3are picked up; not justNLTagger'sPersonalName/PlaceName/OrganizationName.
Each hit carries seed_source (["vector"], ["lexical","entity"], etc.) and the per-call log emits vec=N lex=M ent=K unique=U for branch attribution.
Per-branch budgets via env vars:
| env var | default | meaning |
|---|---:|---|
| CLAUDE_MIND_KVEC | max(25, k*seedOverfetch) | top-K from vector branch |
| CLAUDE_MIND_KLEX | same | top-K from lexical branch |
| CLAUDE_MIND_KENT | same | top-K from entity branch |
| CLAUDE_MIND_QUERY_ENT_FALLBACK | false | v2.6 experimental. When true and NER on the query returns no entities, generate lowercase name-like candidates (unigrams + adjacent bigrams, stopwords filtered, capped at 12) and use those for the entity branch. Off by default; enable for journal/notes corpora where queries are casual lowercase. Recall log shows ner_entities=N fallback_entities=M fallback_tokens=... so you can see when and what fired. |
Phase 2 acceptance (real corpus = private trading-engine commit log, 424 commits, 31 hand-curated queries)
| ablation | Phase 2 (pre-v2.5) MRR | v2.5 MRR | Δ | |-------------------------|---------------------:|---------:|---:| | local-only | 0.451 | 0.451 | — | | mirror-seeds | 0.451 | 0.486 | +0.035 | | mirror + graph | 0.456 | 0.477 | +0.021 | | mirror + graph + lex| 0.482 | 0.894 | +0.412 |
Best swept config on v2.5: α=0.25, β=0.60, γ=0.10, δ=0.05 → MRR 0.940 / nDCG 0.761 / Success@5 1.000.
Failure modes by stratum (v2.5):
| stratum | n | good | partial-top1 | rank-issue | miss | no-truth | |----------|--:|-----:|-------------:|-----------:|-----:|---------:| | entity | 13 | 11 | 1 | 0 | 0| 0 | | semantic | 11 | 7 | 2 | 0 | 0 | 0 | | temporal | 3 | 0 | 1 | 0 | 0 | 2 | | mixed | 4 | 1 | 1 | 0 | 0 | 0 |
Zero misses — was 7 entity-name misses in Phase 2 pre-v2.5.
Calibration that did NOT change
- Default rerank weights stay at
0.55 / 0.20 / 0.10 / 0.15. Best swept α swings per corpus (synthetic 0.80, Phase 2 commit-log 0.20, v2.5 0.25); env-var override (CLAUDE_MIND_W_*) is the right answer until we have a third real corpus. - Graph weight stays small. Graph contribution is +0.021 MRR on this corpus. Probably more useful on entity-rich (journal/notes) data.
- Local fallback path unchanged.
Caveat: lexical did most of the work on this corpus
Commits use entity strings literally — TWAP appears in TWAP commits — so the lexical branch alone surfaces the right memories. The entity branch is correct and contributing on 14/31 queries but largely redundant with lexical here. Phase 3 against journal/notes data is where the entity branch should uniquely matter (common names matching many memories regardless of topic). See docs/quality/findings_v25.md.
Known: Core Data programmatic-model relationship-faulting bug
mention.value(forKey: "entity") returns nil for persisted FKs in our programmatic model, even with returnsObjectsAsFaults = false and relationshipKeyPathsForPrefetching = ["entity"]. Worked around by adding a redundant entityID: UUID attribute on MentionRecord and looking up entities by id directly. Standalone repro (50 lines) at docs/coredata-bug-repro/. Regression test at Sources/ClaudeMindRegressionTest/; run with swift run claude-mind-regression.
Status (milestone 2.3, hybrid recall through Postgres mirror)
When the mirror is enabled and the active enricher's profile + dimension match the mirrored profile, recall runs through Postgres:
- Embed the query with the active enricher.
RecallServicequeries the per-profile pgvector table:
ORDER BY embedding <=> $1::vectorfor cosine, plusts_rank_cd(search_document, websearch_to_tsquery('english', $2))for lexical, with all structured filters in theWHEREclause.- Top
k * seedOverfetchcandidate ids come back from PG with(semantic_score, lexical_score). - Core Data does graph expansion: 1-hop entity neighbors via
MemoryStore.expandGraph(seedIDs:filters:). - Final rerank:
α·cosine + β·recency + γ·graph + δ·lexical(defaults0.55/0.20/0.10/0.15). - Top
kreturned withpath: "mirror",candidate_count,expanded_count, per-hitis_seed,shared_entity_count.
Fallback ladder is strict and boring:
- mirror unavailable / SQL error / profile mismatch / no query embedding → log reason, return local cosine result with
path: "local"andfallback_reason. - mirror never blocks the tool; local path still works without Postgres.
Logs always emit one line per recall:
recall path=mirror profile=<id> candidates=N expanded=M returned=Krecall path=local profile=<id> returned=K reason=<fallback_reason>
Parity check on a fixed corpus
docs/acceptance/test_v23_recall_parity.py writes the bench corpus (30 sentences) into a fresh store with the mirror enabled, drains, then runs 8 queries through (a) the mirror path and (b) the local path on the same store with mirror disabled.
[summary] mean_jaccard=1.000 threshold=0.60 result=pass
All 8 queries agreed on top-5 IDs and exact rank order. Expected on a small corpus where the same embeddings live in both stores; the harness exists to catch divergence as the corpus and rerank weights evolve.
Status (milestone 2.2, runtime-verified against pgvector — Postgres mirror)
Acceptance run (Apple M4, Postgres 17.9, pgvector 0.8.2, debug build)
PG: brew install postgresql@17 pgvector (separate instance on port 5433)
psql -p 5433 -c "CREATE DATABASE claude_mind_test"
psql -p 5433 -d claude_mind_test -c "CREATE EXTENSION vector"
env CLAUDE_MIND_ENABLE_PGVECTOR_MIRROR=true \
CLAUDE_MIND_PG_DSN="postgresql://you@localhost:5433/claude_mind_test?sslmode=disable" \
.build/debug/claude-mind-mcp
| acceptance check | result | notes |
|---|---|---|
| 1. mirror enabled doesn't break stdio/MCP | ✅ | now / remember / recall round-trip, ServiceGroup tears down cleanly on EOF |
| 2. schema creation idempotent | ✅ | 8 tables: memories, embedding_profiles, entities, mentions, relations, tags, memory_tags, memory_embeddings_minilm_l6_v2_3786b7. Re-runs are no-ops. |
| 3a. one remember lands a row in memories | ✅ | id, text, language all match the local row |
| 3b. ... registers profile in embedding_profiles | ✅ | (minilm-l6-v2, CoreML(all), 384, 256) |
| 3c. ... lands an embedding row in the active profile table | ✅ | vector_dims(embedding) = 384 in memory_embeddings_minilm_l6_v2_3786b7 |
| 3d. ... outbox row marked sent | ✅ | Core Data shows pending=0, sent=1 |
| 4a. local writes during DB outage don't block | ✅ | remember returned with id while PG was stopped |
| 4b. drainer catches up after PG restart | ✅ | second-run drainer published 1 rows, row visible in PG |
| 4c. attempt_count / last_error increment on live publish failure | ✅ | docs/acceptance/test_4c_live_failure.py: transient PG outage publishes transparently after reconnect (postgres-nio handles it, no spurious failure logs); a real SQL-level failure (DROP memories table mid-flight) increments attemptCount=2, sets lastError, sets lastAttemptAt. |
| 5. heterogeneous-profile rows skip embedding, still mirror metadata | ✅ | docs/acceptance/test_5_heterogeneous_profile.py: a memory stamped under profile A (CoreML/minilm-l6-v2/384) and drained while profile B is active (NL/nl-512/512) lands in memories but is skipped on the embedding axis; B's own memory lands in B's profile-scoped table normally. |
Build mode policy
- Release build, mirror disabled (default): use
.build/release/claude-mind-mcp. Core Data path only, no Postgres dependency exercised at runtime. - Mirror enabled (
CLAUDE_MIND_ENABLE_PGVECTOR_MIRROR=true): use the debug build (.build/debug/claude-mind-mcp). Mirror latency is dominated by Postgres I/O so the optimization gap is immaterial for that path.
The release binary emits a warning-level startup log when mirror is enabled, pointing at this section.
Why the release+mirror combination crashes
A Swift 6.3.1 release-mode codegen issue triggers freed pointer was not the last allocation on the second consecutive PostgresClient.query(...) only when postgres-nio and Tokenizers (huggingface/swift-transformers) are linked into the same executable. Either dependency on its own works fine in release.
Reduced to a 30-line standalone repro at docs/swift-bug-repro/ — that project pins postgres-nio + swift-transformers and shows the crash with two SELECT N queries. Removing import Tokenizers from main.swift (without removing the SPM dep) is enough to make release work.
The diagnosis is "Swift release-mode codegen interaction between two dependencies", not a defect in either dependency or in our code. Possibly related (none exact): swiftlang/swift#84793, #81771, #86204. The repro project's README.md is ready to file as an upstream issue.
Status (milestone 2.2, code-complete — Postgres mirror)
- New
ClaudeMindMirrorlibrary target withpostgres-nio.MirrorWorkeractor connects to PG, runsSchemaGenerator.canonicalStatements+SchemaGenerator.profileStatements(descriptor)on first start, then loops at 500 ms polling the Core Data outbox in batches of 100. - Per-row failure isolation: successful rows are marked
sent_at; failed rows recordlast_error, bumpattempt_count, setlast_attempt_at. Mirror loop applies exponential backoff (1 s → 30 s) on consecutive batch failures; remember writes are never blocked. outboxStats()reports pending count, oldest pending timestamp, total attempts; the worker logs threshold warnings (pending >= 1000or oldest > 1 h). No silent truncation — backlog grows unbounded by design and is loudly observable.- Heterogeneous-vector safety: per-row publish skips the embedding upsert if the stored
embeddingProfile/embeddingDimdoesn't match the active descriptor. Rows still mirror tomemories,tags,memory_tags; the embedding axis stays unmirrored until that profile activates and registers its own table. - Profile identity is single-source-of-truth (
SchemaGenerator.descriptor(enricher:modelName:seqLen:)— id, backend, dim all come from the running enricher) andsafeIDis collision-defended (sanitized base + 6-hex-char SHA-256 of(id, backend, dim)). - Manifest sha256 verification before model load. Mismatch fails closed → fall back to NL.
- ServiceGroup is back: when
CLAUDE_MIND_ENABLE_PGVECTOR_MIRROR=trueandCLAUDE_MIND_PG_DSNis set, MCP server and mirror run as sibling services under one ServiceGroup withsuccessTerminationBehavior: .gracefullyShutdownGroupso EOF on stdio cleanly tears both down.
Running the mirror
# 1. Postgres with pgvector (this Mac happens to have postgresql@14 + pgvector
# targeting postgres@17; if you're on @14 you'll need to compile pgvector
# against @14 or switch instances). pgvector docs: https://github.com/pgvector/pgvector
psql -d mydb -c "CREATE EXTENSION vector;"
# 2. Run the server with mirror enabled.
env \
CLAUDE_MIND_ENABLE_PGVECTOR_MIRROR=true \
CLAUDE_MIND_PG_DSN="postgresql://user@localhost:5432/mydb?sslmode=disable" \
.build/release/claude-mind-mcp
The mirror will bootstrap the canonical schema + memory_embeddings_<safeID> table, register the active profile in embedding_profiles, and start draining the outbox. Mirror health appears in the server's stderr log.
v2.2 known limits (deferred to v2.3+)
- TLS Postgres connections (
sslmode=require) error out for now — the v2.2 path issslmode=disable. Adding NIOSSL config is a small follow-up. - Hybrid recall through Postgres (
SchemaGenerator.recallQuery) is wired and ready in SQL but the recall handler still queries Core Data only. v2.3 makes recall route through the mirror when it's enabled and the active profile matches. - End-to-end mirror runtime hasn't been smoke-tested in this session: the on-machine postgresql@14 didn't have pgvector available (Homebrew's pgvector targets postgres@17). The build, schema, and worker code are complete; the missing step is
psql -c 'CREATE EXTENSION vector'on a compatible server.
Status (milestone 2.1, complete — pre-mirror)
- Default backend is now CoreML(all) + MiniLM-L6-v2. Settings env defaults to
embedding_backend=coremlandembedding_profile=minilm-l6-v2. NLContextualEmbedding remains as a fallback (setCLAUDE_MIND_EMBEDDING_BACKEND=nl, or it kicks in automatically if the sidecar model isn't installed). - Sidecar model packaging. Models live at
~/Library/Application Support/claude-mind/models/<name>/withmodel.mlpackage/,tokenizer/, andmanifest.json(name, version, backend, profile, dim, seq_len, sha256s).ModelLocatorsearchesCLAUDE_MIND_MODELS_DIR→ app-support →docs/bench/models(dev fallback). Install viascripts/install_model.sh. - Profile-scoped pgvector schema generator (
SchemaGenerator) emits a canonical schema (no embedding column onmemories) plus per-profilememory_embeddings_<profile>tables withvector(<dim>). Different backends/dimensions coexist without forcing a single column type. Active profiles registered inembedding_profiles. - Hygiene.
.gitignoreexcludes.mlpackage/ tokenizer /.sqlite/ bench JSON..gitattributesis preset for Git LFS if you opt in. The .mlpackage is ~88 MB — distribute via release asset or LFS, not normal blob.
Status (milestone 1.5, complete)
- Eager
NLContextualEmbedding.requestAssets()warm-up; logs first-load vs cached separately. On this Mac: first-load asset download ≈ 17.7s, subsequent loads ≈ 100ms. recall_aroundlands on memories within ±window of an anchor (memory id or ISO date), ordered by absolute time delta.- Single shared
NSManagedObjectContextper store — read and write operations serialize on its queue. Note: rapid back-to-back writes-then-reads from independent MCP handlers can still race if the read enters the perform queue before the writes finish their pre-NLP work; in real Claude Desktop usage this never manifests because the client awaits each tool result. NSEntityDescription.indexesreplaces deprecatedisIndexed; single-attribute fetch indexes on hot paths only — compound deferred until profiling justifies it.- New
claude-mind-benchexecutable target measures cold init, cold first embed, warm embed p50/p95/p99/stdev, full-path remember p50/p95/p99/stdev, serial vs limited-concurrency.CoreMLEnricherlands as a parallel backend with explicitMLComputeUnits(cpu / cpu+ane / all); model is BYO (seedocs/coreml-embedding-models.md).
Bench numbers on this machine (Apple M4)
| backend | embed p50 (ms) | remember p50 (ms) | dim | |-------------------------|---------------:|------------------:|----:| | NLContextualEmbedding | 14.30 | 18.91 | 512 | | CoreML(cpu) | 6.07 | 10.95 | 384 | | CoreML(cpu+ane) | 3.03 | 8.60 | 384 | | CoreML(all) | 2.95 | 8.04 | 384 |
Full matrix and findings in docs/bench/results.md. MiniLM-L6-v2 on Core ML(all) is ~5× faster than NLContextualEmbedding for the embed path and ~2.3× faster for the full remember path. ANE startup costs ~1.6 s but amortizes immediately. Concurrency doesn't help any backend — drive embed serially or batch.
Status (milestone 1, complete)
- Programmatic
NSManagedObjectModel(no.xcdatamodeld, no Xcode project required) —Sources/ClaudeMindCore/ManagedObjectModel.swift. - Plain
NSPersistentContaineratCLAUDE_MIND_STORE_URL(default~/Library/Application Support/claude-mind/memory.sqlite). AppleNLPEnricher(actor; serialized to avoid NLTagger/NLEmbedding thread-safety issues):NLContextualEmbeddingpreferred,NLEmbedding.sentenceEmbeddingfallback, dimension runtime-discovered, NER + sentiment + language detection,NSDataDetectorfor explicit dates.- Tools live:
now,parse_date,remember,recall,list_recent,forget.relate,traverse,recall_aroundreturn planned-for-v2 errors. - Outbox row written on every
rememberso the milestone-2 mirror can backfill without changing the write path.
Known caveats
NLContextualEmbeddingrequires an asset bundle that may not be local on first run; we currently fall back toNLEmbedding.sentenceEmbedding. Eager asset download viarequestEmbeddingAssets()is a small follow-up — flagged but not done in v1 to avoid surprising network calls during startup.- Recall does an in-process cosine scan over all non-tombstoned matching memories. Adequate for ≤ ~50k memories; pgvector mirror is the v2 fix.
- Programmatic model uses the deprecated
isIndexedshortcut (still functional). Migration toNSEntityDescription.indexesis cosmetic.
Milestone 2
Knowledge structure + scale:
relate,traverse,recall_around- Postgres mirror target (
ClaudeMindMirror) draining the Core Data outbox pgvectorANN +tsvectorlexical, hybrid recall path- Remote Streamable HTTP transport with origin validation and auth (Anthropic's cloud connects from their network, not the laptop)
- Optional: wrap in
.appto enableNSPersistentCloudKitContainerfor cross-device sync