Mnemosyne MCP Server: Architecture

Level 2 (Topic) — Overall system architecture and component relationships.

Concept

mnemosyne-mcp-server is a semantic memory service that lets AI agents store, search, and retrieve structured knowledge using natural language. It implements the Model Context Protocol (MCP) so any MCP-compatible client (Gemini CLI, OpenCode, Pi) can interact with it as a tool provider.

The server sits between three layers:

  • Caller (AI agent via MCP protocol) — sends ingestion requests and search queries
  • Embedding engine (Google Gemini gemini-embedding-001) — converts text to 768-dimensional vectors
  • Storage (PostgreSQL + pgvector) — persists memories and performs cosine similarity search

Component Map

cmd/mnemosyne-mcp/main.go          ← entrypoint, env vars, transport selection
internal/
├── mcp/mcp.go                      ← MCP server, tool registration, HTTP/stdio transport
├── logic/logic.go                  ← business logic, async ingestion worker pool
├── embedding/embedding.go          ← Gemini embedding client (REST API)
└── db/db.go                        ← PostgreSQL + pgvector layer

Data Flow

Ingestion (write path)

  1. Agent calls ingest_memory tool with content + timestamp
  2. MCP handler returns {"status": "queued"} immediately
  3. Background worker: SHA-256 dedup check → Gemini embed → pgvector INSERT
  4. Errors logged but never reported back to caller

Retrieval (read path)

  1. Agent calls retrieve_memories tool with query + optional limit
  2. MCP handler embeds the query via Gemini
  3. pgvector cosine similarity search (<=> operator)
  4. Results formatted with ID, date, content

Transport Modes

ModeEnv MCP_TRANSPORTEndpointUse Case
stdio"" (default)stdin/stdoutCLI integration, Gemini CLI
HTTP"http":PORT/mcpCluster deployment, remote access (SSE streaming disabled to avoid client hangs)

Deployment Model

  • Docker image: tazzo/mnemosyne-mcp (multi-stage, distroless final)
  • Cluster: Flux GitOps via tazlab-k8s/apps/base/mnemosyne-mcp/
  • Namespace: tazlab-db
  • Image automation: mcp-<N>-<sha> tag pattern, numerical: asc

Key Design Decisions

  1. Async ingestion: callers don’t wait for embedding+DB write; trade-off is silent failure
  2. Dedup via SHA-256: prevents re-ingesting identical content; 10-min in-memory cache + DB unique constraint
  3. Auto-detect dimensions: on startup, queries existing table for vector dimensions; fallback to 3072 (TD-002 risk)
  4. Distroless final image: no shell, minimal attack surface; debugging requires ephemeral debug containers
  5. Disabled HTTP Streaming (SSE): The standard Go MCP StreamableHTTPServer uses Server-Sent Events (SSE) via GET and JSON-RPC via POST. Since the SSE GET handler can block clients indefinitely if no events are sent, streaming has been disabled with server.WithDisableStreaming(true). GET requests return 405 Method Not Allowed, prompting clients (like the Antigravity CLI) to immediately fall back to the standard JSON-RPC over POST transport, avoiding hangs.

See Also