Mnemosyne MCP Server: Architecture
Level 2 (Topic) — Overall system architecture and component relationships.
Concept
mnemosyne-mcp-server is a semantic memory service that lets AI agents store, search, and retrieve structured knowledge using natural language. It implements the Model Context Protocol (MCP) so any MCP-compatible client (Gemini CLI, OpenCode, Pi) can interact with it as a tool provider.
The server sits between three layers:
- Caller (AI agent via MCP protocol) — sends ingestion requests and search queries
- Embedding engine (Google Gemini
gemini-embedding-001) — converts text to 768-dimensional vectors - Storage (PostgreSQL + pgvector) — persists memories and performs cosine similarity search
Component Map
cmd/mnemosyne-mcp/main.go ← entrypoint, env vars, transport selection
internal/
├── mcp/mcp.go ← MCP server, tool registration, HTTP/stdio transport
├── logic/logic.go ← business logic, async ingestion worker pool
├── embedding/embedding.go ← Gemini embedding client (REST API)
└── db/db.go ← PostgreSQL + pgvector layer
Data Flow
Ingestion (write path)
- Agent calls
ingest_memorytool withcontent+timestamp - MCP handler returns
{"status": "queued"}immediately - Background worker: SHA-256 dedup check → Gemini embed → pgvector INSERT
- Errors logged but never reported back to caller
Retrieval (read path)
- Agent calls
retrieve_memoriestool withquery+ optionallimit - MCP handler embeds the query via Gemini
- pgvector cosine similarity search (
<=>operator) - Results formatted with ID, date, content
Transport Modes
| Mode | Env MCP_TRANSPORT | Endpoint | Use Case |
|---|---|---|---|
| stdio | "" (default) | stdin/stdout | CLI integration, Gemini CLI |
| HTTP | "http" | :PORT/mcp | Cluster deployment, remote access (SSE streaming disabled to avoid client hangs) |
Deployment Model
- Docker image:
tazzo/mnemosyne-mcp(multi-stage, distroless final) - Cluster: Flux GitOps via
tazlab-k8s/apps/base/mnemosyne-mcp/ - Namespace:
tazlab-db - Image automation:
mcp-<N>-<sha>tag pattern,numerical: asc
Key Design Decisions
- Async ingestion: callers don’t wait for embedding+DB write; trade-off is silent failure
- Dedup via SHA-256: prevents re-ingesting identical content; 10-min in-memory cache + DB unique constraint
- Auto-detect dimensions: on startup, queries existing table for vector dimensions; fallback to 3072 (TD-002 risk)
- Distroless final image: no shell, minimal attack surface; debugging requires ephemeral debug containers
- Disabled HTTP Streaming (SSE): The standard Go MCP
StreamableHTTPServeruses Server-Sent Events (SSE) via GET and JSON-RPC via POST. Since the SSE GET handler can block clients indefinitely if no events are sent, streaming has been disabled withserver.WithDisableStreaming(true). GET requests return405 Method Not Allowed, prompting clients (like the Antigravity CLI) to immediately fall back to the standard JSON-RPC over POST transport, avoiding hangs.
See Also
- Parent hub: mnemosyne-mcp-server
- Sibling topics: MCP Tools, Embedding Model, Async Ingestion, Database Schema
- Details: Ingest Memory Detail, Deployment Detail
- Debt: Known Issues