Mnemosyne MCP Server: Embedding Model

Level 2 (Topic) — Gemini embedding integration and dimension handling.

Concept

Mnemosyne converts text into vector embeddings using Google’s gemini-embedding-001 model via the Generative Language REST API. These vectors enable semantic similarity search in PostgreSQL via pgvector.

Model Details

PropertyValue
Modelgemini-embedding-001
APIgenerativelanguage.googleapis.com
Output dimensions768
HTTP timeout15 seconds
AuthGEMINI_API_KEY environment variable

How It Works

  1. internal/embedding/embedding.go sends a POST request to the Gemini API
  2. Request body: {"content": {"parts": [{"text": "<input>"}]}}
  3. Response contains embedding.values — a []float32 vector
  4. Vector is passed to the DB layer for storage or comparison

Dimension Mismatch Risk (TD-002)

The DB layer (internal/db/db.go) auto-detects vector dimensions on startup by querying existing rows. If the table is empty, it falls back to 3072 dimensions. The embedding model produces 768-dimensional vectors.

Result: on a fresh database, the memories table is created with vector(3072) columns, and every 768-dim embedding insertion fails with a pgvector dimension error.

Workaround: if at least one memory with correct 768-dim embedding exists, auto-detection works correctly.

Code paths:

  • internal/db/db.goautoDetectDimensions() function, fallback constant
  • internal/embedding/embedding.goDefaultModel const

No Model Override

The embedding model is a Go const with no environment variable override. Changing the model requires:

  1. Source code change in internal/embedding/embedding.go
  2. Dimension alignment with the DB table
  3. Re-ingestion of all existing memories (old vectors won’t match new model)

API Call Pattern

POST https://generativelanguage.googleapis.com/v1beta/models/gemini-embedding-001:embedContent?key=<GEMINI_API_KEY>
Content-Type: application/json

{
  "content": {
    "parts": [{"text": "<text to embed>"}]
  }
}

See Also