Architecture¶
Engram is a dual-memory AI system modeled after how humans store and retrieve information.
Core Concept¶
Humans use two complementary memory systems:
- Episodic memory — "What happened?" — specific events, timestamped experiences
- Semantic memory — "What do I know?" — facts, concepts, relationships
Engram mirrors this with a vector store (episodic) and a knowledge graph (semantic), unified by an LLM reasoning engine.
System Diagram¶
flowchart TD
subgraph Interfaces
CLI["CLI (Typer)"]
MCP["MCP Server (stdio)"]
HTTP["HTTP API (FastAPI)"]
WS["WebSocket /ws"]
end
subgraph Agents
CC["Claude Code"]
OC["OpenClaw"]
CU["Cursor"]
ANY["Any MCP Client"]
end
CC & OC & CU & ANY --> MCP
Interfaces --> Auth
Auth["Auth Middleware\n(JWT + RBAC, optional)"]
Auth --> Tenant["TenantContext\n(ContextVar)"]
Tenant --> RP["Recall Pipeline"]
RP --> EP["EpisodicStore\n(Qdrant)"]
RP --> SG["SemanticGraph\n(NetworkX + SQLite/PG)"]
RP --> Fed["Federation Layer"]
EP & SG --> RE["Reasoning Engine\n(Gemini via litellm)"]
EP --> Cache["Redis Cache\n(optional)"]
WS --> EB["Event Bus\n(push events)"]
subgraph Fed["Federated Knowledge"]
M0["mem0"]
LR["LightRAG"]
GR["Graphiti"]
Custom["REST / File / PG / MCP"]
end
Layers¶
| Layer | Component | Technology |
|---|---|---|
| Interface | CLI, MCP, HTTP API, WebSocket | Typer, FastMCP, FastAPI |
| Auth | JWT + API keys, RBAC | python-jose |
| Tenancy | ContextVar propagation | Python contextvars |
| Recall | Pipeline: decide > resolve > search > fuse | Custom |
| Episodic | Vector store | Qdrant (embedded or server) |
| Semantic | Knowledge graph | NetworkX + SQLite/PostgreSQL |
| Reasoning | LLM synthesis | Gemini via litellm |
| Capture | Session watchers | inotify/watchdog |
| Federation | External providers | REST, File, PG, MCP adapters |
| Cache | Result caching | Redis (optional) |
| Observability | Tracing + audit | OpenTelemetry, JSONL |
Data Flow: Recall¶
sequenceDiagram
participant Agent
participant Pipeline as Recall Pipeline
participant Episodic as EpisodicStore
participant Graph as SemanticGraph
participant LLM as Reasoning Engine
Agent->>Pipeline: recall("production incidents last week")
Pipeline->>Pipeline: Query Decision (trivial check)
Pipeline->>Pipeline: Temporal Resolution ("last week" → date range)
Pipeline->>Pipeline: Entity Resolution (pronouns → named entities)
par Parallel search
Pipeline->>Episodic: vector similarity search
Pipeline->>Graph: entity + keyword search
end
Episodic-->>Pipeline: episodic results
Graph-->>Pipeline: graph results
Pipeline->>Pipeline: Dedup + composite scoring
Pipeline->>LLM: fuse context (think mode)
LLM-->>Agent: synthesized answer
Data Flow: Ingestion¶
sequenceDiagram
participant Source as Agent / Watcher
participant Gate as Entity Gate
participant Extractor as Entity Extractor
participant Classifier as Memory Classifier
participant Episodic as EpisodicStore
participant Graph as SemanticGraph
Source->>Gate: ingest(messages)
Gate->>Extractor: extract entities (LLM)
Extractor-->>Gate: entities found?
alt No entities
Gate-->>Source: skip (noise filtered)
else Entities found
Gate->>Classifier: classify memory type
Classifier-->>Gate: fact / decision / preference / ...
Gate->>Episodic: store with embedding
Gate->>Graph: upsert entities + relations
Gate-->>Source: stored
end
Component Deep Dives¶
- Episodic Memory — Qdrant vector store, decay, scoring
- Semantic Graph — NetworkX graph, SQLite/PG backend, query DSL
- Recall Pipeline — Full pipeline walkthrough
- Entity-Gated Ingestion — Why and how entities gate storage