The First AI Memory ThatThinks Like a Brain
5-layer cognitive architecture. Zero-LLM ingestion pipeline. Self-improving memory that decays, consolidates, and reasons — like a real brain. The memory OS that every other system forgot to build.
npm install mnemosy-aiNot a Demo. Not a Roadmap. Running Right Now.
Mnemosyne powers a 10-machine AI fleet with sub-200ms retrieval latency. Every feature on this page is in production.
AI Agents Have Amnesia. Mnemosyne Gives Them a Brain.
Every other memory system is a vector database with an LLM wrapper. Mnemosyne is different.
✘ Every Other Memory System
- ✘ Burns LLM tokens on every memory stored (~$0.01 each)
- ✘ Retrieves by single signal only (cosine similarity)
- ✘ No knowledge graph, or paywalled at $249/mo
- ✘ Single-agent only — no multi-agent collaboration
- ✘ Static storage — quality degrades over time
- ✘ No activation decay, no consolidation, no reasoning
✔ Mnemosyne
- ✔ Memories persist across sessions, restarts, and context window resets
- ✔ Retrieval uses 5 signals weighted by detected intent
- ✔ Knowledge grows through auto-linking, graph expansion, and cross-agent corroboration
- ✔ Quality improves via reinforcement learning and 4-phase consolidation
- ✔ Agents collaborate through a real-time memory mesh with pub/sub
- ✔ Zero LLM calls during ingestion — $0 per memory, works offline
5 Lines to Cognitive Memory
Drop-in TypeScript SDK. Zero LLM required for the ingestion pipeline. Your agent has a brain in under a minute.
Only hard requirement: Qdrant (docker run -d -p 6333:6333 qdrant/qdrant). Redis and FalkorDB are optional.
5-Layer Cognitive Architecture
Inspired by how the human brain organizes, retrieves, and strengthens memories over time. Each layer is independently toggleable.
graph TB
subgraph L5["L5: Self-Improvement"]
RL["Reinforcement Learning"]
AC["Active Consolidation"]
FR["Flash Reasoning"]
ToMA["Agent Awareness"]
end
subgraph L4["L4: Cognitive"]
AD["Activation Decay"]
MS["Multi-Signal Scoring"]
IA["Intent-Aware Retrieval"]
DR["Diversity Reranking"]
end
subgraph L3["L3: Knowledge Graph"]
TG["Temporal Graph"]
AL["Auto Linking"]
PF["Path Finding"]
TR["Timeline Reconstruction"]
end
subgraph L2["L2: Pipeline"]
EP["Extraction Pipeline"]
TC["Type Classifier"]
DM["Dedup and Merge"]
SF["Security Filter"]
end
subgraph L1["L1: Infrastructure"]
VDB["Qdrant Vector DB"]
GDB["FalkorDB Graph"]
Cache["2-Tier Cache"]
PS["Redis Pub/Sub"]
end
L5 --> L4
L4 --> L3
L3 --> L2
L2 --> L1
style L5 fill:#1e1235,stroke:#a855f7,stroke-width:2px,color:#e2e8f0
style L4 fill:#0c1a1e,stroke:#00f0ff,stroke-width:2px,color:#e2e8f0
style L3 fill:#15102a,stroke:#a855f7,stroke-width:1px,color:#e2e8f0
style L2 fill:#0b171a,stroke:#00f0ff,stroke-width:1px,color:#e2e8f0
style L1 fill:#151520,stroke:#64748b,stroke-width:1px,color:#e2e8f0
33 Features Across 5 Layers
Every feature is independently toggleable — start simple, enable progressively.
Vector Storage
768-dim embeddings on Qdrant with HNSW indexing. Sub-linear search scaling to billions of vectors.
2-Tier Cache
L1 in-memory (50 entries, 5min TTL) + L2 Redis (1hr TTL). Sub-10ms cached recall.
Pub/Sub Broadcast
Real-time memory events across your entire agent mesh via Redis channels.
Knowledge Graph
Temporal entity graph on FalkorDB with auto-linking, path finding, and timeline reconstruction.
Bi-Temporal Model
Every memory tracks eventTime + ingestedAt for temporal queries.
Soft-Delete
Memories are never physically deleted. Full audit trails and recovery at any time.
12-Step Ingestion
Security → embed → dedup → extract → classify → score → link → graph → broadcast.
Zero-LLM Pipeline
Classification, extraction, urgency detection, conflict resolution — all algorithmic. $0 per memory.
Security Filter
3-tier classification (public/private/secret). Blocks API keys, credentials, and private keys.
Smart Dedup & Merge
Cosine ≥0.92 = duplicate merge, 0.70-0.92 = conflict alert broadcast.
Entity Extraction
Automatic identification of people, machines, IPs, dates, technologies, URLs — zero LLM.
7-Type Taxonomy
Episodic, semantic, preference, relationship, procedural, profile, core — classified algorithmically.
Temporal Queries
"What was X connected to as of date Y?" — relationships carry since timestamps.
Auto-Linking
New memories automatically discover and link to related memories. Bidirectional, Zettelkasten-style.
Path Finding
Shortest-path queries between any two entities with configurable max depth.
Timeline Reconstruction
Ordered history of all memories mentioning a given entity.
Depth-Limited Traversal
Configurable graph exploration (default: 2 hops) balancing relevance vs. noise.
Activation Decay
Logarithmic decay model — critical memories stay months, core/procedural are immune.
Multi-Signal Scoring
5 signals: similarity, recency, importance×confidence, frequency, type relevance.
Intent-Aware Retrieval
Auto-detects query intent (factual, temporal, procedural, preference, exploratory).
Diversity Reranking
Cluster detection, overlap penalty, type diversity — prevents echo chambers in results.
4-Tier Confidence
Mesh Fact ≥0.85, Grounded 0.65-0.84, Inferred 0.40-0.64, Uncertain <0.40.
Priority Scoring
Urgency × Domain composite — critical+technical = 1.0, background+general = 0.2.
Reinforcement Learning
Feedback loop tracks usefulness, auto-promotes memories with >0.7 ratio after 3+ retrievals.
Active Consolidation
4-phase autonomous maintenance: contradiction detection, dedup merge, promotion, demotion.
Flash Reasoning
BFS traversal through linked memory graphs, reconstructs multi-step logic chains.
Theory of Mind (ToMA)
"What does Agent-B know about X?" — knowledge gap analysis across the mesh.
Cross-Agent Synthesis
3+ agents agree on a fact → auto-synthesized into fleet-level insight.
Proactive Recall
Generates speculative queries from incoming prompts, injects context before the agent asks.
Session Survival
Snapshot/recovery across context window resets — zero discontinuity.
Observational Memory
Compresses raw conversation streams into structured, high-signal memory cells.
Procedural Memory
Learned procedures as first-class objects, immune to decay, shared across the mesh.
Mesh Sync
Named, versioned shared state blocks with real-time broadcast propagation.
Research-Grade: 10 Capabilities From Research Papers
These capabilities exist almost exclusively in academic literature and closed research labs. Mnemosyne ships all 10 as deployable infrastructure.
33 Features. 28 That Nobody Else Has.
Every feature listed is in production. Not planned. Not in beta. Shipping.
| Feature | Mnemosyne | Mem0 | Zep | Cognee | LangMem | Letta |
|---|---|---|---|---|---|---|
| Pipeline & Ingestion | ||||||
| Zero-LLM Ingestion Pipeline | ✓ | ✗ LLM | ✗ LLM | ✗ LLM | ✗ LLM | ✗ LLM |
| 12-Step Structured Pipeline | ✓ | Partial | Partial | ✓ | ✗ | ✗ |
| Security Filter (Secret Blocking) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Smart Dedup with Semantic Merge | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ |
| Conflict Detection & Alerts | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| 7-Type Memory Taxonomy | ✓ | ✗ | ✗ | ✗ | ✗ | Partial |
| Entity Extraction (Zero-LLM) | ✓ | LLM-based | LLM-based | LLM-based | ✗ | ✗ |
| Cognitive Features | ||||||
| Activation Decay Model | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Multi-Signal Scoring (5 Signals) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Intent-Aware Retrieval | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Diversity Reranking | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Flash Reasoning Chains | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Reinforcement Learning | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Active Consolidation (4-Phase) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Proactive Recall | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Session Survival | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ |
| Observational Memory | ✓ | ✗ | ✓ | ✗ | ✗ | ✗ |
| Knowledge Graph | ||||||
| Built-in Knowledge Graph | ✓ | $249/mo | ✗ | ✓ | ✗ | ✗ |
| Temporal Graph Queries | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Auto-Linking (Bidirectional) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Path Finding Between Entities | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Timeline Reconstruction | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Bi-Temporal Data Model | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Multi-Agent | ||||||
| Real-Time Broadcast (Pub/Sub) | ✓ | Enterprise | ✗ | ✗ | ✗ | ✗ |
| Theory of Mind (ToMA) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Cross-Agent Synthesis | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Shared State Blocks (Mesh Sync) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Infrastructure | ||||||
| 2-Tier Caching (L1 + L2) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Soft-Delete Architecture | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
| Procedural Memory (Skill Library) | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
Store. Recall. Learn. Collaborate.
Four operations that transform stateless agents into intelligent, collaborative systems.
Store
Input goes through a 12-step zero-LLM pipeline: security filter, embedding, dedup, extraction, classification, scoring, linking, graph ingest, and broadcast. All in <50ms.
Recall
Queries hit the 2-tier cache, then vector search with 5-signal intent-aware scoring, diversity reranking, graph enrichment, and flash reasoning chains.
Learn
Feedback signals promote useful memories and flag poor ones. Active consolidation merges duplicates, resolves contradictions, and promotes popular knowledge.
Collaborate
Agents share memories via pub/sub mesh. Theory of Mind queries what others know. 3+ agents agreeing on a fact triggers fleet-level synthesis.
10-40x Faster Ingestion Than Any Competitor
Real numbers from a 10-machine AI fleet running Mnemosyne in production.
Try It: Live Memory Demo
Store memories to localStorage. See the cognitive pipeline in action — no backend needed.
Mnemosyne Demo Terminal
Built for Agents That Actually Think
From single-agent coding assistants to enterprise-scale agent meshes.
AI Coding Assistants
Session survival, procedural memory, temporal graph. Agents remember project context, deployment procedures, and past debugging sessions.
Enterprise Knowledge Agents
Agent mesh, Theory of Mind, cross-agent synthesis. Specialized agents build domain expertise while sharing verified facts.
Customer Support
Preference tracking, reinforcement learning, procedural memory. Agents remember customer history, resolution patterns, and preferences.
Research Assistants
Flash reasoning, auto-linking, knowledge graph. Agents accumulate domain knowledge and surface non-obvious connections.
DevOps & Infrastructure
Temporal queries, proactive warnings, mesh sync. Agents remember topology, incidents, and answer "What changed since last stable state?"
Personal AI Companions
Activation decay, observational memory, preference modeling. Long-running assistants that develop genuine understanding over time.
9 Tools. Any Agent Framework.
Drop-in tools that work with Claude, GPT, LangChain, CrewAI, AutoGen, or any LLM agent framework.
Give Your Agents the Memory
They Deserve
Open source. MIT licensed. 33 features. $0 per memory. Deploy cognitive memory in minutes.
npm install mnemosy-aiBecause intelligence without memory isn't intelligence.