HomeFeaturesCompareDocsGitHub
Cognitive Memory OS for AI Agents

The First AI Memory ThatThinks Like a Brain

5-layer cognitive architecture. Zero-LLM ingestion pipeline. Self-improving memory that decays, consolidates, and reasons — like a real brain. The memory OS that every other system forgot to build.

$ npm install mnemosy-ai
$0
Per Memory Stored
<0ms
Ingestion Latency
0
Production Features
0
Cognitive Layers

Not a Demo. Not a Roadmap. Running Right Now.

Mnemosyne powers a 10-machine AI fleet with sub-200ms retrieval latency. Every feature on this page is in production.

0
Memories in Production
0
Concurrent Agents
0
Unique Features
<0ms
Retrieval Latency

AI Agents Have Amnesia. Mnemosyne Gives Them a Brain.

Every other memory system is a vector database with an LLM wrapper. Mnemosyne is different.

Every Other Memory System

  • Burns LLM tokens on every memory stored (~$0.01 each)
  • Retrieves by single signal only (cosine similarity)
  • No knowledge graph, or paywalled at $249/mo
  • Single-agent only — no multi-agent collaboration
  • Static storage — quality degrades over time
  • No activation decay, no consolidation, no reasoning

Mnemosyne

  • Memories persist across sessions, restarts, and context window resets
  • Retrieval uses 5 signals weighted by detected intent
  • Knowledge grows through auto-linking, graph expansion, and cross-agent corroboration
  • Quality improves via reinforcement learning and 4-phase consolidation
  • Agents collaborate through a real-time memory mesh with pub/sub
  • Zero LLM calls during ingestion — $0 per memory, works offline

5 Lines to Cognitive Memory

Drop-in TypeScript SDK. Zero LLM required for the ingestion pipeline. Your agent has a brain in under a minute.

TypeScript
// npm install mnemosy-ai import { createMnemosyne } from 'mnemosy-ai' const m = await createMnemosyne({ vectorDbUrl: 'http://localhost:6333', embeddingUrl: 'http://localhost:11434/v1/embeddings', agentId: 'my-agent' }) // Store — full 12-step pipeline, <50ms, $0 await m.store("User prefers dark mode and TypeScript") // Recall — 5-signal scoring, intent-aware, graph-enriched const memories = await m.recall("user preferences") // Feedback — memories learn from use await m.feedback("positive") // Theory of Mind — what does another agent know? const knowledge = await m.toma("devops-agent", "production db")

Only hard requirement: Qdrant (docker run -d -p 6333:6333 qdrant/qdrant). Redis and FalkorDB are optional.

5-Layer Cognitive Architecture

Inspired by how the human brain organizes, retrieves, and strengthens memories over time. Each layer is independently toggleable.

graph TB
  subgraph L5["L5: Self-Improvement"]
    RL["Reinforcement Learning"]
    AC["Active Consolidation"]
    FR["Flash Reasoning"]
    ToMA["Agent Awareness"]
  end
  subgraph L4["L4: Cognitive"]
    AD["Activation Decay"]
    MS["Multi-Signal Scoring"]
    IA["Intent-Aware Retrieval"]
    DR["Diversity Reranking"]
  end
  subgraph L3["L3: Knowledge Graph"]
    TG["Temporal Graph"]
    AL["Auto Linking"]
    PF["Path Finding"]
    TR["Timeline Reconstruction"]
  end
  subgraph L2["L2: Pipeline"]
    EP["Extraction Pipeline"]
    TC["Type Classifier"]
    DM["Dedup and Merge"]
    SF["Security Filter"]
  end
  subgraph L1["L1: Infrastructure"]
    VDB["Qdrant Vector DB"]
    GDB["FalkorDB Graph"]
    Cache["2-Tier Cache"]
    PS["Redis Pub/Sub"]
  end
  L5 --> L4
  L4 --> L3
  L3 --> L2
  L2 --> L1
  style L5 fill:#1e1235,stroke:#a855f7,stroke-width:2px,color:#e2e8f0
  style L4 fill:#0c1a1e,stroke:#00f0ff,stroke-width:2px,color:#e2e8f0
  style L3 fill:#15102a,stroke:#a855f7,stroke-width:1px,color:#e2e8f0
  style L2 fill:#0b171a,stroke:#00f0ff,stroke-width:1px,color:#e2e8f0
  style L1 fill:#151520,stroke:#64748b,stroke-width:1px,color:#e2e8f0
      

33 Features Across 5 Layers

Every feature is independently toggleable — start simple, enable progressively.

Infrastructure (L1)

Vector Storage

768-dim embeddings on Qdrant with HNSW indexing. Sub-linear search scaling to billions of vectors.

2-Tier Cache

L1 in-memory (50 entries, 5min TTL) + L2 Redis (1hr TTL). Sub-10ms cached recall.

Pub/Sub Broadcast

Real-time memory events across your entire agent mesh via Redis channels.

Knowledge Graph

Temporal entity graph on FalkorDB with auto-linking, path finding, and timeline reconstruction.

Bi-Temporal Model

Every memory tracks eventTime + ingestedAt for temporal queries.

Soft-Delete

Memories are never physically deleted. Full audit trails and recovery at any time.

Pipeline (L2)

12-Step Ingestion

Security → embed → dedup → extract → classify → score → link → graph → broadcast.

Zero-LLM Pipeline

Classification, extraction, urgency detection, conflict resolution — all algorithmic. $0 per memory.

Security Filter

3-tier classification (public/private/secret). Blocks API keys, credentials, and private keys.

Smart Dedup & Merge

Cosine ≥0.92 = duplicate merge, 0.70-0.92 = conflict alert broadcast.

Entity Extraction

Automatic identification of people, machines, IPs, dates, technologies, URLs — zero LLM.

7-Type Taxonomy

Episodic, semantic, preference, relationship, procedural, profile, core — classified algorithmically.

Knowledge Graph (L3)

Temporal Queries

"What was X connected to as of date Y?" — relationships carry since timestamps.

Auto-Linking

New memories automatically discover and link to related memories. Bidirectional, Zettelkasten-style.

Path Finding

Shortest-path queries between any two entities with configurable max depth.

Timeline Reconstruction

Ordered history of all memories mentioning a given entity.

Depth-Limited Traversal

Configurable graph exploration (default: 2 hops) balancing relevance vs. noise.

Cognitive (L4)

Activation Decay

Logarithmic decay model — critical memories stay months, core/procedural are immune.

Multi-Signal Scoring

5 signals: similarity, recency, importance×confidence, frequency, type relevance.

Intent-Aware Retrieval

Auto-detects query intent (factual, temporal, procedural, preference, exploratory).

Diversity Reranking

Cluster detection, overlap penalty, type diversity — prevents echo chambers in results.

4-Tier Confidence

Mesh Fact ≥0.85, Grounded 0.65-0.84, Inferred 0.40-0.64, Uncertain <0.40.

Priority Scoring

Urgency × Domain composite — critical+technical = 1.0, background+general = 0.2.

Self-Improvement (L5)

Reinforcement Learning

Feedback loop tracks usefulness, auto-promotes memories with >0.7 ratio after 3+ retrievals.

Active Consolidation

4-phase autonomous maintenance: contradiction detection, dedup merge, promotion, demotion.

Flash Reasoning

BFS traversal through linked memory graphs, reconstructs multi-step logic chains.

Theory of Mind (ToMA)

"What does Agent-B know about X?" — knowledge gap analysis across the mesh.

Cross-Agent Synthesis

3+ agents agree on a fact → auto-synthesized into fleet-level insight.

Proactive Recall

Generates speculative queries from incoming prompts, injects context before the agent asks.

Session Survival

Snapshot/recovery across context window resets — zero discontinuity.

Observational Memory

Compresses raw conversation streams into structured, high-signal memory cells.

Procedural Memory

Learned procedures as first-class objects, immune to decay, shared across the mesh.

Mesh Sync

Named, versioned shared state blocks with real-time broadcast propagation.

Research-Grade: 10 Capabilities From Research Papers

These capabilities exist almost exclusively in academic literature and closed research labs. Mnemosyne ships all 10 as deployable infrastructure.

CapabilityIndustry StatusMnemosyne
Flash Reasoning (chain-of-thought graph traversal)Research paper onlyProduction
Theory of Mind for AgentsResearch paper onlyProduction
Observational Memory CompressionResearch paper onlyProduction
Reinforcement Learning on MemoryResearch paper onlyProduction
Self-Improving ConsolidationNot implemented anywhereProduction
Cross-Agent Cognitive StateNot implemented anywhereProduction
Bi-Temporal Knowledge GraphResearch paper onlyProduction
Proactive Anticipatory RecallNot implemented anywhereProduction
Procedural Memory / Skill LibraryNot implemented anywhereProduction
Session SurvivalNot implemented anywhereProduction

33 Features. 28 That Nobody Else Has.

Every feature listed is in production. Not planned. Not in beta. Shipping.

FeatureMnemosyneMem0ZepCogneeLangMemLetta
Pipeline & Ingestion
Zero-LLM Ingestion Pipeline✗ LLM✗ LLM✗ LLM✗ LLM✗ LLM
12-Step Structured PipelinePartialPartial
Security Filter (Secret Blocking)
Smart Dedup with Semantic Merge
Conflict Detection & Alerts
7-Type Memory TaxonomyPartial
Entity Extraction (Zero-LLM)LLM-basedLLM-basedLLM-based
Cognitive Features
Activation Decay Model
Multi-Signal Scoring (5 Signals)
Intent-Aware Retrieval
Diversity Reranking
Flash Reasoning Chains
Reinforcement Learning
Active Consolidation (4-Phase)
Proactive Recall
Session Survival
Observational Memory
Knowledge Graph
Built-in Knowledge Graph$249/mo
Temporal Graph Queries
Auto-Linking (Bidirectional)
Path Finding Between Entities
Timeline Reconstruction
Bi-Temporal Data Model
Multi-Agent
Real-Time Broadcast (Pub/Sub)Enterprise
Theory of Mind (ToMA)
Cross-Agent Synthesis
Shared State Blocks (Mesh Sync)
Infrastructure
2-Tier Caching (L1 + L2)
Soft-Delete Architecture
Procedural Memory (Skill Library)

Store. Recall. Learn. Collaborate.

Four operations that transform stateless agents into intelligent, collaborative systems.

1

Store

Input goes through a 12-step zero-LLM pipeline: security filter, embedding, dedup, extraction, classification, scoring, linking, graph ingest, and broadcast. All in <50ms.

2

Recall

Queries hit the 2-tier cache, then vector search with 5-signal intent-aware scoring, diversity reranking, graph enrichment, and flash reasoning chains.

3

Learn

Feedback signals promote useful memories and flag poor ones. Active consolidation merges duplicates, resolves contradictions, and promotes popular knowledge.

4

Collaborate

Agents share memories via pub/sub mesh. Theory of Mind queries what others know. 3+ agents agreeing on a fact triggers fleet-level synthesis.

10-40x Faster Ingestion Than Any Competitor

Real numbers from a 10-machine AI fleet running Mnemosyne in production.

0
Memories Managed
<0ms
Store Latency
<0ms
Cached Recall
0
Concurrent Agents
~0/min
Consolidation Rate
<0ms
Embedding (cached)

Try It: Live Memory Demo

Store memories to localStorage. See the cognitive pipeline in action — no backend needed.

Mnemosyne Demo Terminal

Welcome to Mnemosyne's cognitive pipeline demo. This runs entirely in your browser — no backend needed.\n\nStore mode: Type anything to run it through the 12-step ingestion pipeline.\nRecall mode: Search your stored memories with multi-signal retrieval.\n\nTry storing: "I prefer TypeScript and dark mode" then recalling: "What language do I like?"
Pipeline: ready Memories: 0 Mode: store

Built for Agents That Actually Think

From single-agent coding assistants to enterprise-scale agent meshes.

AI Coding Assistants

Session survival, procedural memory, temporal graph. Agents remember project context, deployment procedures, and past debugging sessions.

Enterprise Knowledge Agents

Agent mesh, Theory of Mind, cross-agent synthesis. Specialized agents build domain expertise while sharing verified facts.

Customer Support

Preference tracking, reinforcement learning, procedural memory. Agents remember customer history, resolution patterns, and preferences.

Research Assistants

Flash reasoning, auto-linking, knowledge graph. Agents accumulate domain knowledge and surface non-obvious connections.

DevOps & Infrastructure

Temporal queries, proactive warnings, mesh sync. Agents remember topology, incidents, and answer "What changed since last stable state?"

Personal AI Companions

Activation decay, observational memory, preference modeling. Long-running assistants that develop genuine understanding over time.

9 Tools. Any Agent Framework.

Drop-in tools that work with Claude, GPT, LangChain, CrewAI, AutoGen, or any LLM agent framework.

memory_recall
Multi-signal search with intent detection, diversity reranking, graph enrichment, flash reasoning.
memory_store
Full 12-step ingestion: security filter, dedup, classify, link, graph ingest, broadcast.
memory_forget
Soft-delete by ID or semantic search, mesh-wide cache invalidation.
memory_block_get
Read a named shared memory block (Mesh Sync).
memory_block_set
Write/update a named shared block with versioning and broadcast.
memory_feedback
Reinforcement signal — drives memory promotion/demotion.
memory_consolidate
4-phase active maintenance: contradictions, dedup, promotion, demotion.
memory_toma
Query what a specific agent knows about a topic (Theory of Mind).
before_agent_start
Automatic hook: session recovery, proactive recall, context injection.

Open Source. MIT License. Free Knowledge Graph. $0 Per Memory.

Competitors paywall their graph, charge per memory via LLM, and gate multi-agent behind enterprise tiers.

MIT
License (fully open)
Free
All features included
$0.00
Per memory stored
$0
100K memories cost

Give Your Agents the Memory
They Deserve

Open source. MIT licensed. 33 features. $0 per memory. Deploy cognitive memory in minutes.

$ npm install mnemosy-ai

Because intelligence without memory isn't intelligence.