Cognitive Memory OS for AI Agents

The First AI Memory ThatThinks Like a Brain

5-layer cognitive architecture. Zero-LLM ingestion pipeline. Self-improving memory that decays, consolidates, and reasons — like a real brain. The memory OS that every other system forgot to build.

Get Started →View on GitHub

$ npm install mnemosy-ai

Per Memory Stored

<0ms

Ingestion Latency

Production Features

Cognitive Layers

Not a Demo. Not a Roadmap. Running Right Now.

Mnemosyne powers a 10-machine AI fleet with sub-200ms retrieval latency. Every feature on this page is in production.

Memories in Production

Concurrent Agents

Unique Features

<0ms

Retrieval Latency

AI Agents Have Amnesia. Mnemosyne Gives Them a Brain.

Every other memory system is a vector database with an LLM wrapper. Mnemosyne is different.

✘ Every Other Memory System

✘ Burns LLM tokens on every memory stored (~$0.01 each)
✘ Retrieves by single signal only (cosine similarity)
✘ No knowledge graph, or paywalled at $249/mo
✘ Single-agent only — no multi-agent collaboration
✘ Static storage — quality degrades over time
✘ No activation decay, no consolidation, no reasoning

✔ Mnemosyne

✔ Memories persist across sessions, restarts, and context window resets
✔ Retrieval uses 5 signals weighted by detected intent
✔ Knowledge grows through auto-linking, graph expansion, and cross-agent corroboration
✔ Quality improves via reinforcement learning and 4-phase consolidation
✔ Agents collaborate through a real-time memory mesh with pub/sub
✔ Zero LLM calls during ingestion — $0 per memory, works offline

5 Lines to Cognitive Memory

Drop-in TypeScript SDK. Zero LLM required for the ingestion pipeline. Your agent has a brain in under a minute.

TypeScript
// npm install mnemosy-ai
import { createMnemosyne } from 'mnemosy-ai'

const m = await createMnemosyne({
  vectorDbUrl: 'http://localhost:6333',
  embeddingUrl: 'http://localhost:11434/v1/embeddings',
  agentId: 'my-agent'
})

// Store — full 12-step pipeline, <50ms, $0
await m.store("User prefers dark mode and TypeScript")

// Recall — 5-signal scoring, intent-aware, graph-enriched
const memories = await m.recall("user preferences")

// Feedback — memories learn from use
await m.feedback("positive")

// Theory of Mind — what does another agent know?
const knowledge = await m.toma("devops-agent", "production db")

Only hard requirement: Qdrant (docker run -d -p 6333:6333 qdrant/qdrant). Redis and FalkorDB are optional.

5-Layer Cognitive Architecture

Inspired by how the human brain organizes, retrieves, and strengthens memories over time. Each layer is independently toggleable.

graph TB
  subgraph L5["L5: Self-Improvement"]
    RL["Reinforcement Learning"]
    AC["Active Consolidation"]
    FR["Flash Reasoning"]
    ToMA["Agent Awareness"]
  end
  subgraph L4["L4: Cognitive"]
    AD["Activation Decay"]
    MS["Multi-Signal Scoring"]
    IA["Intent-Aware Retrieval"]
    DR["Diversity Reranking"]
  end
  subgraph L3["L3: Knowledge Graph"]
    TG["Temporal Graph"]
    AL["Auto Linking"]
    PF["Path Finding"]
    TR["Timeline Reconstruction"]
  end
  subgraph L2["L2: Pipeline"]
    EP["Extraction Pipeline"]
    TC["Type Classifier"]
    DM["Dedup and Merge"]
    SF["Security Filter"]
  end
  subgraph L1["L1: Infrastructure"]
    VDB["Qdrant Vector DB"]
    GDB["FalkorDB Graph"]
    Cache["2-Tier Cache"]
    PS["Redis Pub/Sub"]
  end
  L5 --> L4
  L4 --> L3
  L3 --> L2
  L2 --> L1
  style L5 fill:#1e1235,stroke:#a855f7,stroke-width:2px,color:#e2e8f0
  style L4 fill:#0c1a1e,stroke:#00f0ff,stroke-width:2px,color:#e2e8f0
  style L3 fill:#15102a,stroke:#a855f7,stroke-width:1px,color:#e2e8f0
  style L2 fill:#0b171a,stroke:#00f0ff,stroke-width:1px,color:#e2e8f0
  style L1 fill:#151520,stroke:#64748b,stroke-width:1px,color:#e2e8f0

33 Features Across 5 Layers

Every feature is independently toggleable — start simple, enable progressively.

Infrastructure (L1)

Vector Storage

768-dim embeddings on Qdrant with HNSW indexing. Sub-linear search scaling to billions of vectors.

2-Tier Cache

L1 in-memory (50 entries, 5min TTL) + L2 Redis (1hr TTL). Sub-10ms cached recall.

Pub/Sub Broadcast

Real-time memory events across your entire agent mesh via Redis channels.

Knowledge Graph

Temporal entity graph on FalkorDB with auto-linking, path finding, and timeline reconstruction.

Bi-Temporal Model

Every memory tracks eventTime + ingestedAt for temporal queries.

Soft-Delete

Memories are never physically deleted. Full audit trails and recovery at any time.

Pipeline (L2)

12-Step Ingestion

Security → embed → dedup → extract → classify → score → link → graph → broadcast.

Zero-LLM Pipeline

Classification, extraction, urgency detection, conflict resolution — all algorithmic. $0 per memory.

Security Filter

3-tier classification (public/private/secret). Blocks API keys, credentials, and private keys.

Smart Dedup & Merge

Cosine ≥0.92 = duplicate merge, 0.70-0.92 = conflict alert broadcast.

Entity Extraction

Automatic identification of people, machines, IPs, dates, technologies, URLs — zero LLM.

7-Type Taxonomy

Episodic, semantic, preference, relationship, procedural, profile, core — classified algorithmically.

Knowledge Graph (L3)

Temporal Queries

"What was X connected to as of date Y?" — relationships carry since timestamps.

Auto-Linking

New memories automatically discover and link to related memories. Bidirectional, Zettelkasten-style.

Path Finding

Shortest-path queries between any two entities with configurable max depth.

Timeline Reconstruction

Ordered history of all memories mentioning a given entity.

Depth-Limited Traversal

Configurable graph exploration (default: 2 hops) balancing relevance vs. noise.

Cognitive (L4)

Activation Decay

Logarithmic decay model — critical memories stay months, core/procedural are immune.

Multi-Signal Scoring

5 signals: similarity, recency, importance×confidence, frequency, type relevance.

Intent-Aware Retrieval

Auto-detects query intent (factual, temporal, procedural, preference, exploratory).

Diversity Reranking

Cluster detection, overlap penalty, type diversity — prevents echo chambers in results.

4-Tier Confidence

Mesh Fact ≥0.85, Grounded 0.65-0.84, Inferred 0.40-0.64, Uncertain <0.40.

Priority Scoring

Urgency × Domain composite — critical+technical = 1.0, background+general = 0.2.

Self-Improvement (L5)

Reinforcement Learning

Feedback loop tracks usefulness, auto-promotes memories with >0.7 ratio after 3+ retrievals.

Active Consolidation

4-phase autonomous maintenance: contradiction detection, dedup merge, promotion, demotion.

Flash Reasoning

BFS traversal through linked memory graphs, reconstructs multi-step logic chains.

Theory of Mind (ToMA)

"What does Agent-B know about X?" — knowledge gap analysis across the mesh.

Cross-Agent Synthesis

3+ agents agree on a fact → auto-synthesized into fleet-level insight.

Proactive Recall

Generates speculative queries from incoming prompts, injects context before the agent asks.

Session Survival

Snapshot/recovery across context window resets — zero discontinuity.

Observational Memory

Compresses raw conversation streams into structured, high-signal memory cells.

Procedural Memory

Learned procedures as first-class objects, immune to decay, shared across the mesh.

Mesh Sync

Named, versioned shared state blocks with real-time broadcast propagation.

Research-Grade: 10 Capabilities From Research Papers

These capabilities exist almost exclusively in academic literature and closed research labs. Mnemosyne ships all 10 as deployable infrastructure.

Capability	Industry Status	Mnemosyne
Flash Reasoning (chain-of-thought graph traversal)	Research paper only	Production
Theory of Mind for Agents	Research paper only	Production
Observational Memory Compression	Research paper only	Production
Reinforcement Learning on Memory	Research paper only	Production
Self-Improving Consolidation	Not implemented anywhere	Production
Cross-Agent Cognitive State	Not implemented anywhere	Production
Bi-Temporal Knowledge Graph	Research paper only	Production
Proactive Anticipatory Recall	Not implemented anywhere	Production
Procedural Memory / Skill Library	Not implemented anywhere	Production
Session Survival	Not implemented anywhere	Production

33 Features. 28 That Nobody Else Has.

Every feature listed is in production. Not planned. Not in beta. Shipping.

Feature	Mnemosyne	Mem0	Zep	Cognee	LangMem	Letta
Pipeline & Ingestion
Zero-LLM Ingestion Pipeline	✓	✗ LLM	✗ LLM	✗ LLM	✗ LLM	✗ LLM
12-Step Structured Pipeline	✓	Partial	Partial	✓	✗	✗
Security Filter (Secret Blocking)	✓	✗	✗	✗	✗	✗
Smart Dedup with Semantic Merge	✓	✓	✗	✓	✗	✗
Conflict Detection & Alerts	✓	✗	✗	✗	✗	✗
7-Type Memory Taxonomy	✓	✗	✗	✗	✗	Partial
Entity Extraction (Zero-LLM)	✓	LLM-based	LLM-based	LLM-based	✗	✗
Cognitive Features
Activation Decay Model	✓	✗	✗	✗	✗	✗
Multi-Signal Scoring (5 Signals)	✓	✗	✗	✗	✗	✗
Intent-Aware Retrieval	✓	✗	✗	✗	✗	✗
Diversity Reranking	✓	✗	✗	✗	✗	✗
Flash Reasoning Chains	✓	✗	✗	✗	✗	✗
Reinforcement Learning	✓	✗	✗	✗	✗	✗
Active Consolidation (4-Phase)	✓	✗	✗	✗	✗	✗
Proactive Recall	✓	✗	✗	✗	✗	✗
Session Survival	✓	✗	✗	✗	✗	✓
Observational Memory	✓	✗	✓	✗	✗	✗
Knowledge Graph
Built-in Knowledge Graph	✓	$249/mo	✗	✓	✗	✗
Temporal Graph Queries	✓	✗	✗	✗	✗	✗
Auto-Linking (Bidirectional)	✓	✗	✗	✗	✗	✗
Path Finding Between Entities	✓	✗	✗	✗	✗	✗
Timeline Reconstruction	✓	✗	✗	✗	✗	✗
Bi-Temporal Data Model	✓	✗	✗	✗	✗	✗
Multi-Agent
Real-Time Broadcast (Pub/Sub)	✓	Enterprise	✗	✗	✗	✗
Theory of Mind (ToMA)	✓	✗	✗	✗	✗	✗
Cross-Agent Synthesis	✓	✗	✗	✗	✗	✗
Shared State Blocks (Mesh Sync)	✓	✗	✗	✗	✗	✗
Infrastructure
2-Tier Caching (L1 + L2)	✓	✗	✗	✗	✗	✗
Soft-Delete Architecture	✓	✗	✗	✗	✗	✗
Procedural Memory (Skill Library)	✓	✗	✗	✗	✗	✗

Store. Recall. Learn. Collaborate.

Four operations that transform stateless agents into intelligent, collaborative systems.

Store

Input goes through a 12-step zero-LLM pipeline: security filter, embedding, dedup, extraction, classification, scoring, linking, graph ingest, and broadcast. All in <50ms.

Recall

Queries hit the 2-tier cache, then vector search with 5-signal intent-aware scoring, diversity reranking, graph enrichment, and flash reasoning chains.

Learn

Feedback signals promote useful memories and flag poor ones. Active consolidation merges duplicates, resolves contradictions, and promotes popular knowledge.

Collaborate

Agents share memories via pub/sub mesh. Theory of Mind queries what others know. 3+ agents agreeing on a fact triggers fleet-level synthesis.

10-40x Faster Ingestion Than Any Competitor

Real numbers from a 10-machine AI fleet running Mnemosyne in production.

Memories Managed

<0ms

Store Latency

<0ms

Cached Recall

Concurrent Agents

~0/min

Consolidation Rate

<0ms

Embedding (cached)

Try It: Live Memory Demo

Store memories to localStorage. See the cognitive pipeline in action — no backend needed.

Mnemosyne Demo Terminal

Welcome to Mnemosyne's cognitive pipeline demo. This runs entirely in your browser — no backend needed.\n\nStore mode: Type anything to run it through the 12-step ingestion pipeline.\nRecall mode: Search your stored memories with multi-signal retrieval.\n\nTry storing: "I prefer TypeScript and dark mode" then recalling: "What language do I like?"

Pipeline: ready Memories: 0 Mode: store

Built for Agents That Actually Think

From single-agent coding assistants to enterprise-scale agent meshes.

AI Coding Assistants

Session survival, procedural memory, temporal graph. Agents remember project context, deployment procedures, and past debugging sessions.

Enterprise Knowledge Agents

Agent mesh, Theory of Mind, cross-agent synthesis. Specialized agents build domain expertise while sharing verified facts.

Customer Support

Preference tracking, reinforcement learning, procedural memory. Agents remember customer history, resolution patterns, and preferences.

Research Assistants

Flash reasoning, auto-linking, knowledge graph. Agents accumulate domain knowledge and surface non-obvious connections.

DevOps & Infrastructure

Temporal queries, proactive warnings, mesh sync. Agents remember topology, incidents, and answer "What changed since last stable state?"

Personal AI Companions

Activation decay, observational memory, preference modeling. Long-running assistants that develop genuine understanding over time.

9 Tools. Any Agent Framework.

Drop-in tools that work with Claude, GPT, LangChain, CrewAI, AutoGen, or any LLM agent framework.

memory_recall

Multi-signal search with intent detection, diversity reranking, graph enrichment, flash reasoning.

memory_store

Full 12-step ingestion: security filter, dedup, classify, link, graph ingest, broadcast.

memory_forget

Soft-delete by ID or semantic search, mesh-wide cache invalidation.

memory_block_get

Read a named shared memory block (Mesh Sync).

memory_block_set

Write/update a named shared block with versioning and broadcast.

memory_feedback

Reinforcement signal — drives memory promotion/demotion.

memory_consolidate

4-phase active maintenance: contradictions, dedup, promotion, demotion.

memory_toma

Query what a specific agent knows about a topic (Theory of Mind).

before_agent_start

Automatic hook: session recovery, proactive recall, context injection.

Give Your Agents the Memory
They Deserve

Open source. MIT licensed. 33 features. $0 per memory. Deploy cognitive memory in minutes.

Get Started →Read the Docs

$ npm install mnemosy-ai

Because intelligence without memory isn't intelligence.

The First AI Memory ThatThinks Like a Brain

Not a Demo. Not a Roadmap. Running Right Now.

AI Agents Have Amnesia. Mnemosyne Gives Them a Brain.

✘ Every Other Memory System

✔ Mnemosyne

5 Lines to Cognitive Memory

5-Layer Cognitive Architecture

33 Features Across 5 Layers

Vector Storage

2-Tier Cache

Pub/Sub Broadcast

Knowledge Graph

Bi-Temporal Model

Soft-Delete

12-Step Ingestion

Zero-LLM Pipeline

Security Filter

Smart Dedup & Merge

Entity Extraction

7-Type Taxonomy

Temporal Queries

Auto-Linking

Path Finding

Timeline Reconstruction

Depth-Limited Traversal

Activation Decay

Multi-Signal Scoring

Intent-Aware Retrieval

Diversity Reranking

4-Tier Confidence

Priority Scoring

Reinforcement Learning

Active Consolidation

Flash Reasoning

Theory of Mind (ToMA)

Cross-Agent Synthesis

Proactive Recall

Session Survival

Observational Memory

Procedural Memory

Mesh Sync

Research-Grade: 10 Capabilities From Research Papers

33 Features. 28 That Nobody Else Has.

Store. Recall. Learn. Collaborate.

Store

Recall

Learn

Collaborate

10-40x Faster Ingestion Than Any Competitor

Try It: Live Memory Demo

Mnemosyne Demo Terminal

Built for Agents That Actually Think

AI Coding Assistants

Enterprise Knowledge Agents

Customer Support

Research Assistants

DevOps & Infrastructure

Personal AI Companions

9 Tools. Any Agent Framework.

Open Source. MIT License. Free Knowledge Graph. $0 Per Memory.

Give Your Agents the MemoryThey Deserve

Give Your Agents the Memory
They Deserve