Open source · Self-hosted · Free forever · Releases

Stop living the
same session twice

A self-hosted MCP server that gives Claude Code persistent memory across sessions, projects, and machines. Everything runs on your hardware, and it's free.

The problem

Every Claude Code session starts from zero. It doesn't know your project. It doesn't remember what failed last week. It has no idea you spent three hours last Tuesday figuring out why onnxruntime crashes on Alpine, only to find something that actually works.

So you explain everything again. Claude suggests the same broken library again. Same alarm, same song. It's Groundhog Day, you're Bill Murray, and Claude is Punxsutawney.

$ claude
claude> use onnxruntime for the embeddings

... 45 minutes of debugging later ...

SIGILL: illegal instruction (Alpine musl libc)

You fixed this exact problem last Tuesday.
Claude doesn't know that.

$ claude # with OmniMem
claude> use onnxruntime for the embeddings

WARNING: previously abandoned approach
onnxruntime - SIGILL crash on Alpine musl libc (effort: 4/5)
Switched to sentence-transformers instead.

See it in action

A single recall() query pulls relevant context from everywhere OmniMem knows about. Personal preferences, ingested articles, past conversations, project context. It's all ranked together and merged into one useful response.

OmniMem recall in action showing multi-source memory retrieval in Claude Code

Real output from a recall query about markdown editors, annotated to show where each piece of knowledge came from.

Episodic memory

Picked up from your conversations. Your preferences, decisions, and things you mentioned in passing.

Knowledge base

Pulled from RSS articles that were auto-summarised and embedded. They surface when they're relevant to what you're asking.

External references

Gathered from links and posts you've shared. GitHub threads, blog posts, anything worth holding onto.

Project context

Brings in extra context it knows matters to you, like your preferred tools and platforms.

What makes OmniMem different

This isn't just another key-value store with an MCP wrapper. OmniMem tries to model how memory actually works. Things fade over time, they sometimes contradict each other, and the hard-won stuff earns its place.

The Graveyard

Every dead-end gets logged: what you tried, what type it was, why it failed, and how much time you burned on it. Before Claude suggests a library or pattern, it checks the graveyard first. You won't waste another afternoon on something you've already ruled out.

Experience scoring

Something that worked first time is handy. But something that took four attempts, two abandoned libraries, and a weird platform workaround to crack? That's gold. The harder it was to figure out, the more prominently it surfaces next time.

Contradiction detection

If a new memory disagrees with something already stored, OmniMem catches it and warns you. There's a fast check on every write, and an optional deeper analysis powered by Claude that you can run on demand. Conflicting memories get linked together so you can sort them out.

Semantic deduplication

When you store something new, it gets compared against what's already there. If it's too similar to an existing memory, you'll get a heads-up instead of a duplicate. You can also run find_duplicates to scan everything and clean up in bulk.

Three memory namespaces

Episodic for your decisions, bugs, and patterns. Project for your stack, goals, and current state. Knowledge for RSS articles auto-summarised by Claude Haiku. All three get searched together whenever you recall something.

One-call briefing

One call to briefing() and Claude gets everything it needs: project context, experience stats, stale memories, new articles, contradictions, and anything that might need reinstating. No more three-step warm-up at the start of every session.

Web UI dashboard

Browse, search, and manage all your memories from a web interface built with htmx. Manage RSS feeds, view project contexts, download and restore backups, and see memory details without touching the command line. Runs on port 8080 alongside the MCP server.

Telemetry and metrics

Every recall tracks which memories actually get used. The telemetry dashboard shows recall counts and last-accessed timestamps per memory. There's a Prometheus-compatible /metrics endpoint too, so you can plug it straight into Grafana.

Auto-maintenance

Every few briefing() calls, OmniMem automatically scans for duplicate memories and runs a heuristic contradiction check. Stale duplicates get archived, conflicts get flagged. Runs in the background so you never have to think about it.

Memory is not binary

Most memory systems either remember something or delete it. OmniMem has a proper lifecycle. When you say "forget about X" you usually mean stop bringing it up, not wipe it from existence.

ACTIVE
1.0x weight
DEPRIORITISED
0.2x weight
ARCHIVED
0.0x weight
DELETED
gone

Deprioritised memories aren't gone for good. You can attach reinstate hints, and if a future query matches one, the memory comes back with a note explaining why it was pushed down in the first place. You can also mute entire topics across all your sessions.

score = similarity × surface_score × recency × experience_weight

Four factors decide what comes back. Semantic similarity on its own isn't enough. Lifecycle state, age, and how hard it was to figure out all play a role in the final ranking.

Effort Meaning Weight Boost
1 Worked first time 1.0x
2 Minor friction 1.1x
3 Multiple iterations 1.25x
4 Significant struggle 1.5x
5 Battle-hardened 1.8x

How it compares

Claude Code's built-in memory uses flat files with no semantic search. Most third-party MCP memory servers just store and retrieve. OmniMem goes quite a bit further.

Capability Claude built-in Typical MCP memory <OmniMem>
Semantic vector search No Yes Yes
Memory lifecycle states No No Yes, 4 states
Abandoned approach warnings No No Yes, graveyard
Experience scoring No No Yes, effort 1-5
Contradiction detection No No Yes, 2-tier
Semantic deduplication No Partial Yes, write + batch
Topic suppression No No Yes
RSS knowledge ingestion No No Yes, auto-summarised
Reinstate hints No No Yes
Self-hosted / no SaaS Local files Varies Yes, Docker
Multi-machine sync No Varies Yes, via proxy
Web dashboard No No Yes, htmx
Telemetry / Prometheus No No Yes, /metrics
Auto-maintenance No No Yes, on briefing

Architecture

Four Docker containers, local embeddings, and nothing leaves your machine.

Claude Code (any machine) Browser | | | SSE / MCP | HTTP :8080 v v +-------------------------+ +-------------------------+ | OmniMem MCP Server | | OmniMem Web UI | | Python fastmcp | | Starlette htmx | | | | Jinja2 templates | | remember recall | | | | deprioritise archive | | Dashboard Search | | record_experience | | Browse Create | | warn_if_abandoned | | Projects Experience | | briefing health | | Duplicates Backups | +-----------+-------------+ +-----------+-------------+ | | +-------------+---------------+ | +-------------+-------------+ | | v v +---------------+ +------------------+ | Valkey | | RSS Worker | | + search | <--- | | | | | feedparser | | idx:episodic | | APScheduler | | idx:project | | Claude Haiku | | idx:knowledge | +------------------+ +---------------+ Both the MCP server and web UI connect directly to Valkey and share the mcp_server/memory/ package. Recall pipeline: query → abandoned fast-path (keyword scan, no embedding needed) → embed query → vector search, top 20 candidates per namespace → filter archived and deleted → filter suppressed topics → apply surface_score (lifecycle state multiplier) → apply recency decay (age penalty after 90 days) → apply experience_weight (effort x outcome multiplier) → check reinstate eligibility → surface contradiction warnings → merge, re-rank, return top_k → log recall event + increment per-memory recall counters

Up and running in two minutes

01

Clone and configure

Pick a strong Valkey password. If you want RSS summaries and smarter contradiction detection, add your Anthropic API key too.

git clone https://codeberg.org/ric_harvey/omnimem.git
cd omnimem
cp .env.example .env
# edit .env: set VALKEY_PASSWORD and ANTHROPIC_API_KEY
02

Start the containers

This spins up four containers: Valkey with vector search, the MCP server, the RSS worker, and the web UI dashboard.

docker compose up -d
# Web UI at http://localhost:8080
03

Connect Claude Code

Point Claude Code at OmniMem in your MCP config, then drop the included CLAUDE.md into your project.

// ~/.claude.json or .mcp.json
{
  "mcpServers": {
    "omnimem": {
      "type": "sse",
      "url": "http://localhost:8765/sse"
    }
  }
}
Pro tip

OmniMem's instructions are injected automatically when the MCP server connects to Claude Code. To customise how Claude uses OmniMem across all your projects, add your own overrides to ~/.claude/CLAUDE.md.