Back to Skill Directory

Agent Memory

PythonPersistent MemoryApache 2.0

Mem0

by Mem0 AI · mem0ai/mem0

Mem0 is a persistent memory layer for AI agents and LLM applications. It solves the stateless nature of LLMs by extracting important facts from conversations — user preferences, decisions, relationships, and context — and storing them in a semantic vector index. On every new interaction, relevant memories are retrieved and injected into the prompt automatically.

Available as an open-source Python library and a managed cloud platform, Mem0 integrates with any LLM (OpenAI, Anthropic, Gemini, Ollama) and any vector database (Qdrant, Pinecone, Chroma, pgvector). Native integrations exist for LangChain, CrewAI, AutoGen, and LlamaIndex.

Semantic
Memory Type
vector embeddings
Any
LLM Support
OpenAI, Claude, local...
10+
Vector Stores
Qdrant, Pinecone, pgvector...
Apache 2.0
License
open source

Quick Install

pip install mem0ai

Key Features

Semantic Memory Extraction

Automatically distills important facts from conversations using an LLM extraction step. Instead of storing raw message history, Mem0 stores atomic facts like "prefers Python", "has 2 children", or "uses dark mode" — compact, queryable, and always relevant.

Cross-Session Persistence

Memories survive between conversations, sessions, and app restarts. Users get an AI that genuinely remembers them — their preferences, past problems solved, decisions made — without developers building custom memory schemas.

Any LLM Backend

Configure Mem0 with OpenAI, Anthropic, Google Gemini, Groq, Ollama, or any LiteLLM-compatible provider. Both the memory extraction LLM and the embedding model are independently configurable.

Pluggable Vector Stores

Store memories in Qdrant, Pinecone, Chroma, Weaviate, Milvus, Redis, pgvector, or MongoDB Atlas. Use whatever vector database already exists in your infrastructure — no lock-in.

Multi-Level Memory Scopes

Mem0 supports user-level memory (per individual), session-level memory (per conversation), agent-level memory (shared across all users), and run-level memory (ephemeral per task). Mix scopes to build nuanced memory hierarchies.

Framework Integrations

Native integrations for LangChain (BaseChatMessageHistory), CrewAI (shared crew memory), AutoGen (ConversableAgent memory), and LlamaIndex (memory module). Drop Mem0 into existing agent pipelines with minimal changes.

Execution Brief

Use this page as a rollout checklist, not just reference text.

Suggest update

Tool Mapping Lens

Organize Tools by Workflow Phase

Catalog-oriented pages work best when users can map discovery, evaluation, and rollout in a clear path instead of reading an undifferentiated list.

  • Define the job-to-be-done first
  • Group tools by stage
  • Prioritize by adoption friction

Actionable Utility Module

Skill Implementation Board

Use this board for Mem0 before rollout. Capture inputs, apply one decision rule, execute the checklist, and log outcome.

Input: Objective

Deliver one measurable improvement with mem0 ai agent memory persistent memory layer llm cross-session vector store

Input: Baseline Window

20-30 minutes

Input: Fallback Window

8-12 minutes

Decision TriggerActionExpected Output
Input: one workflow objective and release owner are definedRun preview execution with fixed acceptance criteria.Go or hold decision backed by repeatable evidence.
Input: output quality below baseline or retries increaseLimit scope, isolate root issue, and rerun controlled test.One confirmed correction path before wider rollout.
Input: checks pass for two consecutive replay windowsPromote to broader traffic with fallback path active.Stable rollout with low operational surprise.

Execution Steps

  1. Record objective, owner, and stop condition.
  2. Execute one controlled preview run.
  3. Measure quality, latency, and correction burden.
  4. Promote only when pass criteria are stable.

Output Template

tool=mem0 ai agent memory persistent memory layer llm cross-session vector store
objective=
preview_result=pass|fail
primary_metric=
next_step=rollout|patch|hold

What Is Mem0?

Mem0 is an open-source memory infrastructure layer designed to give AI agents and LLM-powered applications the ability to remember. Without memory, every conversation with an AI starts from scratch — the agent has no knowledge of who the user is, what they prefer, or what was discussed before. Mem0 solves this by acting as a semantic long-term memory store that lives outside the LLM context window and persists across sessions.

The core mechanism is fact extraction: when a conversation ends (or in real-time during a conversation), Mem0 passes the messages through a configurable LLM that identifies memorable facts — user preferences, key decisions, important relationships, stated goals. These facts are embedded as vectors and stored in a database. On the next interaction, Mem0 retrieves semantically relevant memories based on the current conversation and injects them into the system prompt, giving the LLM accurate context without flooding it with raw history.

Mem0 is built for flexibility. The LLM used for extraction, the embedding model, and the vector database are all independently configurable. Supported LLMs include GPT-4o, Claude, Gemini, Llama (via Ollama), and any OpenAI-compatible API. Supported vector stores include Qdrant, Pinecone, Chroma, Weaviate, pgvector, and more. This architecture means Mem0 slots into existing stacks without forcing migrations to new infrastructure.

Beyond the open-source library, Mem0 offers a managed cloud platform at app.mem0.ai that handles storage, search, and scaling automatically. The platform exposes a REST API compatible with the Python SDK, making it easy to move from self-hosted to cloud as usage grows. A free tier supports up to 50 users, making it accessible for early-stage products. Both deployment modes share the same add/search/get API surface, so switching between them requires only a configuration change.

How to Calculate Better Results with mem0 ai agent memory persistent memory layer llm cross-session vector store

Install Mem0 and set your LLM API key: pip install mem0ai. Export OPENAI_API_KEY (or whichever LLM you plan to use). Mem0 defaults to OpenAI for both extraction and embedding, so no additional config is needed for a quick start.

Add memories from a conversation: from mem0 import Memory; m = Memory(); result = m.add("I prefer TypeScript over Python for backend work", user_id="alice"). Mem0 will extract the preference fact and store it. You can pass full conversation message arrays in the same format as OpenAI chat messages.

Retrieve relevant memories before a new LLM call: memories = m.search("tell me about the project", user_id="alice"). This returns a list of relevant memory strings that you prepend to your system prompt. The LLM then has accurate context about Alice without seeing all her past messages.

For production, configure a persistent vector store: set vector_store={"provider": "qdrant", "config": {"host": "localhost", "port": 6333}} in the Memory() constructor. For the managed cloud, initialize with Memory(api_key="your-mem0-key") and all storage is handled automatically. Scope memories by agent_id in addition to user_id for multi-agent applications where different agents should share or isolate their memory pools.

Treat this page as a decision map. Build a shortlist fast, then run a focused second pass for security, ownership, and operational fit.

When a team keeps one shared selection rubric, tool adoption speeds up because evaluators stop debating criteria every time a new option appears.

Worked Examples

Building a personalized customer support agent with persistent memory

  1. Initialize Mem0 with your vector store: m = Memory(vector_store={"provider": "qdrant", ...})
  2. At the start of each support session, retrieve user memories: past = m.search(user_query, user_id=customer_id)
  3. Inject memories into system prompt: system = f"You are a support agent. What you know about this user: {past}"
  4. After the conversation, store new facts: m.add(conversation_messages, user_id=customer_id)
  5. On the next session, the agent knows the customer's plan, past issues, preferences, and unresolved problems
  6. Customers never repeat themselves — the agent greets them by name and picks up where they left off

Outcome: A support agent that remembers every customer across unlimited sessions. Support quality improves over time as the memory store grows richer, and customers experience continuity that feels genuinely personal.

Adding memory to a LangChain agent with minimal code changes

  1. Install: pip install mem0ai langchain-openai
  2. Create the memory store: from mem0 import MemoryClient; client = MemoryClient(api_key="...")
  3. Before each agent invocation, fetch context: memories = client.search(input, user_id=user_id)
  4. Prepend to the system message: SystemMessage(content=f"Relevant context: {memories}\n\nYour instructions: ...")
  5. After the invocation, store new facts: client.add([{"role": "user", "content": input}, {"role": "assistant", "content": output}], user_id=user_id)
  6. No changes to the agent graph, tools, or chain logic — memory is a pre/post wrapper

Outcome: An existing LangChain agent gains persistent cross-session memory in under 20 lines of new code. The agent's tools and logic remain unchanged; Mem0 handles the memory extraction and retrieval transparently.

Frequently Asked Questions

What is Mem0?

Mem0 (pronounced "mem-zero") is an open-source memory layer for AI agents and LLM-powered applications. It solves a fundamental limitation of LLMs: they have no persistent memory between conversations. Mem0 extracts important information from conversations — user preferences, facts, past decisions, relationships — and stores them in a vector database. On subsequent interactions, it retrieves relevant memories and injects them into the context, making the AI feel like it genuinely remembers the user. Mem0 works with any LLM (OpenAI, Anthropic, Gemini, Ollama) and any vector store (Qdrant, Pinecone, Chroma, pgvector), and is available as both an open-source library and a managed cloud service.

How does Mem0 extract and retrieve memories?

Mem0 uses a multi-step pipeline. During memory creation, it passes the conversation through an LLM extraction step that identifies facts worth remembering — things like "user prefers Python over JavaScript" or "user has a dog named Max." These facts are embedded as vectors and stored in your configured vector database, tagged with user or agent IDs. During retrieval, Mem0 embeds the current query or conversation and performs semantic similarity search to find the most relevant past memories. You can also retrieve all memories for a user, search by metadata, or get memories filtered by date range. The result is a list of relevant facts that can be prepended to the system prompt.

What vector databases does Mem0 support?

Mem0 supports a wide range of vector databases through its pluggable backend architecture. Supported options include Qdrant, Pinecone, Chroma, Weaviate, Milvus, Redis (with vector search), pgvector (PostgreSQL), MongoDB Atlas Vector Search, and Elasticsearch. For the managed cloud version (Mem0 Platform), storage is handled automatically. For self-hosted deployments, you configure the vector store in the Mem0 config object. This flexibility means you can use whichever vector database already exists in your infrastructure without adopting a new service purely for memory.

How is Mem0 different from just storing conversation history in a database?

Storing raw conversation history and storing semantic memories are fundamentally different. Raw history grows linearly with every message and quickly exceeds context window limits — a user who has chatted 100 times would have thousands of messages that cannot fit in a single prompt. Mem0 distills conversations into atomic facts ("user is vegetarian", "user lives in Berlin", "user prefers dark mode") that are compact and semantically indexed. At query time, only the memories relevant to the current question are retrieved — typically 5-20 facts — regardless of how many total memories exist. This keeps context lean and relevant rather than bloated and noisy.

Can Mem0 be used with LangChain, CrewAI, or AutoGen?

Yes. Mem0 provides native integrations with major agent frameworks. The LangChain integration wraps Mem0 as a BaseChatMessageHistory or a retriever, making it drop-in compatible with existing LangChain chains and agents. For CrewAI, Mem0 can be used as a shared memory store that persists knowledge across crew task runs. For AutoGen, Mem0 connects as a memory backend for ConversableAgent. There is also a LlamaIndex integration that wraps Mem0 as a memory module for index-based pipelines. The Python SDK's simple add() and search() interface makes it easy to integrate with any custom agent framework that is not yet officially supported.

What is the difference between Mem0 open-source and Mem0 Platform?

Mem0 open-source (pip install mem0ai) is a self-hosted Python library where you supply your own LLM API keys, vector database, and infrastructure. You have full control and there are no usage limits beyond your own infrastructure costs. Mem0 Platform is the managed cloud service at app.mem0.ai — it handles storage, search, and scaling automatically, exposes a REST API, and provides a dashboard for browsing and managing memories. The Platform offers a free tier (up to 50 users, limited memory operations per month) and paid plans for higher volume. For production applications with many users, the Platform removes operational overhead; for privacy-sensitive or self-hosted deployments, the open-source library is the right choice.

Missing a better tool match?

Send the exact workflow you are solving and we will prioritize a new comparison or rollout guide.