Back to Skill Directory

Agent Infrastructure

Agent Memory

Agent memory is the infrastructure that lets AI agents accumulate knowledge, remember past decisions, and personalize behavior across sessions. Without it, every agent run starts from zero—no awareness of prior errors, no stored preferences, no continuity across days of work.

This page covers the four core memory types used in production AI agents, the best plugins for adding memory to Claude Code and other agent frameworks, and the anti-patterns that make memory systems fail.

Short-term

In-Context Memory

Long-term, exact lookup

Key-Value Storage

Long-term, fuzzy retrieval

Semantic / Vector Memory

Long-term, event log

Episodic Memory

Memory Architecture

The Four Types of AI Agent Memory

In-Context Memory

Short-term

Everything currently inside the active context window. No setup required—it exists naturally in any LLM interaction. Disappears completely when the session ends.

Retrieval

Automatic (LLM attention)

Best for

Within-session reasoning, current task details

Example

The agent remembers earlier steps in the same conversation without needing external storage.

Key-Value Storage

Long-term, exact lookup

Named slots for facts, preferences, and decisions. Fast read/write. Best when you know exactly what you will look up. The Memory MCP server implements this pattern.

Retrieval

Exact key lookup

Best for

User preferences, project facts, stored decisions

Example

Agent stores "preferred_test_framework=vitest" and reads it back at the start of every coding session.

Semantic / Vector Memory

Long-term, fuzzy retrieval

Text chunks converted to embeddings stored in a vector index. Retrieval by nearest-neighbor similarity. Works when queries are natural language phrases rather than exact keys.

Retrieval

Semantic similarity search

Best for

Documentation retrieval, Q&A over large knowledge bases

Example

Agent searches "how did we handle authentication last sprint" and retrieves relevant session notes.

Episodic Memory

Long-term, event log

Append-only log of past agent actions and outcomes. Lets agents review their own history to avoid repeating errors or to report on completed work accurately.

Retrieval

Chronological or filtered log scan

Best for

Error avoidance, progress tracking, audit trails

Example

Agent reviews log of failed deployment attempts before choosing a deployment strategy.

Plugins

Best Agent Memory Plugins for Claude Code

Memory MCP Server

Key-Value

Official MCP key-value store. Persistent across sessions, file-backed. Zero config required.

Install

npx -y @modelcontextprotocol/server-memory

EvoMap MCP Server

Graph / State

Structured project-state graph. Tracks tasks, dependencies, and progress for long-running agent sessions.

Install

npx -y evomap-mcp@latest

Chroma MCP Server

Vector (Semantic)

Connects agents to a local Chroma vector database for semantic retrieval over large document sets.

Install

pip install chromadb && npx -y @modelcontextprotocol/server-chroma

SQLite MCP Server

Relational / Episodic

Lightweight local database. Use for structured episodic logs, preference tables, or knowledge schemas.

Install

npx -y @modelcontextprotocol/server-sqlite --db-path ./agent_memory.db

Anti-Patterns

Agent Memory Anti-Patterns to Avoid

Storing everything in context

Appending all history to the system prompt inflates token cost exponentially. Move stable facts to key-value storage and retrieve only what is relevant to the current task.

No retrieval strategy

Dumping all stored memories into every prompt defeats the purpose of external storage. Use exact-key lookup for named facts and vector search for open queries.

Memory without expiry

Stale memory is worse than no memory. Outdated preferences or deprecated decisions mislead agents. Build explicit TTL (time to live) or review cycles into any persistent store.

Single memory layer for all purposes

Different memory needs different storage. Using only key-value storage for documentation retrieval forces brittle exact-key lookups. Match storage type to retrieval pattern.

Execution Brief

Use this page as a rollout checklist, not just reference text.

Suggest update

Tool Mapping Lens

Organize Tools by Workflow Phase

Catalog-oriented pages work best when users can map discovery, evaluation, and rollout in a clear path instead of reading an undifferentiated list.

  • Define the job-to-be-done first
  • Group tools by stage
  • Prioritize by adoption friction

Actionable Utility Module

Skill Implementation Board

Use this board for Agent Memory before rollout. Capture inputs, apply one decision rule, execute the checklist, and log outcome.

Input: Objective

Deliver one measurable improvement with agent memory

Input: Baseline Window

20-30 minutes

Input: Fallback Window

8-12 minutes

Decision TriggerActionExpected Output
Input: one workflow objective and release owner are definedRun preview execution with fixed acceptance criteria.Go or hold decision backed by repeatable evidence.
Input: output quality below baseline or retries increaseLimit scope, isolate root issue, and rerun controlled test.One confirmed correction path before wider rollout.
Input: checks pass for two consecutive replay windowsPromote to broader traffic with fallback path active.Stable rollout with low operational surprise.

Execution Steps

  1. Record objective, owner, and stop condition.
  2. Execute one controlled preview run.
  3. Measure quality, latency, and correction burden.
  4. Promote only when pass criteria are stable.

Output Template

tool=agent memory
objective=
preview_result=pass|fail
primary_metric=
next_step=rollout|patch|hold

What Is Agent Memory?

Agent memory refers to the storage and retrieval systems that allow AI agents to maintain knowledge across interactions. Without memory, an LLM-based agent is stateless—it processes each request without any awareness of past sessions, user preferences, or accumulated project context. Memory transforms a stateless assistant into a persistent working partner that improves with use.

The term "AI agent memory" covers several distinct mechanisms. In-context memory is the information currently inside the active prompt window. External memory—stored in databases, files, or vector indexes—persists across sessions. Some agent frameworks further distinguish episodic memory (event logs), semantic memory (embedded knowledge), and procedural memory (stored workflows or skill definitions). Choosing the right mechanism for each use case is the core engineering decision.

Memory architecture matters because retrieval quality directly affects agent behavior. An agent that retrieves outdated decisions makes outdated suggestions. An agent that cannot retrieve past errors will repeat them. Good memory design means choosing appropriate storage for the retrieval pattern, setting expiry or review cycles, and keeping retrieved context focused rather than dumping all stored data into every prompt.

How to Calculate Better Results with agent memory

Start by identifying what your agent needs to remember and for how long. Preferences and project facts that never expire belong in key-value storage. Documentation and natural-language notes belong in vector search. Event histories and decision logs belong in append-only relational tables. Most agents need two or three storage types rather than one.

Install the matching MCP plugin and test retrieval quality before building production workflows. For the Memory MCP server, verify that writes and reads work correctly in a live Claude Code session using /tools and a simple test entry. For vector stores, run a few test queries to confirm semantic similarity returns relevant results at your expected query phrasing.

Build explicit memory hygiene into your agent design. Define a maximum number of stored entries per category. Add timestamps to every write so stale items can be identified. Establish a review cycle—monthly or per-project—where outdated memory is pruned or updated. Agents that accumulate memory without pruning gradually degrade as old facts contradict current reality.

Treat this page as a decision map. Build a shortlist fast, then run a focused second pass for security, ownership, and operational fit.

When a team keeps one shared selection rubric, tool adoption speeds up because evaluators stop debating criteria every time a new option appears.

Worked Examples

Example 1: Persistent coding preferences across sessions

  1. Developer uses the Memory MCP server to store project conventions: test runner, linting rules, deployment script path.
  2. At the start of each session, Claude reads stored keys and applies the conventions without asking again.
  3. When conventions change, developer writes new values and the old behavior stops immediately.

Outcome: Onboarding friction for each session drops to near zero and convention drift stops.

Example 2: Semantic retrieval over documentation

  1. Team embeds 200 internal documentation pages into a local Chroma vector store.
  2. Claude Code queries the vector store with natural language questions during coding tasks.
  3. Retrieved document chunks appear in context alongside the code, so Claude can answer without hallucinating API details.

Outcome: Hallucination rate on internal API questions drops because Claude has accurate documentation context rather than training-data approximations.

Example 3: Episodic log for debugging recurring failures

  1. Agent writes each failed command and its error output to a SQLite episodic log.
  2. Before attempting a risky operation, Claude reads the log for similar failures.
  3. Claude adjusts the approach based on previous failure patterns before executing.

Outcome: Repeated failure cycles for known error classes are eliminated because the agent learns from its own history.

Frequently Asked Questions

What is agent memory?

Agent memory is any mechanism that lets an AI agent retain and retrieve information beyond the current context window. This includes in-session scratchpads, persistent key-value stores, vector databases for semantic search, and structured knowledge graphs. Without memory, agents start every session with no knowledge of past decisions, user preferences, or completed work.

What is the difference between short-term and long-term agent memory?

Short-term memory lives in the active context window and disappears when the session ends. Long-term memory is stored externally—in a database, file, or vector index—and persists across sessions. Most production agents need both: short-term for within-task reasoning and long-term for continuity across days or weeks of work.

What is AI agent memory?

AI agent memory is the set of storage and retrieval systems that give AI agents access to past observations, decisions, and facts. Unlike a single LLM call that starts fresh every time, an agent with memory can accumulate knowledge, avoid repeating mistakes, and personalize behavior based on prior interactions.

How does the Memory MCP server work?

The Memory MCP server (from the official Model Context Protocol server list) runs as a local key-value store. Agents can write named entries, read them back, and list all stored keys. It is session-persistent by default and can be backed by a file so data survives process restarts. Install with: npx -y @modelcontextprotocol/server-memory

What is episodic memory in AI agents?

Episodic memory stores sequences of past events—actions taken, outcomes observed, and errors encountered—in a retrievable log. An agent with episodic memory can recall "last Tuesday I tried approach X and it failed because of Y" and avoid repeating the same failure. It is especially useful for debugging agents and long-running automation workflows.

Can Claude Code have persistent memory?

Yes. Claude Code supports persistent memory through the Memory MCP server plugin, which stores key-value pairs that survive across sessions. You can also use file-based memory by writing a structured CLAUDE.md or project notes file and instructing Claude to read it at session start. For vector search, connect a Chroma or Qdrant MCP server.

How do I choose the right memory type for my agent?

Match memory type to retrieval pattern. Use key-value storage for facts you will look up by exact name. Use vector search (semantic memory) when queries are natural language phrases. Use append-only logs for episodic history. Use a graph database when relationships between entities matter. Most agents start with key-value storage and add vector search once retrieval quality becomes a bottleneck.

Missing a better tool match?

Send the exact workflow you are solving and we will prioritize a new comparison or rollout guide.