Back to Skill Directory

MCP Server

Vector DatabaseRAG MemoryApache 2.0

Qdrant MCP Server

by Qdrant · qdrant/mcp-server-qdrant

Qdrant MCP Server gives your AI coding assistant a persistent semantic memory. Instead of losing context between sessions, the agent can store embeddings in Qdrant and retrieve them via similarity search — turning a stateless chat into a long-lived knowledge worker with access to everything it has read before.

Qdrant is open-source, Rust-native, and runs happily on a single Docker container for local dev or scales to billions of vectors on Qdrant Cloud. The MCP server exposes upsert, search, scroll, and collection management as tools your agent can call directly.

6+
Tools Exposed
upsert, search, filter...
<3min
Setup
docker + uvx
Self/Cloud
Hosting
Docker or managed
Apache 2.0
License
open source

Quick Install

claude mcp add qdrant -- uvx mcp-server-qdrant

Key Features

Upsert Embeddings

Store vectors with arbitrary JSON payloads. The agent can index documents, code snippets, or conversation history with rich metadata for later filtering.

Semantic Search

Query by vector similarity with cosine, dot-product, or Euclidean distance. Returns top-K nearest neighbors with scores and payloads in a single call.

Hybrid Filters

Combine vector similarity with payload filters (must / should / must_not). Ask "find similar docs tagged prod and dated after Jan 2025" in one query.

Collection Management

Create, list, and delete collections from the agent. Configure vector size, distance metric, and HNSW parameters without leaving your IDE.

Scroll & Pagination

Walk entire collections with cursored scroll. Useful for agent workflows that re-embed documents, deduplicate, or audit stored payloads.

Self-Hostable

Run locally with a single Docker command for zero-cost development, then point at Qdrant Cloud for production without code changes.

Execution Brief

Use this page as a rollout checklist, not just reference text.

Suggest update

Tool Mapping Lens

Organize Tools by Workflow Phase

Catalog-oriented pages work best when users can map discovery, evaluation, and rollout in a clear path instead of reading an undifferentiated list.

  • Define the job-to-be-done first
  • Group tools by stage
  • Prioritize by adoption friction

Actionable Utility Module

Skill Implementation Board

Use this board for Qdrant MCP Server before rollout. Capture inputs, apply one decision rule, execute the checklist, and log outcome.

Input: Objective

Deliver one measurable improvement with qdrant mcp server claude code vector database rag memory setup

Input: Baseline Window

20-30 minutes

Input: Fallback Window

8-12 minutes

Decision TriggerActionExpected Output
Input: one workflow objective and release owner are definedRun preview execution with fixed acceptance criteria.Go or hold decision backed by repeatable evidence.
Input: output quality below baseline or retries increaseLimit scope, isolate root issue, and rerun controlled test.One confirmed correction path before wider rollout.
Input: checks pass for two consecutive replay windowsPromote to broader traffic with fallback path active.Stable rollout with low operational surprise.

Execution Steps

  1. Record objective, owner, and stop condition.
  2. Execute one controlled preview run.
  3. Measure quality, latency, and correction burden.
  4. Promote only when pass criteria are stable.

Output Template

tool=qdrant mcp server claude code vector database rag memory setup
objective=
preview_result=pass|fail
primary_metric=
next_step=rollout|patch|hold

What Is Qdrant MCP Server?

Qdrant MCP Server is a Model Context Protocol bridge to Qdrant, the open-source vector similarity search engine. It exposes Qdrant's upsert, search, filter, and collection-management primitives as MCP tools, allowing any compatible AI client — Claude Code, Cursor, Continue, Windsurf — to use Qdrant as durable semantic memory.

Qdrant itself is written in Rust for performance, supports billion-scale collections with HNSW indexing, and ships with rich payload filtering that can be combined with vector similarity in a single query. The MCP server is a thin, protocol-faithful wrapper that preserves those capabilities for agent workflows.

The practical value is simple: without persistent vector memory, every agent session starts from scratch. With Qdrant MCP wired in, the agent can remember architectural decisions from three months ago, recall prior code reviews, and surface related documentation on demand. It transforms the agent from a chat partner into a long-lived collaborator.

Qdrant pairs naturally with the rest of the MCP ecosystem. Filesystem MCP feeds the agent raw files to index, Postgres MCP provides structured metadata joins, and GitHub MCP sources code and PR context. Combined, you get a hybrid retrieval stack that most RAG frameworks spend weeks to wire up manually.

How to Calculate Better Results with qdrant mcp server claude code vector database rag memory setup

Start a Qdrant instance. For local development, docker run -p 6333:6333 -v $(pwd)/qdrant_storage:/qdrant/storage qdrant/qdrant. For production, create a free cluster on Qdrant Cloud and copy the URL and API key.

Install the MCP server. With uv installed, run: claude mcp add qdrant -- uvx mcp-server-qdrant. Set QDRANT_URL and, if using Cloud, QDRANT_API_KEY. Optionally set COLLECTION_NAME to pin a default collection.

Choose an embedding strategy. The common pattern is to have the agent compute embeddings with OpenAI text-embedding-3-small (1536 dims) or a local BGE model, then pass the vector to Qdrant MCP. Some server builds bundle FastEmbed for zero-config embeddings.

Verify with a round trip. Ask: "Upsert three sample docs with tags science, history, code, then search for content similar to machine learning basics." If the agent returns the science doc first, the pipeline works end-to-end.

Treat this page as a decision map. Build a shortlist fast, then run a focused second pass for security, ownership, and operational fit.

When a team keeps one shared selection rubric, tool adoption speeds up because evaluators stop debating criteria every time a new option appears.

Worked Examples

Persistent agent memory across sessions

  1. You configure Claude Code with Qdrant MCP and a memory collection called agent_memory
  2. At the end of each coding session, ask the agent to "summarize today's decisions and upsert to agent_memory with tags project=citerank"
  3. Agent embeds the summary via OpenAI, upserts to Qdrant with payload {project, date, topic}
  4. Next week, you start a fresh session and ask "what did we decide about Stripe Radar last month?"
  5. Agent calls qdrant_search with the query embedding, filtered by project=citerank
  6. Returns the exact decision summary with context — no manual notes required

Outcome: An agent that remembers specific decisions across dozens of sessions without re-pasting transcripts or maintaining notes by hand.

Semantic code search over a large monorepo

  1. You want to find every place the codebase handles "retry logic with exponential backoff"
  2. One-time: ask agent to walk src/ via Filesystem MCP, embed each function, upsert to a code_index collection
  3. Ongoing: ask "find all retry logic implementations across the repo"
  4. Agent embeds the query, calls qdrant_search with top_k=10
  5. Returns five relevant files even though none contained the exact phrase "exponential backoff"
  6. Agent reads each file and summarizes the patterns used

Outcome: A grep-proof semantic code search that surfaces conceptually similar code even when naming and comments differ across the codebase.

Frequently Asked Questions

What is the Qdrant MCP Server?

Qdrant MCP Server is a Model Context Protocol integration that exposes a Qdrant vector database to AI coding assistants. It lets Claude Code, Cursor, or any MCP client store embeddings, run similarity search, create collections, and filter by payload metadata — turning Qdrant into a durable semantic memory layer for your agent.

How do I set up Qdrant MCP?

Run Qdrant locally with Docker (docker run -p 6333:6333 qdrant/qdrant) or use Qdrant Cloud. Then register the MCP server with Claude Code: claude mcp add qdrant -- uvx mcp-server-qdrant. Set QDRANT_URL and optionally QDRANT_API_KEY as environment variables. The server auto-creates collections on first write.

Do I need a separate embedding model?

Yes. The MCP server stores and queries vectors but does not produce embeddings by itself. Most users pair it with OpenAI text-embedding-3-small, Cohere embed-v3, or a local BGE/E5 model. The embedding call happens in the agent before sending the vector to Qdrant. Some server builds bundle a FastEmbed default for simple setups.

Qdrant vs Pinecone — which MCP should I pick?

Qdrant is open-source, self-hostable, and free at small scale — ideal for local dev, on-prem, or cost-sensitive projects. Pinecone is fully managed, scales effortlessly, and has a mature serverless tier. For agent memory in a solo developer setup, Qdrant via Docker is the lowest friction. For production RAG at scale without infra overhead, Pinecone wins.

What are typical use cases for Qdrant MCP?

Agent long-term memory (remembering conversation history across sessions), semantic code search over a large repo, RAG over internal documentation, deduplication of incoming content by near-duplicate detection, and recommendation over user-generated embeddings. Pairing it with a filesystem MCP lets the agent index a folder and query it semantically.

Can I use payload filters alongside vector search?

Yes. Qdrant supports rich payload filters (must / should / must_not) combined with vector similarity. You can ask the agent to "find similar docs tagged #architecture and written after 2025-01-01" and the MCP server translates that into a hybrid query. This is one of Qdrant's strongest features versus simpler vector stores.

Missing a better tool match?

Send the exact workflow you are solving and we will prioritize a new comparison or rollout guide.