What Is Haystack Agents?
Haystack is a Python framework by deepset for building LLM-powered applications — search, RAG, agents, and hybrids of all three. Haystack Agents specifically refers to the agent-building primitives introduced in Haystack 2.x, which layer tool-calling behavior on top of the framework's typed Pipeline and Component model.
The framework's core philosophy is composability with contracts. Each component declares its input and output types explicitly, and the Pipeline validates that connections are well-formed at construction time. This catches a whole class of wiring bugs that plague more dynamic frameworks, and makes Haystack pipelines safer to evolve over months of production use.
Haystack Agents give you two primary shapes: a ReAct-style reasoning loop for open-ended tool use, and a tool-call loop driven by native OpenAI / Anthropic function-calling APIs. The Agent class handles the loop, message formatting, and tool invocation so you can focus on defining tools and guarding them with validation.
The deepset team has invested heavily in evaluation. Haystack includes evaluators for retrieval quality, answer correctness, and end-to-end faithfulness. Running a pipeline through eval is a single function call, which means teams can tie CI pipelines to quality thresholds — something most agent frameworks still leave as an exercise to the reader.
How to Calculate Better Results with haystack agents deepset rag framework python agent production
Install Haystack with pip install haystack-ai. Add the integrations you need, for example pip install qdrant-haystack for Qdrant or ollama-haystack for local models. Each integration is a separate package so the core stays lean.
Define your tools. A tool is a Python function decorated (or wrapped) with the Tool class, with a clear name, description, and parameters schema. The agent uses the description to decide when to call the tool, so write it like a product spec, not a dev comment.
Instantiate a ChatGenerator (OpenAIChatGenerator, AnthropicChatGenerator, OllamaChatGenerator, etc.) and pass it plus your tools into Agent. Call agent.run(messages=[ChatMessage.from_user("...")]) and read the final assistant message out.
Wrap the agent in a Pipeline if you need RAG. Connect a retriever and reranker upstream, have them hand context to the agent, and optionally add a Router downstream for structured outputs. Serialize the pipeline to YAML for deployment.
Treat this page as a decision map. Build a shortlist fast, then run a focused second pass for security, ownership, and operational fit.
When a team keeps one shared selection rubric, tool adoption speeds up because evaluators stop debating criteria every time a new option appears.
Worked Examples
Internal support agent over company docs
- Index your help-center articles into Qdrant via a Haystack indexing pipeline
- Build a query pipeline: QdrantRetriever -> TransformersSimilarityRanker -> Agent
- Define tools: search_tickets (Zendesk API), create_jira_issue, send_email
- Wrap with an OpenAI gpt-4o-mini generator and give a clear system prompt
- User asks "ticket 12345 is about SSO — find the root cause and file a Jira"
- Agent retrieves SSO docs from Qdrant, calls search_tickets, reasons, and calls create_jira_issue with a composed summary
Outcome: An internal agent that combines documentation retrieval with real-world tool use, running on a stack your ops team can deploy and monitor like any other FastAPI service.
Quality-gated production deployment
- You have a Haystack RAG pipeline ready for production
- Write a labeled eval set of 200 Q/A pairs drawn from real user questions
- Run the pipeline through evaluate() with retrieval + faithfulness evaluators
- CI asserts retrieval MRR@5 > 0.75 and faithfulness > 0.9 before merging
- A PR that swaps in a cheaper LLM fails the gate and gets rejected automatically
- Only pipelines meeting the bar ship to production — quality regressions are caught pre-merge
Outcome: A quality safety net that turns "LLM in production" from a gut feeling into a measurable metric gated by CI.
Frequently Asked Questions
What is Haystack and Haystack Agents?
Haystack is an open-source Python framework by deepset for building LLM applications — including RAG pipelines, search systems, and tool-using agents. Haystack Agents is the agent layer introduced in Haystack 2.x that lets you assemble a ReAct-style or tool-calling agent on top of Haystack's typed Pipeline and Component primitives. It focuses on production concerns: evaluation, observability, deployment, and swappable backends.
How does Haystack compare to LangChain and LlamaIndex?
LangChain has the widest integration surface and fastest-moving ecosystem, but its abstractions have shifted repeatedly. LlamaIndex is focused on retrieval and index structures. Haystack's strength is typed components, explicit pipelines, and a production ethos inherited from deepset's search background. If you value clear contracts, stable APIs, and strong evaluation tooling, Haystack is often the steadier choice for production.
How do I build an agent with Haystack?
Install haystack-ai via pip. Create tool functions and wrap them with @tool or the Tool class. Instantiate a ChatGenerator (OpenAI, Anthropic, local vLLM, etc.), pass it and the tools to Agent, then call agent.run(messages=[...]). The agent loops internally, calling tools until it produces a final assistant message. Pipelines let you wrap an agent alongside retrievers for hybrid RAG + agent flows.
Does Haystack support local LLMs?
Yes. Haystack integrates with Ollama, Hugging Face Transformers, llama.cpp, vLLM, and any OpenAI-compatible endpoint. Swap the generator component and the rest of the pipeline stays the same. This makes Haystack particularly well-suited to on-prem or regulated environments where you cannot call a hosted API.
What does evaluation look like in Haystack?
Haystack ships evaluators for retrieval (recall, MRR, NDCG), generation (semantic answer similarity, faithfulness, context relevance), and end-to-end pipeline metrics. You define a labeled eval set once and run the same pipeline through evaluation with a one-liner. That tight loop between build and measure is one of Haystack's biggest production advantages.
Is Haystack suitable for enterprise deployments?
Yes. deepset offers deepset Cloud (hosted Haystack) and deepset Studio (a no-code builder that exports Haystack pipelines). Self-hosted, Haystack runs well in FastAPI services behind an API gateway. It has first-class support for pipeline serialization to YAML, making deployments reproducible across dev, staging, and prod environments.