Back to Skill Directory

Agent Framework

RAG-firstPython & TSMIT License

LlamaIndex Agents

by LlamaIndex · run-llama/llama_index

LlamaIndex Agents is the agentic layer of LlamaIndex, the most widely used toolkit for retrieval-augmented generation. Where most agent frameworks treat RAG as an afterthought, LlamaIndex starts from retrieval and builds the agent loop on top — giving you production-grade parsers, retrievers, rerankers, and evaluators out of the box, tightly integrated with ReAct, function-calling, and workflow-style agents.

Available in Python and TypeScript, with 160+ data connectors and integrations with every major vector database and LLM provider, LlamaIndex is the default choice when your agent\'s value comes from grounding answers in your own documents rather than from clever prompting alone.

Py + TS
Languages
dual SDK
160+
Connectors
data loaders
40+
Vector DBs
Qdrant, Pinecone...
40K+
GitHub Stars
leading RAG lib

Quick Install

pip install llama-index  |  npm install llamaindex

Key Features

Function-Calling Agents

First-class support for OpenAI and Anthropic tool-calling APIs. Pass any Python function or QueryEngineTool and the agent decides when to call it.

ReAct Agents

Classic reason-then-act loop for models without native function calling. Useful for local LLMs via Ollama or older providers.

Workflows

Event-driven multi-step agent framework with branching, retries, parallel steps, and human-in-the-loop. The modern alternative to rigid chains.

Query Engines

High-level RAG primitives: sub-question decomposition, router query engines, SQL table agents. Wrap any index as a tool for an agent.

Retrievers & Rerankers

Composable retrieval stack — dense, sparse, hybrid, fusion, Cohere/Jina rerankers, auto-merging retrievers. State-of-the-art recall out of the box.

LlamaParse + Connectors

160+ data loaders plus LlamaParse, a high-accuracy PDF and document parser purpose-built for RAG on messy real-world files.

Execution Brief

Use this page as a rollout checklist, not just reference text.

Suggest update

Tool Mapping Lens

Organize Tools by Workflow Phase

Catalog-oriented pages work best when users can map discovery, evaluation, and rollout in a clear path instead of reading an undifferentiated list.

  • Define the job-to-be-done first
  • Group tools by stage
  • Prioritize by adoption friction

Actionable Utility Module

Skill Implementation Board

Use this board for LlamaIndex Agents before rollout. Capture inputs, apply one decision rule, execute the checklist, and log outcome.

Input: Objective

Deliver one measurable improvement with llamaindex agents python typescript rag framework tool use workflows

Input: Baseline Window

20-30 minutes

Input: Fallback Window

8-12 minutes

Decision TriggerActionExpected Output
Input: one workflow objective and release owner are definedRun preview execution with fixed acceptance criteria.Go or hold decision backed by repeatable evidence.
Input: output quality below baseline or retries increaseLimit scope, isolate root issue, and rerun controlled test.One confirmed correction path before wider rollout.
Input: checks pass for two consecutive replay windowsPromote to broader traffic with fallback path active.Stable rollout with low operational surprise.

Execution Steps

  1. Record objective, owner, and stop condition.
  2. Execute one controlled preview run.
  3. Measure quality, latency, and correction burden.
  4. Promote only when pass criteria are stable.

Output Template

tool=llamaindex agents python typescript rag framework tool use workflows
objective=
preview_result=pass|fail
primary_metric=
next_step=rollout|patch|hold

What Is LlamaIndex Agents?

LlamaIndex Agents is the agent-building layer inside LlamaIndex, the Python and TypeScript toolkit that has become the de facto standard for retrieval-augmented generation. It provides pre-built agent patterns — FunctionCallingAgent, ReActAgent, and the newer Workflows framework — that plug directly into LlamaIndex's retrieval primitives so your agent can combine tool use with high-quality grounding over your data.

The architectural choice that defines LlamaIndex is RAG-first: parsers, chunkers, retrievers, rerankers, and evaluators are first-class citizens, not afterthoughts bolted onto a generic chain framework. Every agent can trivially wrap a query engine as a tool, meaning answers stay grounded in source documents with automatic citations.

In practice, that means building a docs-chatbot takes a dozen lines: load documents, build an index, wrap it as a QueryEngineTool, pass it to an agent. Scaling that to production adds more sophisticated pieces — LlamaParse for messy PDFs, hybrid retrieval with reranking, Workflows for multi-step planning — but the learning curve stays gentle because the core primitives compose naturally.

LlamaIndex pairs well with the rest of the ecosystem. Qdrant and Pinecone are the most common vector-store backends; OpenAI, Anthropic, and Gemini are the common LLMs; Cohere rerankers and LlamaParse handle the hard retrieval problems. For shipping a production RAG agent in 2026, LlamaIndex is usually the fastest path.

How to Calculate Better Results with llamaindex agents python typescript rag framework tool use workflows

Install the SDK. For Python: pip install llama-index llama-index-llms-anthropic. For TypeScript: npm install llamaindex. Set your API keys (OPENAI_API_KEY or ANTHROPIC_API_KEY) as environment variables.

Load and index your data. The one-liner pattern: documents = SimpleDirectoryReader("./docs").load_data(); index = VectorStoreIndex.from_documents(documents). For production, swap the default in-memory store for Qdrant or Pinecone with a StorageContext.

Wrap the index as a tool and build the agent. query_engine = index.as_query_engine(); tool = QueryEngineTool.from_defaults(query_engine, name="docs", description="search product docs"); agent = FunctionCallingAgent.from_tools([tool], llm=Anthropic(model="claude-sonnet-4.5")).

Run the agent. response = await agent.achat("How do I reset my API key?"). The agent plans a retrieval call, fetches relevant chunks, and produces a cited answer. For complex flows, migrate to Workflows to add branching, retries, and parallel steps.

Treat this page as a decision map. Build a shortlist fast, then run a focused second pass for security, ownership, and operational fit.

When a team keeps one shared selection rubric, tool adoption speeds up because evaluators stop debating criteria every time a new option appears.

Worked Examples

Internal knowledge base agent over Notion + Slack

  1. Use LlamaHub connectors to pull Notion pages and the last 90 days of Slack channel history
  2. Parse PDFs in attachments via LlamaParse for tables and structured content
  3. Build a hybrid vector + BM25 index in Qdrant with source metadata (notion vs slack)
  4. Wrap as a QueryEngineTool with Cohere Rerank-v3 for final ranking
  5. Plug into FunctionCallingAgent with Anthropic Claude Sonnet and deploy behind a FastAPI endpoint
  6. Route the endpoint into a Slack bot — employees ask questions in DMs and get cited answers

Outcome: A company-wide knowledge agent launched in a week that replaces endless "has anyone seen X" Slack pings with grounded, cited answers from the existing document corpus.

SQL agent that answers business questions over a data warehouse

  1. Register NLSQLTableQueryEngine on top of a read-only Postgres or Snowflake connection
  2. Provide row samples and a schema description so the agent understands column semantics
  3. Wrap as a tool and add it to a Workflow with a router step — route numeric questions to SQL and conceptual questions to the docs index
  4. User asks: "What was MRR growth last quarter by plan tier?"
  5. Workflow routes to SQL tool, generates the query, executes it, and returns a formatted table plus a short summary
  6. Same Workflow routes "what does MRR mean in our pricing?" to the docs tool instead

Outcome: A self-service analytics agent that turns natural language questions into verified SQL, bridging business users and the data warehouse without an analyst in the loop.

Frequently Asked Questions

What are LlamaIndex Agents?

LlamaIndex Agents are the agentic layer of LlamaIndex, the Python (and TypeScript) toolkit widely used for retrieval-augmented generation. The agents module provides ReAct, Function-Calling, and Workflow-based agent patterns that natively integrate with LlamaIndex query engines, retrievers, and tool specs — so your agent can reason, call tools, and retrieve documents in one coherent loop.

LlamaIndex vs LangChain — when should I pick LlamaIndex?

LlamaIndex is RAG-first: if your core workload is retrieval over documents (docs chat, knowledge base Q&A, enterprise search), LlamaIndex has deeper primitives — parsers, retrievers, rerankers, evaluators — out of the box. LangChain is broader but thinner on retrieval. For pure agent orchestration without heavy RAG, LangChain or LangGraph may be cleaner. Most teams end up using both.

How do I build my first LlamaIndex agent?

Install with pip install llama-index. Create an index from your data (SimpleDirectoryReader or one of 160+ connectors), wrap it as a QueryEngineTool, and pass it to FunctionCallingAgent or ReActAgent. The agent can then answer questions by calling your query engine alongside other tools. The quickstart is 20 lines of Python.

Does LlamaIndex support Workflows for complex agents?

Yes. LlamaIndex Workflows are an event-driven framework for building multi-step agents with branching, retries, human-in-the-loop steps, and parallel execution. They replace rigid chains with a more flexible step-and-event model, similar to LangGraph but with tighter RAG integration.

Which LLMs and vector DBs does LlamaIndex support?

LlamaIndex supports virtually every major provider: OpenAI, Anthropic, Google Gemini, Mistral, Cohere, Bedrock, Azure, Ollama, and dozens more. Vector stores include Qdrant, Pinecone, Weaviate, Chroma, Milvus, pgvector, Elasticsearch, MongoDB Atlas — the integrations catalog is the widest in the RAG ecosystem.

What are typical LlamaIndex Agent use cases?

Documentation Q&A chatbots, internal knowledge assistants that query multiple sources (Notion + Confluence + Slack history), research agents that plan multi-step investigations, SQL agents over data warehouses, and domain-specific copilots (legal, medical, financial) where citations and grounding are non-negotiable.

Missing a better tool match?

Send the exact workflow you are solving and we will prioritize a new comparison or rollout guide.