Why multi-agent?
When developers say they need “multi-agent,” they’re usually looking for one or more of these capabilities:Context management
Surface relevant knowledge without overwhelming the context window. Different tasks (agents) need different context.
Distributed development
Let different teams develop and maintain capabilities independently with clear boundaries.
Parallelization
Spawn specialized workers for subtasks and execute them concurrently for faster results.
Sequential constraints
Enforce step-by-step workflows. Unlock tools and actions only after preconditions are met.
Patterns
Here are the main patterns for building multi-agent systems, each suited to different use cases:| Pattern | How it works |
|---|---|
| Subagents | A main agent coordinates subagents and background jobs as tools. Centralized control — all routing passes through the main agent. Multiple coordination approaches available. |
| Handoffs | Behavior changes dynamically based on state. Tool calls update a state variable, and the system adjusts behavior accordingly. Supports both handoffs between distinct agents and dynamic configuration changes. |
| Skills | Specialized prompts loaded on-demand. The main agent stays in control and gains additional context as needed. |
| Router | A routing step classifies input and directs it to one or more specialized agents. Results are collected and returned to the user. |
| Custom workflow | Build bespoke logic with LangGraph, mixing deterministic and agentic steps. Reuse or customize agents as needed. |
- Invoke sub-agents (subagents)
- Update state to trigger routing or configuration changes (handoffs)
- Load context on-demand (skills)
- Invoke entire multi-agent systems (wrapping a router as a tool)
Choosing a pattern
Use this table to match your requirements to the right pattern:- Distributed development: Can different teams maintain components independently?
- Parallelization: Can multiple agents execute concurrently?
- Multi-hop: Does the pattern support multiple hops between agents?
- Direct user interaction: Can subagents converse directly with the user?
Subagents
In the subagents architecture, a central main agent (often referred to as a supervisor) coordinates subagents by calling them as tools. The main agent decides which subagent to invoke, what input to provide, and how to combine results. Subagents are stateless—they don’t remember past interactions, with all conversation memory maintained by the main agent. This provides context isolation: each subagent invocation works in a clean context window, preventing context bloat in the main conversation. Key characteristics:- Centralized control: All routing passes through the main agent
- No direct user interaction: Subagents return results to the main agent, not the user
- Subagents via tools: Subagents are invoked via tools
- Parallel execution: The main agent can invoke multiple subagents in a single turn
Tutorial: Build an agent with subagents
Learn how to build a personal assistant using the subagents pattern, where a central main agent (supervisor) coordinates specialized worker agents.
Sync vs async
By default, subagent calls are synchronous—the main agent waits for each subagent to complete before continuing. This is simple and works well for most cases. For long-running tasks (reviewing contracts, conducting research, auditing code), use asynchronous execution. The main agent kicks off a background job and continues conversing with the user while the work completes.Background job pattern
Background job pattern
Key characteristics:
- Three-tool pattern: Kick off job (returns job ID), check status, get results
- Asynchronous execution: Work proceeds in the background while main agent remains responsive
- User-initiated checks: Main agent checks job status when the user asks, not on a polling schedule
HumanMessage like “Check job_123 and summarize the results.”Tool patterns
There are two main ways to expose subagents as tools:| Pattern | Best for | Trade-off |
|---|---|---|
| Tool per agent | Fine-grained control over each subagent’s input/output | More setup, but more customization |
| Single dispatch tool | Many agents, distributed teams, convention over configuration | Simpler composition, less per-agent customization |
Tool per agent
The key idea is wrapping subagents as tools that the main agent can call:Single dispatch tool
An alternative approach uses a single parameterized tool to spawn ephemeral sub-agents for independent tasks. Unlike the tool per agent approach where each sub-agent is wrapped as a separate tool, this uses a convention-based approach with a singletask tool: the task description is passed as a human message to the sub-agent, and the sub-agent’s final message is returned as the tool result.
Key characteristics:
- Single task tool: One parameterized tool that can invoke any registered sub-agent by name
- Convention-based invocation: Agent selected by name, task passed as human message, final message returned as tool result
- Team distribution: Different teams can develop and deploy agents independently
- Agent discovery: Sub-agents can be discovered via system prompt (listing available agents) or through progressive disclosure (loading agent information on-demand via tools)
Agent registry with task dispatcher
Agent registry with task dispatcher
Context engineering
Control how context flows between the main agent and its subagents:| Category | Purpose | Impacts |
|---|---|---|
| Subagent specs | Ensure subagents are invoked when they should be | Main agent routing decisions |
| Subagent inputs | Ensure subagents can execute well with optimized context | Subagent performance |
| Subagent outputs | Ensure the supervisor can act on subagent results | Main agent performance |
Subagent specs
The name and description you give a subagent tool determine when the main agent decides to invoke it. These are prompting levers—choose them carefully.- Name: How the main agent refers to the sub-agent. Keep it clear and action-oriented (e.g.,
research_agent,code_reviewer). - Description: What the main agent knows about the sub-agent’s capabilities. Be specific about what tasks it handles and when to use it.
Subagent inputs
Customize what context the subagent receives to execute its task. Add input that isn’t practical to capture in a static prompt—full message history, prior results, or task metadata—by pulling from the agent’s state.Subagent inputs
Subagent inputs
Subagent outputs
Customize what the main agent receives back so it can make good decisions. Two strategies:- Prompt the sub-agent: Specify exactly what should be returned. A common failure mode is that the sub-agent performs tool calls or reasoning but doesn’t include results in its final message—remind it that the supervisor only sees the final output.
- Format in code: Adjust or enrich the response before returning it. For example, pass specific state keys back in addition to the final text using a
Command.
Subagent outputs
Subagent outputs
Handoffs
In the handoffs architecture, behavior changes dynamically based on state. The core mechanism: tools update a state variable (e.g.,current_step or active_agent) that persists across turns, and the system reads this variable to adjust behavior—either applying different configuration (system prompt, tools) or routing to a different agent. This pattern supports both handoffs between distinct agents and dynamic configuration changes within a single agent.
Key characteristics:
- State-driven behavior: Behavior changes based on a state variable (e.g.,
current_steporactive_agent) - Tool-based transitions: Tools update the state variable to move between states
- Direct user interaction: Each state’s configuration handles user messages directly
- Persistent state: State survives across conversation turns
Tutorial: Build a customer support agent using handoffs
Learn how to build a customer support agent using the handoffs pattern, where a single agent transitions between different configurations.
Single agent with middleware
A single agent changes its behavior based on state. Middleware intercepts each model call and dynamically adjusts the system prompt and available tools. Tools update the state variable to trigger transitions:Complete example: Customer support with middleware
Complete example: Customer support with middleware
Multiple agent subgraphs
Multiple distinct agents exist as separate nodes in a graph. Handoff tools navigate between agent nodes usingCommand.PARENT to specify which node to execute next:
Complete example: Sales and support with handoffs
Complete example: Sales and support with handoffs
This example shows a multi-agent system with separate sales and support agents. Each agent is a separate graph node, and handoff tools allow agents to transfer conversations to each other.
- Conversation history: Decide what conversation history each agent/state receives—full history, filtered portions, or summaries.
- Tool semantics: Clarify whether handoff tools only update routing state or also perform actions (e.g., should
transfer_to_sales()also create a ticket?).
Skills
In the skills architecture, specialized capabilities are packaged as invokable “skills” that augment an agent’s behavior. Skills are primarily prompt-driven specializations that an agent can invoke on-demand. Key characteristics:- Prompt-driven specialization: Skills are primarily defined by specialized prompts
- Progressive disclosure: Skills become available based on context or user needs
- Team distribution: Different teams can develop and maintain skills independently
- Lightweight composition: Skills are simpler than full sub-agents
Loading skills on-demand
Loading skills on-demand
Extending the pattern
When writing custom implementations, you can extend the basic skills pattern in several ways: Dynamic tool registration: Combine progressive disclosure with state management to register new tools as skills load. For example, loading a “database_admin” skill could both add specialized context and register database-specific tools (backup, restore, migrate). This uses the same tool-and-state mechanisms used across multi-agent patterns—tools updating state to dynamically change agent capabilities. Hierarchical skills: Skills can define other skills in a tree structure, creating nested specializations. For instance, loading a “data_science” skill might make available sub-skills like “pandas_expert”, “visualization”, and “statistical_analysis”. Each sub-skill can be loaded independently as needed, allowing for fine-grained progressive disclosure of domain knowledge. This hierarchical approach helps manage large knowledge bases by organizing capabilities into logical groupings that can be discovered and loaded on-demand.Tutorial: Build an agent with on-demand skill loading
Learn how to implement skills with progressive disclosure, where the agent loads specialized prompts and schemas on-demand rather than upfront.
Router
In the router architecture, a routing step classifies input and directs it to specialized agents. This is useful when you have distinct verticals—separate knowledge domains that each require their own agent. Key characteristics:- Router decomposes the query
- Zero or more specialized agents are invoked in parallel
- Results are synthesized into a coherent response
- Stateless routers address each request independently
- Stateful routers maintain conversation history across requests
Stateless
Each request is routed independently—no memory between calls. For multi-turn conversations, see Stateful routers.Building a multi-source knowledge base router
Building a multi-source knowledge base router
Your organization’s knowledge lives in multiple places: GitHub repositories, Notion wikis, and Slack conversations. These are three distinct verticals, each requiring specialized tools and context. When users ask questions like “How do I authenticate API requests?”, the answer may require information from multiple sources. This example builds a router that decomposes queries, identifies which verticals to consult, queries them in parallel, and synthesizes results.
Stateful
For multi-turn conversations, you need to maintain context across invocations.Tool wrapper
The simplest approach: wrap the stateless router as a tool that a conversational agent can call. The conversational agent handles memory and context; the router stays stateless. This avoids the complexity of managing conversation history across multiple parallel agents.Full persistence
If you need the router itself to maintain state, use persistence to store message history. When routing to an agent, fetch previous messages from state and selectively include them in the agent’s context—this is a lever for context engineering.Custom workflow
In the custom workflow architecture, you define your own bespoke execution flow using LangGraph. You have complete control over the graph structure—including sequential steps, conditional branches, loops, and parallel execution. Use custom workflows when:- Standard patterns (subagents, skills, etc.) don’t fit your requirements
- You need to mix deterministic logic with agentic behavior
- Your use case requires complex routing or multi-stage processing
Custom RAG workflow
Custom RAG workflow
The workflow demonstrates three types of nodes:
- Model node (Rewrite): Rewrites the user query for better retrieval using structured output.
- Deterministic node (Retrieve): Performs vector similarity search — no LLM involved.
- Agent node (Agent): Reasons over retrieved context and can fetch additional information via tools.