Multi-agent

Multi-agent systems break complex applications into coordinated components. Importantly, “multi-agent” doesn’t necessarily mean multiple distinct agents — a single agent with dynamic behavior can achieve similar capabilities.

Why multi-agent?

When developers say they need “multi-agent,” they’re usually looking for one or more of these capabilities:

Context management

Surface relevant knowledge without overwhelming the context window. Different tasks (agents) need different context.

Distributed development

Let different teams develop and maintain capabilities independently with clear boundaries.

Parallelization

Spawn specialized workers for subtasks and execute them concurrently for faster results.

Sequential constraints

Enforce step-by-step workflows. Unlock tools and actions only after preconditions are met.

Multi-agent patterns are particularly valuable when a single agent has too many tools and makes poor decisions about which to use, when tasks require specialized knowledge with extensive context (long prompts and domain-specific tools), or when you need to enforce sequential constraints that unlock capabilities only after certain conditions are met.

At the center of multi-agent design is context engineering—deciding what information each agent sees. The quality of your system depends on ensuring each agent has access to the right data for its task.

Patterns

Here are the main patterns for building multi-agent systems, each suited to different use cases:

Pattern	How it works
Subagents	A main agent coordinates subagents and background jobs as tools. Centralized control — all routing passes through the main agent. Multiple coordination approaches available.
Handoffs	Behavior changes dynamically based on state. Tool calls update a state variable, and the system adjusts behavior accordingly. Supports both handoffs between distinct agents and dynamic configuration changes.
Skills	Specialized prompts loaded on-demand. The main agent stays in control and gains additional context as needed.
Router	A routing step classifies input and directs it to one or more specialized agents. Results are collected and returned to the user.
Custom workflow	Build bespoke logic with LangGraph, mixing deterministic and agentic steps. Reuse or customize agents as needed.

Tool calling is the primary coordination mechanism across all patterns. Tools can:

Invoke sub-agents (subagents)
Update state to trigger routing or configuration changes (handoffs)
Load context on-demand (skills)
Invoke entire multi-agent systems (wrapping a router as a tool)

Choosing a pattern

Use this table to match your requirements to the right pattern:

Pattern	Distributed development	Parallelization	Multi-hop	Direct user interaction
Subagents	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐	—
Handoffs	—	—	⭐⭐⭐	⭐⭐⭐
Skills	⭐⭐⭐	⭐⭐	⭐⭐⭐	⭐⭐⭐
Router	⭐⭐	⭐⭐⭐	—	⭐⭐

Distributed development: Can different teams maintain components independently?
Parallelization: Can multiple agents execute concurrently?
Multi-hop: Does the pattern support multiple hops between agents?
Direct user interaction: Can subagents converse directly with the user?

You can mix patterns! For example, a subagents pattern can manage workflow sub-graphs or use the router pattern as a tool (querying multiple knowledge bases in parallel, then synthesizing results). A state machine can invoke skills at specific stages (loading specialized context only when reaching certain steps). The one tool for all agents approach can work within a custom workflow to parallelize independent tasks while maintaining deterministic overall structure.

Subagents

In the subagents architecture, a central main agent (often referred to as a supervisor) coordinates subagents by calling them as tools. The main agent decides which subagent to invoke, what input to provide, and how to combine results. Subagents are stateless—they don’t remember past interactions, with all conversation memory maintained by the main agent. This provides context isolation: each subagent invocation works in a clean context window, preventing context bloat in the main conversation. Key characteristics:

Centralized control: All routing passes through the main agent
No direct user interaction: Subagents return results to the main agent, not the user
Subagents via tools: Subagents are invoked via tools
Parallel execution: The main agent can invoke multiple subagents in a single turn

Use the subagents pattern when you have multiple distinct domains (e.g., calendar, email, CRM, database), subagents don’t need to converse directly with users, or you want centralized workflow control. For simpler cases with just a few tools, use a single agent.

Tutorial: Build an agent with subagents

Learn how to build a personal assistant using the subagents pattern, where a central main agent (supervisor) coordinates specialized worker agents.

Sync vs async

By default, subagent calls are synchronous—the main agent waits for each subagent to complete before continuing. This is simple and works well for most cases. For long-running tasks (reviewing contracts, conducting research, auditing code), use asynchronous execution. The main agent kicks off a background job and continues conversing with the user while the work completes.

Background job pattern

Key characteristics:

Three-tool pattern: Kick off job (returns job ID), check status, get results
Asynchronous execution: Work proceeds in the background while main agent remains responsive
User-initiated checks: Main agent checks job status when the user asks, not on a polling schedule

Handling job completion: When a job finishes, your application needs to notify the user. One approach: surface a notification that, when clicked, sends a HumanMessage like “Check job_123 and summarize the results.”

Tool patterns

There are two main ways to expose subagents as tools:

Pattern	Best for	Trade-off
Tool per agent	Fine-grained control over each subagent’s input/output	More setup, but more customization
Single dispatch tool	Many agents, distributed teams, convention over configuration	Simpler composition, less per-agent customization

Tool per agent

The key idea is wrapping subagents as tools that the main agent can call:

from langchain.tools import tool
from langchain.agents import create_agent

# Create a sub-agent
subagent = create_agent(model="...", tools=[...])  

# Wrap it as a tool  #
@tool("subagent_name", description="subagent_description")  
def call_subagent(query: str):  
    result = subagent.invoke({"messages": [{"role": "user", "content": query}]})
    return result["messages"][-1].content

# Main agent with subagent as a tool  #
main_agent = create_agent(model="...", tools=[call_subagent])  

The main agent invokes the subagent tool when it decides the task matches the subagent’s description, receives the result, and continues orchestration. See Context engineering for fine-grained control.

Single dispatch tool

An alternative approach uses a single parameterized tool to spawn ephemeral sub-agents for independent tasks. Unlike the tool per agent approach where each sub-agent is wrapped as a separate tool, this uses a convention-based approach with a single task tool: the task description is passed as a human message to the sub-agent, and the sub-agent’s final message is returned as the tool result. Key characteristics:

Single task tool: One parameterized tool that can invoke any registered sub-agent by name
Convention-based invocation: Agent selected by name, task passed as human message, final message returned as tool result
Team distribution: Different teams can develop and deploy agents independently
Agent discovery: Sub-agents can be discovered via system prompt (listing available agents) or through progressive disclosure (loading agent information on-demand via tools)

Use this approach when you want to distribute agent development across multiple teams, need to isolate complex tasks into separate context windows, need a scalable way to add new agents without modifying the coordinator, or prefer convention over customization. This approach trades flexibility in context engineering for simplicity in agent composition and strong context isolation.

An interesting aspect of this approach is that sub-agents may have the exact same capabilities as the main agent. In such cases, spawning a sub-agent is really about context isolation as the primary reason—allowing complex, multi-step tasks to run in isolated context windows without bloating the main agent’s conversation history. The sub-agent completes its work autonomously and returns only a concise summary, keeping the main thread focused and efficient.

Agent registry with task dispatcher

from langchain.tools import tool
from langchain.agents import create_agent

# Sub-agents developed by different teams
research_agent = create_agent(
    model="gpt-4o",
    prompt="You are a research specialist..."
)

writer_agent = create_agent(
    model="gpt-4o",
    prompt="You are a writing specialist..."
)

# Registry of available sub-agents
SUBAGENTS = {
    "research": research_agent,
    "writer": writer_agent,
}

@tool
def task(
    agent_name: str,
    description: str
) -> str:
    """Launch an ephemeral subagent for a task.

    Available agents:
    - research: Research and fact-finding
    - writer: Content creation and editing
    """
    agent = SUBAGENTS[agent_name]
    result = agent.invoke({
        "messages": [
            {"role": "user", "content": description}
        ]
    })
    return result["messages"][-1].content

# Main coordinator agent
main_agent = create_agent(
    model="gpt-4o",
    tools=[task],
    system_prompt=(
        "You coordinate specialized sub-agents. "
        "Available: research (fact-finding), "
        "writer (content creation). "
        "Use the task tool to delegate work."
    ),
)

Context engineering

Control how context flows between the main agent and its subagents:

Category	Purpose	Impacts
Subagent specs	Ensure subagents are invoked when they should be	Main agent routing decisions
Subagent inputs	Ensure subagents can execute well with optimized context	Subagent performance
Subagent outputs	Ensure the supervisor can act on subagent results	Main agent performance

Subagent specs

The name and description you give a subagent tool determine when the main agent decides to invoke it. These are prompting levers—choose them carefully.

Name: How the main agent refers to the sub-agent. Keep it clear and action-oriented (e.g., research_agent, code_reviewer).
Description: What the main agent knows about the sub-agent’s capabilities. Be specific about what tasks it handles and when to use it.

Subagent inputs

Customize what context the subagent receives to execute its task. Add input that isn’t practical to capture in a static prompt—full message history, prior results, or task metadata—by pulling from the agent’s state.

Subagent inputs

from langchain.agents import AgentState
from langchain.tools import tool, ToolRuntime

class CustomState(AgentState):
    example_state_key: str

@tool(
    "subagent1_name",
    description="subagent1_description"
)
def call_subagent1(query: str, runtime: ToolRuntime[None, CustomState]):
    # Apply any logic needed to transform the messages into a suitable input
    subagent_input = some_logic(query, runtime.state["messages"])
    result = subagent1.invoke({
        "messages": subagent_input,
        # You could also pass other state keys here as needed.
        # Make sure to define these in both the main and subagent's
        # state schemas.
        "example_state_key": runtime.state["example_state_key"]
    })
    return result["messages"][-1].content

Subagent outputs

Customize what the main agent receives back so it can make good decisions. Two strategies:

Prompt the sub-agent: Specify exactly what should be returned. A common failure mode is that the sub-agent performs tool calls or reasoning but doesn’t include results in its final message—remind it that the supervisor only sees the final output.
Format in code: Adjust or enrich the response before returning it. For example, pass specific state keys back in addition to the final text using a Command.

Subagent outputs

from typing import Annotated
from langchain.agents import AgentState
from langchain.tools import InjectedToolCallId
from langgraph.types import Command


@tool(
    "subagent1_name",
    description="subagent1_description"
)
def call_subagent1(
    query: str,
    tool_call_id: Annotated[str, InjectedToolCallId],
) -> Command:
    result = subagent1.invoke({
        "messages": [{"role": "user", "content": query}]
    })
    return Command(update={
        # Pass back additional state from the subagent
        "example_state_key": result["example_state_key"],
        "messages": [
            ToolMessage(
                content=result["messages"][-1].content,
                tool_call_id=tool_call_id
            )
        ]
    })

Handoffs

In the handoffs architecture, behavior changes dynamically based on state. The core mechanism: tools update a state variable (e.g., current_step or active_agent) that persists across turns, and the system reads this variable to adjust behavior—either applying different configuration (system prompt, tools) or routing to a different agent. This pattern supports both handoffs between distinct agents and dynamic configuration changes within a single agent.

The term handoffs was coined by OpenAI for using tool calls (e.g., transfer_to_sales_agent) to transfer control between agents or states.

Key characteristics:

State-driven behavior: Behavior changes based on a state variable (e.g., current_step or active_agent)
Tool-based transitions: Tools update the state variable to move between states
Direct user interaction: Each state’s configuration handles user messages directly
Persistent state: State survives across conversation turns

Use the handoffs pattern when you need to enforce sequential constraints (unlock capabilities only after preconditions are met), the agent needs to converse directly with the user across different states, or you’re building multi-stage conversational flows. This pattern is particularly valuable for customer support scenarios where you need to collect information in a specific sequence — for example, collecting a warranty ID before processing a refund.

Tutorial: Build a customer support agent using handoffs

Learn how to build a customer support agent using the handoffs pattern, where a single agent transitions between different configurations.

There are two ways to implement handoffs: single agent with middleware (one agent with dynamic configuration) or multiple agent subgraphs (distinct agents as graph nodes).

Single agent with middleware

A single agent changes its behavior based on state. Middleware intercepts each model call and dynamically adjusts the system prompt and available tools. Tools update the state variable to trigger transitions:

from langchain_core.tools import tool
from langchain.tools import ToolRuntime
from langchain.messages import ToolMessage
from langgraph.types import Command

@tool
def record_warranty_status(
    status: str,
    runtime: ToolRuntime[None, SupportState]
) -> Command:
    """Record warranty status and transition to next step."""
    return Command(
        update={
            "messages": [
                ToolMessage(
                    content=f"Warranty status recorded: {status}",
                    tool_call_id=runtime.tool_call_id
                )
            ],
            "warranty_status": status,
            "current_step": "specialist"  # Update state to trigger transition
        }
    )

Complete example: Customer support with middleware

from langchain.agents import AgentState, create_agent
from langchain.agents.middleware import wrap_model_call, ModelRequest, ModelResponse
from langchain.tools import tool, ToolRuntime
from langchain.messages import ToolMessage
from langgraph.types import Command
from typing import Callable

# 1. Define state with current_step tracker
class SupportState(AgentState):  
    """Track which step is currently active."""
    current_step: str = "triage"
    warranty_status: str | None = None

# 2. Tools update current_step via Command
@tool
def record_warranty_status(
    status: str,
    runtime: ToolRuntime[None, SupportState]
) -> Command:  
    """Record warranty status and transition to next step."""
    return Command(update={  
        "messages": [  
            ToolMessage(
                content=f"Warranty status recorded: {status}",
                tool_call_id=runtime.tool_call_id
            )
        ],
        "warranty_status": status,
        "current_step": "specialist"  # Transition to next step  #
    })

# 3. Middleware applies dynamic configuration based on current_step
@wrap_model_call
def apply_step_config(
    request: ModelRequest,
    handler: Callable[[ModelRequest], ModelResponse]
) -> ModelResponse:
    """Configure agent behavior based on current_step."""
    step = request.state.get("current_step", "triage")  

    # Map steps to their configurations
    configs = {
        "triage": {
            "prompt": "Collect warranty information...",
            "tools": [record_warranty_status]
        },
        "specialist": {
            "prompt": "Provide solutions based on warranty: {warranty_status}",
            "tools": [provide_solution, escalate]
        }
    }

    config = configs[step]
    request = request.override(  
        system_prompt=config["prompt"].format(**request.state),  
        tools=config["tools"]  
    )
    return handler(request)

# 4. Create agent with middleware
agent = create_agent(
    model,
    tools=[record_warranty_status, provide_solution, escalate],
    state_schema=SupportState,
    middleware=[apply_step_config],  
    checkpointer=InMemorySaver()  # Persist state across turns  #
)

Multiple agent subgraphs

Multiple distinct agents exist as separate nodes in a graph. Handoff tools navigate between agent nodes using Command.PARENT to specify which node to execute next:

@tool
def transfer_to_sales():
    """Transfer to the sales agent."""
    return Command(
        goto="sales_agent",  # Navigate to the sales agent node
        update={"active_agent": "sales_agent"},
        graph=Command.PARENT  # Navigate in parent graph
    )

Complete example: Sales and support with handoffs

This example shows a multi-agent system with separate sales and support agents. Each agent is a separate graph node, and handoff tools allow agents to transfer conversations to each other.

from typing import Literal
from langchain.agents import AgentState, create_agent
from langgraph.graph import StateGraph, START
from langgraph.types import Command
from langchain_core.tools import tool

# 1. Define state with active_agent tracker
class MultiAgentState(AgentState):
    active_agent: str = "sales_agent"  # Track which agent is active

# 2. Create handoff tools
@tool
def transfer_to_sales():
    """Transfer to the sales agent."""
    return Command(
        goto="sales_agent",
        update={"active_agent": "sales_agent"},
        graph=Command.PARENT
    )

@tool
def transfer_to_support():
    """Transfer to the support agent."""
    return Command(
        goto="support_agent",
        update={"active_agent": "support_agent"},
        graph=Command.PARENT
    )

# 3. Create agents with handoff tools
sales_agent = create_agent(
    model="anthropic:claude-3-5-sonnet-latest",
    tools=[transfer_to_support],
    prompt="You are a sales agent. Help with sales inquiries."
)

support_agent = create_agent(
    model="anthropic:claude-3-5-sonnet-latest",
    tools=[transfer_to_sales],
    prompt="You are a support agent. Help with technical issues."
)

# 4. Create agent nodes that invoke the agents
def call_sales_agent(state: MultiAgentState):
    """Node that calls the sales agent."""
    response = sales_agent.invoke(state)
    return response

def call_support_agent(state: MultiAgentState):
    """Node that calls the support agent."""
    response = support_agent.invoke(state)
    return response

# 5. Create router node
def route_to_agent(state: MultiAgentState) -> Literal["sales_agent", "support_agent"]:
    """Route to the active agent based on state."""
    return state["active_agent"]

# 6. Build the graph
builder = StateGraph(MultiAgentState)
builder.add_node("sales_agent", call_sales_agent)
builder.add_node("support_agent", call_support_agent)

# Start with conditional routing based on initial active_agent
builder.add_conditional_edges(
    START,
    route_to_agent,
    ["sales_agent", "support_agent"]
)

# After each agent, route to the active agent (enables handoffs)
builder.add_conditional_edges(
    "sales_agent",
    route_to_agent,
    ["sales_agent", "support_agent"]
)
builder.add_conditional_edges(
    "support_agent",
    route_to_agent,
    ["sales_agent", "support_agent"]
)

graph = builder.compile()

Use single agent with middleware for most handoffs use cases—it’s simpler. Only use multiple agent subgraphs when you need bespoke agent implementations (e.g., a node that’s itself a complex graph with reflection or retrieval steps).

Implementation considerations:

Conversation history: Decide what conversation history each agent/state receives—full history, filtered portions, or summaries.
Tool semantics: Clarify whether handoff tools only update routing state or also perform actions (e.g., should transfer_to_sales() also create a ticket?).

Skills

In the skills architecture, specialized capabilities are packaged as invokable “skills” that augment an agent’s behavior. Skills are primarily prompt-driven specializations that an agent can invoke on-demand. Key characteristics:

Prompt-driven specialization: Skills are primarily defined by specialized prompts
Progressive disclosure: Skills become available based on context or user needs
Team distribution: Different teams can develop and maintain skills independently
Lightweight composition: Skills are simpler than full sub-agents

Use the skills pattern when you want a single agent with many possible specializations, you don’t need to enforce specific constraints between skills, or different teams need to develop capabilities independently. Common examples include coding assistants (skills for different languages or tasks), knowledge bases (skills for different domains), and creative assistants (skills for different formats).

This pattern is conceptually identical to llms.txt (introduced by Jeremy Howard), which uses tool calling for progressive disclosure of documentation. The skills pattern applies the same approach to specialized prompts and domain knowledge rather than just documentation pages.

Loading skills on-demand

from langchain.tools import tool
from langchain.agents import create_agent

@tool
def load_skill(skill_name: str) -> str:
    """Load a specialized skill prompt.

    Available skills:
    - write_sql: SQL query writing expert
    - review_legal_doc: Legal document reviewer

    Returns the skill's prompt and context.
    """
    # Load skill content from file/database
    ...

agent = create_agent(
    model="gpt-4o",
    tools=[load_skill],
    system_prompt=(
        "You are a helpful assistant. "
        "You have access to two skills: "
        "write_sql and review_legal_doc. "
        "Use load_skill to access them."
    ),
)

Extending the pattern

When writing custom implementations, you can extend the basic skills pattern in several ways: Dynamic tool registration: Combine progressive disclosure with state management to register new tools as skills load. For example, loading a “database_admin” skill could both add specialized context and register database-specific tools (backup, restore, migrate). This uses the same tool-and-state mechanisms used across multi-agent patterns—tools updating state to dynamically change agent capabilities. Hierarchical skills: Skills can define other skills in a tree structure, creating nested specializations. For instance, loading a “data_science” skill might make available sub-skills like “pandas_expert”, “visualization”, and “statistical_analysis”. Each sub-skill can be loaded independently as needed, allowing for fine-grained progressive disclosure of domain knowledge. This hierarchical approach helps manage large knowledge bases by organizing capabilities into logical groupings that can be discovered and loaded on-demand.

Tutorial: Build an agent with on-demand skill loading

Learn how to implement skills with progressive disclosure, where the agent loads specialized prompts and schemas on-demand rather than upfront.

Router

In the router architecture, a routing step classifies input and directs it to specialized agents. This is useful when you have distinct verticals—separate knowledge domains that each require their own agent. Key characteristics:

Router decomposes the query
Zero or more specialized agents are invoked in parallel
Results are synthesized into a coherent response

Two approaches:

Stateless routers address each request independently
Stateful routers maintain conversation history across requests

Stateless

Each request is routed independently—no memory between calls. For multi-turn conversations, see Stateful routers.

Stateless router vs Subagents: The subagents pattern can also route to multiple agents. Use the stateless router when you need specialized preprocessing or custom routing logic. Use the subagents pattern when you want the LLM to decide which agents to call dynamically.

Building a multi-source knowledge base router

Your organization’s knowledge lives in multiple places: GitHub repositories, Notion wikis, and Slack conversations. These are three distinct verticals, each requiring specialized tools and context. When users ask questions like “How do I authenticate API requests?”, the answer may require information from multiple sources. This example builds a router that decomposes queries, identifies which verticals to consult, queries them in parallel, and synthesizes results.

from typing import TypedDict
from langgraph.graph import StateGraph, START, END, Send
from langchain.agents import create_agent
from langchain_openai import ChatOpenAI

class RouterState(TypedDict):
    query: str
    routes: list[str]  # Which knowledge bases to query
    github_result: str | None
    notion_result: str | None
    slack_result: str | None
    final_answer: str

# Specialized agents for each vertical
github_agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_code, search_issues, search_prs],
    prompt="You are a GitHub expert. Answer questions about code, API references, and implementation details.",
    name="github_agent"
)

notion_agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_notion, get_page],
    prompt="You are a Notion expert. Answer questions about internal processes, policies, and team documentation.",
    name="notion_agent"
)

slack_agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_slack, get_thread],
    prompt="You are a Slack expert. Answer questions by searching relevant threads and discussions.",
    name="slack_agent"
)

router_llm = ChatOpenAI(model="gpt-4o-mini")

def decompose_query(state: RouterState) -> dict:  
    """Decompose query and determine which knowledge bases to consult."""
    response = router_llm.invoke([
        {
            "role": "system",
            "content": "Analyze this query and determine which knowledge bases to consult. Return a JSON list with one or more of: 'github', 'notion', 'slack'."
        },
        {"role": "user", "content": state["query"]}
    ])
    # Parse LLM response to get routes (simplified for example)
    routes = ["github", "notion"]  # In practice, parse from LLM response
    return {"routes": routes}

# Route to multiple agents in parallel  #
def route_to_agents(state: RouterState) -> list[Send]:  
    """Fan out to multiple agents in parallel."""
    return [Send(route, state) for route in state["routes"]]  

def query_github(state: RouterState) -> dict:
    result = github_agent.invoke({
        "messages": [{"role": "user", "content": state["query"]}]
    })
    return {"github_result": result["messages"][-1].content}

def query_notion(state: RouterState) -> dict:
    result = notion_agent.invoke({
        "messages": [{"role": "user", "content": state["query"]}]
    })
    return {"notion_result": result["messages"][-1].content}

def query_slack(state: RouterState) -> dict:
    result = slack_agent.invoke({
        "messages": [{"role": "user", "content": state["query"]}]
    })
    return {"slack_result": result["messages"][-1].content}

def synthesize_results(state: RouterState) -> dict:  
    """Combine results from multiple agents into a coherent answer."""
    results = []
    if state.get("github_result"):
        results.append(f"GitHub: {state['github_result']}")
    if state.get("notion_result"):
        results.append(f"Notion: {state['notion_result']}")
    if state.get("slack_result"):
        results.append(f"Slack: {state['slack_result']}")

    # Use LLM to synthesize
    synthesis_response = router_llm.invoke([
        {"role": "system", "content": "Synthesize these search results into a coherent answer."},
        {"role": "user", "content": "\n\n".join(results)}
    ])
    return {"final_answer": synthesis_response.content}

# Build workflow with parallel execution
workflow = (
    StateGraph(RouterState)
    .add_node("decompose", decompose_query)
    .add_node("github", query_github)
    .add_node("notion", query_notion)
    .add_node("slack", query_slack)
    .add_node("synthesize", synthesize_results)
    .add_edge(START, "decompose")
    .add_conditional_edges("decompose", route_to_agents, ["github", "notion", "slack"])  
    .add_edge("github", "synthesize")
    .add_edge("notion", "synthesize")
    .add_edge("slack", "synthesize")
    .add_edge("synthesize", END)
    .compile()
)

result = workflow.invoke({"query": "How do I authenticate API requests?"})
print(result["final_answer"])

Stateful

For multi-turn conversations, you need to maintain context across invocations.

Tool wrapper

The simplest approach: wrap the stateless router as a tool that a conversational agent can call. The conversational agent handles memory and context; the router stays stateless. This avoids the complexity of managing conversation history across multiple parallel agents.

@tool
def search_docs(query: str) -> str:
    """Search across multiple documentation sources."""
    result = workflow.invoke({"query": query})
    return result["final_answer"]

# Conversational agent uses the router as a tool
conversational_agent = create_agent(
    model,
    tools=[search_docs],
    prompt="You are a helpful assistant. Use search_docs to answer questions."
)

Full persistence

If you need the router itself to maintain state, use persistence to store message history. When routing to an agent, fetch previous messages from state and selectively include them in the agent’s context—this is a lever for context engineering.

Stateful routers require custom history management. If the router switches between agents across turns, conversations may not feel fluid to end users when agents have different tones or prompts. With parallel invocation, you’ll need to maintain history at the router level (inputs and synthesized outputs) and leverage this history in routing logic. Consider the handoffs pattern or subagents pattern instead—both provide clearer semantics for multi-turn conversations.

Custom workflow

In the custom workflow architecture, you define your own bespoke execution flow using LangGraph. You have complete control over the graph structure—including sequential steps, conditional branches, loops, and parallel execution. Use custom workflows when:

Standard patterns (subagents, skills, etc.) don’t fit your requirements
You need to mix deterministic logic with agentic behavior
Your use case requires complex routing or multi-stage processing

Each node in your workflow can be a simple function, an LLM call, or an entire agent with tools. You can also compose other architectures within a custom workflow—for example, embedding a multi-agent system as a single node. The router pattern is an example of a custom workflow.

Calling a LangChain agent from a LangGraph node: The main insight when mixing LangChain and LangGraph is that you can call a LangChain agent directly inside any LangGraph node. This lets you combine the flexibility of custom workflows with the convenience of pre-built agents:

from langchain.agents import create_agent

agent = create_agent(model="openai:gpt-4o", tools=[...])

def agent_node(state: State) -> dict:
    """A LangGraph node that invokes a LangChain agent."""
    result = agent.invoke({
        "messages": [{"role": "user", "content": state["query"]}]
    })
    return {"answer": result["messages"][-1].content}

Example: RAG pipeline — A common use case is combining retrieval with an agent. This example builds a WNBA stats assistant that retrieves from a knowledge base and can fetch live news.

Custom RAG workflow

The workflow demonstrates three types of nodes:

Model node (Rewrite): Rewrites the user query for better retrieval using structured output.
Deterministic node (Retrieve): Performs vector similarity search — no LLM involved.
Agent node (Agent): Reasons over retrieved context and can fetch additional information via tools.

You can use LangGraph state to pass information between workflow steps. This allows each part of your workflow to read and update structured fields, making it easy to share data and context across nodes.

from typing import TypedDict
from pydantic import BaseModel
from langgraph.graph import StateGraph, START, END
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

class State(TypedDict):
    question: str
    rewritten_query: str
    documents: list[str]
    answer: str

# WNBA knowledge base with rosters, game results, and player stats
embeddings = OpenAIEmbeddings()
vector_store = InMemoryVectorStore(embeddings)
vector_store.add_texts([
    # Rosters
    "New York Liberty 2024 roster: Breanna Stewart, Sabrina Ionescu, Jonquel Jones, Courtney Vandersloot.",
    "Las Vegas Aces 2024 roster: A'ja Wilson, Kelsey Plum, Jackie Young, Chelsea Gray.",
    "Indiana Fever 2024 roster: Caitlin Clark, Aliyah Boston, Kelsey Mitchell, NaLyssa Smith.",
    # Game results
    "2024 WNBA Finals: New York Liberty defeated Minnesota Lynx 3-2 to win the championship.",
    "June 15, 2024: Indiana Fever 85, Chicago Sky 79. Caitlin Clark had 23 points and 8 assists.",
    "August 20, 2024: Las Vegas Aces 92, Phoenix Mercury 84. A'ja Wilson scored 35 points.",
    # Player stats
    "A'ja Wilson 2024 season stats: 26.9 PPG, 11.9 RPG, 2.6 BPG. Won MVP award.",
    "Caitlin Clark 2024 rookie stats: 19.2 PPG, 8.4 APG, 5.7 RPG. Won Rookie of the Year.",
    "Breanna Stewart 2024 stats: 20.4 PPG, 8.5 RPG, 3.5 APG.",
])
retriever = vector_store.as_retriever(search_kwargs={"k": 5})

@tool
def get_latest_news(query: str) -> str:
    """Get the latest WNBA news and updates."""
    # Your news API here
    return "Latest: The WNBA announced expanded playoff format for 2025..."

agent = create_agent(
    model="openai:gpt-4o",
    tools=[get_latest_news],
)

model = ChatOpenAI(model="gpt-4o")

class RewrittenQuery(BaseModel):
    query: str

def rewrite_query(state: State) -> dict:
    """Rewrite the user query for better retrieval."""
    system_prompt = """Rewrite this query to retrieve relevant WNBA information.
The knowledge base contains: team rosters, game results with scores, and player statistics (PPG, RPG, APG).
Focus on specific player names, team names, or stat categories mentioned."""
    response = model.with_structured_output(RewrittenQuery).invoke([
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": state["question"]}
    ])
    return {"rewritten_query": response.query}

def retrieve(state: State) -> dict:
    """Retrieve documents based on the rewritten query."""
    docs = retriever.invoke(state["rewritten_query"])
    return {"documents": [doc.page_content for doc in docs]}

def call_agent(state: State) -> dict:
    """Generate answer using retrieved context."""
    context = "\n\n".join(state["documents"])
    prompt = f"Context:\n{context}\n\nQuestion: {state['question']}"
    response = agent.invoke({"messages": [{"role": "user", "content": prompt}]})
    return {"answer": response["messages"][-1].content_blocks}

workflow = (
    StateGraph(State)
    .add_node("rewrite", rewrite_query)
    .add_node("retrieve", retrieve)
    .add_node("agent", call_agent)
    .add_edge(START, "rewrite")
    .add_edge("rewrite", "retrieve")
    .add_edge("retrieve", "agent")
    .add_edge("agent", END)
    .compile()
)

result = workflow.invoke({"question": "Who won the 2024 WNBA Championship?"})
print(result["answer"])

Edit the source of this page on GitHub.

Connect these docs programmatically to Claude, VSCode, and more via MCP for real-time answers.

Get started

Core components

Middleware

Advanced usage

Agent development

Deploy with LangSmith

Why multi-agent?

Context management

Distributed development

Parallelization

Sequential constraints

Patterns

Choosing a pattern

Subagents

Tutorial: Build an agent with subagents

Sync vs async

Tool patterns

Tool per agent

Single dispatch tool

Context engineering

Subagent specs

Subagent inputs

Subagent outputs

Handoffs

Tutorial: Build a customer support agent using handoffs

Single agent with middleware

Multiple agent subgraphs

Skills

Extending the pattern

Tutorial: Build an agent with on-demand skill loading

Router

Stateless

Stateful

Tool wrapper

Full persistence

Custom workflow

Get started

Core components

Middleware

Advanced usage

Agent development

Deploy with LangSmith

​Why multi-agent?

Context management

Distributed development

Parallelization

Sequential constraints

​Patterns

​Choosing a pattern

​Subagents

Tutorial: Build an agent with subagents

​Sync vs async

​Tool patterns

​Tool per agent

​Single dispatch tool

​Context engineering

​Subagent specs

​Subagent inputs

​Subagent outputs

​Handoffs

Tutorial: Build a customer support agent using handoffs

​Single agent with middleware

​Multiple agent subgraphs

​Skills

​Extending the pattern

Tutorial: Build an agent with on-demand skill loading

​Router

​Stateless

​Stateful

​Tool wrapper

​Full persistence

​Custom workflow

Why multi-agent?

Patterns

Choosing a pattern

Subagents

Sync vs async

Tool patterns

Tool per agent

Single dispatch tool

Context engineering

Subagent specs

Subagent inputs

Subagent outputs

Handoffs

Single agent with middleware

Multiple agent subgraphs

Skills

Extending the pattern

Router

Stateless

Stateful

Tool wrapper

Full persistence

Custom workflow