LangGraph Tutorial: Build Stateful AI Agents (2026)
LangGraph is the tool that takes you from LangChain scripts to real agents — ones that loop, branch, remember where they are, and wait for humans to approve risky actions. This tutorial covers everything from core concepts to production-grade patterns, with working Python code throughout.
What is LangGraph and why it exists
LangGraph is LangChain's agent orchestration layer, released in 2024 and now the de facto standard for building production agents in the Python ecosystem. The name is literal: you define your agent as a directed graph where nodes do work (call LLMs, run tools, transform data) and edges define control flow (including conditional branches and cycles).
The reason LangGraph exists is that LangChain's original chains are fundamentally linear — data flows from step A to step B to step C and exits. Real agents don't work that way. A ReAct agent needs to loop: observe → think → act → observe again. A customer support agent needs to branch: is this a billing question or a technical question? A code-writing agent needs to retry when tests fail. LangGraph gives you the graph primitives to express all of this.
The other key thing LangGraph adds is explicit state. Every node reads from and writes to a typed state dict. The graph knows what the state is at every step, which makes it possible to serialize, checkpoint, resume, and inspect runs — things that are nearly impossible with unstructured chains.
LangGraph is well-suited for:
- ReAct-style tool-using agents (search, code execution, API calls)
- Multi-agent workflows where specialized sub-agents hand off to each other
- Long-running processes that need to survive restarts
- Any workflow that requires a human approval gate
- Self-correcting loops (generate → evaluate → revise)
It is not the right choice for simple one-shot prompts, basic RAG pipelines, or tasks where a linear chain works fine. Don't reach for it when LCEL (LangChain Expression Language) is sufficient.
Core concepts: StateGraph, nodes, edges, conditional routing
StateGraph
The StateGraph is the top-level object that holds your entire agent. You instantiate it with a schema — a TypedDict that defines every field your agent will read and write. This schema is the contract between nodes.
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add] # list that accumulates
tool_calls: list
iteration: int
final_answer: str | None
graph = StateGraph(AgentState)
The Annotated[list, operator.add] pattern tells LangGraph how to merge state updates from parallel branches — in this case, concatenate lists rather than overwrite.
Nodes
A node is any Python callable that accepts the current state and returns a dict of updates. Nodes are added to the graph with a string name:
def call_llm(state: AgentState) -> dict:
# state["messages"] contains the full conversation so far
response = llm.invoke(state["messages"])
return {"messages": [response], "iteration": state["iteration"] + 1}
graph.add_node("llm", call_llm)
Edges and conditional routing
Edges connect nodes. There are three kinds:
- Normal edges (
add_edge) — always go from A to B. - Conditional edges (
add_conditional_edges) — call a routing function that returns the name of the next node. - Entry/exit points —
set_entry_pointandset_finish_point(or the specialENDconstant).
from langgraph.graph import END
def route_after_llm(state: AgentState) -> str:
last_msg = state["messages"][-1]
if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
return "tools" # LLM wants to use a tool
if state["iteration"] >= 10:
return END # safety limit
return "llm" # keep going
graph.add_conditional_edges("llm", route_after_llm)
graph.add_edge("tools", "llm") # always return to llm after tool call
graph.set_entry_point("llm")
Your first LangGraph agent (ReAct with tool use)
Here is a complete, runnable ReAct agent that can search the web and do arithmetic. This is the pattern underlying most production agents.
import os
from typing import TypedDict, Annotated
import operator
from langchain_anthropic import ChatAnthropic
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.messages import BaseMessage, HumanMessage, ToolMessage
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
# --- State ---
class AgentState(TypedDict):
messages: Annotated[list[BaseMessage], operator.add]
# --- Tools ---
search = DuckDuckGoSearchRun()
tools = [search]
# --- LLM bound to tools ---
llm = ChatAnthropic(
model="claude-sonnet-4-5",
api_key=os.environ["ANTHROPIC_API_KEY"],
)
llm_with_tools = llm.bind_tools(tools)
# --- Nodes ---
def agent_node(state: AgentState) -> dict:
response = llm_with_tools.invoke(state["messages"])
return {"messages": [response]}
tool_node = ToolNode(tools) # handles tool dispatch automatically
# --- Routing ---
def should_continue(state: AgentState) -> str:
last = state["messages"][-1]
if hasattr(last, "tool_calls") and last.tool_calls:
return "tools"
return END
# --- Build graph ---
graph = StateGraph(AgentState)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue)
graph.add_edge("tools", "agent")
app = graph.compile()
# --- Run ---
result = app.invoke({
"messages": [HumanMessage(content="What is the current price of Nvidia stock? Then multiply it by 100.")]
})
print(result["messages"][-1].content)
Install dependencies: pip install langgraph langchain-anthropic langchain-community duckduckgo-search.
What's happening: the agent node calls Claude with the user's message. If Claude decides to use a tool, the routing function sends execution to the ToolNode, which runs the tool and returns a ToolMessage. Control loops back to the agent, which sees the tool result and either calls another tool or generates a final answer.
State management and persistence
By default, LangGraph state lives in memory and is gone the moment the process exits. For anything beyond quick demos you need a checkpointer — an object that serializes state after every node and lets you resume from any point.
MemorySaver (development only)
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)
# thread_id ties all runs for one user/session together
config = {"configurable": {"thread_id": "user-123"}}
result = app.invoke({"messages": [HumanMessage(content="Hello")]}, config=config)
SqliteSaver (local / single-server)
from langgraph.checkpoint.sqlite import SqliteSaver
import sqlite3
conn = sqlite3.connect("checkpoints.db", check_same_thread=False)
checkpointer = SqliteSaver(conn)
app = graph.compile(checkpointer=checkpointer)
PostgresSaver (production)
from langgraph.checkpoint.postgres import PostgresSaver
import psycopg
conn = psycopg.connect(os.environ["DATABASE_URL"])
checkpointer = PostgresSaver(conn)
checkpointer.setup() # creates tables on first run
app = graph.compile(checkpointer=checkpointer)
Install: pip install langgraph-checkpoint-postgres psycopg[binary].
With a persistent checkpointer you can do things like inspect past state with app.get_state(config), replay a run from any step with app.update_state, and build time-travel debugging into your tooling. The thread_id in the config groups all state for a single conversation — changing it starts a fresh session.
State schema best practices
- Keep state flat — nested objects are harder to merge and debug.
- Use
Annotated[list, operator.add]for any list that accumulates across steps (messages, tool results). - Add an explicit
error: str | Nonefield to track failures without raising exceptions that kill the graph. - Version your state schema — adding fields is backward-compatible, removing or renaming fields breaks existing checkpoints.
Human-in-the-loop patterns
Human-in-the-loop (HITL) is one of LangGraph's strongest differentiators. You can pause execution before any node, serialize the entire graph state, wait for human input (hours or days), and resume exactly where you left off. This requires a persistent checkpointer.
interrupt_before
The simplest HITL pattern: compile the graph with interrupt_before pointing at a node, and execution will pause just before that node runs:
app = graph.compile(
checkpointer=checkpointer,
interrupt_before=["tools"], # pause before every tool call
)
config = {"configurable": {"thread_id": "approval-flow-1"}}
# First invoke — runs until it hits "tools", then stops
result = app.invoke({"messages": [HumanMessage(content="Delete all records older than 90 days")]}, config=config)
# Inspect what the agent is about to do
state = app.get_state(config)
pending_tool = state.next # ["tools"]
tool_call = state.values["messages"][-1].tool_calls[0]
print(f"Agent wants to call: {tool_call['name']} with args {tool_call['args']}")
# Human approves — resume by calling invoke again with the same config
# LangGraph picks up from the saved checkpoint
app.invoke(None, config=config) # None = no new input, just resume
# Human rejects — update state to inject an error and skip the tool
app.update_state(config, {
"messages": [ToolMessage(content="Action rejected by human reviewer.", tool_call_id=tool_call["id"])]
}, as_node="tools")
app.invoke(None, config=config)
Approval as a separate node
For more control, model the approval itself as a node with its own logic:
def approval_node(state: AgentState) -> dict:
# In a web app, this would write to a DB and return immediately.
# A webhook or polling loop resumes the graph when a human acts.
pending = state["messages"][-1].tool_calls[0]
print(f"[APPROVAL REQUIRED] Tool: {pending['name']}, Args: {pending['args']}")
approved = input("Approve? (y/n): ").strip().lower() == "y"
if not approved:
return {"messages": [ToolMessage(content="Rejected.", tool_call_id=pending["id"])]}
return {} # empty dict = no state change, proceed to tool
graph.add_node("approval", approval_node)
graph.add_edge("agent", "approval")
graph.add_conditional_edges("approval", ...)
In a real web application, the approval node writes a pending approval record to your database, returns immediately, and the API endpoint responds to the human's action by calling app.invoke(None, config=config) to resume. This pattern scales to async/queued workflows with no blocking.
LangGraph vs CrewAI, AutoGen, and raw LangChain
| Framework | Best for | Abstraction level | State model | HITL support |
|---|---|---|---|---|
| LangGraph | Complex, custom agents; production systems | Low — you define the graph | Typed, explicit, checkpointable | First-class |
| CrewAI | Role-based multi-agent teams, fast prototyping | High — roles + tasks | Implicit, per-agent memory | Limited |
| AutoGen | Conversational multi-agent, research workflows | Medium — chat-centric | Conversation history | Via UserProxyAgent |
| Raw LangChain (LCEL) | Linear pipelines, simple RAG | Medium — chain composition | Passed explicitly | Not built-in |
| Raw API calls | Maximum control, no dependencies | None | DIY | DIY |
The honest take: CrewAI is faster to get a demo running if you think in terms of "roles" (researcher, writer, reviewer). But its abstraction becomes a ceiling when you need custom routing, fine-grained state control, or production-grade reliability. LangGraph starts slower but scales to production without rewrites.
AutoGen is excellent for research and agentic coding assistants but its conversational model makes it awkward for structured workflows with side effects. Raw LangChain is the right choice when you don't need cycles — don't reach for LangGraph to build a basic RAG chain.
Production patterns and gotchas
Use LangGraph Platform or LangServe for deployment
LangGraph Platform (the managed hosted version) handles concurrency, scaling, and persistent storage out of the box. Self-hosting? Use langgraph serve or wrap your compiled graph in a FastAPI endpoint. The key thing to get right is thread isolation — every user or session needs its own thread_id.
Set recursion limits
Without a recursion limit, a buggy routing function can loop forever. Set it at compile time:
app = graph.compile(checkpointer=checkpointer)
config = {"configurable": {"thread_id": "xyz"}, "recursion_limit": 25}
25 is a reasonable default for most agents. Bump to 50 for complex multi-step workflows. If you're hitting the limit regularly, your routing logic has a bug.
Handle tool errors gracefully
Wrap tool nodes in try/except and write errors into state instead of raising exceptions. A crashed node kills the graph run; a state update with error: "tool timed out" lets the LLM decide what to do next:
def safe_tool_node(state: AgentState) -> dict:
try:
result = run_tool(state)
return {"messages": [result], "error": None}
except Exception as e:
tool_call_id = state["messages"][-1].tool_calls[0]["id"]
return {
"messages": [ToolMessage(content=f"Error: {str(e)}", tool_call_id=tool_call_id)],
"error": str(e),
}
Add observability from day one
Wire up Langfuse or LangSmith before you do anything else. LangGraph emits traces automatically if you set the environment variables:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-key"
# Or for Langfuse:
os.environ["LANGFUSE_SECRET_KEY"] = "your-key"
os.environ["LANGFUSE_PUBLIC_KEY"] = "your-public-key"
You'll get a full trace of every node, every token, and every tool call — impossible to debug production agents without this.
Stream responses to users
For a good UX, stream tokens as they arrive rather than waiting for the full response:
for chunk in app.stream(
{"messages": [HumanMessage(content="Analyze this dataset")]},
config=config,
stream_mode="values",
):
# chunk["messages"][-1] has the latest partial message
print(chunk["messages"][-1].content, end="", flush=True)
Common gotchas
- State type errors. If a node returns a key that isn't in your TypedDict, LangGraph silently ignores it. Define your schema strictly and add validation in development.
- Parallel branch merging. LangGraph supports parallel branches with
add_edgefrom multiple nodes to one. The merger uses your Annotated reducers — make sure they're correct before going parallel. - Checkpointer connection pooling. PostgresSaver needs a connection per thread. In production, use a connection pool (e.g.,
psycopg_pool) rather than a single connection. - Thread ID collision. If two concurrent requests share a thread ID, they'll corrupt each other's state. Use UUID4 for thread IDs in production.
FAQ
What is LangGraph used for?
LangGraph is used to build stateful, multi-step AI agents. It handles cyclic workflows, branching logic, and persistent state — things that LangChain's linear chains cannot express cleanly. Common use cases include ReAct agents with tool use, multi-agent orchestration, self-correcting code generators, and any workflow with human approval gates.
Is LangGraph better than CrewAI?
It depends on the use case. LangGraph gives you lower-level control over graph topology and state, making it better for complex, custom agents and production deployments. CrewAI is higher-abstraction and faster to prototype role-based multi-agent teams (researcher, writer, reviewer), but its ceiling is lower — you'll hit it when you need custom routing or production-grade reliability. Most teams start with CrewAI and migrate to LangGraph when they need more control.
Does LangGraph require LangChain?
LangGraph is built on top of LangChain's ecosystem and shares its message types, but you can use it with any LLM client. The graph primitives — StateGraph, nodes, edges — are independent of LangChain chains. You can call the OpenAI SDK directly inside a node without using any LangChain abstractions.
How does LangGraph persistence work?
LangGraph uses checkpointers to serialize the full graph state after every node execution. The default MemorySaver is in-process only and lost on restart. For production, use SqliteSaver (single server) or PostgresSaver (multi-instance, recommended). Checkpoints are keyed by thread_id, so you can resume any conversation or run from the last successful step — useful for long-running agents and recovery from failures.
What is human-in-the-loop in LangGraph?
Human-in-the-loop means pausing graph execution before a critical node — like a destructive tool call or write operation — so a human can approve or reject the pending action. LangGraph implements this via interrupt_before, which suspends the run at a specific node, serializes the full state to the checkpointer, and waits. A human reviews the pending action via your UI, approves or rejects, and the backend calls app.invoke(None, config) to resume from exactly where it paused. Requires a persistent checkpointer (SqliteSaver or PostgresSaver).