ReAct — Reason + Act

Early agent systems struggled to combine two essential capabilities: reasoning about a problem and acting in the external world.

Chain-of-Thought prompting enabled strong reasoning but kept the model isolated from real data and tools.
Pure tool-calling systems could act but often lacked structured thinking, leading to hallucinations and poor decision-making.

The ReAct framework, introduced in the 2022 paper “ReAct: Synergizing Reasoning and Acting in Language Models” (Yao et al.), solved this by tightly interleaving the two.

The ReAct Pattern

ReAct structures agent behavior as a repeating cycle:

Thought → Action → Observation → Thought → ...

Thought: The agent reasons about the current situation and decides what to do next.
Action: The agent calls a tool (search, calculator, database, etc.).
Observation: The agent receives the tool result and incorporates it into its reasoning.

This loop continues until the agent has enough information to produce a Final Answer.

ReAct Example

Task: “What is the population of the capital of Germany?”

Thought: The capital of Germany is Berlin. I need its current population.

Action: web_search("Berlin population 2026")

Observation: Berlin has an estimated population of 3.68 million as of 2026.

Thought: This answers the question directly.

Final Answer: The population of Berlin, the capital of Germany, is approximately 3.68 million.

Notice how reasoning guides the action, and the observation grounds the next thought. This tight integration reduces hallucinations and makes the agent’s process highly interpretable.

Why ReAct Works So Well

ReAct delivers three major advantages over earlier approaches:

Grounded Reasoning
Every claim is backed by fresh tool results rather than potentially outdated or fabricated knowledge.
Adaptive Problem Solving
The agent can adjust its strategy dynamically based on what it learns (e.g., “The first search returned incomplete data → try a different query or source”).
Improved Interpretability & Debugging
The explicit Thought → Action → Observation trace reads like human problem-solving, making it easier to understand, diagnose, and improve agent behavior.

ReAct vs Chain-of-Thought

Method	Focus	Strength	Limitation
Chain-of-Thought	Pure reasoning	Strong step-by-step logic	No access to external data
ReAct	Reasoning + Acting	Grounded, adaptive, interactive	Can still follow suboptimal paths

Chain-of-Thought example (isolated):

Thought: Germany’s capital is Berlin. Berlin has roughly 3.6 million people.
Final Answer: 3.6 million.

ReAct (grounded):

Thought: I should verify the latest population.
Action: web_search(...)
Observation: 3.68 million (2026 estimate).
Final Answer: ...

ReAct keeps reasoning honest by forcing interaction with the real world.

Implementing a Basic ReAct Loop

Here’s a minimal ReAct-style agent skeleton:

Python
Rust

def react_agent(goal: str, llm, tools, max_steps: int = 10):
    scratchpad = ""

    for step in range(max_steps):
        prompt = f"""
        Goal: {goal}

        Previous steps:
        {scratchpad}

        Respond in the format:
        Thought: <reasoning>
        Action: <tool_name> with args: <arguments>
        Or: Final Answer: <answer>
        """

        response = llm.generate(prompt)
        # Parse Thought, Action, or Final Answer

        if "Final Answer" in response:
            return extract_final_answer(response)

        thought, action = parse_thought_and_action(response)
        observation = tools.execute(action.name, action.args)

        scratchpad += f"Thought: {thought}\nAction: {action}\nObservation: {observation}\n\n"

    return "Max steps reached."

fn react_agent(goal: &str, llm: &LLMClient, tools: &ToolManager, max_steps: u32) -> String {
    let mut scratchpad = String::new();

    for _ in 0..max_steps {
        let prompt = format!(
            "Goal: {}\n\nPrevious steps:\n{}\n\nThought + Action or Final Answer:",
            goal, scratchpad
        );

        let response = llm.generate(&prompt);

        if let Some(answer) = extract_final_answer(&response) {
            return answer;
        }

        let (thought, action) = parse_response(&response);
        let observation = tools.execute(&action);

        scratchpad.push_str(&format!(
            "Thought: {}\nAction: {}\nObservation: {}\n\n",
            thought, action, observation
        ));
    }

    "Maximum steps reached.".to_string()
}

In practice, production implementations add parsing robustness, structured outputs (JSON mode), memory management, and termination logic.

ReAct in Modern Agent Frameworks

ReAct (or close variants) remains a foundational pattern in 2026. You’ll find it in:

LangChain / LangGraph — Core ReAct agents and graph-based extensions
CrewAI — Role-based agents with ReAct-style reasoning
AutoGen and many others

Many frameworks enhance the original pattern with better memory, multi-step planning, or parallel tool use, but the Thought → Action → Observation backbone is still widely used.

Strengths and Limitations

Strengths:

Simple yet effective for a wide range of tasks
Excellent grounding and reduced hallucination
Human-readable traces for debugging

Limitations:

Single-path reasoning (no exploration of multiple branches)
Can get stuck in local optima or inefficient loops
Less optimal for tasks requiring deep lookahead or complex orchestration

For more challenging problems, ReAct is often combined with higher-level planning techniques (e.g., Plan-and-Execute or Tree-of-Thoughts).

Looking Ahead

ReAct was a pivotal breakthrough because it showed that reasoning and acting are far more powerful together than apart.

It laid the groundwork for today’s sophisticated agent systems.

→ Continue to 3.3 — Chain-of-Thought Planning: Techniques that enhance reasoning depth before or alongside action.