September 7, 2025 in agents 4 minutes
A practical explanation of the ReAct framework—how LLMs combine reasoning, tool use, and observations to act as agents.
Large Language Models (LLMs) can be seen as text-in, text-out systems: they take textual input and produce textual output.
Recently, the idea of the AI agent has gained huge traction. In simple terms, an agent is more than just a model—it’s a system where an LLM dynamically directs its own reasoning, decides when to use external tools, and manages how tasks get accomplished. (The same intuition also extends to multimodal models, but we’ll focus here on text.)
For a concise introduction to agents versus workflows, I recommend Anthropic’s article: Building effective agents.
One general definition of an agent is:
Agents are systems where LLMs dynamically manage their own reasoning and tool use, maintaining control over how they accomplish tasks.
ReAct is one of the most influential frameworks in the development of modern AI agents, introduced by Yao et al. (2023): React: Synergizing Reasoning and Acting in Language Models (ICLR).
At its core, ReAct combines two processes:

In the original paper, ReAct is compared to other prompting strategies:

Notice in panel (d), the cycle alternates between Thought → Action → Observation. But how does this look in practice, given that an LLM is still just a text generator? Let’s break it down.
Modern LLMs can generate explicit “thinking” traces (sometimes hidden, sometimes shown). With a proper prompt, you can ask the model to produce reasoning steps before its final output. For example:
Thought: ... reasoning ... + Answer: ...This part is straightforward: it’s just instructing the model to verbalize its reasoning.
Here’s where agents become powerful. You can equip your system with external tools (e.g., a web search API, a calculator, or a database lookup).
To enable tool use, you extend the model’s prompt with:
{"query": "textual_query"} wrapped in special tokens like <tool_use> … </tool_use>)Example model output:
Thought: I need to look this up.
Action: <tool_use>{"query": "LLM ReAct framework"}</tool_use>
Your code can then parse the JSON between <tool_use> tags and call the actual tool. If the model decides not to use a tool, it simply produces a direct answer.
After a tool call, the system receives results (the observation). These are stored as text and appended back into the model’s context.
A ReAct agent runs inside a loop:
Initial input:
system prompt (with tool instructions) + user query
Model output:
Thought + Action (→ tool call), orThought + Answer (→ stop, task complete)If Action:
ObservationNext iteration input:
prompt + query + thought1 + action1 + observation1
Repeat until:
Answer (final)The vanilla ReAct loop works, but has limitations:
✦ That’s the essence of how a ReAct agent works in practice.
Future posts will dive into improvements like memory compression, evaluation strategies, and deployment tips.