In the original ReAct paper ([2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models), the model “thinks” (reasons) and “acts” (calls tools) in an interleaved manner.
Here are the original (few-shot) ReAct prompts for HotpotQA: https://raw.githubusercontent.com/ysymyth/ReAct/refs/heads/master/prompts/prompts_naive.json
From the above, you can see that the “thoughts” appear before the “actions”.
However, in all the (official) articles on tool calling I’m finding, there is no provision to make the LLM “think” first and then “act”.
For example, the following are the articles by OpenAI and HF on tool calling:
From the articles, I inferred that the model EITHER outputs a content
attribute (text) OR a tool_calls
attribute (structured JSON output).
But what I want are two consecutive outputs: content
(“thought”) → tool_calls
(“action”).
Basically, I want to faithfully re-implement the original ReAct paper (interleaving thoughts and actions). The ReAct loop looks like “thought” → “action” → “thought” → “action” → … → until a “stop action” is called.
By the way, neither smolagents
’ ToolCallingAgent
, nor LangGraph’s ReAct agent are faithful implementations of the original ReAct paper. In these libraries, the “agentic” loop looks like the following: “action” → “action” → “action” → … → until a “final answer” is generated. These libraries aren’t making the agent “think” before acting.
4 Likes
this is very interesting.
at first glance this look as if they a building out a GoT(graph of thought) and allowing tools to be implemented as nodes and apart of the reasoning processes. So each node can be interconnected and is apart of the reasoning chain. This would allow contextual(task) → thinking → action but the thinking and the tool would be directly interconnected in the overall thinking chain.
You could train a model to do this through emergence but i think that would be far more difficult without very specific training data or a graph embedding that guides the model on what should be reasoned over and how it should reason over it with tool nodes or symbolic tool nodes.
I am very interested in hearing more on this.
1 Like
I took a look at LangGraph. Quoting from this official LangGraph article:
In our basic ReAct agent there are only two nodes, one for calling the model and one for using tools…
In the above quote, a node “for calling the model” refers to a tool call request by the model. And a node “for using tools” refers to an actual tool call execution. (The result of the tool call execution becomes a new message with the role 'tool'
.)
Every time the model generates tokens, it generates tool call requests. In other words, LangGraph isn’t offering a graph of thoughts at all. It’s offering a graph of (i) tool call requests and (ii) tool call executions.
You can confirm this by taking a look at the while
loop in this official article: How to create a ReAct agent from scratch (Functional API). It looks like the following:
while True:
if not llm_response.tool_calls:
break
...
In other words, the moment the LLM’s response isn’t a tool call (i.e., it is actual text), the while
loop breaks. Generating text is interpreted as generating a “final answer”.
On the other hand, in the original ReAct paper, the loop looks like the following: “thought” → “action” → “thought” → “action” → … → until a “stop action” is called.
1 Like