{"id":2863485,"date":"2023-09-04T10:00:03","date_gmt":"2023-09-04T14:00:03","guid":{"rendered":"https:\/\/wordpress-1016567-4521551.cloudwaysapps.com\/plato-data\/react-reasoning-and-acting-augments-llms-with-tools-kdnuggets\/"},"modified":"2023-09-04T10:00:03","modified_gmt":"2023-09-04T14:00:03","slug":"react-reasoning-and-acting-augments-llms-with-tools-kdnuggets","status":"publish","type":"station","link":"https:\/\/platodata.io\/plato-data\/react-reasoning-and-acting-augments-llms-with-tools-kdnuggets\/","title":{"rendered":"ReAct, Reasoning and Acting augments LLMs with Tools! – KDnuggets"},"content":{"rendered":"

Short for Reasoning and Acting, this paper<\/a> introduces a new concept that improves the performance of LLMs and also provides us with more explainability and interpretability.<\/p>\n

The goal of AGI could be one of the most important goals for human civilization to achieve. Imagine creating artificial intelligence that could generalize to many problems. There are many interpretations of what an AGI is, and when do we say that we have achieved it?<\/p>\n

The most promising method for AGI in the last decades was the reinforcement learning path, more specifically what DeepMind was able to achieve hard tasks, AlphaGo, AlphaStar and so many breakthroughs\u2026<\/p>\n

However, ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples.<\/p>\n

With this kind of result (of course, provided there is no data leakage and we can trust the evaluation methods provided in the paper), we can no longer ignore LLMs\u2019 potential to reason and divide complex tasks into logical steps.<\/p>\n

This paper starts with the idea that LLMs so far are impressive in language understanding, they have been used to generate CoT (Chain of thought) to solve some problems, and they were also used for acting and plan generation.<\/p>\n

Although these two have been studied separately, the paper aims to combine both reasoning and acting in an interleaved manner to enhance LLM’s performance.<\/p>\n

The reason behind this idea is that if you think about how you, as a human, behave in order to execute some task.<\/p>\n

The first step is that you\u2019ll use \u201cinner Speech\u201d or you\u2019ll write down or communicate with yourself somehow, saying \u201cHow do I execute task X? to do task X I need to first do step 1 and then do step2 and so on\u201d<\/p>\n

More concretely, if you were to cook up a dish in the kitchen, you could ReAct something like this:<\/p>\n

\u201cNow that everything is cut, I should heat up the pot of water\u201d), to handle exceptions or adjust the plan according to the situation (\u201cI don\u2019t have salt, so let me use soy sauce and pepper instead\u201d), and to realize when external information is needed (\u201chow do I prepare dough? Let me search on the Internet\u201d).<\/p>\n

You can also act (open a cookbook to read the recipe, open the fridge, check ingredients) to support the reasoning and answer questions (\u201cWhat dish can I make right now?\u201d).<\/p>\n

This combination of both reasoning and acting is what makes humans learn and achieve tasks even under previously unseen circumstances or when faced with information uncertainties.<\/p>\n

Previous works demonstrated the capabilities of LLMs to reason, for example, Chain of Thought Prompting demonstrated that the model could come up with plans to answer questions in arithmetic, common sense, and symbolic reasoning.<\/p>\n

However, the model here is still a \u201cstatic black box\u201d because it uses its internal language representation to answer these questions, and this representation may not always be accurate or up-to-date which leads to fact hallucination (coming with facts from its own imagination) or error propagation (one error in the chain of thoughts propagates to a wrong answer).<\/p>\n

Without the ability to take some sort of action and update its knowledge, the model is limited.<\/p>\n

There have also been studies that employed LLMs to do actions based on language, these studies usually take in multimodal inputs (audio, text, and images), convert them to text, use the model to generate in-domain actions, and then use a controller to do these actions.<\/p>\n

Without the ability to plan some steps and reason about what to do, the model will simply output the wrong actions.<\/p>\n

The proposal of this paper is to combine both methods mentioned above. ReAct prompts LLMs to generate both verbal reasoning traces and actions pertaining to a task in an interleaved manner, which allows the model to perform dynamic reasoning to create, maintain, and adjust high-level plans for acting (reason to act), while also interacting with external environments (e.g., Wikipedia) to incorporate additional information into reasoning (act to reason).<\/p>\n

This is shown in the figure below:<\/p>\n

 <\/p>\n

\"ReAct,
Difference between Reason, Act and ReAct (Photo taken from the paper)<\/span> <\/p>\n

So in order to make the reasoning prompting better, they design an action space, which means three actions that the model is allowed to use when answering questions.<\/p>\n

This is done through a Wikipedia API that provides the following:<\/p>\n