Artificial Intelligence is rapidly popularizing and for all good reasons. With the introduction of Large Language Models like GPT, BERT, and LLaMA, almost every industry, including healthcare, finance, E-commerce, and media, is making use of those models for tasks like Natural Language Understanding (NLU), Natural Language Generation (NLG), query answering, programming, information retrieval and so forth. The very famous ChatGPT, which has been within the headlines ever since its release, has been built with the GPT 3.5 and GPT 4’s transformer technology.
These AI systems imitating humans are heavily depending on the event of agents which might be able to exhibiting problem-solving abilities much like humans. The three primary approaches for developing agents that may address complex interactive reasoning tasks are – Deep Reinforcement Learning (RL), which involves training agents through a strategy of trial and error, Behavior Cloning (BC) through Sequence-to-Sequence (seq2seq) Learning which involves training agents by imitating the behavior of expert agents and Prompting LLMs wherein generative agents based on prompting LLMs produce reasonable plans and actions for complex tasks.
RL-based and seq2seq-based BC approaches have some limitations, resembling task decomposition, inability to take care of long-term memory, generalization to unknown tasks, and exception handling. On account of repeated LLM inference at every time step, the prior approaches are also computationally expensive.
Recently, a framework called SWIFTSAGE has been proposed to handle these challenges and enable agents to mimic how humans solve complex, open-world tasks. SWIFTSAGE goals to integrate the strengths of behavior cloning and prompt LLMs to reinforce task completion performance in complex interactive tasks. The framework draws inspiration from the twin process theory, which suggests that human cognition involves two distinct systems: System 1 and System 2. System 1 involves rapid, intuitive, and automatic pondering, while System 2 entails methodical, analytical, and deliberate thought processes.
The SWIFTSAGE framework consists of two modules – the SWIFT module and the SAGE module. Just like System 1, the SWIFT module represents quick and intuitive pondering. It’s implemented as a compact encoder-decoder language model that has been fine-tuned on the motion trajectories of an oracle agent. The SWIFT module encodes short-term memory components like previous actions, observations, visited locations, and the present environment state, followed by decoding the following individual motion, thus aiming to simulate the rapid and instinctive decision-making process shown by humans.
The SAGE module, however, imitates thought processes much like System 2 and utilizes LLMs resembling GPT-4 for subgoal planning and grounding. Within the strategy planning stage, LLMs are prompted to locate essential items, plan, track subgoals, and detect and rectify potential mistakes, while within the grounding stage, LLMs are employed to remodel the output subgoals derived from the strategy planning stage right into a sequence of executable actions.
The SWIFT and SAGE modules have been integrated through a heuristic algorithm that determines when to activate or deactivate the SAGE module and the best way to mix the outputs of each modules using an motion buffer mechanism. Unlike previous methods that generate only the immediate next motion, SWIFTSAGE engages in longer-term motion planning.
For evaluating the performance of SWIFTSAGE, experiments have been conducted on 30 tasks from the ScienceWorld benchmark. The outcomes have shown that SWIFTSAGE significantly outperforms other existing methods, resembling SayCan, ReAct, and Reflexion. It achieves higher scores and demonstrates superior effectiveness in solving complex real-world tasks.
In conclusion, SWIFTSAGE is a promising framework that mixes the strengths of behavior cloning and prompting LLMs. It thus might be really useful in enhancing motion planning and improving performance in complex reasoning tasks.
Check Out The Paper, Github link, and Project Page. Don’t forget to hitch our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the newest AI research news, cool AI projects, and more. If you’ve any questions regarding the above article or if we missed anything, be at liberty to email us at Asif@marktechpost.com
🚀 Check Out 100’s AI Tools in AI Tools Club
Tanya Malhotra is a final 12 months undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and important pondering, together with an ardent interest in acquiring recent skills, leading groups, and managing work in an organized manner.