Large Language Models(LLMs) have taken center stage in a world where technology is making leaps and bounds. These LLMs are incredibly sophisticated computer programs that may understand, generate, and interact with a human language in a remarkably natural way. In recent research, an progressive embodied conversational agent generally known as FurChat has been unveiled. LLMs like GPT-3.5 have pushed the boundaries of what’s possible in natural language processing. They’ll understand context, answer questions, and even generate text that looks like it’s written by a traditional human being. This powerful capability has opened doors to countless opportunities in various domains like robotics.
Researchers at Heriot-Watt University and Alana AI Propose FurChat, a revolutionary system that may function as a receptionist, engage in dynamic conversions, and convey emotions through facial expressions. Furchat’s deployment on the National Robotarium exemplifies its transformative potential, facilitating natural conversations with visitors and offering various information on facilities, news, research, and upcoming events.
Furhat robot, a humanoid robotic bust has a three-dimensional mask that closely resembles a human face and employs a micro projector to project an animated facial features onto this mask. The robot is mounted on a monitored platform that permits its head to maneuver and nod, enhancing its lifelike interactions. To facilitate communication, Furhat is supplied with a microphone array and speakers, enabling it to acknowledge and reply to human speech.
Its system is designed for seamless applications. Dialogue Management involves three most important components: NLU, DM, and a custom database. NLU analyzes incoming text, classifies intents, and assesses confidence. DM maintains conversational flow, sends prompts to LLM, and processes responses. A custom database is created by web-scraping the Nation Robotarium’s website, which provides data relevant to user intents. Prompt engineering ensures natural responses from LLM. It combines a couple of shot-learning and prompt-learning techniques to generate context-aware replies. Gesture parsing leverages Furhat SDK’s facial gestures and LLM’s sentiment recognition from text to synchronize facial expressions with speech, creating an immersive interaction. Amazon Polly is used for text-to-speech conversion, which is on the market in FurhatOS.
In the long run, researchers are gearing as much as expand its capabilities. They’ve their sights set on enabling multiuser interactions, an area of lively research in the sector of receptionist robots. Moreover, to tackle the difficulty posed by hallucinations in language models, they plan to explore strategies similar to finetuning the language model and experimenting with direct conversation generation, reducing reliance on NLU components. A major milestone for the researchers is the demonstration of FurChat on the Sigdial conference. It’ll function a platform to show the system’s capabilities to a broader audience of peers and experts.
Take a look at the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to affix our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.
In case you like our work, you’ll love our newsletter..
Astha Kumari
” data-medium-file=”https://www.marktechpost.com/wp-content/uploads/2023/07/1689434294478-2-Astha-Kumari-225×300.jpg” data-large-file=”https://www.marktechpost.com/wp-content/uploads/2023/07/1689434294478-2-Astha-Kumari-768×1024.jpg”>
Astha Kumari is a consulting intern at MarktechPost. She is currently pursuing Dual degree course within the department of chemical engineering from Indian Institute of Technology(IIT), Kharagpur. She is a machine learning and artificial intelligence enthusiast. She is keen in exploring their real life applications in various fields.