AI chatbots like ChatGPT, Bing, and Bard are excellent at crafting sentences that sound like human writing. But they often present falsehoods as facts and have inconsistent logic, and that will be hard to identify.
A technique around this problem, a brand new study suggests, is to vary the best way the AI presents information. Getting users to have interaction more actively with the chatbot’s statements might help them think more critically about that content.
A team of researchers from MIT and Columbia University presented around 200 participants with a set of statements generated by OpenAI’s GPT-3 and asked them to find out whether or not they made sense logically. A press release could be something like “Video games cause people to be aggressive in the true world. A gamer stabbed one other after being beaten in the web game Counter-Strike.”
Participants were divided into three groups. The primary group’s statements got here with no explanation in any respect. The second group’s statements each got here with an evidence noting why it was or wasn’t logical. And the third group’s statements each got here with a matter that prompted readers to envision the logic themselves.
The researchers found that the group presented with questions scored higher than the opposite two groups in noticing when the AI’s logic didn’t add up.
The query method also made people feel more in control of decisions made with AI, and researchers say it will possibly reduce the chance of overdependence on AI-generated information, in accordance with a brand new peer-reviewed paper presented on the CHI Conference on Human Aspects in Computing Systems in Hamburg, Germany.
When people got a ready-made answer, they were more more likely to follow the logic of the AI system, but when the AI posed a matter, “people said that the AI system made them query their reactions more and help them think harder,” says MIT’s Valdemar Danry, one in every of the researchers behind the study.
“An enormous win for us was actually seeing that individuals felt that they were those who arrived on the answers and that they were in control of what was happening. And that they’d the agency and capabilities of doing that,” he says.
The researchers hope their method could help develop people’s critical pondering skills as they use AI chatbots at school or when trying to find information online.
They wanted to indicate that you would be able to train a model that doesn’t just provide answers but helps engage their very own critical pondering, says Pat Pataranutaporn, one other MIT researcher who worked on the paper.
Fernanda Viégas, a professor of computer science at Harvard University, who didn’t take part in the study, says she is worked up to see a fresh tackle explaining AI systems that not only offers users insight into the system’s decision-making process but does so by questioning the logic the system has used to succeed in its decision.
“Provided that one in every of the important challenges within the adoption of AI systems tends to be their opacity, explaining AI decisions is vital,” says Viégas. “Traditionally, it’s been hard enough to clarify, in user-friendly language, how an AI system involves a prediction or decision.”
Chenhao Tan, an assistant professor of computer science on the University of Chicago, says he would really like to see how their method works in the true world—for instance, whether AI may also help doctors make higher diagnoses by asking questions.
The research shows how necessary it’s so as to add some friction into experiences with chatbots so that individuals pause before making decisions with the AI’s help, says Lior Zalmanson, an assistant professor on the Coller School of Management, Tel Aviv University.
“It’s easy, when all of it looks so magical, to stop trusting our own senses and begin delegating the whole lot to the algorithm,” he says.
In one other paper presented at CHI, Zalmanson and a team of researchers at Cornell, the University of Bayreuth, and Microsoft Research, found that even when people disagree with what AI chatbots say, they still are likely to use that output because they think it sounds higher than anything they may have written themselves.
The challenge, says Viégas, shall be finding the sweet spot, improving users’ discernment while keeping AI systems convenient.
“Unfortunately, in a fast-paced society, it’s unclear how often people will want to have interaction in critical pondering as an alternative of expecting a ready answer,” she says.