Home Learn AI just beat a human test for creativity. What does that even mean?

AI just beat a human test for creativity. What does that even mean?

AI just beat a human test for creativity. What does that even mean?

AI is recuperating at passing tests designed to measure human creativity. In a study published in Nature Scientific Reports today, AI chatbots achieved higher average scores than humans within the Alternate Uses Task, a test commonly used to evaluate this ability. 

This study will add fuel to an ongoing debate amongst AI researchers about what it even means for a pc to pass tests devised for humans. The findings don’t necessarily indicate that AIs are developing a capability to do something uniquely human. It could just be that AIs can pass creativity tests, not that they’re actually creative in the best way we understand. Nonetheless, research like this might give us a greater understanding of how humans and machines approach creative tasks.

Researchers began by asking three AI chatbots—OpenAI’s ChatGPT and GPT-4 in addition to Copy.Ai, which is built on GPT-3—to give you as many uses for a rope, a box, a pencil, and a candle as possible inside just 30 seconds. 

Their prompts instructed the big language models to give you original and artistic uses for every of the items, explaining that the standard of the ideas was more necessary than the amount. Each chatbot was tested 11 times for every of the 4 objects. The researchers also gave 256 human participants the identical instructions.

The researchers used two methods to evaluate each AI and human responses. The primary was an algorithm that rated how closely the suggested use for the article was to the article’s original purpose. The second involved asking six human assessors (who were unaware that a number of the answers had been generated by AI systems) to judge each response on a scale of 1 to five by way of how creative and original it was—1 being in no way, and 5 being very. Average scores for each humans and AIs were then calculated. 

Although the chatbots’ responses were rated as higher than the humans’ on average, the best-scoring human responses were higher.

While the aim of the study was to not prove that AI systems are able to replacing humans in creative roles, it raises philosophical questions on the characteristics which can be unique to humans, says Simone Grassini, an associate professor of psychology on the University of Bergen, Norway, who co-led the research.

“We’ve shown that up to now few years, technology has taken a really big breakthrough after we speak about imitating human behavior,” he says. “These models are constantly evolving.” 

Proving that machines can perform well in tasks designed for measuring creativity in humans doesn’t display that they’re able to anything approaching original thought, says Ryan Burnell, a senior research associate on the Alan Turing Institute, who was not involved with the research.

The chatbots that were tested are “black boxes,” meaning that we don’t know exactly what data they were trained on, or how they generate their responses, he says. “What’s very plausibly happening here is that a model wasn’t coming up with latest creative ideas—it was just drawing on things it’s seen in its training data, which could include this exact Alternate Uses Task,” he explains. “In that case, we’re not measuring creativity. We’re measuring the model’s past knowledge of this sort of task.”

That doesn’t mean that it’s not still useful to check how machines and humans approach certain problems, says Anna Ivanova, an MIT postdoctoral researcher studying language models, who didn’t work on the project. 

Nonetheless, we must always keep in mind that although chatbots are excellent at completing specific requests, slight tweaks like rephrasing a prompt may be enough to stop them from performing as well, she says. Ivanova believes that these sorts of studies should prompt us to look at the link between the duty we’re asking AI models to finish and the cognitive capability we’re attempting to measure. “We shouldn’t assume that folks and models solve problems in the identical way,” she says.


Please enter your comment!
Please enter your name here