Google DeepMind has used a big language model to crack a famous unsolved problem in pure mathematics. In a paper published in Nature today, the researchers say it’s the primary time a big language model has been used to find an answer to a long-standing scientific puzzle—producing verifiable and worthwhile recent information that didn’t previously exist. “It’s not within the training data—it wasn’t even known,” says coauthor Pushmeet Kohli, vice chairman of research at Google DeepMind.
Large language models have a repute for making things up, not for providing recent facts. Google DeepMind’s recent tool, called FunSearch, could change that. It shows that they will indeed make discoveries—in the event that they are coaxed just so, and should you throw out the vast majority of what they provide you with.
FunSearch (so called since it searches for mathematical functions, not since it’s fun) continues a streak of discoveries in fundamental math and computer science that DeepMind has made using AI. First AlphaTensor found a technique to speed up a calculation at the guts of many alternative sorts of code, beating a 50-year record. Then AlphaDev found ways to make key algorithms used trillions of times a day run faster.
Yet those tools didn’t use large language models. Built on top of DeepMind’s game-playing AI AlphaZero, each solved math problems by treating them as in the event that they were puzzles in Go or chess. The difficulty is that they’re stuck of their lanes, says Bernardino Romera-Paredes, a researcher at the corporate who worked on each AlphaTensor and FunSearch: “AlphaTensor is great at matrix multiplication, but mainly nothing else.”
FunSearch takes a unique tack. It combines a big language model called Codey, a version of Google’s PaLM 2 that’s fine-tuned on computer code, with other systems that reject incorrect or nonsensical answers and plug good ones back in.
“To be very honest with you, we have now hypotheses, but we don’t know exactly why this works,” says Alhussein Fawzi, a research scientist at Google DeepMind. “To start with of the project, we didn’t know whether this could work in any respect.”
The researchers began by sketching out the issue they wanted to resolve in Python, a preferred programming language. But they ignored the lines in this system that might specify solve it. That’s where FunSearch is available in. It gets Codey to fill within the blanks—in effect, to suggest code that can solve the issue.
A second algorithm then checks and scores what Codey comes up with. The very best suggestions—even when not yet correct—are saved and given back to Codey, which tries to finish this system again. “Many will probably be nonsensical, some will probably be sensible, and just a few will probably be truly inspired,” says Kohli. “You’re taking those truly inspired ones and also you say, ‘Okay, take these ones and repeat.’”
After a few million suggestions and just a few dozen repetitions of the general process—which took just a few days—FunSearch was in a position to provide you with code that produced an accurate and previously unknown solution to the cap set problem, which involves finding the most important size of a certain form of set. Imagine plotting dots on graph paper. The cap set problem is like attempting to determine what number of dots you’ll be able to put down without three of them ever forming a straight line.
It’s super area of interest, but essential. Mathematicians don’t even agree on solve it, let alone what the answer is. (It is usually connected to matrix multiplication, the computation that AlphaTensor found a technique to speed up.) Terence Tao on the University of California, Los Angeles, who has won most of the top awards in mathematics, including the Fields Medal, called the cap set problem “perhaps my favorite open query” in a 2007 blog post.
Tao is intrigued by what FunSearch can do. “It is a promising paradigm,” he says. “It’s an interesting technique to leverage the facility of huge language models.”
A key advantage that FunSearch has over AlphaTensor is that it might, in theory, be used to seek out solutions to a wide selection of problems. That’s since it produces code—a recipe for generating the answer, somewhat than the answer itself. Different code will solve different problems. FunSearch’s results are also easier to grasp. A recipe is commonly clearer than the weird mathematical solution it produces, says Fawzi.
To check its versatility, the researchers used FunSearch to approach one other hard problem in math: the bin packing problem, which involves attempting to pack items into as few bins as possible. This is vital for a spread of applications in computer science, from data center management to e-commerce. FunSearch got here up with a technique to solve it that’s faster than human-devised ones.
Mathematicians are “still attempting to determine one of the best technique to incorporate large language models into our research workflow in ways in which harness their power while mitigating their drawbacks,” Tao says. “This definitely indicates one possible way forward.”