Saturday, January 18, 2025

How machines that can solve complex math problems might usher in more powerful AI

But the news item that really stood out to me was one that didn’t get as much attention as it should have. It has the potential to usher in more powerful AI and scientific discovery than previously possible. 

Last Thursday, Google DeepMind announced it had built AI systems that can solve complex math problems. The systems—called AlphaProof and AlphaGeometry 2—worked together to successfully solve four out of six problems from this year’s International Mathematical Olympiad, a prestigious competition for high school students. Their performance was the equivalent of winning a silver medal. It’s the first time any AI system has ever achieved such a high success rate on these kinds of problems. My colleague Rhiannon Williams has the news here

Math! I can already imagine your eyes glazing over. But bear with me. This announcement is not just about math. In fact, it signals an exciting new development in the kind of AI we can now build. AI search engines that you can chat with may add to the illusion of intelligence, but systems like Google DeepMind’s could improve the actual intelligence of AI. For that reason, building systems that are better at math has been a goal for many AI labs, such as OpenAI.  

That’s because math is a benchmark for reasoning. To complete these exercises aimed at high school students, the AI system needed to do very complex things like planning to understand and solve abstract problems. The systems were also able to generalize, allowing them to solve a whole range of different problems in various  branches of mathematics. 

“What we’ve seen here is that you can combine [reinforcement learning] that was so successful in things like AlphaGo with large language models and produce something which is extremely capable in the space of text,” David Silver, principal research scientist at Google DeepMind and indisputably a pioneer of deep reinforcement learning, said in a press briefing. In this case, that capability was used to construct programs in the computer language Lean that represent mathematical proofs. He says the International Mathematical Olympiad represents a test for what’s possible and paves the way for further breakthroughs. 

This same recipe could be applied in any situation with really clear, verified reward signals for reinforcement-learning algorithms and an unambiguous way to measure correctness as you can in mathematics, said Silver. One potential application would be coding, for example. 

Now for a compulsory reality check: AlphaProof and AlphaGeometry 2 can still only solve hard high-school-level problems. That’s a long way away from the extremely hard problems top human mathematicians can solve. Google DeepMind stressed that its tool did not, at this point, add anything to the body of mathematical knowledge humans have created. But that wasn’t the point. 

“We are aiming to provide a system that can prove anything,” Silver said. Think of an AI system as reliable as a calculator, for example, that can provide proofs for many challenging problems, or verify tests for computer software or scientific experiments. Or perhaps build better AI tutors that can give feedback on exam results, or fact-check news articles. 

Related Articles

Latest Articles