The computing world celebrates the 2024 Association for Computing Machinery A.M. Turing Award, often dubbed the “Nobel Prize of computing.” The award recognizes Andrew Barto, professor emeritus at the University of Massachusetts Amherst, and Richard Sutton, professor of computer science at the University of Alberta, Canada, for their groundbreaking contributions to reinforcement learning (RL).
Reinforcement Learning: A Transformative Legacy
Barto and Sutton are lauded as pioneers of modern computational reinforcement learning. RL addresses the challenge of teaching agents how to act based on evaluative feedback. Their work has provided the foundational conceptual and algorithmic framework for RL, profoundly influencing the future of both artificial intelligence and decision-making systems.
The impact of RL spans multiple disciplines, including computer science (machine learning), engineering (optimal control), mathematics (operations research), neuroscience (optimal decision-making), psychology (classical and operant conditioning), and economics (rational choice theory). Researchers across these fields continue to be inspired by the work of Sutton and Barto.
NSF Funding: Fueling AI Innovation
Barto’s research was made possible by numerous U.S. National Science Foundation (NSF)-funded projects that supported AI research long before its recent explosion in popularity and funding. These grants, which came from programs such as the National Robotics Initiative, Robust Intelligence, and Artificial Intelligence and Cognitive Science, drove long-term, fundamental advances in machine learning we see today.
“Barto’s research exemplifies the power of foundational computational research that has not only advanced state-of-the-art decision-making machines and intelligent systems but has also provided critical insights into understanding intelligence itself,” noted Greg Hager, NSF assistant director for Computer and Information Science and Engineering.
Michael Littman, director for the NSF Division of Information and Intelligent Systems, echoed this sentiment: “Andy Barto’s work laid the foundation for modern reinforcement learning, influencing generations of researchers, including myself. His insights with Rich Sutton into how agents can learn and adapt in complex environments form the backbone of how automated behavior is generated in the field of artificial intelligence. Without his pioneering research, many of today’s — and tomorrow’s — AI breakthroughs wouldn’t be possible.”
Impact of Barto and Sutton’s Work
For decades, NSF has supported foundational AI research, with Barto’s work standing out as particularly influential. Barto and Sutton formalized key RL concepts over decades of research, beginning with Sutton’s time as Barto’s first doctoral student. Their collaboration continued when Sutton joined Barto at UMass Amherst, producing many of the foundational RL approaches that are still in use today.
Reinforcement learning methods, built on Sutton and Barto’s work, power a wide array of technologies, including:
- Chatbots: Conversational AI agents employ RL to answer questions helpfully and accurately, as demonstrated by ChatGPT and similar bots.
- Games: RL algorithms have enabled computer players to achieve world-class performance in games like Jeopardy and Go, even influencing top human players’ strategies.
- Robot motor skill learning: RL enables robots to learn, through trial and error, how to perform intricate tasks independently.
- Microprocessor layout and circuit design: RL systems make decisions about composing components that make up computer chips.
- Personalized recommendations: Online services such as Netflix and YouTube rely on RL techniques to tailor recommendations.
- Autonomous vehicles: RL models help self-driving cars navigate complex traffic environments.
- Supply chain optimization: RL systems learn how and where to store items so customers receive goods quickly and at a low cost.
- Algorithm design: Researchers have found novel solutions to long-standing problems using RL.
Breakthroughs in RL have spurred a multibillion-dollar industry, with major companies such as DeepMind and OpenAI relying on RL as a core technology. Moreover, numerous major tech firms now have dedicated RL research teams, and it’s increasingly recognized as a core topic of study. For example, RL was added to the Computer Science Standards of Learning for Virginia Public Schools earlier this year.
Bridging AI and Neuroscience
The work of Barto and Sutton extends beyond computer science and AI, forging crucial connections between RL and brain sciences, including cognitive science, psychology, and neuroscience. Their research has provided revolutionary insights into the mechanisms of learning, in both machines and the human brain.
One of their initial breakthroughs emerged in 1981 when they showed that temporal difference (TD) learning could describe certain learning behaviors that the then-existing Rescorla-Wagner model could not. This discovery opened the door to a new way of understanding how learning occurs. Building on this, a 1995 study found a link between the TD algorithm and the behavior of dopamine neurons in the brain. This provided the foundation for later experiments, confirming that TD learning accurately describes how dopamine influences reward-based learning.
The 2024 A.M. Turing Award, recognizing Barto and Sutton’s lifetime achievements, underscores the essential nature of sustained federal investment in basic research — the type of support that has fueled AI’s breakthroughs over the last four decades.