In 1977, at the University of Massachusetts, Amherst, researcher Andrew Barto began exploring a revolutionary idea: that neurons functioned like hedonists, striving to maximize pleasure and minimize pain. This concept, that the human brain is driven by billions of such cells, became the foundation for significant advancements in artificial intelligence.
A year later, Richard Sutton joined Barto, and together they sought to explain human intelligence using this simple, yet powerful, concept. Their collaboration led to the development of “reinforcement learning,” a method for AI systems to learn through digital rewards and punishments.
Recently, the Association for Computing Machinery (ACM) – the world’s foremost organization for computing professionals – announced that Drs. Barto and Sutton are the recipients of this year’s prestigious Turing Award. Established in 1966, the Turing Award is widely recognized as the Nobel Prize of computing. The two scientists will equally share the $1 million prize.
Over the past decade, reinforcement learning has been essential in the remarkable advances in artificial intelligence. It is the driving force behind technologies such as Google’s AlphaGo and OpenAI’s ChatGPT, both of which have captured the world’s technological imagination. These powerful systems owe their existence to the foundational work of Dr. Barto and Dr. Sutton.
“They are the undisputed pioneers of reinforcement learning,” stated Oren Etzioni, a professor emeritus of computer science at the Allen Institute for Artificial Intelligence. “They conceived the key ideas and also literally wrote the book on the subject.”
Their seminal work, “Reinforcement Learning: An Introduction,” first published in 1998, continues to serve as the definitive text on the subject. Many experts believe the full potential of reinforcement learning is only beginning to be realized.