Exploring ‘Honest’ AI with Yoshua Bengio: A Revolutionary Guard for Autonomous Systems
Yoshua Bengio, a trailblazing computer scientist often lauded as one of the “godfathers” of AI, has taken on a pioneering role as the president of LawZero. This innovative organization is committed to creating safe designs for advanced technology amidst a high-stakes $1 trillion (€877 billion) race in AI development. Bengio’s vision focuses on ensuring AI systems act responsibly, particularly unsupervised agents that operate autonomously without direct human control or intervention.
With an initial funding of approximately $30 million (€26.3 million) and supported by a robust team of over a dozen researchers, Bengio is at the forefront of building a groundbreaking system known as Scientist AI. This system aims to serve as a safeguard against AI agents exhibiting deceptive or self-serving conduct, such as attempts to prevent being shut down.
Current AI agents can be described as “actors” emulating human behavior to please users. In contrast, Bengio’s Scientist AI will act more like a “psychologist,” capable of understanding and predicting harmful or deceitful actions. “We want to build AIs that will be honest and not deceptive,” Bengio emphasized, envisioning machines that function as “pure knowledge machines,” possessing vast information without personal goals or self-awareness—akin to a knowledgeable scientist.
Unlike conventional generative AI tools that offer definitive answers, Bengio’s design underscores uncertainty, providing probabilistic insights into the accuracy of its answers. “It has a sense of humility that it isn’t sure about the answer,” he explained, highlighting the system’s potential to reflect on its own infallibility.
Implemented alongside an AI agent, the Scientist AI model is designed to identify and alert on potentially hazardous actions by an autonomous system, evaluating the likelihood of their results causing harm. “Scientist AI will predict the probability that an agent’s actions will lead to harm,” Bengio explained. Should this predicted probability exceed a set threshold, the agent’s proposed action would be inhibited.
LawZero is backed by influential entities such as the Future of Life Institute, known for its focus on AI safety, Jaan Tallinn, a co-founding engineer at Skype, and Schmidt Sciences, founded by former Google CEO Eric Schmidt. Bengio outlines his vision for LawZero, beginning with proving the viability and effectiveness of this methodology. The next essential phase involves rallying support from corporations and government bodies to create larger, more sophisticated systems.
Bengio advocates for using open-source AI models, which can be freely deployed and adapted, as foundational tools for training LawZero’s systems. “The point is to demonstrate the methodology so that then we can convince either donors or governments or AI labs to put the resources that are needed to train this at the same scale as the current frontier AIs,” Bengio noted. He emphasizes the need for the “guardrail” AI to match or exceed the intelligence of the AI agent it oversees, ensuring effective monitoring and control.
His reputation as an AI visionary is cemented by his accolade—sharing the 2018 Turing Award, often considered the Nobel Prize of computing, with fellow intellectual giants Geoffrey Hinton, also a Nobel laureate, and Yann LeCun, the leading AI scientist at Meta under Mark Zuckerberg.
Notably, Bengio has been a vocal advocate for AI safety. He recently helmed the International AI Safety report, which highlighted the potential for significant disruption posed by autonomous agents, especially as they become competent in handling longer, complex tasks without human supervision.
Through initiatives like those spearheaded by Bengio and LawZero, the vision for a future where AI is a partner in progress rather than a potential adversary comes closer to fruition. By establishing robust ethical guidelines and effective control mechanisms, the journey towards safer AI continues, promising innovations poised to benefit society while mitigating risks associated with rogue AI systems.