AI companies are falling short on their promises to the White House
A year on from making voluntary commitments to the White House, adherence is patchy
Last July, leading AI companies made voluntary commitments to the White House on AI safety. In what was seen at the time as a landmark event, the companies promised to conduct safety research, establish bug bounty programs, and red-team their models before deployment. According to President Biden, the commitments were “real” and “concrete”, and would help the industry “fulfill its fundamental obligation to Americans to develop safe, secure, and trustworthy technologies.”
But one year on, adherence to the commitments is patchy — raising questions about the efficacy of self-regulation.
Zach Stein-Perlman, an independent researcher who has assessed labs’ safety efforts for AI Lab Watch, said the companies have repeatedly failed to meet commitments. OpenAI’s “bug bounty” program excludes issues with its models, despite that being part of the commitments. Microsoft and Meta, meanwhile, do not appear to test their models for capabilities like the ability for a model to self-replicate — a task explicitly mentioned in the commitments to the White House. (When asked about this, a Meta spokesperson said that while the company doesn’t specifically test for self-replication, it does assess its models for autonomous cyber capabilities, which could be considered a precursor to self-replication.)
While most companies are technically meeting most commitments, the implementation is often weak, Stein-Perlman noted. Take bug bounties, for instance. Companies promised the White House they would set up incentives for third-parties to discover and report issues and vulnerabilities, but the language is so vague that “even a bug bounty that excludes many issues with models arguably complies with the commitment,” Stein-Perlman said. Google, Microsoft, and Meta’s bounty programs, for instance, technically meet the commitment but exclude important safety concerns such as jailbreaks or getting the model to help with misuse.
The bug bounties aren’t the only weak implementation. According to a recent Washington Post article, OpenAI rushed through internal testing for GPT-4o in one week. “They planned the launch after-party prior to knowing if it was safe to launch,” an OpenAI whistleblower told the Washington Post. “We basically failed at the process.”
Grant Fergusson of EPIC, a digital policy advocacy group, concurred with Stein-Perlman on the weakness of voluntary commitments. “By and large, it seems like they have put together the trappings of really responsible AI development and use, without a lot of the follow-through,” he said.
Fergusson noted that poor transparency from the companies makes it hard to track how well they are complying with the commitments. Even for companies claiming they did comprehensive red-teaming, “we don't actually know the results of their red teaming, and we don't know exactly what they are testing for when they are red teaming.”
Despite these shortcomings, voluntary commitments continue to be the primary mechanism by which AI safety measures are implemented. In May, companies promised to create responsible scaling policies, which outline how labs will assess risk and thresholds at which they will mitigate risk. However, there are still no binding rules in the US to ensure follow-through.
Several people told Transformer that the voluntary commitments were a good first step, offering speed and flexibility compared to legislation and allowing policy makers to see prototypes of potential policies in action. Companies have also expressed support for the White House’s approach. “The White House commitments have really served as a foundation for a lot of the work here with policy makers, but also with governments globally,” Amazon VP of Public Policy Shannon Kellogg recently told the Washington AI Network.
However, many are sceptical about the sufficiency of voluntary measures. “I don't think we can rely solely on voluntary testing and evaluations indefinitely,” said Chris Painter from METR, a research nonprofit focused on AI evaluation. “Eventually we're going to need external assessments with teeth.”
Disclosure: The author’s partner works at Google DeepMind.