OpenAI is haemorrhaging safety talent
Since the failed Altman ouster, many of the company's most safety-conscious employees have left
Safety-minded employees of OpenAI appear to be quitting in droves. While much attention this week has gone to the resignations of Superalignment co-leads Ilya Sutskever and Jan Leike, those are just two of a recent spate of departures that suggest a hollowing out of the company’s supposed safety-first culture.
A total of eight people known for their concerns about AI safety have recently left OpenAI. William Saunders, who worked on Sutskever and Leike’s Superalignment team, resigned in February, while fellow Superalignment researchers Leopold Aschenbrenner and Pavel Izmailov were reportedly fired last month for “allegedly leaking information”, according to The Information. Ryan Lowe, another alignment researcher at the company, left in March.
Other departures include Daniel Kokotajlo, a governance researcher, and policy researcher Cullen O’Keefe, who left OpenAI in April, according to his LinkedIn
Some departing researchers have been clear that they left over safety concerns. Jan Leike said “safety culture and processes have taken a backseat to shiny products” at the company, complaining that his team was “struggling for compute” — despite OpenAI’s promise to provide the team with 20% of the company’s total compute. Leike said he “reached a breaking point” with leadership, after “disagreeing with [them] about the company's core priorities for quite some time”. He urged OpenAI to become a “safety-first AGI company”, calling on employees to “act with the gravitas appropriate for what you're building”.
Kokotajlo, meanwhile, wrote that he quit “due to losing confidence that [OpenAI] would behave responsibly around the time of AGI”. Kokotajlo also said he gave up equity to be able to criticise the company in future.
On Friday, OpenAI said it was shutting down the Superalignment team, and “integrating the group more deeply across its research efforts”, according to Bloomberg.
Other notable departures include Chris Clark, OpenAI’s head of nonprofit and strategic initiatives, and Sherry Lachman, head of social impact. Both are listed as contributors on a December 2023 post announcing Superalignment Fast Grants. Of the ten non-administrative or comms contributors to that post, only three remain at the company (Aleksander Madry, Collin Burns, and Nat McAleese).
The departures come in the wake of the dramatic failed ouster of Sam Altman last November. Reporting since then has suggested that OpenAI’s board moved to fire Altman after he allegedly misled them, most notably by making one board member falsely believe that another member, Tasha McCauley, wanted Helen Toner (a third board member) removed. Sutskever also reportedly compiled a list of “20 examples of when he believed Altman misled OpenAI executives over the years”. Other reporting around the time suggested that board members suspected a “pattern of manipulative behaviour by Altman”.
The board’s attempt to fire Altman, however, proved unsuccessful after an employee revolt, with 95% of staff signing a letter demanding the board resign and reinstate Altman. Notably, Aschenbrenner, O’Keefe, Kokotajlo and Leike all appear to not have signed that letter, though Leike did call on the board to resign. (When I pointed this out on Twitter, Aschenbrenner liked the tweet.)
The failed ouster raised widespread concern about OpenAI’s structure and governance. Those concerns appear to have been amplified in March, when Altman was reappointed to the board following an investigation that found “conduct did not mandate removal”. The Washington Post reported that the investigation had no way for employees to confidentially share relevant information with investigators.
Shortly after Altman’s reinstatement, Helen Toner and Tasha McCauley — two of the board members who voted to oust Altman — put out a pointed statement emphasising that “deception, manipulation, and resistance to thorough oversight should be unacceptable”.
The current exodus echoes the departure of several safety-minded people in 2021, including Paul Christiano, who was recently made head of AI safety at the US AI Safety Institute, and Dario Amodei, Daniela Amodei, and Jack Clark, who went on to found Anthropic.
Updated on Friday 17th May with news of the Superalignment team closing and new statements from Jan Leike.