Untitled

**In defence of Helen Toner, Adam D'Angelo, and Tasha McCauley**

I understand a common view in EA- or AI-governance land is that Toner, D'Angelo and McCauley (TDM for short) really messed things up at OpenAI, and AI, the fate of the world, etc. has gotten worse thanks to them. I am confident this is completely wrong: ex ante TDM have acquitted themselves with extraordinary ability and valor (instead of 'maybe-understandable maybe-not massive screw up); ex post, their achievements are consequential enough to vindicate the entire AI governance community as a whole.

I argue:
1) TDM's actions have left the situation at open AI as it stands considerably better than it would have been vs. the counterfactual where they did nothing.
2) In terms of the expected or realised good or bad outcomes, one should find the former pleasantly surprising and the latter essentially priced in, given the situation @ OpenAI was already very bad from a safety perspective.
3) Whether you're an 'honour and integrity-maxxer' or 'ruthless strategist', TDMs actions generally fare well-to-excellent by either light.

(Note: Anon mostly for wanting a quiet life. I have no inside info, nor any dog in the fight. Re. 'credentials', I don't work in AI gov, but am pretty experienced in an area which rewards strategic acumen and abundant cynicism, and I made some prescient calls on the 'story so far'. But as an anon this is little more than 'source: trust me bro', so you shouldn't unless what I argue persuades.)

**What went down**
I think the most accurate account has been given by Zvi and Gwern over on lesswrong (also NYT reporting). Basically: Altman attempted to knife Helen Toner to gain control of the OpenAIs board (i.e. with 3 to 2 Altman can appoint his allies to stack the board, knife McCauley later, etc.). Ilya baulked, briefly defected to the TDM 'safety faction', who then gained control themselves and fired Sam. All the subsequent events are widely reported. (My argument relies on this being ~the real story, so if you're sure that isn't what happened, you can stop reading).

This suggests Altman is a nasty piece of work, and definitely not the person you'd want running a major AI effort: highly machiavellian, power seeking, and desires to remove any check on his unilateral vision on how OpenAI should go. I take events like the reddit board reverse takeover, Paul Graham's 'complements' re. his superlative corporate knife-fighting skills, sparking a Microsoft/Google LLM arms race, and seemingly trying to keep the board mostly in the dark re. concerns around GPT-4 as further evidence of this.

Although my argument doesn't need quite such an extremely adverse judgement of Altman, I am mystified why some in EA/AI governance land think well of him: there's a huge stack of apparent negatives, and the only positive I can see is 'sometimes makes kinda the right noises re. AI safety when convenient to him' (cheap shot: SBF also made the right noises much more consistently...) But even if Altman is great, trying to depose an independent board member of the governing non-profit on the pretext of something they wrote which is adverse to OpenAI's commercial interest is wholly unacceptable.

**What things look like now**
Sam is (likely) back as CEO, and the board appears set to be 1 Sam ally, D'Angelo (1 person who got him kicked), and Summers presumably to act as balance. Tasha, Helen, and Ilya are out, and Greg and Sam do not return.

Comparing this board vs. the counterfactual where Altman got Toner deposed, this is a board which Altman has much less control over. Firstly, he's not on it, and secondly he now has one third rather than a majority clearly allied to him. Of course, this may change: I'd expect Altman to continue his attempts to gain control of board, and given his abilities he may well succeed sooner or later; even if not, maybe he can arrange 'facts on the ground' so the board is ineffectual at stopping him running the company as he pleases. Or maybe microsoft swoops in and essentially runs the show from now on.

But maybe none of these things, and maybe the board does grow into an effective check on Altman and those happy to race to the precipice. At least the good guys still have play at the table, instead of being effectively wiped out had TDM let Sam succeed in his coup. Sometimes that's the best you can hope for.

But was it? There's a bunch of other outcomes which seem bad, e.g.:
* Sam is back as CEO, with a lot of laudatory PR. If he is as bad as I think he is, that's not good.
* OpenAI staff breaking ~unanimously in Sam's favour.
* Blowback/distrust/ridicule/etc. against 'AI safety' 'EAs' 'Decels'. etc.

Essentially, yes: TDM had basically no chance of avoiding these downsides (so credible apparent lapses are basically 'no harm, no foul'), and even with avid monday morning quarterbacking it seems they basically took their best shot given the very adverse position they had, and the execution surpassed reasonable expectations.

**Why the bad things should have basically already been priced in**
The current drama suggests three main power blocks: The board (well, TDM), the staff, and investors. The latter two should have been reliably modelled in advance to have their main bottom line be "$" rather than "the mission" if push comes to big enough shove. And the potential shoves here were massive (e.g. staff becoming overnight millionaires if the equity vests, MS's Billions in investment). So if they got to vote between "keeping the talismanic CEO promising to keep the astonishingly good times rolling" vs. "mission/safety concerns" one should expect them to overwhelmingly back the former virtually no matter what.

But, at least ostensibly, and definitely by design (Altman's previous remarks on the importance of OpenAI's odd governance structure are ironic here) the board ultimately calls the shots, and is set up to be (unlike the latter two groups) insensitive to financial pressure. Although other parties (inc. outsiders like most the financial press) presume the power vested in them by the charter is essentially cosmetic PR or LARP, it does reserve some realpolitik teeth: a lot of resources seem to be tied up under 501(c)3, and although the board can't stop the staff emigrating nor investors starting a 'we couldn't GAF about safety' alternative AI firm, they can credibly threaten to torch a lot of 'shareholder value' already placed in OpenAI, so fight for concessions if these parties return to the 'inside option'.

In other words: board power is the only realistic leverage TDM could achieve, which they did (masterfully) and exploited as much as practicable. That a lot of people - insiders or observers - who don't care about AI safety despise or ridicule them (and perhaps by extension AI safety writ large) for acting by the lights of a cause they think is stupid (or opposed to their interests) is an inevitable cost of doing business. Primarily, the events revealed dynamics which were latent but pre-existing, more than galvanising or strengthening them.

**Doing benefit of hindsight better**
Humans are fallible, and hindsight is 20-20, so it can be easy to be too critical of actions in the fog of war which seem stupid once it clears. But one can err in the opposite direction: you can almost headcannon anything and anyone as a brilliant hero if you carefully stipulate what they could or could not have 'reasonably foreseen' or 'reasonably be led to believe'.

Another challenge is there are two perspectives available. One is the more archly/naively consequentialist version: OpenAI governance is an IRL game of diplomacy, so TDM should take an opportunity to do a 'stab' if the EV is high enough, etc. Another is the more principled 'honor at all costs' which seems in vogue among the rationalists: so TDM should act as paladins of the charter, so should take a stand even if they knew for sure it would be a futile and counter-productive gesture, and be kinda lawful-stupid in general, as something something pre-commitment something something UDT this is actually the best multiverse-wide policy, etc.

These perspectives usually converge: even if the optimal frequency of (e.g.) 'backstabbing' is not 'none' it is at most 'very rare'. As indeed TDM acquitted themselves well, their choices generally (but not universally) look somewhere between 'defensible' and 'exemplary' whichever way you prefer. This is best seen by looking at various 'why didn't they X' questions.

*Why didn't they give peace a chance?*
TDM could, presumably, have picked a lesser response to Altman than throwing him out completely. Perhaps they could have kicked Sam and Greg from the board but kept him as CEO, or they could have used their majority to appoint a bunch of safetyist members to block future coup attempts (or both). Or they could have called a meeting with Sam to discuss his behaviour in the hopes of sorting things out rather than the hasty 'counter-coup' they went for. Perhaps it would have been in vain, but there's option value, and maybe gives them a better position if (e.g.) Altman resigns in response and does much the same as he actually did.

In realpolitik terms, this seems extremely dumb. Even ignoring any prior concerns about Altman, the coup attempt indicates Altman is out to get you. Ilya's defection offers the only window of opportunity, with uncertain (and with hindsight, evidently short) duration to get him first. Keeping him around as a CEO hostile to your interests seems begging him to try other options to depose you de facto (e.g. just making sure you don't get consulted on anything consequential) even if you make yourself immune to this him repeating this particular attack. Even if the likelihood you have to strike a bargain where he gets to come back as CEO but no board seat anyway should be foreseen (cf. what actually happened) you got to freeroll a chance at you succeeding at replacing him with someone better, and re-appointing Altman is an extra horse you can trade in negotiations.

Although principled people might dislike corporate sharp practice in principle, TDM's knifing of Altman is credibly justified as a response to his attempted knifing of Toner - they didn't fight 'dirty' first - and there's a natural justice 'let's give you a taste of your own medicine' aspect here. Even if you don't think tit-for-tat is appropriate, I think the 'paladin of the charter' view would say Altman's summary dismissal is either reasonable or straightforwardly required. Criticising an independent board member overseeing you for writing something you don't like about your company is already out of line. Using this as a pretext for kicking them off the board because she has and will continue to oppose your desire for rapid commercialization, and will stop any ability for the board to effectively oppose you crosses a red one. If this appears pre-meditated, and alongside a background of trying to end-run the board etc. means the board should not have confidence in the CEO justly developing AI technology for the benefit of all.

A archly principled person could dock TDM points if it is believed TDM kinda knew Ilya would later regret what he voted for, so rushed him into it before he could think better of it (/have Greg's wife talk him down). But this conjoins a lot of propositions about what TDM did/should have thought/foreseen at the time, and resolving a variety of potential defences even-if-so against them. A realpolitik perspective would be extremely impressed: TDM were ambushed by (reputedly) one of the best corporate knife-fighters around, yet it was they who ended up stabbing him.

*Why not explain clearly what happened?*
I agree this looks like a mistake, albeit of little consequence. Even if "Altman tried to coup us, so we couped back" wasn't going to get much traction on the Altman/SV media full-court press, nor cut much mustard with the staff ("Well, if its the board or Altman, we definitely prefer Altman"), it seems better than offering nothing substantial. The principled approach would say you owe staff a full explanation if you are doing things which could lose them a lot of money.

Reasonable explanations could emerge: maybe there are overriding principles which oblige silence, maybe its canny to reserve 'we will sign NDAs/non-disparagement contracts and not go on a media tour after we leave' as a further bargaining chip, maybe they (reasonably, if mistakenly) followed very cautious legal advice, or something else. But it seems mistaken, and presuming TDM did the right things to explain away apparently wrong things in the course of arguing they did the right things begs the question.

*Why did they cave?/Why didn't they burn it all down?*
If you're sure Altman is such bad news, and you were willing to 'full send aggro' to kick him, why would you let him return? Why not make OpenAI die on this hill out of principle - or at least make Altman's setting up under MS a Pyrrhic victory given the likely legal headaches from scavenging the OpenAIs corpse for valuable AI resources. Why not stick up the middle finger and gift everything you have legal title over to someone else?

I would guess these headaches, and the credible threat TDM could torch a lot of OpenAI 'assets' if they wanted to is what brought Altman to the table, and why he has agreed in principle to a deal much less like 'abject surrender to him' that his pet press coverage kept suggesting was imminent or the only thing he'd accept. I think what TDM have won for their side is the prospect (not guarantee) of more safety conscious governance which Altman will have to abide by, which looks better for the safety perspective (at least to me) than him setting up directly under microsoft, even if he would suffer a (substantial) one-off switching cost.

*Won't someone please think of the second order effects?!*
But what about the wider world ramifications? Was it worth it if EA/AIS/whatever is now a dirty word or object of scorn amongst the technical or financial elite? If TDM look stupid (even if they weren't) will that reflect badly on the rest of us? Could this stop anyone investing in safety-primary governance structures for big tech ever gain? Will OS AI development proliferate as no one wants to trust an API which could be snuffed out if >50% of a small number of people decide to do it?

They're fine:
1) The principled anti-PR view would say one should accept that people sincerely think less of you or oppose you for what you actually believe and stand for. Tact fine, open compromise also fine, but maintaining a pretence you're (e.g.) pro cautious capabilities when you're really 'pause AI' is wrong. So social reality/higher simulacrum levels should not enter into the evaluation of what TDM did.
2) If you don't buy that, there are typically a lot of putative second order effects going this way and that. Maybe OpenAI having a big blow up gives impetus to government involvement? Maybe Altman having to be more 'mask off' to cling on at OpenAI allows others to be on their guard when things really start to count? Maybe this governance clearly (or clearly seen to be) failing means the same mistakes are not repeated? So if in first order terms TDM did the right thing, maybe the second order stuff (especially ex ante) is an approximate wash.
3) Even if you're an arch-wannabe-kissinger who jerks off to reading the Melian Dialogue every night, and you see the real game of AI governance efforts infiltrating the corridors of power under cover so you and yours are well-placed to make a treacherous turn for good at an opportune time, the position TDM found themselves in was just such a time. If you're not willing to risk some social capital on a reasonable shot to stop the morally dubious CEO of a leading AI power deposing his own board and making it his own personal empire, when will you ever spend it?

**Summing up**
Perhaps not much time will be needed to see this amounts to cope or fan fiction. Maybe (e.g.) TDM tell their own story, and it sounds more like they were scared, impulsive, or stupid. Perhaps another turn in the tale means the supposed story of how things initially went down is just wrong, and the justificatory edifice I built upon it collapses.

But I don't think so. In terms of virtue, they were close to the only people who actually took the 'mission' seriously, could do something about it, and tried to do the right thing despite extraordinary pressure. In terms of performance, they had the talismanic CEO ambush them, not only successfully defend themselves but get him kicked out, and despite having him, the staff, the investors, and a generally hostile media ranged against them, hostage part of the OpenAI board back to them for significant concessions. Doesn't seem like a bad week's work to me.