Name: Open AI, Superalignment and the Battle for the Future of AI
Start: 2024-06-28T18:00:00+10:00
End: 2024-06-28T21:00:00+10:00
Location: 622-632 Harris St, Ultimo NSW 2007, Australia

Open AI, Superalignment and the Battle for the Future of AI

If you're attending in person, please register at the Meetup Event as well so Timothy has accurate numbers: https://www.meetup.com/sydney-ai/events/301294322/
We'll also have a virtual event: https://us02web.zoom.us/j/87089646620?pwd=9p71Ue6vhWNcVVdBPHrkMMCoIrVny6.1
“The future is going to be good for the AI regardless, it would be nice if it were good for humans as well” - Ilya Sustkever, Open AI, Co-Founder and former Chief Scientist
“I used to be annoyed at being the villain of the EAs (effective altruists) until I met their heroes” - Sam Altman, CEO of OpenAI
“As we told the investigators, deception, manipulation and resistance to thorough oversight should be unacceptable” - Helen Toner and Tasha McCauley, OpenAI board members with links to the EA movement
On the 15th of May, the two co-leads of Open AI’s Superalignment team, Jan Leike and Ilya Sutskever resigned. While Ilya’s exit was diplomatic, Jan curtly tweeted “I resigned”, then created a media storm three days later by complaining that “safety culture and processes” at OpenAI had “taken a backseat to shiny products”
Six months earlier, Ilya and other board members had stunned the world by firing the (now reinstated) CEO Sam Altman, with media stories identifying the OpenAI app store and concerns about excess commercialization as a key point of contention between Sam and the board.
Superalignment, a project to “steer and control AI systems much smarter than us” had been a pet project of Ilya who had become increasingly persuaded that AGI was coming soon. A year earlier, OpenAI announced this project, along with a commitment to allocate 20% of their compute secured to date, which made it the most well-funded AI safety research project up until that point.
This talk will delve into the research conducted by the team to ensure that humanity remains in control of the future, as well as their views on the most promising directions for investigation. It will discuss where the idea of AI as an existential risk came from and the role this played in the founding of both DeepMind and OpenAI. It will also cover how safety research led to the development of reinforcement learning by human feedback (RLHF), the key technical innovation behind ChatGPT. It will paint a picture of the tumultuous back and forth between the “safety” and “accelerationist” factions including much of the safety team quitting to start Anthropic, Sam Altman’s firing and reinstatement, the formation and dissolution of the super-alignment team and the controversy over Open AI’s coercive non-disparagement agreements.
The main talk will be accessible to people without a strong technical background, but I’d be excited to dive deeper into the technical details during the Q&A or after the talk.