![SRI Seminar Series: David Duvenaud [HYBRID EVENT]](https://cdn.stayhappening.com/events7/banners/58cb829254b9a1c93c3c73bbc4226329fc5cc42078b90473725a1aaf01522f0b-rimg-w1200-h601-dcc2d9ed-gmir.jpg?v=1759525881)
About this Event
Our weekly SRI Seminar Series welcomes David Duvenaud, associate professor in the Department of Computer Science at the University of Toronto and a Schwartz Reisman Chair in Technology and Society. A founding faculty member and Canada CIFAR AI Chair at the Vector Institute, Duvenaud is widely recognized for his contributions to AI safety, probabilistic deep learning, and generative modeling.
Duvenaud’s current research focuses on assessing dangerous capabilities in frontier AI models, mitigating catastrophic risks, and developing institutional frameworks for post-AGI futures. He recently served with the Alignment Science team at Anthropic and is a member of ISED Canada’s Safe and Secure AI Advisory Group.
This special in-person seminar is jointly presented as part of the Department of Computer Science’s , which has welcomed top minds to U of T to talk about key issues in computer science for more than a decade.
Moderator: Sheila McIlraith, Department of Computer Science
Venue:
, University of Toronto, Room W240 (second floor)
Entrance: 108 College St, Toronto, ON M5G 0C6
Talk title: "The big picture of LLM dangerous capability evals"
Abstract:
How can we avoid AI disasters? The plan so far is mostly to check the extent to which AIs could cause catastrophic harms based on tests in controlled conditions. However, there are obvious problems with this approach, both technical and due to their limited scope. I'll give an overview of the work my team at Anthropic did to evaluate risks due to models feigning incompetence, colluding, or sabotaging human decision-making. I'll also discuss the idea of “control” techniques, which use AIs to monitor and set traps to look for bad behavior in other AIs. Finally, I'll outline the main problems beyond the scope of these approaches, in particular that of robustly aligning our institutions to human interests.
Suggested reading:
- “Sabotage Evaluations for Frontier Models,” Anthropic Alignment Safety Team, October 18, 2024.
- “SHADE-Arena: Evaluating sabotage and monitoring in LLM agents,” Anthropic Alignment Safety Team, June 16, 2025.
- “Full-Stack Alignment: Co‑Aligning AI and Institutions with Thick Models of Value,” ICML 2025 Workshop MoFA, June 10, 2025.
About the speaker
is an associate professor in the Department of Computer Science and Statistical Sciences at the University of Toronto, where he holds a Schwartz Reisman Chair in Technology and Society. A leading voice in AI safety and artificial general intelligence (AGI) governance, Duvenaud’s current work focuses on evaluating dangerous capabilities in advanced AI systems, mitigating catastrophic risks from future models, and developing institutional designs for post-AGI futures. Duvenaud is a Canada CIFAR AI Chair and a founding faculty member at the Vector Institute, a member of Innovation, Science and Economic Development Canada’s Safe and Secure AI Advisory Group, and recently completed an extended sabbatical with the Alignment Science team at Anthropic.
Duvenaud’s early helped shape the field of probabilistic deep learning, with contributions including neural ordinary differential equations, gradient-based hyperparameter optimization, and generative models for molecular design. He has received numerous honors, including the Sloan Research Fellowship, Ontario Early Researcher Award, and best paper awards at NeurIPS, ICML, and ICFP. Before joining the University of Toronto, Duvenaud was a postdoctoral fellow in the Harvard Intelligent Probabilistic Systems group and completed his PhD at the University of Cambridge under Carl Rasmussen and Zoubin Ghahramani.
About the SRI Seminar Series
The SRI Seminar Series brings together the Schwartz Reisman community and beyond for a robust exchange of ideas that advance scholarship at the intersection of technology and society. Seminars are led by a leading or emerging scholar and feature extensive discussion.
To register for all seminar events in the Winter 2025 season, please contact us directly at [email protected].
About the Department of Computer Science
This event is co-sponsored by the University of Toronto’s Department of Computer Science as a presentation of the , supported in part by a gift from the Webster Family Charitable Giving Foundation.
About the Schwartz Reisman Institute for Technology and Society
The Schwartz Reisman Institute for Technology and Society is a research institute at the University of Toronto that explores the ethical and societal implications of technology. Our mission is to deepen knowledge of technologies, societies, and humanity by integrating research across traditional boundaries to build human-centred solutions.
Explore each session in advance by visiting .
Missed an event? Visit to watch previous seminars.
Event Venue & Nearby Stays
U of T: Schwartz Reisman Innovation Campus, 108 College Street, Toronto, Canada
CAD 0.00