About this Event
This event takes place at G.03, Informatics Forum.
Register to attend by Noon on 16 April
Explainable AI for NLP
Machine learning models now underpin NLP systems, from sentiment classification to conversational agents. This lecture’s goal is to map what current AI enables in NLP and to make explicit what can go wrong as model capacity (and opacity) increases—covering failure modes such as spurious correlations and poor generalization, hallucinations in generative settings, and strategic underperformance (sandbagging). We then survey the main tools for diagnosing and mitigating these risks: explainers (e.g., attribution- and counterfactual-based analyses), deployment guardrails (monitoring and auditing practices), and advanced mechanistic interpretability techniques (disentangled representations, ablations and steering). The lecture closes by highlighting open problems—especially the limits of current guarantees and the remaining gaps for robust, scalable oversight across both discriminative and generative NLP.
About the Speaker
Alan Perotti is an Italian researcher based in Turin, currently working at the Intesa Sanpaolo AI Research lab. He holds a PhD focused on neuro-symbolic integration and specializes in Explainable AI and Mechanistic Intepretability. His work spans both fundamental research and applied projects, including industrial collaborations, and European research initiatives. He is also active as a lecturer and science communicator, regularly engaging with academic institutions and the broader public on topics related to artificial intelligence.
Event Venue & Nearby Stays
G.03 in Informatics Forum, The University of Edinburgh, 10 Crichton Street, Edinburgh, United Kingdom
GBP 0.00












