
About this Event
From memes that act as cultural shibboleths to deepfakes that erode political trust, multimodal misinformation is reshaping how we interpret and engage with online content. Organised by the under the , this workshop brings together scholars and researchers to explore the symbolic, technical, and societal dimensions of this growing challenge.
Speakers will examine memes as powerful in-group markers (Dr. Marko Skorić), the global political implications of deepfakes (Dr. Saifuddin Ahmed and Ms. Ruolan Deng), and new advances in multimodal detection systems such as TRUST-VL (Dr. Peng Qi). The discussion also highlights how cultural logics shape the reception of online visuals (Dr. Rituparna Banerjee), and how innovative methods in augmentation and prompt design can strengthen multimodal hate detection (Mr. Sahajpreet Singh).
By engaging with diverse perspectives, this workshop seeks to deepen our understanding of multimodal misinformation and spark dialogue on building more resilient and trustworthy digital ecosystems.
Agenda
🕑: 02:00 PM - 02:05 PM
Welcome
Host: Asst. Prof. Kokil Jaidka (NUS)
🕑: 02:05 PM - 02:20 PM
Memes as Shibboleths
Host: A/P Marko Skorić (CityU)
Info: This talk examines internet memes as a powerful mode of communication rooted in imitation, storytelling, and cultural transmission. Viewed as modern-day shibboleths, memes mark in-group identity and exclude outsiders, granting communities the power to define norms of belonging. In polarised digital spaces, such “memetic shibboleths” act as political tools that mobilise supporters, reinforce stereotypes, ridicule opponents, distort facts, and strengthen group solidarity—all while resisting conventional fact-checking and regulatory measures. By highlighting vivid examples, the talk illustrates how memes circumvent content filtering and censorship, and underscores their persuasive potential in shaping socio-political discourse.
🕑: 02:20 PM - 02:35 PM
Multimodal Misinformation: Deepfakes and the Erosion of Political Trust
Host: A/P Saifuddin Ahmed (NTU)
Info: This talk examines the growing challenge of deepfakes in political contexts, where artificially generated images, audio, or video can convincingly depict events or statements that never occurred. Our research shows that exposure to deepfakes—such as a fabricated bridge collapse—significantly reduces trust in government, while awareness of their manipulative potential undermines confidence in both media and electoral processes. By illustrating deepfakes’ direct and indirect effects on political attitudes, we highlight how their strategic use can erode institutional trust and raise doubts about democratic integrity, with implications extending well beyond the United States to a global scale.
🕑: 02:35 PM - 02:50 PM
Discussion 1
🕑: 02:50 PM - 03:10 PM
Teabreak
🕑: 03:10 PM - 03:20 PM
TRUST-VL: An Explainable News Assistant for General Multimodal Misinformation
Host: Dr. Peng Qi (NUS CTIC)
Info: This talk presents TRUST-VL, a unified and explainable vision-language model designed to detect multimodal misinformation spanning text, images, and cross-modal manipulations. Unlike existing approaches that focus on single distortion types, TRUST-VL leverages shared reasoning capabilities and task-specific features to improve generalisation. Supported by TRUST-Instruct, a large-scale instruction dataset of nearly 200K samples aligned with human fact-checking workflows, the model achieves state-of-the-art performance while offering interpretability. Through extensive experiments, we show how TRUST-VL advances both robustness and transparency in combating misinformation amplified by generative AI.
🕑: 03:20 PM - 03:30 PM
Seeing Is Not Believing: Rethinking Online Visuals through Cultural Logics
Host: Dr. Rituparna Banerjee (NUS CTIC)
Info: This talk rethinks multimodal misinformation by emphasising how cultural logics and national myths shape the circulation and reception of online visuals. Drawing on the visual politics of women politicians in West Bengal, India, the study highlights how tropes of purity, sacrifice, and respectability inform digital self-presentation and political legitimacy. Such findings challenge universalist models of visual communication by exposing symbolic and culture-specific codes often overlooked in technical frameworks. By foregrounding the role of cultural resonance alongside technological affordances, this analysis underscores the need for culture-sensitive approaches in strengthening digital resilience.
🕑: 03:30 PM - 03:40 PM
Labels or Input? Rethinking Augmentation in Multimodal Hate Detection
Host: Mr. Sahajpreet Singh (NUS SoC)
Info: This talk explores new methods for detecting hateful memes, where harmful intent emerges subtly through text-image interplay. We propose a dual approach: optimising prompt structures and scaling supervision to enhance model robustness, and introducing a multimodal augmentation pipeline that generates counterfactually neutral memes to reduce spurious correlations. Experiments demonstrate that both prompt design and data composition critically influence model performance, with structured prompts improving even small models and InternVL2 achieving the best results. Our findings highlight the importance of combining prompt optimisation and synthetic data to build more trustworthy, context-sensitive hate detection systems.
🕑: 03:40 PM - 04:15 PM
Discussion 2
Event Venue & Nearby Stays
Conference Room, #01-05, innovation 4.0, 3 Research Link, Singapore, Singapore
USD 0.00