About this Event
About this workshop
Just like how humans have multiple senses to perceive the world around them, computers have a variety of sensors to help perceive the human world. In the health industry, computed tomography (CT) scans provide a 3D representation used to detect potentially dangerous abnormalities. In the robotics industry, lidars are used to help robots see depth and navigate the complex topology around them. In this course, learners will develop neural network based multimodal models
In this advanced training, we explore how to design and orchestrate intelligent AI agents that can understand many different data types by exploring different fusion techniques
🔹 What participants will learn:
• Early & Late Fusion of camera and LiDAR data
• Contrastive pretraining and multimodal embeddings
• Building and querying vector databases
• Converting LLMs into Vision-Language Models (VLMs)
• Processing PDFs with OCR pipelines
• Video understanding using NVIDIA Cosmos Nemotron
• Agent orchestration with NVIDIA AI Blueprints
This workshop provides practical experience in building production-ready multimodal AI agents using NVIDIA’s cutting-edge ecosystem.
🎯 Ideal for:
AI researchers, engineers, data scientists, graduate students, and industry professionals interested in agentic AI, multimodal learning, and next-generation AI systems.
AI workshop attendees will receive NVIDIA Deep Learning Institute (DLI) certificates upon completion of an assessment test at the end of the workshop.
Seats are limited. Join us to explore the future of multimodal and agentic AI.
Event Venue & Nearby Stays
50 Quai Charles de Gaulle, 50 Quai Charles de Gaulle, Lyon, France
EUR 0.00






