About this Event
Join us at the Milano vLLM Meetup!
Excited about large language models? Come hang out in person with fellow practitioners in Milano. It’s a great chance to chat, learn, and exchange cool ideas about vLLM tech in a relaxed setting. Whether you're a pro or just curious, everyone's welcome!
What to Expect
- Deep technical sessions from vLLM maintainers, committers, and teams using vLLM at scale
- Live demos focused on real workflows
- Great networking with food and drinks
Who Should Attend
- vLLM users and contributors
- ML and infra engineers working on inference and serving
- Platform teams running GenAI in production
- Anyone curious about efficient inference across local, cloud, and Kubernetes
Don't miss out on this awesome meetup – see you there!
Agenda
🕑: 05:00 PM - 05:30 PM
Doors Open, Check-In
🕑: 05:30 PM - 05:40 PM
Welcome and Opening Remarks
Host: Andrea Ferretti, AI Researcher, Bending Spoons
🕑: 05:40 PM - 06:00 PM
Intro to vLLM and Project Update
Host: Nicolò Lucchesi, vLLM Maintainer, Red Hat
🕑: 06:00 PM - 06:30 PM
Powering Evernote AI features with vLLM
Host: Ludovico Papavassiliou, AI Engineer, Bending Spoons
🕑: 06:30 PM - 06:50 PM
vLLM & Mistral AI
Host: AI Engineers from Mistral AI
🕑: 06:50 PM - 07:10 PM
Distributed large-scale serving with llm-d
Host: AI Engineers from Red Hat AI
🕑: 07:10 PM - 07:20 PM
Coffee & Tea Break*
🕑: 07:20 PM - 07:40 PM
The New Open-Weight Frontier: Hybrid Models in vLLM
Host: Thomas Parnell, vLLM Maintainer , IBM
🕑: 07:40 PM - 08:00 PM
Model Compression for fast Inference on vLLM
Host: AI Engineers from Nvidia
🕑: 08:00 PM - 09:00 PM
Networking, Food and Drinks
Event Venue & Nearby Stays
Bending Spoons, 10 Via Nino Bonnet, Milano, Italy
EUR 0.00












