About this Event
The 2026 Vanderbilt Center for Quantitative Sciences (CQS) Summer Institute will be held July 20-24 and July 27-31 in Nashville, Tennessee. This year's lineup features four dynamic weeklong half-day courses (each course offers 15 hours of classroom instruction):
- Genomic Data Analysis: From Sequencing to Biological Insights
- Introduction to PyMC and Bayesian Modeling
- Terra-Based Cloud Computing*
- Introduction to Causal Inference
*This course is available only to Vanderbilt Health and Vanderbilt University employees and trainees.
Each course will be held in person at 2525 West End Avenue. Breakfast, lunch, and parking are included. Purchasing tickets via this Eventbrite form enrolls you in your selected course(s). You can enroll in multiple courses on one order.
Tuition (per course)
$950 - regular
$700 - Vanderbilt Health and Vanderbilt University faculty and staff
$450 - Vanderbilt Health and Vanderbilt University students
Note: A 20% Early Bird discount is automatically applied from May 1-May 31.
Genomic Data Analysis: From Sequencing to Biological Insights
July 20-24, 9 a.m.-noon
Instructor
Qi Liu, PhD - professor of biostatistics and biomedical informatics, Vanderbilt University School of Medicine; associate director of bioinformatics and omics coordinating center director, CQS; technical director, VANGARD (Vanderbilt Technologies for Advanced Genomics Analysis and Research Design)
Overview
This course introduces the statistical and bioinformatic approaches used to analyze large-scale omics data in modern biomedical research. Students will explore key technologies and analytical frameworks for DNA sequencing, RNA sequencing, single-cell RNA sequencing, and spatial transcriptomics. Core topics include sequencing technologies, data preprocessing, quality control, read alignment, transcript quantification, dimensionality reduction, differential expression analysis, and functional enrichment.
Through a combination of lectures and hands-on practical sessions, students will learn how to transform raw sequencing data into biologically meaningful insights. Using real datasets, students will gain practical experience with commonly used genomics analysis workflows, with a particular focus on RNA-seq and single-cell RNA-seq. By the end of the course, students will understand the principles behind modern genomics pipelines and will be able to conduct basic analyses of high-throughput sequencing data.
Prerequisites
Course participants should have basic or entry-level knowledge of R programming, Linux/Unix commands, and biostatistics. See "Preparing for this course" on the course details page for information on software to download and materials to review before July 20.
Introduction to PyMC and Bayesian Modeling
July 20-24, 1-4 p.m.
Instructor
Chris Fonnesbeck, PhD - principal data scientist at PyMC Labs; adjoint associate professor of biostatistics, Vanderbilt University School of Medicine
Overview
This course provides a comprehensive introduction to Bayesian statistical modeling using PyMC. Participants will progress from foundational concepts through applied modeling techniques, building practical skills through hands-on coding exercises with real-world datasets. Each session combines conceptual instruction with interactive notebook-based exercises.
Prerequisites
- Working knowledge of Python programming
- Familiarity with basic statistical concepts (e.g., distributions, regression concepts)
No prior Bayesian experience is required.
Please see the course README page on GitHub for what to download and set up before July 20. The page also features a detailed schedule for each day of the course.
Terra-Based Cloud Computing
July 27-31, 9 a.m.-noon
Instructor
Quanhu (Tiger) Sheng, PhD - associate professor of biostatistics, Vanderbilt University School of Medicine; associate director of advanced computing, CQS; deputy technical director, VANGARD
Teaching Assistant
Hua-Chang Chen, PhD - data scientist assistant II, Department of Biostatistics, Vanderbilt Health
Overview
This course provides an in-depth exploration of Terra-based cloud computing with a focus on genome-wide association studies (GWAS) analysis using BioVU whole genome sequencing data. Students will use Visual Studio Code to navigate course materials and engage in hands-on exercises. The curriculum introduces key concepts and tools, including the Terra environment, Docker image creation, workflow description language (WDL), cohort building with the BioVU synthetic derivative BigQuery database, and GWAS analysis using Regenie4. Through practical activities, participants will develop skills in cloud-based GWAS analysis, covering environment setup, software packaging, cohort construction, and data processing.
*This course is available only to Vanderbilt Health and Vanderbilt University employees and trainees. Space is limited to 20 participants.
Prerequisites
- Knowledge of genomics and GWAS fundamentals
- Familiarity with Python and Jupyter Notebook
- Basic proficiency in SQL
- Experience using Linux command-line interfaces
See the course details page for instructions on preparing for the course. Some action items must be completed by July 20 (one week before the start of the course).
Introduction to Causal Inference
July 27-31, 1-4 p.m.
Instructor
Andrew Spieker, PhD - associate professor of biostatistics, Vanderbilt University School of Medicine; board member, Society for Causal Inference
Overview
Many have likely heard that “correlation does not imply causation,” but that then begs the question: what exactly is causation in the first place? This five-day short course will provide a framework for modern causal inference. The first day will involve an overview of the potential outcomes framework and theory of DAGs (directed acyclic graphs). The second and third days will involve commonly implemented causal inference methods for use in cross-sectional data including standardization, matching, inverse-weighting, and instrumental variables. The fourth day will focus on methods for longitudinal data including marginal structural models and g-computation. The fifth day will likely feature miscellaneous advanced topics, which may include sensitivity analyses, parametric identification, and Bayesian methods. Throughout the course, emphasis will be placed on graphical representation of variables through DAGs and software-based implementation to real-world data.
Upon completion of this course, participants should be able to:
- explain the potential outcomes framework.
- explain causal identifiability assumptions.
- use directed acyclic graphs to characterize relationships between variables.
- choose between and implement causal methods suitable for real-world cross-sectional and longitudinal data.
- assess covariate balance and positivity violations.
Prerequisites
- A basic understanding of biostatistical methods, including linear and logistic regression.
- Prior experience working in R will be helpful, although it is not strictly necessary.
Event Venue & Nearby Stays
2525 West End Ave, 2525 West End Avenue, Nashville, United States
USD 450.00 to USD 950.00








