Name: Introduction to Analysis of Public Survey Data
Start: 2025-10-29T13:30:00-04:00
End: 2025-10-29T16:30:00-04:00
Location: Parkway Central Library (ROOM 407)

Introduction to Analysis of Public Survey Data

This hands-on workshop mixes powerpoint-free discussion of survey methodology, open-access microdata, & intro to R code. Bring your laptop!
About this Event

This short course examines how to conduct original research with free and public survey data.

This half-day session equips attendees with survey analysis skills for quantitative research, economics, public policy, demography, and many other fields reliant on social statistics to better understand individuals and businesses.

Please feel free to browse the accompanying open source reference textbook at http://asdfree.com/

Learning Outcomes

Review of government funding of complex sample surveys rather than simple random samples.
R programming introduction with hands-on coding and scripting - beginners are welcome!!
Replication of published statistics with free & publicly available one-row-per-person microdata.

Agenda

🕑: 01:30 PM - 02:00 PM
Lecture to introduce complex sampling

Info: Fundamentally, a complex sample survey aims to save money on the transportation costs of its interviewers by sampling geographies first and then people (or businesses or structures) within the geographies. So instead of sampling individuals nationwide, a survey administrator samples twenty towns and cities across the country, and then within those geographic areas, again samples multiple individuals. Nationwide, everyone still has the same probability of being sampled, but once the first stage of sampling occurs - when geographies are sampled - then suddenly the residents of those sampled geographies have a much higher probability of inclusion and everyone else's inclusion probability goes to zero. But now, instead of sending an in-person survey team to ten thousand different interviews across the country, they'll only need to travel to twenty. Suddenly, the survey interviewer transportation budget looks much nicer.

🕑: 02:00 PM - 02:30 PM
Discussion of experience with datasets

Info: Course participants will discuss any publication history or experience using any publicly available dataset, and what research questions they have answered (or would like to answer) with any publicly available dataset.

🕑: 02:30 PM - 03:00 PM
Review of one http://asdfree.com/ entry

Info: Each dataset presented on http://asdfree.com/ includes three major components: 1. Download automation or data acquisition; 2. Helpfully-noted analysis examples; 3. Replication of published estimates to prove correct methodology. We will walk through each of these segments for one dataset, with participants testing out the same R code on their local laptops. Given the high similarity of the structure of each dataset, participants will ideally quickly understand that once they are able to get started using any of these entries, it's straightforward to apply the same knowledge to *all* of these entries.

🕑: 03:00 PM - 03:00 PM
BREAK FOR SNACKS!
🕑: 03:00 PM - 03:30 PM
Discussion of research goals across surveys

Info: As an example, if a participant mentions interest in health insurance coverage in the United States, we might review the strengths and weaknesses of different surveys on the topic. SIPP interviews individuals every year for multiple years, and asks about every single month of coverage. CPS interviews individuals with the full ASEC one time, asking for health insurance at the point of interview and also monthly through the prior year. CPS also asks many labor force questions, and is representative at the state-level. NHIS asks about health insurance only at the single interview, but also asks many health status and health behavior questions. BRFSS only asks a single question about health insurance, but it has a large sample size, even at the state-level.

🕑: 03:30 PM - 04:30 PM
Hands-on session

Info: Selecting any dataset from the list of available datasets on http://asdfree.com/, participants will follow a single entry from start to finish.