Applied Statistical Analysis with Missing Data, Aarhus 2016
Preliminary programme
Teachers: Morten Frydenberg and Henrik Støvring (Section of Biostatistics, Department of Public Health)
Last revision Henrik Støvring: November 28, 2016

The programme below will be continuously updated during the course with links to the lecture notes, exercises and datasets.

Day 1: Monday, December 12, 2016
The first day will focus on learning the basic principles and how to conduct simple MI analyses in Stata

Slides and datasets (Updated 14/12 2016): Missingdata_MI.pdf, Wooddata1.dta, ess2e03_scand.dta (version 13: Wooddata_13.dta)

Exercise 2 link:

Exercise 5 link:

9.00 - 10.00

Introduction – the ordinary analysis and its shortcomings with a brief outlook to sensitivity analyses and an example of a simple multiple imputation based analysis.

10.00 -12.00

Understanding the missing data mechanism (MCAR, MAR and MNAR) – what can we learn from analyzing the data?

12.00 - 13.00

Lunch break

13.00 - 14.00

Our first imputation model – understanding the concept of random data, prediction and filling in missing observations

14.00 -15.00

Multiple Imputation by Chained Equations (MICE) – an iterative procedure for imputing missing values based on regression analyses (black box version)

15.00 -16.00

MICE continued – how to analyze multiply imputed data

Day 2: Wednesday, December 14, 2013
The second day will focus on how to conduct more advanced MI analyses with Stata

Slides and datasets: Missingdata_MI_II.pdf, Wooddata2.dta, Wooddata2_13.dta, Wooddata2_11.dta

Google document for checking assumptions: Exercise 4 - assumptions checking

Do-file example with log:, day2.log

9.00 - 10.00

A closer look at how Stata handles imputed datasets – the -mi-commands in Stata; MI data types; adding and extracting imputed datasets; examining missing and imputed values

10.00 - 12.00

Tailor-made regression equations with MICE – how to exploit insights about the missing data mechanism and associations between variables when imputing missing values

12.00 - 13.00

Lunch break

13.00 - 14.00

Advanced concepts in MI – passive variables, choosing a regression type for categorical imputed variables (logit, ologit, mlogit), debugging strategies when your -mi impute- will not run

14.00 - 16.00

Examples based student’s own projects

Day 3: Friday November 29 2013
The third day will focus on the methodological background of the methods

9.00 - 12.00

Statistical inference – fundamental principles, likelihood, estimates, uncertainty and bias. Slides
Morning exercise

12.00- 12.30

Lunch break

12.30 - 13.30

Sensitivity analysis and other methods for handling missing data problems.Slides

13.30 - 15.00

A case study.Slides

15.00 - 15.30

Final remarks and course evaluation

Link to homepage on missing data