P173/5 Applied Statistical Analysis with Missing Data, Aarhus 2017
Preliminary programme
Teachers: Morten Frydenberg and Henrik Støvring (Section of Biostatistics, Department of Public Health)
Last revision Morten Frydenberg: December 1st, 2017

Note Microsoft Edge might not download stata files automatically. Use the option"save target as".

Things you have to do before we meet:

The programme below will be continuously updated during the course with links to the lecture notes, exercises and datasets.

Day 1: Monday, December 11, 2017
The first day will focus on learning the basic principles and how to conduct simple MI analyses in Stata

Slides and datasets (Updated 10/12 2017): Missingdata_MI.pdf, Wooddata1.dta, ess2e03_scand.dta (version 13: Wooddata_13.dta)
The paper on wooddust Jacobsen2008

Exercise 2 link: https://docs.google.com/spreadsheets/d/11GxmvTPvH2_xbj93lGqgrBuwh13rnTR3JKzuyrv_PLY/edit?usp=sharing

Exercise 5 link:


A tentative solution do-file for the exercises: solution_wednesday.do

9.00 - 10.00

Introduction – the ordinary analysis and its shortcomings with a brief outlook to sensitivity analyses and an example of a simple multiple imputation based analysis.

10.00 -12.00

Understanding the missing data mechanism (MCAR, MAR and MNAR) – what can we learn from analyzing the data?

12.00 - 13.00

Lunch break

13.00 - 14.00

Our first imputation model – understanding the concept of random data, prediction and filling in missing observations

14.00 -15.00

Multiple Imputation by Chained Equations (MICE) – an iterative procedure for imputing missing values based on regression analyses (black box version)

15.00 -16.00

MICE continued – how to analyze multiply imputed data

Day 2: Wednesday, December 13, 2017
The second day will focus on how to conduct more advanced MI analyses with Stata

Slides and datasets: Missingdata_MI_II.pdf, Wooddata2.dta, Wooddata2_13.dta

9.00 - 10.00

A closer look at how Stata handles imputed datasets – the -mi-commands in Stata; MI data types; adding and extracting imputed datasets; examining missing and imputed values

10.00 - 12.00

Tailor-made regression equations with MICE – how to exploit insights about the missing data mechanism and associations between variables when imputing missing values

12.00 - 13.00

Lunch break

13.00 - 16.00

Advanced concepts in MI – passive variables, choosing a regression type for categorical imputed variables (logit, ologit, mlogit), debugging strategies when your -mi impute- will not run

Day 3: Friday December 15 2017
The third day will focus on the methodological background of the methods

9.00 - 12.00

Statistical inference – fundamental principles, likelihood, estimates, uncertainty and bias. Slides
Morning exercise

12.00- 12.30

Lunch break

12.30 - 13.30

Sensitivity analysis and other methods for handling missing data problems.Slides

13.30 - 15.00

A case study.Slides

15.00 - 15.30

Final remarks and course evaluation

Link to homepage on missing data