Preliminary plan
Notes, Data sets and programs
for
Postgraduate course
Linear regression models for continuous and binary data
Note, lectures, and exercises are under revisions, and the hyperlinks will be dead under this revision.

Edited Febuary 17th, 2017
By Morten Frydenberg
morten@biostat.au.dk



A list of all data sets All data

Where to store downloaded and homemade Stata programs (so called ado files):
By default Stata assume that the downloaded ado and help files are located in
C:\ado\personal (for you personal/homemade programs) and
C:\ado\plus (for the one you download from the net).
C:\ado (ado files made many years ago).

You can check the setting by by the commmand sysdir in Stata.

If you are running Stata via CITRIX, you might not be able to store downloaded Stata programs on the C drive.
The solution can be to store them om your personal drive (here H:). You can do this by changing the location by the commands

sysdir set PERSONAL "H:\ado\personal\"
sysdir set PLUS     "H:\ado\plus\"
sysdir set OLDPLACE "H:\ado\"

These lines should either be in your profile.do file or in the begining of every do file you use.

We use some extra Stata commands you can install these in Stata by
net from http://www.biostat.au.dk/teaching/Ados
net install cis
net install regeq
net install gr42_7.pkg ,from(http://www.stata-journal.com/software/sj16-3)

The command qplot can make several qq-plots at the same time. Note that hou have to apply the option "trscale(invnorm(@)".
An example: to make qnorm plots of res for the two sexes: qplot res, trscale(invnorm(@)) by(sex)

Day 1: Monday November 6th 2017 Updated
9.15 - 10.30 Lecture: Simple linear regression -1 .
The model, the parameters, estimation and inference.

All Stata code used at the lecture.
Data set used at the lecture: lung
10.30 - 12.00 Exercises .
Data set used at the exercises: lung.
12.00 - 13.00 Lunch break
13.00 - 14.30 Lecture: Simple linear regression -2 .
Checking the model, residuals, leverage, diagnostics plots,transformation of variables.

Most of Stata code used at the lecture.
Data set used at the lecture: lung and gfrdata.
14.30 - 16.00 Exercises .
Data set used at the exercises: gfrdata and glyco.
How to get from the log - model to the original scale.


Day 2: Wednesday November 8th 2017
Updated
9.00 - 9.15 Summarizing Mondays exercises.
9.15 - 10.15 Lecture: Multiple linear regression - 1
The model, the parameters, estimation and inference.
Checking the model.

All of Stata code used at the lectures today.
Data set used at the lecture: fram200 .
10.15 - 12.00 Exercises .
Data set used at the exercises: lung and fram200 .
12.00 - 12.30 Lunch break
12.30 - 14.00 Lecture:
Stata 13             Multiple linear regression - 2
Working med categorical explanatory variables
Interaction/effectmodification.
14.00 - 15.15 Exercises .
Data set used at the exercises: lung and fram200 .


Day 3: Friday November 10th 2017
Updated
9.00 - 12.00 Exercises .
Data set used at the exercises: serumchol.
Some answers to the exercises
12.00 - 12.30 Lunch break
12.30 - 13.30 Exercises continued .
13.30 - 15.15 Lecture: Linear regression, collinerarity, splines and extensions
Collinearity
Restricted cubic splines
Clustered data

Some off Stata code used at the lecture.
Data set used at the lecture : serumchol194 , Framingham, and FEV .

Homework Updated
The homework to the last week is to go through the lectures on logistic regression day 7 in the Basic Statistics course Day7.pdf (Day7.do).
After that you should complete the exercises Homework.pdf with data : postterm and the do-file HomeworkPartB.do.


Day 4: Monday November 20th 2017
Updated
9.00 - 11.00 Lecture: Regression model for binary data .
The logistic regression model in general
Most of Stata code used at the lectures.
Data set used at the lecture: obese .
11.00 - 12.00 Morning exercises
Data set used at the exercises: obese.
12.00 - 12.30 Lunch break.
12.30 - 14.00 The lecture continued.
14.00 - 15.15 Afternoon exercises
Data set used at the exercises: obese.


Day 5: Wednesday November 22nd 2017
Updated
9.00 - 10.00 Exercises. - Monday afternoon continued
10.00 - 12.00 Lecture: Modelbuilding in regression models
Modelbuilding: this to consider
Confounding and adjustment
Model selection an its consequences
Over-fitting
A strategy
12.00 - 12.30 Lunch break
12.30 - 15.15 Exercises
Data set used at the exercises: coffee.


Day 6: Friday November 24th 2017 Updated

9.00 - 12.00 Working with wednesdays exercise
12.00 - 12.30 Lunch break
12.30 - 13.30 Discussing wednesdays exercise
13.30 - 15.00 Lecture: Working with logistics regression models and Extensions .
Diagnostics for logistic regression
Conditional logistic regression
Missing data
Binary data with several random components

Some of Stata code used at the lectures.
15.00 - 15.15 Course evaluation