Preliminary plan
Notes, Data sets and programs
for
Postgraduate course
in
Linear and logistic regression
The course in september will NOT use the added features in the new version 11 of Stata.

Edited September 24, 2009
By Morten Frydenberg
morten@biostat.au.dk



If you are using your own laptop at the exercises then download datasets and do-file.
A list of all data sets All data
If you want the data in SAS or R go here.

Day 1: Monday September 14th 2009
9.15 - 10.30 Lecture: Simple linear regression -1 .
The model, the parameters, estimation and inference.

All Stata code used at the lecture.
Data set used at the lecture: lung
10.30 - 12.00 Exercises .
Data set used at the exercises: lung.
12.00 - 13.00 Lunch break
13.00 - 14.30 Lecture: Simple linear regression -2 .
Checking the model, residuals, leverage, diagnostics plots,transformation of variables.

Most of Stata code used at the lecture.
Data set used at the lecture: lung and gfrdata.
14.30 - 16.00 Exercises .
Data set used at the exercises: gfrdata and glyco.


Day 2: Wednesday September 16th 2009

9.15 - 9.30 Summarizing Mondays exercises.
9.30 - 10.30 Lecture: Multiple linear regression - 1 Modified version of the plot part..
The model, the parameters, estimation and inference.
Checking the model.

All of Stata code used at the lectures today.
Stata code used for plotting. Modified version.
Data set used at the lecture: fram200 .
10.30 - 12.00 Exercises .
Data set used at the exercises: lung and fram200 .
12.00 - 12.30 Lunch break
12.30 - 14.00 Lecture: Multiple linear regression - 2
Working med categorical explanatory variables
Interaction/effectmodification.
14.00 - 15.30 Exercises .
Data set used at the exercises: lung and fram200 .


Day 3: Friday September 18th 2009

9.15 - 12.00 Exercises .
Data set used at the exercises: serumchol.
12.00 - 12.30 Lunch break
12.30 - 13.30 Exercises continued .
13.30 - 15.30 Lecture: Linear regression, collinerarity, splines and extensions
Colinearity
Restricted cubic splines
Random coefficient models
Clustered data

Some off Stata code used at the lecture.
Data set used at the lecture : serumchol194 , Framingham, FEV and greymatter.

Home work
The homework to the last week is to go through the lectures on logistic regression day 7 in the Basic Statistics course Day7.pdf (Day7.do).
After that you should complete the exercises Exercise7.pdf with data : postterm and tatsoib.
SPSS users should substitute exercise 7.1 with SPSSday7_1.pdf.


Day 4: Monday September 28th 2009

9.15 - 10.00 Discussing the home work.
10:15 - 12:00 Lecture: Logistic regression .
Odds ratios via logistic regression
Continuous independendt variables
Categorical independendt variables
Interactions
Wald and likelihood ratio test
The logistic regression model in general

Most of Stata code used at the lectures.
Data set used at the lecture: obese and case_control.
12.00 - 12.30 Lunch break.
12.30 - 14.00 The lecture continued.
14.00 - 15.30 Exercises Updated September 24, 2009.
Data set used at the exercises: obese.
A short sketch op a solution: plots and log


Day 5: Wednesday September 20th 2009

9.15 - 10.00 Exercises. - Monday afternoon continued
10.15 - 12.00 Lecture: Modelbuilding in regression models
Modelbuilding: this to consider
Confounding and adjustment
Model selection an its consequences
Over-fitting
A strategy
12.00 - 12.30 Lunch break
12.030 - 15.30 Exercises Updated September 29, 2009. .
Data set used at the exercises: coffee.


Day 6: Friday October 2nd 2009

9.15 - 10.00 Discusing wednesdays exercise
10.15 - 12.00 Lecture: Working with logistics regression models and Extensions . Updated October 12th, 2009.
Diagnostics for logistic regression
ROC curves and the area under the ROC-curve
Conditional logistic regression
Models for relative risk or risk differences
Clustered binary data

Some of Stata code used at the lectures.
Data set used at the lecture: obese and euroscore.
12.00 - 12.30 Lunch break
12.30 - 14.45 Case studies
14.45 - 15.30 Course evaluation


The assignment