Preliminary plan
Notes, Data sets and programs
for
Postgraduate course
in
Linear and logistic regression

Edited May 14th 2004
By Morten Frydenberg
morten@biostat.au.dk



Thursday April 29 2004
9.15 - 10.30 Simple linear regression 1.
The model, the parameters, estimation and inference.

All STATA code used at the lecture.
Data set used at the lecture STATA: lung.dta. SPSS: lung.sav.
10.30 - 12.00 Exercises.
The lung data STATA lung.dta. SPSS: lung.sav.
12.00 - 13.00 Lunch break
13.00 - 14.30 Simple linear regression 2.
Checking the model, residuals, leverage, diagnostics plots,transformation of variables.

Most of STATA code used at the lecture.
Data set used at the lecture: STATA: lung.dta and gfrdata.dta. SPSS: lung.sav and gfrdata.sav.
14.30 - 16.00 Exercises.
The gfr data STATA gfrdata.dta SPSS: gfrdata.sav.
The glyco data STATA glyco.dta SPSS: glyco.sav.


Friday April 30 2004

9.15 - 9.30 Summarizing yesterdays exercises.
9.30 - 10.30 Multiple linear regression 1.
The model, the parameters, estimation and inference.
Checking the model.

Data set used at the lecture STATA: fram200.dta. SPSS: fram200.sav.
10.30 - 12.00 Exercises.
Data STATA: lung.dta. and fram200.dta. SPSS: lung.sav. and fram200.sav.
12.00 - 13.00 Lunch break
13.00 - 14.30 Multiple linear regression. 2
Working med categorical explanatory variables
Interaction/effectmodification.

All of STATA code used at the lecture.
Data set used at the lecture STATA: fram200.dta. SPSS: fram200.sav.
14.30 - 16.00 Exercises.
Data STATA: lung.dta. and fram200.dta. SPSS: lung.sav. and fram200.sav.

Home work
The home work with data sets STATA: case_control.dta. and serumchol.dta. SPSS: case_control.sav. and serumchol.sav.



Thursday May 13 2004

9.15 - 10.00 Summarizing the home work exercises.
10:15 - 12:00 Logistic regression
Odds ratios via logistic regression
Continuous independendt variables
Categorical independendt variables
Interactions
Wald and likelihood ratio test
The logistic regression model in general

Most of STATA code used at the lecture.
Data set used at the lecture STATA: obese.dta. SPSS: obese.sav.
12.00 - 13.00 Lunch break
13.00 - 14.30 The lecture continued
14.30 - 16.00 Exercises.
The prostate cancer data set prossub.dta. SPSS: prossub.sav.



Friday May 14 2004

8.30 - 9.15 Exercises. - Thursday afternoon continued
9.30 - 11.15 Working with linear and logistics regression models
Diagnostics for logistic regression
Test and estimation after the model has been fitted in STATA
Colinearity
Things to consider when specifying a model
Model selection an its consequences

All STATA code used at the lecture.
Data set used at the lecture STATA: obese.dta. and serumchol194.dta.
SPSS: obese.sav. and serumchol194.sav.
11.15 - 11.45 Lunch break
11.45 - 13.45 Extensions
Conditional logistic regression
Models for relative risk or risk differences
Clustered data
Non-linear regression

All STATA code used at the lecture.
Data set used at the lecture STATA: obese.dta.
13.45 - 14.45 Course evaluation

The examination
The assignment and the data STATA: examF2004.dta. SPSS: examF2004.sav.

Participants who are taken the course as the final part of the mandatory Ph.D. course in biostatistics
are required to hand in a solution to an exercise that is given at the end of the course.
The exercise has the form of a statistical analysis of a data set and the participants must return an individual solution in form of a short paper before

Monday June 7 2004 at 12 a.m. at the Department of Biostatistics.
To pass the examination the paper should give a satisfactory analysis of the data and answer the questions posed in the exercise.
Note that the exercise will also evaluate the Basic course in Biostatistics. .