******************************************************************************* **** **** **** Stata-program (do-file) associated with the lectures on day two. **** **** **** ******************************************************************************* * Specifying the path to where the data are, for example cd "C:\ANOVA\" ****** Body weights of 27 rats in 3 treatment groups ****** * Reading in the data (the data are in the wide format). use "rat.dta", clear * Plotting the individual curves. * Here we need data in the long format. reshape long Weight, i(Rat) j(Week) scatter Weight Week, connect(L) sort(Week) by(Rat) scatter Weight Week, connect(L) sort(Rat Week) by(Group) * Plotting the mean curves. egen groupmean = mean(Weight), by(Group Week) sort Group Week Rat twoway (connected groupmean Week if Group==1, msymbol(Oh)) /// (connected groupmean Week if Group==2, msymbol(O)) /// (connected groupmean Week if Group==3, msymbol(+)), /// legend(label(1 "Control") label(2 "Thyroxin") label(3 "Thiouracil")) /// ytitle("Mean weight (gram)") * Analysing the data using the multivariate repeated measurments ANOVA * Here we need the data in the wide format drop groupmean reshape wide Weight, i(Rat) j(Week) generate W10 = Weight1 - Weight0 generate W21 = Weight2 - Weight1 generate W32 = Weight3 - Weight2 generate W43 = Weight4 - Weight3 * Testing the hypothesis of parallel mean curves in group 1 and 3 (Test 1) mvtest means W10 W21 W32 W43 if Group==1 | Group==3, by(Group) * Testing the hypothesis of equal standard deviations and correlations mvtest cov W10 W21 W32 W43 if Group==1 | Group==3, by(Group) * In order to access the standardized residuals we need to use mixed. * The data need to be on long format to use mixed reshape long Weight, i(Rat) j(Week) xi: mixed Weight bn.Group#bn.Week || Rat: i.Week if Group==1 | Group==3, /// cov(un) reml predict res, rstandard qnorm res * Analysing group 1 and 2 keep Rat Week Weight Group reshape wide Weight, i(Rat) j(Week) generate W10 = Weight1 - Weight0 generate W21 = Weight2 - Weight1 generate W32 = Weight3 - Weight2 generate W43 = Weight4 - Weight3 * Testing the hypothesis of parallel mean curves in group 1 and 2 (Test 1) mvtest means W10 W21 W32 W43 if Group==1 | Group==2, by(Group) * Testing the hypothesis of no difference between group 1 and 2 from * parallel mean curves (Test 2) generate ave = (Weight0+Weight1+Weight2+Weight3+Weight4)/5 sdtest ave if Group==1 | Group==2, by(Group) ttest ave if Group==1 | Group==2, by(Group) * Testing the hypothesis of no difference between group 1 and 2 from *arbitrary mean curves (Test X) mvtest means Weight0 Weight1 Weight2 Weight3 Weight4 if Group==1 | Group==2, /// by(Group) ****** VO2 for CHF patients and healthy during exercise ****** * Reading in the data (the data are in the long format). use "vo2.dta", clear * Plotting the individual curves. scatter vo2 workload, connect(L) sort(workload) by(subject) scatter vo2 workload, connect(L) sort(subject workload) by(group) * Plotting the mean curves. egen groupmean = mean(vo2), by(group workload) sort group workload subject twoway (connected groupmean workload if group==1, msymbol(Oh)) /// (connected groupmean workload if group==2, msymbol(O)), /// legend(label(1 "Healthy") label(2 "CHF")) /// ytitle("Mean VO2 max (ml/kg/min)") * Analysing the data using the multivariate repeated measurments ANOVA * Here we need the data in the wide format drop groupmean reshape wide vo2, i(subject) j(workload) generate v30_0 = vo230 - vo20 generate v60_30 = vo260 - vo230 generate v90_60 = vo290 - vo260 generate v120_90 = vo2120 - vo290 mvtest means v30_0 v60_30 v90_60 v120_90, by(group) * Calculating the observed standard deviations and correlations bysort group: pwcorr vo20 vo230 vo260 vo290 vo2120 tabstat vo20 vo230 vo260 vo290 vo2120, nototal by(group) stat(sd) * Analysing the data using the multivariate repeated measurments ANOVA * and mixed (so that all observations are used). reshape long vo2, i(subject) j(workload) * Multivariate model with different curves in the two groups. The * model is saved so that we can compare it to other models using * a likelihood ratio test. xi: mixed vo2 bn.workload#bn.group || subject: i.workload, cov(un) /// technique(bfgs) estimates store model1 * Multivariate model with parallel curves in the two groups. xi: mixed vo2 bn.workload bn.group || subject: i.workload, cov(un) /// technique(bfgs) estimates store model2 * Likelihood ratio test for parallel curves in the two groups. lrtest model1 model2 * This should work and produce approximate F-tests, but it doesn't!! xi: mixed vo2 bn.workload bn.group bn.workload#bn.group || /// subject: i.workload, cov(un) technique(bfgs) reml dfmethod(kroger) testparm workload#group, small * or contrast group@workload, small ****************************************************************************** *** The following test the hypothesis of equal sd's and correlations in the *** two groups mixed vo2 bn.workload#bn.group || subject: , nocons residual(un, t(workload) /// by(group)) reml estimates store m1a mixed vo2 bn.workload#bn.group || subject: , nocons /// residual(un, t(workload)) reml estimates store m2a lrtest m1a m2a * Plot of estimated means with 95%-confidence intervals xi: mixed vo2 bn.workload#bn.group, nocons || subject: i.workload, cov(un) /// reml technique(bfgs) margins workload#group marginsplot, title("") ytitle("Mean VO2 (ml/kg/min)") * All pairwise comparisons pwcompare workload#group, eff * Estimates of group differences at each workload. contrast group@workload, eff asobserved ****** The effect of cholagogues on gallbladder volume ****** * Reading in the data (the data are in the long format). use "gb.dta", clear * Plotting the individual curves. scatter vol time, connect(L) sort(time) by(dog) scatter vol time, connect(L) sort(dog time) by(group) * Plotting the mean curves. egen groupmean = mean(vol), by(group time) sort group time dog twoway (connected groupmean time if group==1, msymbol(Oh)) /// (connected groupmean time if group==2, msymbol(O)) /// (connected groupmean time if group==3, msymbol(+)), /// legend(label(1 "Cholechystokynin") label(2 "Clanobutin") /// label(3 "Control")) /// ytitle("Mean Gallbladder volume (cm3)") * Analysing the data using the multivariate repeated measurments ANOVA * Here we need the data in the wide format drop groupmean reshape wide vol, i(dog) j(time) generate v10_0 = vol10 - vol0 generate v20_10 = vol20 - vol10 generate v30_20 = vol30 - vol20 generate v40_30 = vol40 - vol30 generate v50_40 = vol50 - vol40 generate v60_50 = vol60 - vol50 generate v70_60 = vol70 - vol60 generate v80_70 = vol80 - vol70 generate v90_80 = vol90 - vol80 generate v100_90 = vol100 - vol90 generate v110_100 = vol110 - vol100 generate v120_110 = vol120 - vol110 mvtest means v10_0 v20_10 v30_20 v40_30 v50_40 v60_50 v70_60 v80_70 v90_80 /// v100_90 v110_100 v120_110, by(group) * Univariate repeated measurements model reshape long vol, i(dog) j(time) * When using anova to fit the univariate repeated measurements ANOVA to the * data we also get various corrections to the F-test statistic. anova vol group/dog time time#group, bse(dog) repeated(time) * In order to get the residuals we fit the univariate repeated measurements * ANOVA to the data using mixed mixed vol bn.time#bn.group || dog: , reml predict v_fit if e(sample), xb predict v_res if e(sample), rstandard scatter v_res v_fit, yline(0) name(q1,replace) qnorm v_res, title ("residual probability-plot") name(q2,replace) graph combine q1 q2 * Looking at the standard deviations for each combination of group and * time-point keep vol dog time group reshape wide vol, i(dog) j(time) tabstat vol0 vol10 vol20 vol30 vol40 vol50 vol60 vol70 vol80 vol90 vol100 /// vol110 vol120, nototal by(group) stat(sd) * Testing the hypothesis of parallel mean curves using the univariate repeated * measurements ANOVA with unequal between and within subject variation in the * three groups reshape long vol, i(dog) j(time) mixed vol bn.time#bn.group || dog: , nocons residual(ex, t(time) by(group)) mle estimates store m1 mixed vol bn.time bn.group || dog: , nocons residual(ex, t(time) by(group)) mle estimates store m2 lrtest m1 m2 * Inspecting the residuals mixed vol bn.time#bn.group || dog: , nocons residual(ex, t(time) by(group)) reml predict v_fit if e(sample), xb predict v_res if e(sample), rstandard scatter v_res v_fit, yline(0) name(q1,replace) qnorm v_res, title ("residual probability-plot") name(q2,replace) graph combine q1 q2 * Testing the hypothesis of parallel mean curves using an exponential * auto-correlation function (something in between the multivariate and * the univariate repeated measurements model) mixed vol bn.time#bn.group || dog: , residual(exp, t(time) by(group)) mle estimates store m1 mixed vol bn.time bn.group || dog: , residual(exp, t(time) by(group)) mle estimates store m2 lrtest m1 m2 * Inspecting the residuals mixed vol bn.time#bn.group || dog: , residual(exp, t(time) by(group)) reml predict v_f if e(sample), xb predict v_r if e(sample), rstandard scatter v_r v_f, yline(0) name(q1,replace) qnorm v_r, title ("residual probability-plot") name(q2,replace) graph combine q1 q2 * This does indeed give a better description of the data