Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ===================HW7======================================================================================
- ============================================================================================================
- 1. For the dental study example, suppose we assume that every girl has the same intercept and slope (no random effects) and every boy has the same intercept and slope (no random effects). The intercept and slope for girls may be different from those of boys. Compare the intercepts and slopes of girls and boys group. Consider a few different variance-covariance structure of the error term within each subject: (i) Independence with the same variance (this may be accomplished by not including a repeated statement at all) (ii) Homogeneous compound symmetry (iii) Homogeneous AR(1). Choose the one you think fit the model best.
- a. Write down the statistical model you analyze.
- b. Are slopes statistically different from each other?
- ++++++++++++++++++>Code+++++++++++++++++++++++++++++++++
- /*******************************************************************
- Analysis of the dental study data by fitting a random coefficient
- model in time using PROC MIXED.
- - the repeated measurement factor is age (time)
- - there is one "treatment" factor, gender
- The model for each child is assumed to be a straight line.
- The intercepts and slopes may have different means depending on
- gender, with the same covariance matrix D for each gender.
- We use the RANDOM and REPEATED statements to fit models that
- make several different assumptions about the forms of the matrices
- Ri and D.
- *******************************************************************/
- options ls=80 ps=59 nodate; run;
- /******************************************************************
- Read in the data set (See Example 1 of Chapter 4)
- *******************************************************************/
- data dent1; infile 'C:\Users\Min\Box Sync\teaching\Spring 2015\Stat 382\datasets\dental.dat';
- input obsno child age distance gender;
- run;
- * Independent case;
- proc mixed method=ml data=dent1;
- class gender child;
- model distance = gender gender*age / noint solution;
- * estimate 'diff in mean slope' gender 0 0 gender*age 1 -1;
- * contrast 'overall gender diff' gender 1 -1, gender*age 1 -1 /chisq;
- run;
- * CS structure;
- proc mixed method=ml data=dent1;
- class child gender;
- model distance = gender gender*age / noint solution;
- repeated / type=cs subject=child;
- * estimate 'diff in mean slope' gender 0 0 gender*age 1 -1;
- * contrast 'overall gender diff' gender 1 -1, gender*age 1 -1 /chisq;
- run;
- * AR(1);
- proc mixed method=ml data=dent1;
- class child gender;
- model distance = gender gender*age / noint solution;
- repeated / type=ar(1) subject=child;
- * estimate 'diff in mean slope' gender 0 0 gender*age 1 -1;
- * contrast 'overall gender diff' gender 1 -1, gender*age 1 -1 /chisq;
- run;
- * UN;
- proc mixed method=ml data=dent1;
- class child gender;
- model distance = gender gender*age / noint solution;
- repeated / type=un subject=child;
- * estimate 'diff in mean slope' gender 0 0 gender*age 1 -1;
- * contrast 'overall gender diff' gender 1 -1, gender*age 1 -1 /chisq;
- run;
- ==========================================================================>
- ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2
- ==========================================================================>
- 2. For the weight-lifting example, suppose we assume that a subject’s intercept depends on the subject’s weight, age, and previous weight experience only (no random effects), and slope depends weight, age, previous weight experience and the treatment effect (diet or no-diet). Consider a few different variance-covariance structure of the error term within each subject: (i) Independence with the same variance (this may be accomplished by not including a repeated statement at all) (ii) Homogeneous compound symmetry (iii) Homogeneous AR(1). Choose the one you think fit the model best.
- a. Write down the statistical model you analyze
- b. Is the diet treatment effective?
- ++++++++++++++>SAS CODE+++++++++++++++++++++++++++++++
- /*******************************************************************
- Illustration of fitting a linear mixed effects model derived
- from a random coefficient model, where the mean slope in each
- group depends on a continuous covariate.
- The model for each man is assumed to be a straight line.
- The intercepts are taken to depend on baseline covariates.
- The slopes are taken to depend on baseline covariates, differentially
- by group (diet or not).
- We take D to be common for both groups and take Ri to be
- common to both groups of the form Ri = sigma^2 I.
- *******************************************************************/
- options ls=80 ps=59 nodate; run;
- /******************************************************************
- Read in the data set
- *******************************************************************/
- data pdat; infile "C:\Users\Min\Box Sync\teaching\Spring 2015\Stat 382\datasets\press.dat";
- input id time press weight age prev diet;
- run;
- /*******************************************************************
- Use PROC MIXED to fit linear mixed effects model (i); we use
- normal ML rather than REML to get likelihood ratio tests
- *******************************************************************/
- ** independent;
- proc mixed method=ml data=pdat;
- class id;
- model press = weight prev age
- time time*diet time*age time*prev time*diet*prev;
- run;
- ** CS struture;
- proc mixed method=ml data=pdat;
- class id;
- model press = weight prev age
- time time*diet time*age time*prev time*diet*prev;
- repeated /type=cs subject=id;
- run;
- ** AR(1) structure;
- proc mixed method=ml data=pdat;
- class id;
- model press = weight prev age
- time time*diet time*age time*prev time*diet*prev;
- repeated /type=Ar(1) subject=id;
- run;
- proc mixed data=pdat;
- class id;
- model press = weight prev age
- time time*diet time*prev time*diet*prev / solution;
- repeated /type=AR(1) subject=id;
- /* estimate "slp, diet, no prev" time 1 time*diet 1;
- estimate "slp, no diet, prev" time 1 time*prev 1;
- estimate "slp, diet, prev" time 1 time*prev 1 time*diet 1 time*diet*prev 1;
- contrast "overall slp diff" time*diet 1,
- time*prev 1,
- time*diet*prev 1 / chisq;
- contrast "prev effect" time*prev 1, time*diet*prev 1 / chisq;
- */
- contrast "diet effect" time*diet 1, time*diet*prev 1 /chisq;
- run;
- ====================================================================================>
- ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
- ====================================================================================> 3
- 3. In a clinical trial of studying the effects of AZT in slowing the development of AIDS symptoms. In the study, 338 veterans whose immune systems were beginning to falter after infection with the AIDS virus were randomly assigned either to receive AZT immediately or to wait until their T cells showed severe immune weakness. The following table shows whether they received AZT treatment and whether they developed AIDS symptoms during the 3-year study.
- Use logistic regression model to analyze this table.
- a. Write down the statistical model you analyze
- b. What is the odds ratio between AZT use and development of AIDS symptoms? What is the confidence interval?
- +++++++++++++++++++++>SAS CODE================================>
- data AZT;
- input race $ AZT $ Symptoms $ count;
- cards;
- white yes yes 14
- white yes no 93
- white no yes 32
- white no no 81
- black yes yes 11
- black yes no 52
- black no yes 12
- black no no 43
- ;
- proc logistic data=AZT;
- class race AZT/param=ref;
- model Symptoms(event='no')=race azt;
- weight count;
- run;
- ============================================================>
- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++> 4
- ==============================================================>
- options nodate ls=75; run;
- /* Read in the data */
- data lead; infile "C:\Users\Min\Box Sync\teaching\Spring 2015\Stat 382\datasets\lead.dat";
- input id age gender week lead trt;
- time=week;
- run;
- /* Fit of model with common D and common within-child variance */
- **fixed model;
- title "M1";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- repeated week/subject=id;
- run;
- title "M2";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- repeated week/type=cs subject=id;
- run;
- title "M3";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- repeated week/type=ar(1) subject=id;
- run;
- title "M4";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- repeated week/type=un subject=id;
- run;
- title "check gender effects";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age time*trt time*age*trt / solution;
- repeated week/type=un subject=id;
- run;
- proc mixed method=reml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- repeated week/type=un subject=id;
- contrast "gender effects" gender 1, age*gender 1, time*trt*gender 1 0 0, time*trt*gender 0 1 0,
- time*trt*gender 0 0 1, time*age*gender*trt 1 0 0,
- time*age*gender*trt 0 1 0, time*age*gender*trt 0 0 1/chisq;
- run;
- **random model;
- title "M1";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- random intercept time/type=un subject=id;
- * repeated week/subject=id;
- run;
- title "M2";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- random intercept time/type=un subject=id;
- repeated week/type=cs subject=id;
- run;
- title "M3";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- random intercept time/type=un subject=id;
- repeated week/type=ar(1) subject=id;
- run;
- title "M4";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- random intercept time/type=un subject=id;
- repeated week/type=un subject=id;
- run;
- title "check gender effects";
- proc mixed method=ml data=lead;
- class trt id week;
- model lead = age time*trt time*age*trt;
- random intercept time/type=un subject=id;
- * repeated week/subject=id;
- run;
- proc mixed method=reml data=lead;
- class trt id week;
- model lead = age gender age*gender time*trt time*age*trt time*trt*gender time*age*gender*trt;
- random intercept time/type=un subject=id;
- contrast "gender effects" gender 1, age*gender 1, time*trt*gender 1 0 0, time*trt*gender 0 1 0,
- time*trt*gender 0 0 1, time*age*gender*trt 1 0 0,
- time*age*gender*trt 0 1 0, time*age*gender*trt 0 0 1/chisq;
- run;
- =======================================================================================>
- =======================================================================================================
- ============================= HW 6 TOM=========================================================================
- =======================================================================================>
- =======================================================================================================
- ======================================================================================================
- * Tom Holmes;
- *Problem 1;
- *H0: Diet A is not better than Diet B;
- *H1: Diet A is better than Diet B;
- proc power;
- twosamplemeans groupmeans=(0 8) stddev=16 alpha=0.05 sides=1 power=0.5 npergroup=.;
- run;
- /*
- She needs to assume that the blood glucose levels for each group is normally distributed.
- Note that it doesn't matter what the group means actually are as long as they differ by 8.
- Power = .9
- Actual Power N Total N Per Group
- 0.903 140 70
- Power = .8
- Actual Power N Total N Per Group
- 0.806 102 51
- Power = .7
- Actual Power N Total N Per Group
- 0.707 88 44
- Power = .6
- Actual Power N Total N Per Group
- 0.606 60 30
- Power = .5
- Actual Power N Total N Per Group
- 0.510 46 23
- */
- *Problem 2;
- *H0: Men are not better at detecting the sound than women.;
- *H1: Men are better are detecting the sound than women.;
- proc power;
- twosamplemeans groupmeans=(0 .25) stddev=.2 alpha=0.05 sides=1 power=. npergroup=20;
- run;
- /*
- She needs to assume that the response time for each group is normally distributed.
- Mean Difference Power
- .05 0.193
- .1 0.463
- .15 0.753
- .2 0.928
- .25 0.987
- */
- *Problem 3;
- *Same hypothesis as problem 1;
- *H0: Diet A is not better than Diet B;
- *H1: Diet A is better than Diet B;
- proc power;
- twosamplewilcoxon
- vardist("DietA") = uniform(50,80)
- vardist("DietB") = normal(75, 100)
- variables = "DietA" | "DietB"
- alpha=0.05
- sides=1
- power = 0.5
- ntotal = .;
- run;
- /*
- That specified variance is absolutely crazy.
- Actual Power N Total N Per Group
- 0.900 2106 1053
- Actual Power N Total N Per Group
- 0.800 1470 735
- Actual Power N Total N Per Group
- 0.700 1082 541
- Actual Power N Total N Per Group
- 0.600 798 399
- Actual Power N Total N Per Group
- 0.501 572 286
- */
- *Problem 4;
- *Same hypothesis as problem 2;
- *H0: Men are not better at detecting the sound than women.;
- *H1: Men are better are detecting the sound than women.;
- proc power;
- twosamplewilcoxon
- vardist("Men") = gamma(2, .05)
- vardist("Women") = exponential(.2)
- variables = "Men" | "Women"
- alpha=0.05
- sides=1
- power =.
- npergroup =40;
- run;
- /*
- N Per Group Power
- 10 0.308
- 20 0.474
- 30 0.606
- 40 0.709
- */
- data problem5;
- input FeedRate Catalyst AgitRate Temp Concen ReactionPercentage @@;
- datalines;
- 10.0 1.0 100 140 6.0 37.5
- 10.0 1.0 120 180 3.0 28.5
- 10.0 2.0 100 180 3.0 40.4
- 10.0 2.0 120 140 6.0 48.2
- 15.0 1.0 100 180 6.0 50.7
- 15.0 1.0 120 140 3.0 28.9
- 15.0 2.0 100 140 3.0 43.5
- 15.0 2.0 120 180 6.0 64.5
- 12.5 1.5 110 160 4.5 39.0
- 12.5 1.5 110 160 4.5 40.3
- 12.5 1.5 110 160 4.5 38.7
- 12.5 1.5 110 160 4.5 39.7
- 10 2.0 120 130 7.0 .
- ;
- run;
- proc print; run;
- proc reg data=problem5;
- model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen;
- run;
- /*
- Problem 5
- 1.
- The model form with all five factors as independent variables is:
- ReactionPercentage = B_0 + B_1*FeedRate + B_2*Catalyst + B_3*AgitRate + B_4*Temp + B_5*Concen
- The actual model is:
- ReactionPercentage = -43.69167 + 1.65000*FeedRate + 12.75000*Catalyst + -0.02500*AgitRate + 0.16250*Temp + 4.96667*Concen
- R-Square of this model is 0.9652 and the Adjusted R-Square is 0.9362
- */
- proc glm data=problem5;
- model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen;
- run;
- proc glm data=problem5;
- model ReactionPercentage=FeedRate Catalyst Temp Concen;
- run;
- proc reg data=problem5;
- model ReactionPercentage=FeedRate Catalyst Temp Concen;
- run;
- /*
- 2.
- It seems like the variable AgitRate is not significant.
- Removing that model, we have the following:
- R-Square = 0.9647 (Lower), but the Adjusted R-Square is 0.9446 Which is higher.
- We will trust the Adjusted R-Square and assume we made the right choice.
- 3. The Final Model is the following
- ReactionPercentage = -46.44167 + 1.65000*FeedRate + 12.75000*Catalyst + 0.16250*Temp + 4.96667*Concen
- */
- ods select all;
- proc reg data=problem5 ;
- model ReactionPercentage = FeedRate Catalyst Temp Concen/clb cli clm p;
- output out=info p=rpredict lcl=lp ucl=up lclm=lm uclm=um;
- run;
- /*
- 4.
- The 95% confidence intervals of the parameters are:
- Intercept(B_0): -66.16853 -26.71480
- FeedRate 0.88954 2.41046
- Catalyst 8.94769 16.55231
- Temp 0.06744 0.25756
- Concen 3.69923 6.23410
- */
- /*
- 5. The 95% confidence interval of the expected mean of the first observation is:
- 31.2514 39.4653
- */
- proc mixed data=problem5 cl;
- model ReactionPercentage = FeedRate Catalyst Temp Concen/solution;
- estimate 'Part6' intercept 1 FeedRate 10 Catalyst 2 Temp 130 Concen 7;
- run;
- proc print data=info; run;
- /*
- 6.
- The 95% confidence interval of the expected mean of the given entry is:
- 46.1764 56.7236
- The 95% confidence interval of the predicted value is:
- 43.9183 58.9817
- The predicted value is:
- 51.4500
- */
- proc reg data=problem5;
- *model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen
- / selection=forward details=summary;
- *model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen
- / selection=backward details=summary;
- model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen
- / selection=maxr details=summary;
- run;
- proc reg data=problem5;
- model ReactionPercentage = FeedRate Catalyst Temp Concen/clb cli clm p;
- output out=info p=rpredict lcl=lp ucl=up lclm=lm uclm=um;
- run;
- proc print data=info; run;
- =======================================================================================>
- =======================================================================================================
- ============================= HW 6 ME =========================================================================
- =======================================================================================>
- =======================================================================================================
- ======================================================================================================
- *number one;
- proc power;
- twosamplemeans test=diff
- meandiff = 8
- stddev = 16
- power = .5, .6, .7, .8, .9
- npergroup = . ;
- run;
- *number two;
- proc power;
- twosamplemeans test=diff
- meandiff = 0.05, 0.1, 0.15, 0.2, .25
- stddev = .2
- power = .
- npergroup = 20 ;
- run;
- *number three;
- proc power;
- twosamplewilcoxon
- VARDIST("Group A") = UNIFORM(50,80)
- VARDIST("Group B") = NORMAL(75,100)
- variables= "Group A" | "Group B"
- nperg= .
- power=.5, .6, .7, .8, .9;
- run;
- *number four;
- proc power;
- twosamplewilcoxon
- VARDIST("Male")=GAMMA(2,.05)
- VARDIST("Female")=EXPONENTIAL(.2)
- Variables="Male"|"Female"
- nperg=10, 20, 30, 40
- power= . ;
- run;
- *number five;
- data materials;
- input FeedRate Catalyst AgitRate Temp Concen ReactionPercentage @@;
- datalines;
- 10 1 100 140 6 37.5
- 10 1 120 180 3 28.5
- 10 2 100 180 3 40.4
- 10 2 120 140 6 48.2
- 15 1 100 180 6 50.7
- 15 1 120 140 3 28.9
- 15 2 100 140 3 43.5
- 15 2 120 180 6 64.5
- 12.5 1.5 110 160 4.5 39
- 12.5 1.5 110 160 4.5 40.3
- 12.5 1.5 110 160 4.5 38.7
- ;
- proc reg data=materials;
- model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen;
- run;
- proc reg data=materials;
- model ReactionPercentage=FeedRate Catalyst Temp Concen;
- run;
- pro reg data=materials;
- model ReactionPercentage=FeedRate Catalyst AgitRate Temp Concen/clb cli clm p;
- output out=a2 p=b lcl=c ucl=d lclm=e uclm=f;
- run;
- ===============================================================================
- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==
- ===============================================================================
- +++++++++++++++++++++++++++ EXAM 2 ++++++++++++++++++++++++++++++++++++++++++++++==
- ===============================================================================
- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==
- http://rpubs.com/mxmrt/stat382_exam2
- 1. A company conducted a clinical trial to test a drug treatment for treating bilirubin abnormalities. 86 patients were treated with an experimental drug for 3 months. The following is the summary of the data.
- Pre-treatment Post-treatment
- Yes No
- Yes 6 14
- No 6 60
- “Yes” means total bilirubin is above upper limit of normal range
- Is there evidence of a change in the pre to post-treatment rates of abnormalities?
- 2. Lamb’squarter is a common weed in corn fields. A researcher planted corn at the same rate in 8 small plots of ground, then weeded the corn rows by hand to allow no weeds in 4 randomly selected plots and exactly 3 lamb’s-quarter plants per meter of row in the other 4 plots. Here are the yields of corn (bushels per acre) in each of the plots:
- 0 weeds per meter 166.7, 172.2, 165.0, 176.9
- 3 weeds per meter 158.6, 176.4, 153.1, 156.0
- Choose a proper statistical test to determine whether the presence of small numbers of weeds reduce the yield of corn.
- 3. To test whether Drug 6-MP is effective, 10 leukaemia patients (for easy data entry, we just use 10 data point) were divided into two treatment groups: Drug 6-MP and Control. The remission times in weeks for these patients are as following:
- Drug 6-MP 6, 6, 6*, 7, 9*
- Control 1, 2, 2, 3, 4
- *indicates the survival time is censored.
- Can we conclude that Drug 6-MP is effective in term of survival time compared with control treatment?
- For the following two problems, you only need to write down the null and alternative hypotheses, as well as the appropriate statistical test for testing the hypotheses.
- 4. A study considered the effect of prednisolone on severe hypercalcaemia in women with metastatic breast cancer. Of 30 patients, 15 were randomly selected to receive prednisolone. The other 15 formed a control group. Normalization in their level of serum-ionized calcium was achieved by 7 of the treated patients and none of the control group. Analyze whether results were significantly better for treatment than for control.
- 5. In 1878 Charles Darwin recorded some data on the heights of Zea mays plants to determine what effect cross-fertilization or self- fertilization had on the height of Zea mays. The experiment was to select one cross-fertilization and self- fertilization, grow them in the same pot, and later measure their heights. The experiment was conducted on 6 pots. Let (Xi,Yi), be the height of cross-fertilization and the height of self-fertilization at pot i, i=1,…,6. Charles Darwin wanted to know whether the cross-fertilization plants are generally taller than the self-fertilization plants. Set up null and alternative hypotheses, and describe your plan to perform the statistical test.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement