Advertisement
Guest User

Untitled

a guest
Apr 26th, 2019
91
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.34 KB | None | 0 0
  1. ---
  2. title: "Statistics2, lab 1: 1-way ANOVA"
  3. author: "Gijs Danoe s3494888"
  4. date: "April 12th, 2019"
  5. output: html_document
  6. ---
  7.  
  8. ```{r setup, include=FALSE}
  9. knitr::opts_chunk$set(echo = TRUE)
  10. ```
  11.  
  12. ## 1. Load libraries and data
  13.  
  14. ```{r}
  15. # remove the comments if you don't have these packages yet, run those lines and then comment the lines again. If you don't comment them again, every time you knit the file R will try to install those packages again.
  16.  
  17.  
  18.  
  19. library(foreign)
  20. library(car)
  21. mydata = read.spss("lab1_reading.sav", to.data.frame=TRUE)
  22. # if you have the file lab1_reading.sav in another folder and you are using windows then it should be something as...
  23. #mydata = read.spss("C:/.../reading.sav", to.data.frame=TRUE) # where you need to replace "..." with the folder where the file is located.
  24. # this command produces a warning, which however is not important.
  25.  
  26. ```
  27.  
  28.  
  29. ## 2. Investigate variable Post3
  30.  
  31. ```{r 1load}
  32. boxplot(mydata$POST3 ~ mydata$Group)
  33.  
  34. ```
  35.  
  36. ## 3. Hypotheses
  37. H0: $\mu_1$ = $\mu_2$ = $\mu_3$ \n
  38. Ha: not all of the $\mu_i$ are equal.
  39.  
  40. ## 4. Test normality
  41.  
  42. ```{r}
  43. mydata.aov <- aov(POST3 ~ Group, data = mydata)
  44. mydata.aov.res <- residuals( object = mydata.aov ) # extract the residuals
  45. qqnorm( y = mydata.aov.res ); qqline( y = mydata.aov.res)
  46.  
  47. aggregate(POST3~Group, data=mydata, function(x) shapiro.test(x)$p.value)
  48.  
  49. ```
  50.  
  51. According to the qq plot, the observations all follow the line and the residuals are roughly normally distributed. The Shapiro test confirms this because all p values are above 0.05.
  52.  
  53. ## 5. Test variance
  54.  
  55. ```{r}
  56. leveneTest(mydata.aov)
  57.  
  58. ```
  59.  
  60. The assumption that the variance is homogenous is fulfilled because the p value is higher than 0.05.
  61.  
  62. ## 6. Test variance with Hartley's test, by hand
  63.  
  64. ```{r}
  65. var(mydata[mydata$Group == "Basal",]$POST3)
  66. var(mydata[mydata$Group == "DRTA",]$POST3)
  67. var(mydata[mydata$Group == "Strat",]$POST3)
  68. var(mydata[mydata$Group == "DRTA",]$POST3)/var(mydata[mydata$Group == "Basal",]$POST3)
  69. ```
  70.  
  71. Because k = 3 and n - 1 = 22-1=21, the table for $\alpha$=0.05 shows us that the Hartley's test reaches significance at 2.95. Because my score is lower (1.72 < 2.95) the test is significant and the assumption that the variance is homogenous is fulfilled.
  72.  
  73. ## 7. 1-way ANOVA
  74.  
  75. ```{r}
  76. summary(mydata.aov)
  77. ```
  78.  
  79. Our p value is lower than the $\alpha$ value of 0.05, which means the test is significant and we reject H0 and accept Ha that not all of the $\mu_i$ are equal.
  80.  
  81. ## 8. Effect size
  82.  
  83. ```{r}
  84. # regular R^2
  85.  
  86. mean_data <- mean(mydata$POST3)
  87. SST <- ((41-mean_data)^2 + (41-mean_data)^2 + (43-mean_data)^2 + (46-mean_data)^2 + (46-mean_data)^2 + (45-mean_data)^2 + (45-mean_data)^2 + (32-mean_data)^2 + (33-mean_data)^2 + (39-mean_data)^2 + (42-mean_data)^2 + (45-mean_data)^2 + (39-mean_data)^2 + (44-mean_data)^2 + (36-mean_data)^2 + (49-mean_data)^2 + (40-mean_data)^2 + (35-mean_data)^2 + (36-mean_data)^2 + (40-mean_data)^2 + (54-mean_data)^2 + (32-mean_data)^2 + (31-mean_data)^2 + (40-mean_data)^2 + (48-mean_data)^2 + (30-mean_data)^2 + (42-mean_data)^2 + (48-mean_data)^2 + (49-mean_data)^2 + (53-mean_data)^2 + (48-mean_data)^2 + (43-mean_data)^2 + (55-mean_data)^2 + (55-mean_data)^2 + (57-mean_data)^2 + (53-mean_data)^2 + (37-mean_data)^2 + (50-mean_data)^2 + (54-mean_data)^2 + (41-mean_data)^2 + (49-mean_data)^2 + (47-mean_data)^2 + (49-mean_data)^2 + (49-mean_data)^2 + (53-mean_data)^2 + (47-mean_data)^2 + (41-mean_data)^2 + (49-mean_data)^2 + (43-mean_data)^2 + (45-mean_data)^2 + (50-mean_data)^2 + (48-mean_data)^2 + (49-mean_data)^2 + (42-mean_data)^2 + (38-mean_data)^2 + (42-mean_data)^2 + (34-mean_data)^2 + (48-mean_data)^2 + (51-mean_data)^2 + (33-mean_data)^2 + (44-mean_data)^2 + (48-mean_data)^2 + (49-mean_data)^2 + (33-mean_data)^2 + (45-mean_data)^2 + (42-mean_data)^2)
  88.  
  89. (SSG <- 22*(41.04545 - mean_data)^2 + 22*(46.72727 - mean_data)^2 + 22*(44.27273 - mean_data)^2)
  90.  
  91. SSG/SST
  92.  
  93. summary(lm(POST3~Group, mydata))$adj.r.squared
  94.  
  95. ```
  96.  
  97. The effect size is medium.
  98.  
  99. ## 9. Post-hoc test Bonferroni
  100.  
  101. ```{r}
  102. pairwise.t.test(x = mydata$POST3, g = mydata$Group, p.adjust.method = "bonferroni")
  103. ```
  104.  
  105. The most significant difference according to this test is between DRTA and Basal.
  106.  
  107. ## 10. Post-hoc test TukeyHSD
  108.  
  109. ```{r}
  110. TukeyHSD(mydata.aov)
  111. ```
  112.  
  113. This test agrees with my statement at exercise 9: the biggest difference is between DRTA and Basal.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement