Advertisement
Guest User

HW 2

a guest
Oct 20th, 2019
95
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.00 KB | None | 0 0
  1. ---
  2. title: "Homework 2"
  3. author: "Joshua Kim"
  4. date: "10/19/2019"
  5. output: pdf_document
  6. ---
  7. ## Question 1
  8. **1. A laboratory is estimating the rate of tumorigenesis (the formation of tumors) in two strains of mice, A and B. They have tumor count data for 10 mice in strain A and 13 mice in strain B. Type A mice have been well studied, and information from other laboratories suggests that type A mice have tumor counts that are approximately Poisson-distributed. Tumor count rates for type B mice are unknown, but type B mice are related to type A mice. Assuming a Poisson sampling distribution for each group with rates $\theta_A$ and $\theta_B$ Based on previous research you settle on the following prior distribution:**\newline
  9.  
  10. \centerline {$\theta_A$ ~ gamma(120, 10), $\theta_B$ ~ gamma(12, 1)}
  11.  
  12.  
  13. **(a) Before seeing any data, which group do you expect to have a higher average incidence of cancer? Which group are you more certain about a priori? You answers should be based on the priors specified above.**
  14.  
  15. I expect type A mice to have a higher average incidence of cancer. I am more certainf about Group A's priori.
  16.  
  17.  
  18. **(b) After you the complete of the experiment, you observe the following tumor counts for the two populations:**
  19.  
  20. \centerline {$y_A$ = (12,9,12,14,13,13,15,8,15,6)}
  21. \centerline {$y_B$ = (11,11,10,9,9,8,7,10,6,8,8,9,7)}
  22. ```{r}
  23. y_a = c(12,9,12,14,13,13,15,8,15,6)
  24. sum_y_a = sum(y_a) #117
  25.  
  26. y_b = c(11,11,10,9,9,8,7,10,6,8,8,9,7)
  27. sum_y_b = sum(y_b) #113
  28.  
  29. ```
  30. $$\sum_{i=1}^{10} y_A = 117, \quad \sum_{i=1}^{13} y_B = 113 $$
  31.  
  32. **Write down the posterior distributions, posterior means, posterior variances and 95% quantile-based credible intervals for $\theta_A$ and $\theta_B$**
  33.  
  34. $\theta_A$ posterior distributions with $\theta_A$ ~ Gamma(120,10)
  35.  
  36. $$Posterior \quad \alpha \quad Poisson(y_A | \theta_A) \quad x \quad Gamma(\theta_A)$$
  37. $$Poisson (y_A | \theta_A)= \theta_A^ {\sum_{i=1}^{10} y_A} e^{-10\theta_A}$$
  38. $$Gamma(\theta_A) = \theta_A^ {120-1} e^{-10 \theta_A}$$
  39.  
  40. $$Posterior \quad \alpha \quad [\theta_A^ {\sum_{i=1}^{10} y_A} e^{-10\theta_A} ] \quad x \quad [\theta_A^ {119} e^{-10 \theta_A}]$$
  41. $$ \alpha \quad [\theta_A^ {\sum_{i=1}^{10} y_A + 119} e^{-20\theta_A} ] $$
  42.  
  43. This is a Gamma conjugate prior with
  44.  
  45. $\alpha_A -1 = \sum_{i=1}^{10} y_A + 119$, \space $\beta_A = 20$
  46.  
  47. Therefore, Posterior Distribution of $\theta_A$ is
  48. $$Gamma (\sum_{i=1}^{10} y_A + 120,\space 20)$$
  49.  
  50. E($\theta_A$) for a Gamma Distribution is $\frac {\alpha} {\beta}$. Therefore
  51.  
  52. $$E(\theta_A | y_A) = \frac {(117+120)} {20} = 11.85 $$
  53. Var($\theta_A$ ) for a Gamma Distribution is $\frac {\alpha} {\beta^2}$. Therefore
  54.  
  55. $$Var(\theta_A| y_A) = \frac {(117+120)} {400} = 0.5925 $$
  56.  
  57. $\theta_B$ posterior distributions with $\theta_B$ ~ Gamma(12,1)
  58.  
  59. $$Posterior \quad \alpha \quad Poisson(y_B | \theta_B) \quad x \quad Gamma(\theta_B)$$
  60. $$Poisson (y_B | \theta_B)= \theta_B^ {\sum_{i=1}^{13} y_B} e^{-13\theta_B}$$
  61. $$Gamma(\theta_B) = \theta_B^ {12-1} e^{-\theta_B}$$
  62.  
  63. $$Posterior \quad \alpha \quad [\theta_B^ {\sum_{i=1}^{13} y_B} e^{-13\theta_B} ] \quad x \quad [\theta_B^ {11} e^{-\theta_B}]$$
  64. $$ \alpha \quad [\theta_B^ {\sum_{i=1}^{13} y_B + 11} e^{-14\theta_B} ] $$
  65.  
  66. This is a Gamma conjugate prior with
  67.  
  68. $\alpha_B - 1= \sum_{i=1}^{13} y_B + 11$, \space $\beta_B = 14$
  69.  
  70. Therefore, Posterior Distribution of $\theta_B$ is
  71. $$Gamma (\sum_{i=1}^{13} y_B + 12,\space 14)$$
  72. E($\theta_B$) for a Gamma Distribution is $\frac {\alpha} {\beta}$. Therefore
  73.  
  74. $$E(\theta_B | y_B) = \frac {(113+12)} {14} =8.929 $$
  75. Var($\theta_B$) for a Gamma Distribution is $\frac {\alpha} {\beta^2}$. Therefore
  76.  
  77. $$Var(\theta_B | y_B) = \frac {(113+12)} {196} = 0.638 $$
  78.  
  79. ```{r}
  80. y_a = c(12,9,12,14,13,13,15,8,15,6)
  81. sum_y_a = sum(y_a) #117
  82. a_post_A = sum_y_a +120
  83. b_post_A = 20
  84. alpha = 1 - 0.95
  85. low_A = qgamma(alpha/2, a_post_A, b_post_A)
  86. high_A = qgamma(1 - alpha/2, a_post_A, b_post_A)
  87.  
  88.  
  89. y_b = c(11,11,10,9,9,8,7,10,6,8,8,9,7)
  90. sum_y_b = sum(y_b) #113
  91. a_post_B = sum_y_b +12
  92. b_post_B = 14
  93. alpha = 1 - 0.95
  94. low_B = qgamma(alpha/2, a_post_B, b_post_B)
  95. high_B = qgamma(1 - alpha/2, a_post_B, b_post_B)
  96. print(c(low_A, high_A))
  97. print(c(low_B, high_B))
  98. ```
  99.  
  100. 95% quantile-based credible intervals for $\theta_A$: ( 0.8442978, 0.9434177 )
  101.  
  102. 95% quantile-based credible intervals for $\theta_B$: ( 0.8442978, 0.9434177 )
  103.  
  104. **(c) Compute and plot the posterior expectation of $\theta_B$ given $y_B$ under the prior distribution gamma(12 x $n_o$, $n_o$) for each value of $n_o \epsilon$ {1, 2, ..., 50}. As a reminder, $n_o$ can be thought of as the number of prior observations (or pseudo-counts).**
  105.  
  106. ```{r}
  107. n_0 = c(1:50)
  108. a = 12*n_0 #alpha
  109. b = n_0 #beta
  110. avrg = a/b
  111. avrg
  112. posterior_exp = dgamma(avrg, shape = a, rate = b)
  113. posterior_exp
  114. plot(n_0, posterior_exp,
  115. main="Posterior Density", col='blue', type="l")
  116. ```
  117.  
  118. **(d) Should knowledge about population A tell us anything about population B? Discuss whether or not it makes sense to have p($\theta_A$ , $\theta_B$ ) = p($\theta_A$) × p($\theta_B$ ).**
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement