s243a

Draft: Mask Statistical Analysis

Aug 7th, 2020
534
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Latex 7.87 KB | None | 0 0
  1. \documentclass[11pt]{article}
  2. %Gummi|065|=)
  3. \title{\textbf{Mask Statistical Analysis}}
  4. \author{s243a\\
  5.         No One Else}
  6. \date{}
  7. \usepackage[utf8]{inputenc}
  8. \usepackage[english]{babel}
  9.  
  10. \usepackage{hyperref}
  11.  
  12. \usepackage[outputdir=/tmp]{minted}
  13.  
  14. \usepackage[most]{tcolorbox}
  15.  
  16. \newtcblisting{commandshell}{colback=black,colupper=white,colframe=yellow!75!black,
  17. listing only,listing options={language=sh},
  18. every listing line={\textcolor{red}{\small\ttfamily\bfseries DeathStar \$> }}}
  19. \begin{document}
  20. \maketitle
  21. \section{First Section}
  22. The Guardian gave the following absolute risks for contracting COVID-19:
  23. $13\% <1m$ (No physical distancing)
  24. $3\% >1m$ (Physical Distancing) \newline
  25. The guardian sighted following short literature review: \newline
  26. \emph{C Raina MacIntyre, \href{https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31183-1/fulltext}{Physical distancing, face masks, and eye protection for prevention of COVID-19} - (\href{https://www.thelancet.com/action/showPdf?pii=S0140-6736\%2820\%2931183-1}{pdf}), \href{https://www.pearltrees.com/s243a/distancing-protection/id33469256}{pt} }  \newline
  27. but MacIntyre cited the following paper:\newline
  28. \emph{Derek K Chu, \href{https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)31142-9/fulltext}{Physical distancing, face masks, and eye protection to prevent person-to-person transmission of SARS-CoV-2 and COVID-19: a systematic review and meta-analysis} (\href{https://www.thelancet.com/action/showPdf?pii=S0140-6736\%2820\%2931142-9}{pdf}) (\href{https://www.pearltrees.com/s243a/distancing-transmission/id32761118}{pt})} \newline
  29. The exact numbers from Chu were:
  30. $12.8\% <1m$ (No physical distancing)
  31. $2.6\% >1m$ (Physical Distancing) \newline
  32.  
  33.  
  34. **Note that in the paper the actual numbers were written as 12·8\% and 2·6\% respectively.
  35.  
  36. The paper notes that duration wasn't taken into account but for most studies the duration was at least 1h. This is problematic if one is basing risks based on The Independent Action Hypothesis (IAH). Also I doubt that someone was at a fixed distance for over an hour so I wonder how distance values were assigned. Chu's paper was a meta-analysis and some papers that Chu assigned 0 distance to may be better though of as close contacts. For example in reference\#46 of Chu's paper we have the following statement:
  37.  
  38. \begin{quote}"On inspection of the living quarters, the field team found that most of the windows in the bedrooms were closed and sealed and that ventilation within the bedrooms was poor. Initial open-ended interviews with some residents informed the study team that residents shared the same kitchen and dining room within the villa but did not typically eat together or share food at mealtimes. There were no designated social spaces; however, residents reported gathering around laptops to watch movies together."\end{quote}
  39.  
  40. \url{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6759265/} \newline
  41.  
  42. I suppose that we can expect a bit of hand-waving in these types of observational studies because of ethical issues that would occur by trying to get this data via experimental means. Therefore for now let's accept the distance assignments given in "Figure 2" of Chu's paper which is actually tabular data that gives:
  43.  
  44. 1. distance measures for mitigating against spread,
  45.  
  46. 2. both the sample size and the number of people infected at the shorter distance (i.e. the baseline)
  47.  
  48. 3. both the sample size and the number of people infected at the longer distance distance (i.e. intervention aka mitigating measures)
  49.  
  50. 4. the relative risk (RR)
  51.  
  52.  
  53. The first study in this table seems to have "0" given for the "further distance" which doesn't make sense to me. For each study the risk ratio is transparent. For instance on the second line of the table for the study "Arwady et al (2016) [35]"  we have:
  54.  
  55. Events, furtherdistance (n/N) = 1/10
  56.  
  57. Events, shorterdistance (n/N) = 8/20
  58.  
  59.  
  60. and the relative risk RR is (1/10)/(8/20)=0.25
  61.  
  62.  
  63. what is less apparent is how a relative risk is calculated for a a combination of studies. There appears to be a weight to each study called "random weight" and presumably using these weights one can get a combined value for the relative risk (RR). Two ways that one might try to use these weights are as follows:
  64.  
  65. MATLAB Code
  66. \definecolor{bg}{rgb}{0.95,0.95,0.95}
  67. \begin{minted}[linenos=true,bgcolor=bg]{matlab}
  68. % Risk Ratio's for the MERS physical distancing studies;
  69. RR=[0.05 0.25 0.72 0.59];
  70. % Corresponding random weights for "RR"
  71. W=[5.5 2.6 3.2 1.6];
  72. %If the RR was just summed using the weights
  73. RR1=W*RR'/12.9
  74. %If the weights were for "Weighted Least Means squre"
  75. RR2=sqrt(W*RR'/(W*W'))
  76. \end{minted}
  77.  
  78. Output
  79. \begin{commandshell}
  80. RR1 =  0.32349
  81. RR2 =  0.28944
  82. \end{commandshell}
  83.  
  84. Neither of these methods of calculating the combined risk ratios (RR) for the MERS study produces the value of 0.23 from Chu's paper but the method akin to weighted least squares is the closest. Weighted least squares is a minimum variance estimator but it is a biased estimator. Weighted least squares also has the problem of the weights not being known a priori. In weighted least means squares the weights are: \newline
  85.  
  86. 1. the reciprocal of the variance for uncorrelated events or
  87.  
  88. 2. the inverse of the covariance matrix  \newline
  89.  
  90.  
  91. For each study the paper calculates a confidence interval for the risk ratio. Wikipedia gives the following formula.
  92.  
  93. \begin{equation}
  94. CI_{1 - \alpha}(\log(RR)) = \log(RR)\pm SE(\log(RR))\times z_\alpha
  95. \end{equation}
  96.  
  97. where:
  98. \begin{equation}
  99. SE(\log(RR)) = \sqrt{\frac{IN}{IE(IE + IN)} + \frac{CN}{CE(CE + CN)}}
  100. \end{equation}
  101.  
  102. $IE$ = Events in the Intervention Goup
  103.  
  104. $IN$ = Non events in the intervention group
  105.  
  106. $CE$ = Events in the Control Group
  107.  
  108. $CN$ = Non Events in the control Group \newline
  109.  
  110. $Z_{alpha}$ is the standard score: and given by:
  111.  
  112. \begin{equation}
  113. z = {x- \bar{x} \over S}
  114. \end{equation}
  115.  
  116. where:
  117.  
  118. $\bar{x}$ is the mean of the sample.
  119.  
  120. S is the standard deviation of the sample. \newline
  121.  
  122. For a normal distribution the z score of a 95\% confidence interval is +/-2:In general the form of the CI (confidence interval) given by Wikipedia looks correct to me but the expression for SE (standard error) didn't seem to produce the confidence interval's in Chu's paper. The formula given by Wikipedia is supposedly derived via the delta method and the following source was given:The delta method provides the following rule for variance:
  123.  
  124. \begin{equation}
  125. Var(G(X)) = G'(\mu) Var(X)G'(\mu)^T
  126. \end{equation}
  127. \url{www.stata.com/support/faqs/statistics/delta-method/}
  128.  
  129. and confidence interval for a transformed variable:
  130. \begin{equation}
  131. \left[g^{-1}(g(B) - z*se(g(B))), g^{-1}(g(B) + z*se(g(B)))\right]
  132. \end{equation}
  133.  
  134. \url{https://www.stata.com/support/faqs/statistics/delta-rule/}\newline
  135.  
  136. Where, $z$ is the "standard score" and denotes how many standard deviations are required to get a given confidence interval.\newline
  137.  
  138. **Make this an image
  139.  
  140. \url{https://en.wikipedia.org/wiki/File:The_Normal_Distribution.svg}
  141.  
  142. The motivation behind the transformation is that the log(RR) is supposed to be closer to a normal distribution that the risk ratio. The mean and variance of the risk ratio (RR) is given by respectively:
  143. \begin{equation}
  144. \int {(n_1/N_1) \over (n_2/N_2)}B(n_1,p_1)B(n_2,p_2) dn_1 dn_2
  145. \end{equation}
  146. \begin{equation}
  147. \int \left({(n_1/N_1) \over (n_2/N_2)}-E\left[{(n_1/N_1) \over (n2/N2)} \right] \right)^2B(n_1,p_1)B(n_2,p_2) dn_1 dn_2
  148. \end{equation}
  149.  
  150. Where:
  151.  
  152. $B(n1,p1)$ and $B(n2,p2)$
  153.  
  154. are the Binomial distributions for the baseline case and the intervention case respectively.\newline
  155.  
  156. Which has the following mean and variance:
  157. \begin{equation}
  158. \mu=np
  159. \end{equation}
  160.  
  161. \begin{equation}
  162. \sigma^2=npq
  163. \end{equation}
  164.  
  165. We can consider a variable substitution so that:$n_1=exp(x_1)$ and $n_2=exp(x_2)$ and $x3=ln(x_1-x_2)$
  166.  
  167. \end{document}
  168.  
Add Comment
Please, Sign In to add comment