Advertisement
Guest User

LW/EA

a guest
May 13th, 2015
284
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 13.89 KB | None | 0 0
  1. ["Portrait of EAs I know"](http://su3su2u1.tumblr.com/post/117473264133/portrait-of-eas-i-know), su3su2u1:
  2.  
  3. > But I note from googling for surveys that the median charitable donation for an EA in the Less Wrong survey was 0.
  4.  
  5. [Yvain](http://slatestarscratchpad.tumblr.com/post/117484095321/portrait-of-eas-i-know):
  6.  
  7. > Two years ago I got a paying residency, and since then I’ve been donating 10% of my salary, which works out to about \$5,000 a year. In two years I’ll graduate residency, start making doctor money, and then I hope to be able to donate maybe eventually as much as \$25,000 - \$50,000 per year. But if you’d caught me five years ago, I would have been one of those people who wrote a lot about it and was very excited about it but put down \$0 in donations on the survey.
  8.  
  9. Data preparation:
  10.  
  11. ~~~{.R}
  12. set.seed(2015-05-13)
  13. survey2013 <- read.csv("http://www.gwern.net/docs/lwsurvey/2013.csv", header=TRUE)
  14. survey2013$EffectiveAltruism2 <- NA
  15. s2013 <- subset(survey2013, select=c(Charity,Effective.Altruism,EffectiveAltruism2,Work.Status,
  16. Profession,Degree,Age,Income))
  17. colnames(s2013) <- c("Charity","EffectiveAltruism","EffectiveAltruism2","WorkStatus","Profession",
  18. "Degree","Age","Income")
  19. s2013$Year <- 2013
  20. survey2014 <- read.csv("http://www.gwern.net/docs/lwsurvey/2014.csv", header=TRUE)
  21. s2014 <- subset(survey2014, PreviousSurveys!="Yes", select=c(Charity,EffectiveAltruism,EffectiveAltruism2,
  22. WorkStatus,Profession,Degree,Age,Income))
  23. s2014$Year <- 2014
  24. survey <- rbind(s2013, s2014)
  25.  
  26. # replace empty fields with NAs:
  27. survey[survey==""] <- NA; survey[survey==" "] <- NA
  28.  
  29. # convert money amounts from string to number:
  30. survey$Charity <- as.numeric(as.character(survey$Charity))
  31. survey$Income <- as.numeric(as.character(survey$Income))
  32. # both Charity & Income are skewed, like most monetary amounts, so log transform as well:
  33. survey$CharityLog <- log1p(survey$Charity)
  34. survey$IncomeLog <- log1p(survey$Income)
  35.  
  36. # age:
  37. survey$Age <- as.integer(as.character(survey$Age))
  38. # prodigy or no, I disbelieve any LW readers are <10yo (bad data? malicious responses?):
  39. survey$Age <- ifelse(survey$Age >= 10, survey$Age, NA)
  40.  
  41. # convert Yes/No to boolean TRUE/FALSE:
  42. survey$EffectiveAltruism <- (survey$EffectiveAltruism == "Yes")
  43. survey$EffectiveAltruism2 <- (survey$EffectiveAltruism2 == "Yes")
  44. summary(survey)
  45. ## Charity EffectiveAltruism EffectiveAltruism2 WorkStatus
  46. ## Min. : 0.000 Mode :logical Mode :logical Student :905
  47. ## 1st Qu.: 0.000 FALSE:1202 FALSE:450 For-profit work :736
  48. ## Median : 50.000 TRUE :564 TRUE :45 Self-employed :154
  49. ## Mean : 1070.931 NA's :487 NA's :1758 Unemployed :149
  50. ## 3rd Qu.: 400.000 Academics (on the teaching side):104
  51. ## Max. :110000.000 (Other) :179
  52. ## NA's :654 NA's : 26
  53. ## Profession Degree Age
  54. ## Computers (practical: IT programming etc.) :478 Bachelor's :774 Min. :13.00000
  55. ## Other :222 High school:597 1st Qu.:21.00000
  56. ## Computers (practical: IT, programming, etc.):201 Master's :419 Median :25.00000
  57. ## Mathematics :185 None :125 Mean :27.32494
  58. ## Engineering :170 Ph D. :125 3rd Qu.:31.00000
  59. ## (Other) :947 (Other) :189 Max. :72.00000
  60. ## NA's : 50 NA's : 24 NA's :28
  61. ## Income Year CharityLog IncomeLog
  62. ## Min. : 0.00 2013:1547 Min. : 0.000000 Min. : 0.000000
  63. ## 1st Qu.: 10000.00 2014: 706 1st Qu.: 0.000000 1st Qu.: 9.210440
  64. ## Median : 33000.00 Median : 3.931826 Median :10.404293
  65. ## Mean : 75355.69 Mean : 3.591102 Mean : 9.196442
  66. ## 3rd Qu.: 80000.00 3rd Qu.: 5.993961 3rd Qu.:11.289794
  67. ## Max. :10000000.00 Max. :11.608245 Max. :16.118096
  68. ## NA's :993 NA's :654 NA's :993
  69.  
  70. # lavaan doesn't like categorical variables and doesn't automatically expand out into dummies like lm/glm,
  71. # so have to create the dummies myself:
  72. survey$Degree <- gsub("2","two",survey$Degree)
  73. survey$Degree <- gsub("'","",survey$Degree)
  74. survey$Degree <- gsub("/","",survey$Degree)
  75. survey$WorkStatus <- gsub("-","", gsub("\\(","",gsub("\\)","",survey$WorkStatus)))
  76. library(qdapTools)
  77. survey <- cbind(survey, mtabulate(strsplit(gsub(" ", "", as.character(survey$Degree)), ",")),
  78. mtabulate(strsplit(gsub(" ", "", as.character(survey$WorkStatus)), ",")))
  79. write.csv(survey, file="2013-2014-lw-ea.csv", row.names=FALSE)
  80. ~~~
  81.  
  82. Analysis:
  83.  
  84. ~~~{.R}
  85. survey <- read.csv("http://www.gwern.net/docs/lwsurvey/2013-2014-lw-ea.csv")
  86. # treat year as factor for fixed effect:
  87. survey$Year <- as.factor(survey$Year)
  88.  
  89. median(survey[survey$EffectiveAltruism,]$Charity, na.rm=TRUE)
  90. ## [1] 100
  91. median(survey[!survey$EffectiveAltruism,]$Charity, na.rm=TRUE)
  92. ## [1] 42.5
  93. # t-tests are inappropriate due to non-normal distribution of donations:
  94. wilcox.test(Charity ~ EffectiveAltruism, conf.int=TRUE, data=survey)
  95. ## Wilcoxon rank sum test with continuity correction
  96. ##
  97. ## data: Charity by EffectiveAltruism
  98. ## W = 214215, p-value = 4.811186e-08
  99. ## alternative hypothesis: true location shift is not equal to 0
  100. ## 95% confidence interval:
  101. ## -4.999992987e+01 -1.275881408e-05
  102. ## sample estimates:
  103. ## difference in location
  104. ## -19.99996543
  105.  
  106. library(ggplot2)
  107. qplot(Age, CharityLog, color=EffectiveAltruism, data=survey) + geom_point(size=I(3))
  108. ## https://i.imgur.com/wd5blg8.png
  109. qplot(Age, CharityLog, color=EffectiveAltruism,
  110. data=na.omit(subset(survey, select=c(Age, CharityLog, EffectiveAltruism)))) +
  111. geom_point(size=I(3)) + stat_smooth()
  112. ## https://i.imgur.com/UGqf8wn.png
  113.  
  114. # you might think that we can't treat Age linearly because this looks like a quadratic or
  115. # logarithm, but when I fitted some curves, charity donations did not seem to flatten out
  116. # appropriately, and the GAM/loess wiggly-but-increasing line seems like a better summary.
  117. # Try looking at the asymptotes & quadratics split by group as follows:
  118. #
  119. ## n1 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc),
  120. ## data=survey[survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3))
  121. ## n2 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc),
  122. ## data=survey[!survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3))
  123. ## with(survey, plot(Age, CharityLog))
  124. ## points(predict(n1, newdata=data.frame(Age=0:70)), col="blue")
  125. ## points(predict(n2, newdata=data.frame(Age=0:70)), col="red")
  126. ##
  127. ## l1 <- lm(CharityLog ~ Age + I(Age^2), data=survey[survey$EffectiveAltruism,])
  128. ## l2 <- lm(CharityLog ~ Age + I(Age^2), data=survey[!survey$EffectiveAltruism,])
  129. ## with(survey, plot(Age, CharityLog));
  130. ## points(predict(l1, newdata=data.frame(Age=0:70)), col="blue")
  131. ## points(predict(l2, newdata=data.frame(Age=0:70)), col="red")
  132. #
  133. # So I will treat Age as a linear additive sort of thing.
  134. ~~~
  135.  
  136. ![2013-2014 LW survey respondents: self-reported charity donation vs self-reported age, split by self-identifying as EA or not](https://i.imgur.com/fscMfsX.png)
  137. ![Likewise, but with GAM-smoothed curves for EA vs non-EA](https://i.imgur.com/SSQ4qnU.png)
  138.  
  139. ~~~{.R}
  140. # for the regression, we want to combine EffectiveAltruism/EffectiveAltruism2 into a single measure, EA, so
  141. # a latent variable in a SEM; then we use EA plus the other covariates to estimate the CharityLog.
  142. library(lavaan)
  143. model1 <- " # estimate EA latent variable:
  144. EA =~ EffectiveAltruism + EffectiveAltruism2
  145. CharityLog ~ EA + Age + IncomeLog + Year +
  146. # Degree dummies:
  147. None + Highschool + twoyeardegree + Bachelors + Masters + Other +
  148. MDJDotherprofessionaldegree + PhD. +
  149. # WorkStatus dummies:
  150. Independentlywealthy + Governmentwork + Forprofitwork +
  151. Selfemployed + Nonprofitwork + Academicsontheteachingside +
  152. Student + Homemaker + Unemployed
  153. "
  154. fit1 <- sem(model = model1, missing="fiml", data = survey); summary(fit1)
  155. ## lavaan (0.5-16) converged normally after 197 iterations
  156. ##
  157. ## Number of observations 2253
  158. ##
  159. ## Number of missing patterns 22
  160. ##
  161. ## Estimator ML
  162. ## Minimum Function Test Statistic 90.659
  163. ## Degrees of freedom 40
  164. ## P-value (Chi-square) 0.000
  165. ##
  166. ## Parameter estimates:
  167. ##
  168. ## Information Observed
  169. ## Standard Errors Standard
  170. ##
  171. ## Estimate Std.err Z-value P(>|z|)
  172. ## Latent variables:
  173. ## EA =~
  174. ## EffectvAltrsm 1.000
  175. ## EffctvAltrsm2 0.355 0.123 2.878 0.004
  176. ##
  177. ## Regressions:
  178. ## CharityLog ~
  179. ## EA 1.807 0.621 2.910 0.004
  180. ## Age 0.085 0.009 9.527 0.000
  181. ## IncomeLog 0.241 0.023 10.468 0.000
  182. ## Year 0.319 0.157 2.024 0.043
  183. ## None -1.688 2.079 -0.812 0.417
  184. ## Highschool -1.923 2.059 -0.934 0.350
  185. ## twoyeardegree -1.686 2.081 -0.810 0.418
  186. ## Bachelors -1.784 2.050 -0.870 0.384
  187. ## Masters -2.007 2.060 -0.974 0.330
  188. ## Other -2.219 2.142 -1.036 0.300
  189. ## MDJDthrprfssn -1.298 2.095 -0.619 0.536
  190. ## PhD. -1.977 2.079 -0.951 0.341
  191. ## Indpndntlywlt 1.175 2.119 0.555 0.579
  192. ## Governmentwrk 1.183 1.969 0.601 0.548
  193. ## Forprofitwork 0.677 1.940 0.349 0.727
  194. ## Selfemployed 0.603 1.955 0.309 0.758
  195. ## Nonprofitwork 0.765 1.973 0.388 0.698
  196. ## Acdmcsnthtchn 1.087 1.970 0.551 0.581
  197. ## Student 0.879 1.941 0.453 0.650
  198. ## Homemaker 1.071 2.498 0.429 0.668
  199. ## Unemployed 0.606 1.956 0.310 0.757
  200. ##
  201. ## Intercepts:
  202. ## EffectvAltrsm 0.319 0.011 28.788 0.000
  203. ## EffctvAltrsm2 0.109 0.012 8.852 0.000
  204. ## CharityLog -0.284 0.737 -0.385 0.700
  205. ## EA 0.000
  206. ##
  207. ## Variances:
  208. ## EffectvAltrsm 0.050 0.056
  209. ## EffctvAltrsm2 0.064 0.008
  210. ## CharityLog 7.058 0.314
  211. ## EA 0.168 0.056
  212. # simplify:
  213. model2 <- " # estimate EA latent variable:
  214. EA =~ EffectiveAltruism + EffectiveAltruism2
  215. CharityLog ~ EA + Age + IncomeLog + Year
  216. "
  217. fit2 <- sem(model = model2, missing="fiml", data = survey); summary(fit2)
  218. ## lavaan (0.5-16) converged normally after 55 iterations
  219. ##
  220. ## Number of observations 2253
  221. ##
  222. ## Number of missing patterns 22
  223. ##
  224. ## Estimator ML
  225. ## Minimum Function Test Statistic 70.134
  226. ## Degrees of freedom 6
  227. ## P-value (Chi-square) 0.000
  228. ##
  229. ## Parameter estimates:
  230. ##
  231. ## Information Observed
  232. ## Standard Errors Standard
  233. ##
  234. ## Estimate Std.err Z-value P(>|z|)
  235. ## Latent variables:
  236. ## EA =~
  237. ## EffectvAltrsm 1.000
  238. ## EffctvAltrsm2 0.353 0.125 2.832 0.005
  239. ##
  240. ## Regressions:
  241. ## CharityLog ~
  242. ## EA 1.770 0.619 2.858 0.004
  243. ## Age 0.085 0.009 9.513 0.000
  244. ## IncomeLog 0.241 0.023 10.550 0.000
  245. ## Year 0.329 0.156 2.114 0.035
  246. ##
  247. ## Intercepts:
  248. ## EffectvAltrsm 0.319 0.011 28.788 0.000
  249. ## EffctvAltrsm2 0.109 0.012 8.854 0.000
  250. ## CharityLog -1.331 0.317 -4.201 0.000
  251. ## EA 0.000
  252. ##
  253. ## Variances:
  254. ## EffectvAltrsm 0.049 0.057
  255. ## EffctvAltrsm2 0.064 0.008
  256. ## CharityLog 7.111 0.314
  257. ## EA 0.169 0.058
  258. # simplify even further:
  259. summary(lm(CharityLog ~ EffectiveAltruism + EffectiveAltruism2 + Age + IncomeLog, data=survey))
  260. ## ...Residuals:
  261. ## Min 1Q Median 3Q Max
  262. ## -7.6813410 -1.7922422 0.3325694 1.8440610 6.5913961
  263. ##
  264. ## Coefficients:
  265. ## Estimate Std. Error t value Pr(>|t|)
  266. ## (Intercept) -2.06062203 0.57659518 -3.57378 0.00040242
  267. ## EffectiveAltruismTRUE 1.26761425 0.37515124 3.37894 0.00081163
  268. ## EffectiveAltruism2TRUE 0.03596335 0.54563991 0.06591 0.94748766
  269. ## Age 0.09411164 0.01869218 5.03481 7.7527e-07
  270. ## IncomeLog 0.32140793 0.04598392 6.98957 1.4511e-11
  271. ##
  272. ## Residual standard error: 2.652323 on 342 degrees of freedom
  273. ## (1906 observations deleted due to missingness)
  274. ## Multiple R-squared: 0.2569577, Adjusted R-squared: 0.2482672
  275. ## F-statistic: 29.56748 on 4 and 342 DF, p-value: < 2.2204e-16
  276. ~~~
  277.  
  278. Note these increases are on a log-dollars scale.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement