Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ["Portrait of EAs I know"](http://su3su2u1.tumblr.com/post/117473264133/portrait-of-eas-i-know), su3su2u1:
- > But I note from googling for surveys that the median charitable donation for an EA in the Less Wrong survey was 0.
- [Yvain](http://slatestarscratchpad.tumblr.com/post/117484095321/portrait-of-eas-i-know):
- > Two years ago I got a paying residency, and since then I’ve been donating 10% of my salary, which works out to about \$5,000 a year. In two years I’ll graduate residency, start making doctor money, and then I hope to be able to donate maybe eventually as much as \$25,000 - \$50,000 per year. But if you’d caught me five years ago, I would have been one of those people who wrote a lot about it and was very excited about it but put down \$0 in donations on the survey.
- Data preparation:
- ~~~{.R}
- set.seed(2015-05-13)
- survey2013 <- read.csv("http://www.gwern.net/docs/lwsurvey/2013.csv", header=TRUE)
- survey2013$EffectiveAltruism2 <- NA
- s2013 <- subset(survey2013, select=c(Charity,Effective.Altruism,EffectiveAltruism2,Work.Status,
- Profession,Degree,Age,Income))
- colnames(s2013) <- c("Charity","EffectiveAltruism","EffectiveAltruism2","WorkStatus","Profession",
- "Degree","Age","Income")
- s2013$Year <- 2013
- survey2014 <- read.csv("http://www.gwern.net/docs/lwsurvey/2014.csv", header=TRUE)
- s2014 <- subset(survey2014, PreviousSurveys!="Yes", select=c(Charity,EffectiveAltruism,EffectiveAltruism2,
- WorkStatus,Profession,Degree,Age,Income))
- s2014$Year <- 2014
- survey <- rbind(s2013, s2014)
- # replace empty fields with NAs:
- survey[survey==""] <- NA; survey[survey==" "] <- NA
- # convert money amounts from string to number:
- survey$Charity <- as.numeric(as.character(survey$Charity))
- survey$Income <- as.numeric(as.character(survey$Income))
- # both Charity & Income are skewed, like most monetary amounts, so log transform as well:
- survey$CharityLog <- log1p(survey$Charity)
- survey$IncomeLog <- log1p(survey$Income)
- # age:
- survey$Age <- as.integer(as.character(survey$Age))
- # prodigy or no, I disbelieve any LW readers are <10yo (bad data? malicious responses?):
- survey$Age <- ifelse(survey$Age >= 10, survey$Age, NA)
- # convert Yes/No to boolean TRUE/FALSE:
- survey$EffectiveAltruism <- (survey$EffectiveAltruism == "Yes")
- survey$EffectiveAltruism2 <- (survey$EffectiveAltruism2 == "Yes")
- summary(survey)
- ## Charity EffectiveAltruism EffectiveAltruism2 WorkStatus
- ## Min. : 0.000 Mode :logical Mode :logical Student :905
- ## 1st Qu.: 0.000 FALSE:1202 FALSE:450 For-profit work :736
- ## Median : 50.000 TRUE :564 TRUE :45 Self-employed :154
- ## Mean : 1070.931 NA's :487 NA's :1758 Unemployed :149
- ## 3rd Qu.: 400.000 Academics (on the teaching side):104
- ## Max. :110000.000 (Other) :179
- ## NA's :654 NA's : 26
- ## Profession Degree Age
- ## Computers (practical: IT programming etc.) :478 Bachelor's :774 Min. :13.00000
- ## Other :222 High school:597 1st Qu.:21.00000
- ## Computers (practical: IT, programming, etc.):201 Master's :419 Median :25.00000
- ## Mathematics :185 None :125 Mean :27.32494
- ## Engineering :170 Ph D. :125 3rd Qu.:31.00000
- ## (Other) :947 (Other) :189 Max. :72.00000
- ## NA's : 50 NA's : 24 NA's :28
- ## Income Year CharityLog IncomeLog
- ## Min. : 0.00 2013:1547 Min. : 0.000000 Min. : 0.000000
- ## 1st Qu.: 10000.00 2014: 706 1st Qu.: 0.000000 1st Qu.: 9.210440
- ## Median : 33000.00 Median : 3.931826 Median :10.404293
- ## Mean : 75355.69 Mean : 3.591102 Mean : 9.196442
- ## 3rd Qu.: 80000.00 3rd Qu.: 5.993961 3rd Qu.:11.289794
- ## Max. :10000000.00 Max. :11.608245 Max. :16.118096
- ## NA's :993 NA's :654 NA's :993
- # lavaan doesn't like categorical variables and doesn't automatically expand out into dummies like lm/glm,
- # so have to create the dummies myself:
- survey$Degree <- gsub("2","two",survey$Degree)
- survey$Degree <- gsub("'","",survey$Degree)
- survey$Degree <- gsub("/","",survey$Degree)
- survey$WorkStatus <- gsub("-","", gsub("\\(","",gsub("\\)","",survey$WorkStatus)))
- library(qdapTools)
- survey <- cbind(survey, mtabulate(strsplit(gsub(" ", "", as.character(survey$Degree)), ",")),
- mtabulate(strsplit(gsub(" ", "", as.character(survey$WorkStatus)), ",")))
- write.csv(survey, file="2013-2014-lw-ea.csv", row.names=FALSE)
- ~~~
- Analysis:
- ~~~{.R}
- survey <- read.csv("http://www.gwern.net/docs/lwsurvey/2013-2014-lw-ea.csv")
- # treat year as factor for fixed effect:
- survey$Year <- as.factor(survey$Year)
- median(survey[survey$EffectiveAltruism,]$Charity, na.rm=TRUE)
- ## [1] 100
- median(survey[!survey$EffectiveAltruism,]$Charity, na.rm=TRUE)
- ## [1] 42.5
- # t-tests are inappropriate due to non-normal distribution of donations:
- wilcox.test(Charity ~ EffectiveAltruism, conf.int=TRUE, data=survey)
- ## Wilcoxon rank sum test with continuity correction
- ##
- ## data: Charity by EffectiveAltruism
- ## W = 214215, p-value = 4.811186e-08
- ## alternative hypothesis: true location shift is not equal to 0
- ## 95% confidence interval:
- ## -4.999992987e+01 -1.275881408e-05
- ## sample estimates:
- ## difference in location
- ## -19.99996543
- library(ggplot2)
- qplot(Age, CharityLog, color=EffectiveAltruism, data=survey) + geom_point(size=I(3))
- ## https://i.imgur.com/wd5blg8.png
- qplot(Age, CharityLog, color=EffectiveAltruism,
- data=na.omit(subset(survey, select=c(Age, CharityLog, EffectiveAltruism)))) +
- geom_point(size=I(3)) + stat_smooth()
- ## https://i.imgur.com/UGqf8wn.png
- # you might think that we can't treat Age linearly because this looks like a quadratic or
- # logarithm, but when I fitted some curves, charity donations did not seem to flatten out
- # appropriately, and the GAM/loess wiggly-but-increasing line seems like a better summary.
- # Try looking at the asymptotes & quadratics split by group as follows:
- #
- ## n1 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc),
- ## data=survey[survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3))
- ## n2 <- nls(CharityLog ~ SSasymp(as.integer(Age), Asym, r0, lrc),
- ## data=survey[!survey$EffectiveAltruism,], start=list(Asym=6.88, r0=-4, lrc=-3))
- ## with(survey, plot(Age, CharityLog))
- ## points(predict(n1, newdata=data.frame(Age=0:70)), col="blue")
- ## points(predict(n2, newdata=data.frame(Age=0:70)), col="red")
- ##
- ## l1 <- lm(CharityLog ~ Age + I(Age^2), data=survey[survey$EffectiveAltruism,])
- ## l2 <- lm(CharityLog ~ Age + I(Age^2), data=survey[!survey$EffectiveAltruism,])
- ## with(survey, plot(Age, CharityLog));
- ## points(predict(l1, newdata=data.frame(Age=0:70)), col="blue")
- ## points(predict(l2, newdata=data.frame(Age=0:70)), col="red")
- #
- # So I will treat Age as a linear additive sort of thing.
- ~~~
- ![2013-2014 LW survey respondents: self-reported charity donation vs self-reported age, split by self-identifying as EA or not](https://i.imgur.com/fscMfsX.png)
- ![Likewise, but with GAM-smoothed curves for EA vs non-EA](https://i.imgur.com/SSQ4qnU.png)
- ~~~{.R}
- # for the regression, we want to combine EffectiveAltruism/EffectiveAltruism2 into a single measure, EA, so
- # a latent variable in a SEM; then we use EA plus the other covariates to estimate the CharityLog.
- library(lavaan)
- model1 <- " # estimate EA latent variable:
- EA =~ EffectiveAltruism + EffectiveAltruism2
- CharityLog ~ EA + Age + IncomeLog + Year +
- # Degree dummies:
- None + Highschool + twoyeardegree + Bachelors + Masters + Other +
- MDJDotherprofessionaldegree + PhD. +
- # WorkStatus dummies:
- Independentlywealthy + Governmentwork + Forprofitwork +
- Selfemployed + Nonprofitwork + Academicsontheteachingside +
- Student + Homemaker + Unemployed
- "
- fit1 <- sem(model = model1, missing="fiml", data = survey); summary(fit1)
- ## lavaan (0.5-16) converged normally after 197 iterations
- ##
- ## Number of observations 2253
- ##
- ## Number of missing patterns 22
- ##
- ## Estimator ML
- ## Minimum Function Test Statistic 90.659
- ## Degrees of freedom 40
- ## P-value (Chi-square) 0.000
- ##
- ## Parameter estimates:
- ##
- ## Information Observed
- ## Standard Errors Standard
- ##
- ## Estimate Std.err Z-value P(>|z|)
- ## Latent variables:
- ## EA =~
- ## EffectvAltrsm 1.000
- ## EffctvAltrsm2 0.355 0.123 2.878 0.004
- ##
- ## Regressions:
- ## CharityLog ~
- ## EA 1.807 0.621 2.910 0.004
- ## Age 0.085 0.009 9.527 0.000
- ## IncomeLog 0.241 0.023 10.468 0.000
- ## Year 0.319 0.157 2.024 0.043
- ## None -1.688 2.079 -0.812 0.417
- ## Highschool -1.923 2.059 -0.934 0.350
- ## twoyeardegree -1.686 2.081 -0.810 0.418
- ## Bachelors -1.784 2.050 -0.870 0.384
- ## Masters -2.007 2.060 -0.974 0.330
- ## Other -2.219 2.142 -1.036 0.300
- ## MDJDthrprfssn -1.298 2.095 -0.619 0.536
- ## PhD. -1.977 2.079 -0.951 0.341
- ## Indpndntlywlt 1.175 2.119 0.555 0.579
- ## Governmentwrk 1.183 1.969 0.601 0.548
- ## Forprofitwork 0.677 1.940 0.349 0.727
- ## Selfemployed 0.603 1.955 0.309 0.758
- ## Nonprofitwork 0.765 1.973 0.388 0.698
- ## Acdmcsnthtchn 1.087 1.970 0.551 0.581
- ## Student 0.879 1.941 0.453 0.650
- ## Homemaker 1.071 2.498 0.429 0.668
- ## Unemployed 0.606 1.956 0.310 0.757
- ##
- ## Intercepts:
- ## EffectvAltrsm 0.319 0.011 28.788 0.000
- ## EffctvAltrsm2 0.109 0.012 8.852 0.000
- ## CharityLog -0.284 0.737 -0.385 0.700
- ## EA 0.000
- ##
- ## Variances:
- ## EffectvAltrsm 0.050 0.056
- ## EffctvAltrsm2 0.064 0.008
- ## CharityLog 7.058 0.314
- ## EA 0.168 0.056
- # simplify:
- model2 <- " # estimate EA latent variable:
- EA =~ EffectiveAltruism + EffectiveAltruism2
- CharityLog ~ EA + Age + IncomeLog + Year
- "
- fit2 <- sem(model = model2, missing="fiml", data = survey); summary(fit2)
- ## lavaan (0.5-16) converged normally after 55 iterations
- ##
- ## Number of observations 2253
- ##
- ## Number of missing patterns 22
- ##
- ## Estimator ML
- ## Minimum Function Test Statistic 70.134
- ## Degrees of freedom 6
- ## P-value (Chi-square) 0.000
- ##
- ## Parameter estimates:
- ##
- ## Information Observed
- ## Standard Errors Standard
- ##
- ## Estimate Std.err Z-value P(>|z|)
- ## Latent variables:
- ## EA =~
- ## EffectvAltrsm 1.000
- ## EffctvAltrsm2 0.353 0.125 2.832 0.005
- ##
- ## Regressions:
- ## CharityLog ~
- ## EA 1.770 0.619 2.858 0.004
- ## Age 0.085 0.009 9.513 0.000
- ## IncomeLog 0.241 0.023 10.550 0.000
- ## Year 0.329 0.156 2.114 0.035
- ##
- ## Intercepts:
- ## EffectvAltrsm 0.319 0.011 28.788 0.000
- ## EffctvAltrsm2 0.109 0.012 8.854 0.000
- ## CharityLog -1.331 0.317 -4.201 0.000
- ## EA 0.000
- ##
- ## Variances:
- ## EffectvAltrsm 0.049 0.057
- ## EffctvAltrsm2 0.064 0.008
- ## CharityLog 7.111 0.314
- ## EA 0.169 0.058
- # simplify even further:
- summary(lm(CharityLog ~ EffectiveAltruism + EffectiveAltruism2 + Age + IncomeLog, data=survey))
- ## ...Residuals:
- ## Min 1Q Median 3Q Max
- ## -7.6813410 -1.7922422 0.3325694 1.8440610 6.5913961
- ##
- ## Coefficients:
- ## Estimate Std. Error t value Pr(>|t|)
- ## (Intercept) -2.06062203 0.57659518 -3.57378 0.00040242
- ## EffectiveAltruismTRUE 1.26761425 0.37515124 3.37894 0.00081163
- ## EffectiveAltruism2TRUE 0.03596335 0.54563991 0.06591 0.94748766
- ## Age 0.09411164 0.01869218 5.03481 7.7527e-07
- ## IncomeLog 0.32140793 0.04598392 6.98957 1.4511e-11
- ##
- ## Residual standard error: 2.652323 on 342 degrees of freedom
- ## (1906 observations deleted due to missingness)
- ## Multiple R-squared: 0.2569577, Adjusted R-squared: 0.2482672
- ## F-statistic: 29.56748 on 4 and 342 DF, p-value: < 2.2204e-16
- ~~~
- Note these increases are on a log-dollars scale.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement