Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- > head(df)
- Empst Gender Age Agegroup Marst Education State Year Month
- 1 Employed Female 58 50-60 Married Some college or associate degree AL 2008 12
- 2 Not in labor force Male 63 61+ Married Less than a high school diploma AL 2008 12
- 3 Employed Male 60 50-60 Single Some college or associate degree AL 2008 12
- 4 Not in labor force Male 55 50-60 Single High school graduates, no college AL 2008 12
- 5 Employed Male 36 30-39 Single Some college or associate degree AL 2008 12
- 6 Employed Female 42 40-49 Married Bachelor's degree or higher AL 2008 12
- YYYYMM Weight
- 1 200812 1876.356
- 2 200812 2630.503
- 3 200812 2763.981
- 4 200812 2693.110
- 5 200812 2905.784
- 6 200812 3511.313
- sum(df[df$Empst=="Unemployed",]$Weight) /
- sum(df[df$Empst %in% c("Employed","Unemployed"),]$Weight)
- UnR<-vector()
- for(i in levels(factor(df$YYYYMM))){
- temp<-sum(df[df$Empst=="Unemployed" & df$YYYYMM == i,]$Weight) /
- sum(df[df$Empst %in% c("Employed","Unemployed") & df$YYYYMM == i,]$Weight)
- UnR<-append(UnR,temp)
- rm(temp)
- }
- Empst Gender Age Agegroup Marst
- Not in universe : 11423 Male :1266475 Min. :16.00 16-19:187734 Married:1441114
- Employed :1600882 Female:1377638 1st Qu.:31.00 20-29:422699 Married: 0
- Unemployed : 132344 Median :45.00 30-39:431298 Single :1202999
- Not in labor force: 899464 Mean :45.81 40-49:490533 Single : 0
- 3rd Qu.:59.00 50-60:518633 Single : 0
- Max. :85.00 61+ :593216 Single : 0
- Education State Year Month
- Less than a high school diploma :418636 CA : 221244 Min. :2008 Min. : 1.000
- High school graduates, no college:802141 TX : 132650 1st Qu.:2008 1st Qu.: 4.000
- Some college or associate degree :719492 NY : 114282 Median :2009 Median : 6.000
- Bachelor's degree or higher :703844 FL : 106116 Mean :2009 Mean : 6.385
- PA : 82482 3rd Qu.:2009 3rd Qu.: 9.000
- IL : 80816 Max. :2010 Max. :12.000
- (Other):1906523
- YYYYMM Weight
- Min. :200804 Min. : 0
- 1st Qu.:200810 1st Qu.: 1176
- Median :200904 Median : 2496
- Mean :200887 Mean : 2226
- 3rd Qu.:200910 3rd Qu.: 3139
- Max. :201004 Max. :16822
- unemployment_rate.df <- ddply(.data = df,
- .variables = "YYYYMM",
- .fun = function(x){
- return(sum(x$weight[x$Empst== "unemployed"])/sum(x$weight[|x$Empst== "Not in labor force"]))
- YYYYMM V1
- 200812 0.13
- 200901 0.1
- 200902 0.43
- #Create an output vector. We can specify length, because we know there'll
- #be one entry for each unique value in the YYYYMM column.
- #That saves time because it means R just modifies the vector in place.
- UnR <- numeric(length(unique(df$YYYYMM))
- #And now, the for loop.
- for(i in levels(factor(df$YYYYMM))){
- #Instead of creating a temporary object (which takes time), and then appending
- #(which takes time), we can just assign the result to the Ith element of the
- #output vector.
- UnR[i]<-sum(df[df$Empst=="Unemployed" & df$YYYYMM == i,]$Weight) /
- sum(df[df$Empst %in% c("Employed","Unemployed") & df$YYYYMM == i,]$Weight)
- }
- require(dplyr)
- df %.%
- group_by(YYYYMM) %.%
- summarize(UnR = sum(Weight[Empst == "Employed"]) /
- sum(Weight[Empst %in% c("Employed", "Unemployed")]))
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement