Advertisement
Guest User

Untitled

a guest
Apr 23rd, 2014
45
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 3.89 KB | None | 0 0
  1. > head(df)
  2. Empst Gender Age Agegroup Marst Education State Year Month
  3. 1 Employed Female 58 50-60 Married Some college or associate degree AL 2008 12
  4. 2 Not in labor force Male 63 61+ Married Less than a high school diploma AL 2008 12
  5. 3 Employed Male 60 50-60 Single Some college or associate degree AL 2008 12
  6. 4 Not in labor force Male 55 50-60 Single High school graduates, no college AL 2008 12
  7. 5 Employed Male 36 30-39 Single Some college or associate degree AL 2008 12
  8. 6 Employed Female 42 40-49 Married Bachelor's degree or higher AL 2008 12
  9. YYYYMM Weight
  10. 1 200812 1876.356
  11. 2 200812 2630.503
  12. 3 200812 2763.981
  13. 4 200812 2693.110
  14. 5 200812 2905.784
  15. 6 200812 3511.313
  16.  
  17. sum(df[df$Empst=="Unemployed",]$Weight) /
  18. sum(df[df$Empst %in% c("Employed","Unemployed"),]$Weight)
  19.  
  20. UnR<-vector()
  21. for(i in levels(factor(df$YYYYMM))){
  22. temp<-sum(df[df$Empst=="Unemployed" & df$YYYYMM == i,]$Weight) /
  23. sum(df[df$Empst %in% c("Employed","Unemployed") & df$YYYYMM == i,]$Weight)
  24. UnR<-append(UnR,temp)
  25. rm(temp)
  26. }
  27.  
  28. Empst Gender Age Agegroup Marst
  29. Not in universe : 11423 Male :1266475 Min. :16.00 16-19:187734 Married:1441114
  30. Employed :1600882 Female:1377638 1st Qu.:31.00 20-29:422699 Married: 0
  31. Unemployed : 132344 Median :45.00 30-39:431298 Single :1202999
  32. Not in labor force: 899464 Mean :45.81 40-49:490533 Single : 0
  33. 3rd Qu.:59.00 50-60:518633 Single : 0
  34. Max. :85.00 61+ :593216 Single : 0
  35.  
  36. Education State Year Month
  37. Less than a high school diploma :418636 CA : 221244 Min. :2008 Min. : 1.000
  38. High school graduates, no college:802141 TX : 132650 1st Qu.:2008 1st Qu.: 4.000
  39. Some college or associate degree :719492 NY : 114282 Median :2009 Median : 6.000
  40. Bachelor's degree or higher :703844 FL : 106116 Mean :2009 Mean : 6.385
  41. PA : 82482 3rd Qu.:2009 3rd Qu.: 9.000
  42. IL : 80816 Max. :2010 Max. :12.000
  43. (Other):1906523
  44. YYYYMM Weight
  45. Min. :200804 Min. : 0
  46. 1st Qu.:200810 1st Qu.: 1176
  47. Median :200904 Median : 2496
  48. Mean :200887 Mean : 2226
  49. 3rd Qu.:200910 3rd Qu.: 3139
  50. Max. :201004 Max. :16822
  51.  
  52. unemployment_rate.df <- ddply(.data = df,
  53. .variables = "YYYYMM",
  54. .fun = function(x){
  55. return(sum(x$weight[x$Empst== "unemployed"])/sum(x$weight[|x$Empst== "Not in labor force"]))
  56.  
  57. YYYYMM V1
  58. 200812 0.13
  59. 200901 0.1
  60. 200902 0.43
  61.  
  62. #Create an output vector. We can specify length, because we know there'll
  63. #be one entry for each unique value in the YYYYMM column.
  64. #That saves time because it means R just modifies the vector in place.
  65. UnR <- numeric(length(unique(df$YYYYMM))
  66.  
  67. #And now, the for loop.
  68. for(i in levels(factor(df$YYYYMM))){
  69.  
  70. #Instead of creating a temporary object (which takes time), and then appending
  71. #(which takes time), we can just assign the result to the Ith element of the
  72. #output vector.
  73. UnR[i]<-sum(df[df$Empst=="Unemployed" & df$YYYYMM == i,]$Weight) /
  74. sum(df[df$Empst %in% c("Employed","Unemployed") & df$YYYYMM == i,]$Weight)
  75. }
  76.  
  77. require(dplyr)
  78. df %.%
  79. group_by(YYYYMM) %.%
  80. summarize(UnR = sum(Weight[Empst == "Employed"]) /
  81. sum(Weight[Empst %in% c("Employed", "Unemployed")]))
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement