Guest User

Untitled

a guest
May 23rd, 2018
95
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.83 KB | None | 0 0
  1. > summary(lm(y~., data=mydf))
  2.  
  3. Call:
  4. lm(formula = y ~ ., data = mydf)
  5.  
  6. Residuals:
  7. Min 1Q Median 3Q Max
  8. -73.111 -9.528 -0.897 8.907 78.653
  9.  
  10. Coefficients:
  11. Estimate Std. Error t value Pr(>|t|)
  12. (Intercept) 107.20300 2.83286 37.843 < 2e-16
  13. age -0.87090 0.12356 -7.048 1.97e-12 # SIGNIFICANT
  14. genderM -6.34184 0.33625 -18.861 < 2e-16 # SIGNIFICANT
  15. htcm -0.05992 0.02657 -2.255 0.02415 # SIGNIFICANT
  16. wtkg 0.01247 0.04037 0.309 0.75745
  17. waistcm 0.08095 0.03434 2.358 0.01842 # SIGNIFICANT
  18. cityP 1.18070 0.38454 3.070 0.00214 # SIGNIFICANT
  19. seasonsummer 0.28349 0.66278 0.428 0.66886
  20. seasonwinter -1.25711 0.67247 -1.869 0.06161
  21.  
  22. Residual standard error: 14.32 on 7767 degrees of freedom
  23. (396 observations deleted due to missingness)
  24. Multiple R-squared: 0.08514, Adjusted R-squared: 0.08419
  25. F-statistic: 90.35 on 8 and 7767 DF, p-value: < 2.2e-16
  26.  
  27. > summary(aov(y~., data=mydf))
  28. Df Sum Sq Mean Sq F value Pr(>F)
  29. age 1 68902 68902 335.992 < 2e-16 # SIGNIFICANT
  30. gender 1 72243 72243 352.280 < 2e-16 # SIGNIFICANT
  31. htcm 1 149 149 0.726 0.39409
  32. wtkg 1 1592 1592 7.762 0.00535 # SIGNIFICANT
  33. waistcm 1 767 767 3.738 0.05323
  34. city 1 829 829 4.043 0.04440 # SIGNIFICANT
  35. season 2 3742 1871 9.124 0.00011 # SIGNIFICANT
  36. Residuals 7767 1592791 205
  37. 396 observations deleted due to missingness
  38.  
  39. > bestglm(mydf)
  40. Morgan-Tatar search since factors present with more than 2 levels.
  41. BIC
  42. Best Model:
  43. Df Sum Sq Mean Sq F value Pr(>F)
  44. age 1 68902 68902 334.8 <2e-16 # SIGNIFICANT
  45. gender 1 72243 72243 351.0 <2e-16 # SIGNIFICANT
  46. Residuals 7773 1599869 206
  47. 396 observations deleted due to missingness
  48.  
  49. > library(randomForest)
  50. > fit <- randomForest(y~., data=mydf, importance=TRUE)
  51. > print(fit)
  52.  
  53. Call:
  54. randomForest(formula = y ~ ., data = mydf)
  55. Type of random forest: regression
  56. Number of trees: 500
  57. No. of variables tried at each split: 2
  58.  
  59. Mean of squared residuals: 207.2199
  60. % Var explained: 7.45
  61.  
  62. # FOLLOWING IS FROM fit$importance:
  63.  
  64. IncNodePurity
  65. htcm 219809.13
  66. waistcm 196753.10
  67. wtkg 181179.19
  68. age 119446.90
  69. gender 83154.71
  70. season 42938.42
  71. city 27040.10
  72.  
  73. %IncMSE
  74. htcm 72.663197
  75. wtkg 68.040321
  76. age 48.075415
  77. waistcm 33.267517
  78. gender 26.680004
  79. season 5.932131
  80. city 3.905936
  81.  
  82. var importance
  83.  
  84. gender 55.4005861
  85. waistcm 34.4082250
  86. age 32.3720673
  87. htcm 28.6817975
  88. wtkg 26.7268140
  89. season 8.0689392
  90. city 7.9994742
Add Comment
Please, Sign In to add comment