Advertisement
Guest User

Term Paper

a guest
Dec 8th, 2016
81
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 7.09 KB | None | 0 0
  1. ############################################################
  2. ## Script Description ##
  3. ############################################################
  4. ## My name: Simon Gustafsson
  5. ## I collaborate with: Olle Jönsson
  6. ## Simon.gustafsson1996@gmail.com
  7. ## 05/12/16
  8. ## The title of the work:
  9. ## The data and the packages needed to run your script etc.
  10. ## The version of the software and packages you use.
  11. ############################################################
  12. ############################################################
  13. ############################################################
  14. ## PREAMBLE ##
  15. ############################################################
  16. ## 1. Purpose
  17. ## 2. DATA
  18. ## 3. Variables used in your application
  19. ## 4. Data management
  20. ## 5. Descriptive statistics
  21. ## 6. Estimations
  22. ## 7. Interpretations of your results
  23. ## 8. References
  24. ############################################################
  25. ## Purpose ##
  26. ############################################################
  27. ## You should write what you are doing in this section.
  28. ## Please repeat the question in the asignmnet instead of
  29. ## referring to the assignment.
  30. ############################################################
  31. ## DATA ##
  32. ############################################################
  33. ## Source: Describe the source.
  34. ## Data collection: How is the data collected? Is it a
  35. ## survey data? Is it adminstrative records?
  36. ## The sampling structure.
  37. ## Sample selection. If you for some good reason only use a
  38. ## part of the original data.
  39. install.packages("downloader") # Used to download
  40. install.packages("Ecdat")
  41. install.packages("lmtest")
  42. install.packages("sandwich")
  43.  
  44. ## Importing data
  45. library(Ecdat)
  46. data(Wages)
  47. attach(Wages, warn.conflicts = F)
  48.  
  49.  
  50. ## Checking for invalid enteries in the data set:
  51. summary(is.na(Wages))
  52. ############################################################
  53. ## Variables used in your application ##
  54. ############################################################
  55. ## Variable description including the definition of the
  56. ## variables, coding of missing values.
  57. ############################################################
  58. ## Data management ##
  59. ############################################################
  60. ## This section includes your variable transformations and
  61. ## recoding of the variables for your specific purpose.
  62.  
  63. ## Recodes the variable industry from having values {0,1} -> {1,2}
  64. ## and stores the values as Recoded.industry.
  65.  
  66. ## The actual function changes every value of {0} to {1} and {1} to {2}.
  67. ## In case there are any invalid enteries, that is neither {0} or {1}
  68. ## it will be coded as {NA}.
  69. Recoded.industry <- ifelse(Wages$ind == 0, 1, ifelse(Wages$ind == 1, 2, NA))
  70.  
  71. ## Comparing the old coding and the new one.
  72. ## Looking at key statistics
  73. summary(Wages$ind)sampl
  74. summary(Recoded.industry)
  75.  
  76. ## Checking that there are no invalid enteries, in original and recoded form.
  77. summary(is.na(Wages$ind))
  78. summary(is.na(Recoded.industry))
  79.  
  80. ## Logical testing if the two std.dev.s are identical
  81. identical(sd(Wages$ind), sd(Recoded.industry))
  82.  
  83. ## As seen by the comparison above, nothing but the actual coding is changed
  84.  
  85. ## Potential outliers
  86. ############################################################
  87. ## Descriptive statistics ##
  88. ############################################################
  89. ## Here you should present the descriptive statistics such
  90. ## as means, standard deviations, minimum, maximum of your
  91. ## variables. Sometimes graphic presentation of variables
  92. ## is very informative
  93. summary(Wages)
  94.  
  95. plot(density(Wages$lwage)) # Not decided if
  96. plot(density(Wages$exp)) # these plots should
  97. plot(density(Wages$wks)) # be kept in the final
  98. plot(density(Wages$ed)) # form of the report.
  99. plot(density(scale(Wages$lwage))) #
  100.  
  101. ## As we are interested in the distribution of the numerical
  102. ## variables, we compute the standard deviations.
  103. sd(Wages$lwage)
  104. ## Returns the st. dev. of the logarithm of wages.
  105.  
  106. sd(Wages$exp)
  107. ## Returns the st. dev. of work experience.
  108.  
  109. sd(Wages$wks)
  110. ## Returns the st. dev. of working weeks.
  111.  
  112. sd(Wages$ed)
  113. ## Returns the st. dev. of education.
  114. ############################################################
  115. ## Estimations ##
  116. ############################################################
  117. ## Setting libraries in order to conduct t-tests.
  118. library(lmtest)
  119. library(sandwich)
  120. #################### Model 1 ###############################
  121.  
  122.  
  123. model1 <- lm(Wages$lwage ~ Wages$ed)
  124. plot(model1)
  125. # Basically just used for comparison
  126.  
  127. coeftest(model1) # Regular coefficient printout with t-test
  128. coeftest(model1, vcov = (vcovHC(model1))) # Coefficient printout with t-test,
  129. # White robust std.err applied.
  130. ## Notice the over-estimation of the effect of education on wages
  131. ## in the first model, compared to the second model. This indicates
  132. ## a positive bias (between the error and education).
  133.  
  134. #################### Model 2 ###############################
  135. ## Constructing a linear model
  136. model2 <- lm(Wages$lwage ~ Wages$ed + Wages$wks # The actual model to be used,
  137. + Wages$exp + Wages$bluecol + Recoded.industry # with the variable industry recoded
  138. + Wages$south + Wages$smsa + Wages$married
  139. + Wages$sex + Wages$union + Wages$black)
  140.  
  141. ## Coefficient testing
  142. coeftest(model2) # without robust standard errors, with the variable industry recoded.
  143. coeftest(model2, vcov = (vcovHC(model2))) # Robust std.err applied, with the variable industry recoded.
  144.  
  145. anova(model2)
  146.  
  147.  
  148.  
  149. ############################################################
  150. ## Interpretations of your results ##
  151. ############################################################
  152. ## Interpretations of the results you obtain from your
  153. ## estimations.
  154. ############################################################
  155. ## References ##
  156. ############################################################
  157. ## Here you list your references. This is an example
  158. # "@Unpublished{KoenkerZeileis,
  159. # author = {Koenker, Roger and Zeileis, Achim},
  160. # year = {2007},
  161. # title = {Reproducible Econometric Research
  162. # (A Critical Review of the State of the
  163. # Art)},
  164. # note = {Report 60, Department of Statistics and
  165. # Mathematics, Wirtschaftsuniversit ̈at Wien,
  166. # Research Report Series},
  167. # url =
  168. # {http://www.econ.uiuc.edu/~roger/research/repro/}"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement