Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ############################################################
- ## Script Description ##
- ############################################################
- ## My name: Simon Gustafsson
- ## I collaborate with: Olle Jönsson
- ## Simon.gustafsson1996@gmail.com
- ## 05/12/16
- ## The title of the work:
- ## The data and the packages needed to run your script etc.
- ## The version of the software and packages you use.
- ############################################################
- ############################################################
- ############################################################
- ## PREAMBLE ##
- ############################################################
- ## 1. Purpose
- ## 2. DATA
- ## 3. Variables used in your application
- ## 4. Data management
- ## 5. Descriptive statistics
- ## 6. Estimations
- ## 7. Interpretations of your results
- ## 8. References
- ############################################################
- ## Purpose ##
- ############################################################
- ## You should write what you are doing in this section.
- ## Please repeat the question in the asignmnet instead of
- ## referring to the assignment.
- ############################################################
- ## DATA ##
- ############################################################
- ## Source: Describe the source.
- ## Data collection: How is the data collected? Is it a
- ## survey data? Is it adminstrative records?
- ## The sampling structure.
- ## Sample selection. If you for some good reason only use a
- ## part of the original data.
- install.packages("downloader") # Used to download
- install.packages("Ecdat")
- install.packages("lmtest")
- install.packages("sandwich")
- ## Importing data
- library(Ecdat)
- data(Wages)
- attach(Wages, warn.conflicts = F)
- ## Checking for invalid enteries in the data set:
- summary(is.na(Wages))
- ############################################################
- ## Variables used in your application ##
- ############################################################
- ## Variable description including the definition of the
- ## variables, coding of missing values.
- ############################################################
- ## Data management ##
- ############################################################
- ## This section includes your variable transformations and
- ## recoding of the variables for your specific purpose.
- ## Recodes the variable industry from having values {0,1} -> {1,2}
- ## and stores the values as Recoded.industry.
- ## The actual function changes every value of {0} to {1} and {1} to {2}.
- ## In case there are any invalid enteries, that is neither {0} or {1}
- ## it will be coded as {NA}.
- Recoded.industry <- ifelse(Wages$ind == 0, 1, ifelse(Wages$ind == 1, 2, NA))
- ## Comparing the old coding and the new one.
- ## Looking at key statistics
- summary(Wages$ind)sampl
- summary(Recoded.industry)
- ## Checking that there are no invalid enteries, in original and recoded form.
- summary(is.na(Wages$ind))
- summary(is.na(Recoded.industry))
- ## Logical testing if the two std.dev.s are identical
- identical(sd(Wages$ind), sd(Recoded.industry))
- ## As seen by the comparison above, nothing but the actual coding is changed
- ## Potential outliers
- ############################################################
- ## Descriptive statistics ##
- ############################################################
- ## Here you should present the descriptive statistics such
- ## as means, standard deviations, minimum, maximum of your
- ## variables. Sometimes graphic presentation of variables
- ## is very informative
- summary(Wages)
- plot(density(Wages$lwage)) # Not decided if
- plot(density(Wages$exp)) # these plots should
- plot(density(Wages$wks)) # be kept in the final
- plot(density(Wages$ed)) # form of the report.
- plot(density(scale(Wages$lwage))) #
- ## As we are interested in the distribution of the numerical
- ## variables, we compute the standard deviations.
- sd(Wages$lwage)
- ## Returns the st. dev. of the logarithm of wages.
- sd(Wages$exp)
- ## Returns the st. dev. of work experience.
- sd(Wages$wks)
- ## Returns the st. dev. of working weeks.
- sd(Wages$ed)
- ## Returns the st. dev. of education.
- ############################################################
- ## Estimations ##
- ############################################################
- ## Setting libraries in order to conduct t-tests.
- library(lmtest)
- library(sandwich)
- #################### Model 1 ###############################
- model1 <- lm(Wages$lwage ~ Wages$ed)
- plot(model1)
- # Basically just used for comparison
- coeftest(model1) # Regular coefficient printout with t-test
- coeftest(model1, vcov = (vcovHC(model1))) # Coefficient printout with t-test,
- # White robust std.err applied.
- ## Notice the over-estimation of the effect of education on wages
- ## in the first model, compared to the second model. This indicates
- ## a positive bias (between the error and education).
- #################### Model 2 ###############################
- ## Constructing a linear model
- model2 <- lm(Wages$lwage ~ Wages$ed + Wages$wks # The actual model to be used,
- + Wages$exp + Wages$bluecol + Recoded.industry # with the variable industry recoded
- + Wages$south + Wages$smsa + Wages$married
- + Wages$sex + Wages$union + Wages$black)
- ## Coefficient testing
- coeftest(model2) # without robust standard errors, with the variable industry recoded.
- coeftest(model2, vcov = (vcovHC(model2))) # Robust std.err applied, with the variable industry recoded.
- anova(model2)
- ############################################################
- ## Interpretations of your results ##
- ############################################################
- ## Interpretations of the results you obtain from your
- ## estimations.
- ############################################################
- ## References ##
- ############################################################
- ## Here you list your references. This is an example
- # "@Unpublished{KoenkerZeileis,
- # author = {Koenker, Roger and Zeileis, Achim},
- # year = {2007},
- # title = {Reproducible Econometric Research
- # (A Critical Review of the State of the
- # Art)},
- # note = {Report 60, Department of Statistics and
- # Mathematics, Wirtschaftsuniversit ̈at Wien,
- # Research Report Series},
- # url =
- # {http://www.econ.uiuc.edu/~roger/research/repro/}"
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement