Guest User

Untitled

a guest
Jun 18th, 2018
80
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 4.57 KB | None | 0 0
  1. ---
  2. title: "Course5Project1"
  3. author: "Nevon"
  4. date: "June 12, 2018"
  5. output:
  6. html_document:
  7. keep_md: yes
  8.  
  9. ---
  10.  
  11. ```{r setup, include=TRUE}
  12. knitr::opts_chunk$set(echo = TRUE)
  13. ```
  14.  
  15. 1. First we load the walking activity data
  16. ```{r}
  17. ##Load Data
  18. fitness <- read.csv("activity.csv")
  19. df <- data.frame(fitness)
  20. ```
  21. 2. Next we will calculate base measure
  22. ```{r}
  23. ## Calculations/ Metrics
  24. stepsbyday <- aggregate(steps ~ date, data = df, sum) ## steps per day
  25. avgsteps <- mean(stepsbyday$steps)## average steps by day
  26. median <- median(stepsbyday$steps)## median steps by day
  27. ```
  28. 3. Following the base measures we plot the average steps by day
  29. ```{r}
  30. ##Plot Histogram & Report Figures
  31. hist(stepsbyday$steps, xlab = "Number of Steps Per Day", main = "Total Steps Per Day", breaks = 4, col = "royal blue")
  32. ## Add Metrics
  33. abline(v = median(stepsbyday$steps), col = "red", lwd = 10)
  34. abline(v = mean(stepsbyday$steps), col = "yellow", lwd = 2)
  35. legend(x = "topright", c("Median", "Mean"), col =c("red", "yellow"), lwd = c(2, 2, 2 ))
  36. ```
  37. 4. Afterwards we will look at the steps per intervals 4a.First removing NAs by creating new data set than plotting the figures
  38. ```{r}
  39. ## Calculate Steps by Interval
  40. library(ggplot2)
  41. Intervals <- df[!is.na(df$steps), ] ##remove NAs
  42. intrv <- aggregate(steps ~ interval, data = Intervals, mean)
  43. ## Create Plot
  44. g <- ggplot(intrv, aes(x=intrv$interval, y = intrv$steps), xlab = "Intervals", ylab = "Avg Steps")
  45. g+geom_line() + xlab("Intervals") + ylab("Avg Steps") + ggtitle("Avg number of Stepgs by Intervals")
  46. ##Find Max Step Interval
  47. max <- max(intrv)
  48. print(max)
  49. ```
  50. 5. Then we calculate the weight of missing values, i.e(how many missing values are there)
  51. 5a. Also we will replace missing values using the value of the average steps per day we calculated before & create clean data set
  52. 5b. Furthermore we will create a new data set which merges orignal data set with new clean data
  53. 5c. Lastly we will plot the new set & find new measure
  54.  
  55. ```{r}
  56. ## Calculate Weight of Missing Values
  57. ALLNAs <- as.numeric(is.na(df))
  58. Missing_Val <- sum(ALLNAs)
  59. print(Missing_Val)
  60. ##Substitute NAs with average steps per date
  61. library(plyr)
  62. dfvalues <- Intervals
  63. avgsteps_day <- tapply(Intervals$steps, Intervals$interval, mean, na.rm = TRUE, simplify = T)
  64. NAdata <- is.na(dfvalues$steps)
  65. dfvalues$steps[NAdata] <- avgsteps_day[as.character(dfvalues$interval[NAdata])]
  66.  
  67. newstepstotal <- tapply(dfvalues$steps,dfvalues$date, sum, na.rm = TRUE, simplify = T) ## New data Frame
  68. newstepstotal <- newstepstotal[!is.na(newstepstotal)]
  69. ##Plot New Hist & Find New Metrics
  70. hist(x = newstepstotal,
  71. col = "royal blue",
  72. breaks = 10,
  73. xlab = "Daily Steps")
  74.  
  75. ##Find Metrics of Newsteptotal
  76. summary(newstepstotal)
  77. ```
  78. 6.The last part of our anlysis will be to explore differences between weekend & weekdays
  79. 6a. First create new variable for weekends/weekdays
  80. 6b. Next find value of steps per daytype
  81. 6c. Lastly we'll plot the data
  82. ```{r}
  83. ## Segment Data into Weekdays / Weekends
  84. wd <- !(weekdays(as.Date(df$date)) %in% c("Saturday", "Sunday"))
  85. wknd <- c("","")
  86. for (i in 1:length(wd)) {
  87. if(wd[i]) {wknd[i] <- "Weekday"} else {wknd[i] <- "Weekend"}
  88. }
  89. df[, "dayType"] <- factor(wknd) ##new daytpe variable
  90.  
  91. wk_df <- aggregate(steps~dayType+interval, data = df, mean)##average steps per daytype
  92. library(lattice)
  93. xyplot(steps ~ interval | factor(dayType),
  94. layout = c(1,2),
  95. xlab = "Interval",
  96. ylab = "Number of Steps",
  97. type = "l",
  98. lty=1,
  99. data = wk_df)
  100. ```
Add Comment
Please, Sign In to add comment