Untitled

["~#iM",["preFetchedData",["^0",["course",["^0",["status","SUCCESS","data",["^ ","id",5977,"title","Human Resources Analytics in R: Exploring Employee Data","description","HR analytics, people analytics, workforce analytics -- whatever you call it, businesses are increasingly counting on their human resources departments to answer questions, provide insights, and make recommendations using data about their employees. In this course, you'll learn how to manipulate, visualize, and perform statistical tests on HR data through a series of HR analytics case studies.","short_description","Manipulate, visualize, and perform statistical tests on HR data. ","author_field",null,"author_bio",null,"author_image","placeholder.png","nb_of_subscriptions",4157,"slug","human-resources-analytics-in-r-exploring-employee-data","image_url","https://assets.datacamp.com/production/course_5977/shields/thumb/shield_image_course_5977_20181024-15-yuf5oq?1540391108","image_thumbnail_url","https://assets.datacamp.com/production/course_5977/shields/thumb_home/shield_image_course_5977_20181024-15-yuf5oq?1540391108","last_updated_on","24/10/2018","link","https://www.datacamp.com/courses/human-resources-analytics-in-r-exploring-employee-data","should_cache",true,"type","datacamp","difficulty_level",1,"state","live","university",null,"sharing_links",["^ ","twitter","http://bit.ly/1eWTMJh","facebook","http://bit.ly/1iS42Do"],"programming_language","r","paid",true,"time_needed",null,"xp",4750,"topic_id",12,"reduced_outline",null,"runtime_config",null,"lti_only",false,"chapters",[["^ ","id",19221,"title_meta",null,"^1","Identifying the best recruiting source","^2","In this chapter, you will get an introduction to how data science is used in a human resources context. Then you will dive into a case study where you'll analyze and visualize recruiting data to determine which source of new candidates ultimately produces the best new hires. The dataset you'll use in this and the other chapters in this course is synthetic, to maintain the privacy of actual employees.","number",1,"^8","identifying-the-best-recruiting-source","nb_exercises",11,"badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png","badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;","24/10/2018","slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter1.pdf","free_preview",true,"xp",850],["^ ","id",19222,"^M",null,"^1","What is driving low employee engagement?","^2","Gallup defines engaged employees as those who are involved in, enthusiastic about and committed to their work and workplace.  There is disagreement about the strength of the connection between employee engagement and business outcomes, but the idea is that employees that are more engaged will be more productive and stay with the organization longer. In this chapter, you'll  look into potential reasons that one department's engagement scores are lower than the rest.","^N",2,"^8","what-is-driving-low-employee-engagement","^O",12,"^P","https://assets.datacamp.com/production/default/badges/missing.png","^Q","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;","24/10/2018","^R","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter2.pdf","^S",null,"xp",950],["^ ","id",19223,"^M",null,"^1","Are new hires getting paid too much?","^2","When employers make a new hire, they must determine what the new employee will be paid. If the employer is not careful, the new hires can come in with a higher salary than the employees that currently work at the same job, which can cause  employee turnover and dissatisfaction. In this chapter, you will check whether new hires are really getting paid more than current employees, and how to double-check your initial observations.","^N",3,"^8","are-new-hires-getting-paid-too-much","^O",12,"^P","https://assets.datacamp.com/production/default/badges/missing.png","^Q","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;",
"24/10/2018","^R","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter3.pdf","^S",null,"xp",950],["^ ","id",19224,"^M",null,"^1","Are performance ratings being given consistently?","^2","Performance management helps an organization keep track of which employees are providing extra value, or below-average value, and compensating them accordingly. Whether performance is a rating or the result of a questionnaire, whether employees are rated each year or more often than that, the process is somewhat subjective. An organization should check that ratings are being given with regard to performance, and not individual managers' preferences, or even biases (conscious or subconscious).","^N",4,"^8","are-performance-ratings-being-given-consistently","^O",12,"^P","https://assets.datacamp.com/production/default/badges/missing.png","^Q","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;","24/10/2018","^R","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter4.pdf","^S",null,"xp",950],["^ ","id",19225,"^M",null,"^1","Improving employee safety with data","^2","In many industries, workplace safety is a critical consideration. Maintaining a safe workplace provides employees with confidence and reduces costs for workers' compensation and legal liabilities. In this chapter, you'll look for  explanations for an increase in workplace accidents.","^N",5,"^8","improving-employee-safety-with-data","^O",13,"^P","https://assets.datacamp.com/production/default/badges/missing.png","^Q","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;","24/10/2018","^R","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter5.pdf","^S",null,"xp",1050]]]]],"chapter",["^0",["status","SUCCESS","data",["^ ","id",19221,"^M",null,"^1","Identifying the best recruiting source","^2","In this chapter, you will get an introduction to how data science is used in a human resources context. Then you will dive into a case study where you'll analyze and visualize recruiting data to determine which source of new candidates ultimately produces the best new hires. The dataset you'll use in this and the other chapters in this course is synthetic, to maintain the privacy of actual employees.","^N",1,"^8","identifying-the-best-recruiting-source","^O",11,"^P","https://assets.datacamp.com/production/default/badges/missing.png","^Q","https://assets.datacamp.com/production/default/badges/missing_unc.png","^;","24/10/2018","^R","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter1.pdf","^S",true,"xp",850]]],"exercises",["^0",["status","SUCCESS","data",[["^ ","id",222104,"^>","VideoExercise","assignment",null,"^1","Welcome to the course!","sample_code","","instructions",null,"^N",1,"sct","","pre_exercise_code","","solution","","hint",null,"attachments",null,"xp",50,"possible_answers",[],"feedbacks",[],"question","","video_link","//player.vimeo.com/video/154783078","video_hls","//videos.datacamp.com/transcoded/000_placeholders/v1/hls-temp.master.m3u8","aspect_ratio",56.25,"projector_key","course_5977_1f1b78a54071ef29773cb5e0c0aa8a3c","language","r","randomNumber",0.6250374303150088],["^ ","id",222105,"^>","PureMultipleChoiceExercise","^T","<p>Based on the video, which of the following would not be considered an application of HR analytics?</p>","^1","Applications of human resources (HR) analytics","^U","","^V",null,"^N",2,"sct","","^W","","^X","","^Y","<p>The thrust of human resources analytics is using data about a company's workforce to create value. Which of these examples don't use data about an organization's workforce?</p>","^Z",null,"xp",50,"^[",["Identifying drivers of employee attrition","[Determining which product to produce next]","Testing whether employee promotion rates are equal for all demographics","Reducing accidents in a workplace"],"^10",["No, understanding why employees leave is a great example of HR analytics.","That's right. Choosing a product direction is important, but it would not be considered HR
analytics.","Try again. HR analytics includes understanding and improving employee fairness and diversity.","Incorrect. Employee safety is another example of an HR analytics problem."],"^11","","^16","r","^17",0.640631600688103],["^ ","id",222106,"^>","NormalExercise","^T","<p>Real HR datasets are tough to find because of privacy and ethical concerns about sharing sensitive employee data. The dataset you'll be using throughout this course is a synthetic one produced by <a href=https://www.ibm.com/communities/analytics/watson-analytics-blog/hr-employee-attrition/>IBM</a>, and modified for learning purposes. </p>\\n<p>In this chapter, you'll be focusing on the sales department and the recruiting channels they were hired from.</p>","^1","Looking at the recruiting data","^U","# Load the readr package\\n___\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Look at the first few rows of the dataset\\n___","^V","<ul>\\n<li>Load the <code>readr</code> package so you can use <code>read_csv()</code>.</li>\\n<li>Look at the first rows of <code>recruitment</code> with <code>head()</code>. </li>\\n</ul>","^N",3,"sct","\\ntest_library_function(readr)\\n\\nex() %>% {\\n    check_object(., recruitment) %>% check_equal(incorrect_msg = 'Do not modify the code that imports `recruitment_data.csv` into `recruitment`.', append = FALSE)\\n    check_output_expr(., head(recruitment), missing_msg = Did you call `head()` on `recruitment`?, append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Excellent. Time to learn how to think about analyzing this data.)","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)","^X","# Load the readr package\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Look at the first few rows of the dataset\\nhead(recruitment)","^Y","<ul>\\n<li>Use <code>library()</code> to load external packages. </li>\\n<li>The <code>head()</code> function only requires a single argument, which is the data frame you want to examine.</li>\\n</ul>","^Z",null,"xp",100,"^[",[],"^10",[],"^11","","^16","r","^17",0.8325200594162925],["^ ","id",222107,"^>","VideoExercise","^T",null,"^1","Recruiting and quality of hire","^U","","^V",null,"^N",4,"sct","","^W","","^X","","^Y",null,"^Z",null,"xp",50,"^[",[],"^10",[],"^11","","^12","//player.vimeo.com/video/154783078","^13","//videos.datacamp.com/transcoded/000_placeholders/v1/hls-temp.master.m3u8","^14",56.25,"^15","course_5977_a1c52bd79de1f53bb11b8a9b610ab9c5","^16","r","^17",0.6698748494912818],["^ ","id",267862,"^>","TabExercise","^T","<p>You would like to help the talent acquisition team understand which recruiting channel will produce the best sales hires. You can apply the HR analytics process to help them. Start by examining the recruiting channels in the data.</p>","^1","Identifying groups in data","^U","","^V",null,"^N",5,"sct","","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\nlibrary(readr)\\nrecruitment <- read_csv(recruitment_data.csv)","^X","","^Y",null,"^Z",null,"xp",100,"^[",[],"^10",[],"^11","","subexercises",[["^ ","id",292535,"^>","NormalExercise","^T",null,"^1",null,"^U","# Load the dplyr package\\n___\\n","^V","<ul>\\n<li>Load the <code>dplyr</code> package.</li>\\n</ul>","^N",1,"sct","test_library_function(dplyr)\\n","^W","","^X","# Load the dplyr package\\nlibrary(dplyr)\\n","^Y","<ul>\\n<li>Use <code>library()</code> to load external packages.</li>\\n</ul>","^Z",null,"xp",10,"^[",[],"^10",[],"^11",""],["^ ","id",292536,"^>","NormalExercise","^T",null,"^1",null,"^U","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\n___\\n","^V","<p>Take a look at the sales recruiting data, <code>recruitment</code>, with <code>summary()</code>.</p>","^N",2,"sct","test_library_function(dplyr)\\n\\nex() %>% check_output_expr(summary(recruitment), missing_
msg = Did you call `summary()` on `recruitment`?, append = FALSE)\\n","^W","","^X","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n","^Y","<p>The <code>summary()</code> function takes a single argument, which is the data frame you want to examine.</p>","^Z",null,"xp",40,"^[",[],"^10",[],"^11",""],["^ ","id",292537,"^>","NormalExercise","^T",null,"^1",null,"^U","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n\\n# See which recruiting sources the company has been using\\n___","^V","<p>Using <code>summary()</code> didn't tell you much about the <code>recruiting_source</code> variable, because <code>read_csv()</code> imported it as a character vector. Use <code>count()</code> on the <code>recruiting_source</code> column to get more information.</p>","^N",3,"sct","test_library_function(dplyr)\\n\\nex() %>% check_output_expr(summary(recruitment), missing_msg = Did you call `summary()` on `recruitment`?, append = FALSE)\\n\\nex() %>% {\\n    check_output_expr(., count(recruitment, recruiting_source), missing_msg = Did you call `count()` on the `recruiting_source` column?, append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Great job! It looks like some employees don't have a recruiting source, but you won't need to worry about that for this analysis.)","^W","","^X","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n\\n# See which recruiting sources the company has been using\\nrecruitment %>% \\n  count(recruiting_source)","^Y","<p>The <code>count()</code> function can be used in a pipeline, just like in the slides from the video. For this exercise, it takes a column name as an argument.</p>","^Z",null,"xp",50,"^[",[],"^10",[],"^11",""]],"^16","r","^17",0.5771138393051058],["^ ","id",222109,"^>","TabExercise","^T","<p>Which recruiting channel produces the best salespeople? One quality of hire metric you can use is sales quota attainment, or how much a salesperson sold last year relative to their quota. An employee whose <code>sales_quota_pct</code> equals .75 sold 75% of their quota, for example. This metric can be helpful because raw sales numbers are not always comparable between employees. </p>\\n<p>Calculate the average sales quota attainment achieved by hires from each recruiting source.</p>","^1","Sales numbers by recruiting source","^U","","^V",null,"^N",6,"sct","","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n# Load packages\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\nlibrary(dplyr)","^X","","^Y",null,"^Z",null,"xp",100,"^[",[],"^10",[],"^11","","^18",[["^ ","id",292630,"^>","NormalExercise","^T",null,"^1",null,"^U","# Find the average sales quota attainment \\nrecruitment %>%\\n  ___","^V","<p><code>recruitment</code> and <code>dplyr</code> are loaded in your workspace. </p>\\n<ul>\\n<li>Use <code>summarize()</code> to calculate the average sales quota attainment. Store it in a new column called <code>avg_sales_quota_pct</code>.</li>\\n</ul>","^N",1,"sct","\\nmsg <- Looks like you didn't calculate the average sales quota attainment correctly. Did you use `summarize()` and `mean()` functions, and name the resulting column `avg_sales_quota_pct`?\\n\\ncheck_correct({\\n    ex() %>% check_output_expr(summarize(recruitment, avg_sales_quota_pct = mean(sales_quota_pct)), missing_msg = msg, append = FALSE)\\n}, {\\n    ex() %>% {\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n        }\\n    }\\n})","^W","","^X","# Find the average sales quota attainment\\nrecruitment %>%\\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) ","^Y","<p>The
<code>summarize()</code> function applies a function over the dataset. Use <code>mean()</code> to find the average sales quota attainment.</p>","^Z",null,"xp",50,"^[",[],"^10",[],"^11",""],["^ ","id",292632,"^>","NormalExercise","^T",null,"^1",null,"^U","# Find the average sales quota attainment for each recruiting source\\navg_sales <- ___\\n\\n# Display the result\\navg_sales","^V","<p>Use <code>summarize()</code> to calculate the average sales quota attainment <strong>within each recruiting source</strong>. Store it in a new column called <code>avg_sales_quota_pct</code>. Assign the result to <code>avg_sales</code>.</p>","^N",2,"sct","\\nmsg <- Looks like you didn't calculate the average sales quota attainment for each recruiting source correctly. You need to use `group_by()`, `summarize()` and `mean()` functions, and name the resulting column `avg_sales_quota_pct`.\\n\\ncheck_correct({\\n   ex() %>% check_object(avg_sales) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n}, {\\n    ex() %>% {\\n        check_function(., group_by) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `group_by()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you group by `recruiting_source`?, append = FALSE)\\n        }\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe the result of `group_by()` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n        }\\n        \\n    }\\n})\\n\\nex() %>% {\\n    check_output_expr(., avg_sales, missing_msg = Do not remove the line of code that prints `avg_sales` to console., append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(You did it! Look at the output. Which recruiting source produces hires with the highest sales?)","^W","","^X","# Find the average sales quota attainment for each recruiting source\\navg_sales <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) \\n  \\n# Display the result\\navg_sales","^Y","<p>Use <code>group_by(recruiting_source)</code> to add <code>recruiting_source</code> as a grouping variable. </p>","^Z",null,"xp",50,"^[",[],"^10",[],"^11",""]],"^16","r","^17",0.664953081591213],["^ ","id",222110,"^>","NormalExercise","^T","<p>Another quality of hire metric you can consider is the attrition rate, or how often hires leave the company. Determine which recruiting channels have the highest and lowest attrition rates. </p>\\n<p>In the last exercise, the output was a data frame with the recruiting channels and the average quota attainment. It would have been easier to tell which channel had the highest-performing employees if it were sorted with <code>arrange()</code>.</p>","^1","Attrition rates by recruiting source","^U","# Find the average attrition for the sales team, by recruiting source, sorted from lowest attrition rate to highest\\navg_attrition <- recruitment %>%\\n  ___ %>% \\n  ___ %>% \\n  ___\\n\\n# Display the result\\navg_attrition","^V","<p><code>recruitment</code> and <code>dplyr</code> are loaded in your workspace. </p>\\n<ul>\\n<li>Use <code>summarize()</code> to calculate the attrition rate <strong>within each recruiting source</strong>. Store it in a new column called <code>attrition_rate</code>.</li>\\n<li>Sort the result by attrition rate, from lowest to highest, and assign it to <code>avg_attrition</code>.</li>\\n</ul>","^N",7,"sct","\\ncheck_correct({\\n    ex() %>% check_object(avg_attrition) %>% check_equal()\\n}, {\\n    ex() %>% {\\n        check_function(., group_by) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `group_by()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = You need to group by the `recruiting_source` column., append = FALSE)\\n        }\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_
msg = Did you pipe the output of `group_by()` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you calculate the mean attrition rate as `attrition_rate`?, append = FALSE)\\n        }\\n        check_function(., arrange) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe the output of `summarize()` into `arrange()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you arrange by `attrition_rate`?, append = FALSE)\\n        }\\n    }\\n})\\n\\nex() %>% {\\n    check_output_expr(., avg_attrition, missing_msg = Do not remove the line of code that prints `avg_attrition` to console., append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Excellent! Now let's learn how to visualize those results.)","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\nlibrary(dplyr)","^X","# Find the average attrition for the sales team, by recruiting source, sorted from lowest attrition rate to highest\\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)\\n  \\n# Display the result\\navg_attrition","^Y","<ul>\\n<li>The attrition rate is calculated with <code>mean()</code>. Be sure to <code>group_by()</code> the recruiting source first. This exercise is very similar to the last one.</li>\\n<li>To sort by a variable <code>x</code> within a pipeline, use <code>arrange(x)</code>.</li>\\n</ul>","^Z",null,"xp",100,"^[",[],"^10",[],"^11","","^16","r","^17",0.2201465533206446],["^ ","id",267863,"^>","VideoExercise","^T",null,"^1","Visualizing the recruiting data","^U","","^V",null,"^N",8,"sct","","^W","","^X","","^Y",null,"^Z",null,"xp",50,"^[",[],"^10",[],"^11","","^12",null,"^13",null,"^14",56.25,"^15","course_5977_60f9b307a3483cc6cd3d0498827cbcce","^16","r","^17",0.9741008948072372],["^ ","id",267864,"^>","NormalExercise","^T","<p>The last step in the HR analytics process is to test and plot the results. For now, you'll focus on visualizing the data from the previous exercises. You'll be making a bar chart so you can more easily see the average sales quota attainment for each recruiting channel.</p>","^1","Visualizing the sales performance differences","^U","# Load the ggplot2 package\\n___\\n\\n# Plot the bar chart\\n___\\n","^V","<p>The summarized data, <code>avg_sales</code>, is available in your workspace. </p>\\n<ul>\\n<li>Load the <code>ggplot2</code> package (<code>dplyr</code> is already loaded).</li>\\n<li>Using <code>ggplot()</code>, plot a bar chart from the <code>avg_sales</code> data. Place the recruiting source on the x-axis and <code>avg_sales_quota_pct</code> on the y-axis. </li>\\n</ul>","^N",9,"sct","\\nmsg <- Did you call `geom_col()` or `geom_bar(stat = 'identity')` to plot the bar chart?\\n\\ntest_library_function(ggplot2)\\n\\nex() %>% {\\n    check_function(., ggplot) %>% check_arg(data) %>% check_equal() \\n    check_function(., aes) %>% {\\n            check_arg(., x) %>% check_equal(eval = FALSE)\\n            check_arg(., y) %>% check_equal(eval = FALSE)\\n    }\\n}\\n\\ncheck_or(\\n    ex() %>% check_function(geom_col, not_called_msg = msg, append = FALSE), \\n    ex() %>% override_solution(geom_bar(stat = 'identity')) %>% check_function(geom_bar) %>% check_arg(stat) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n)\\n\\nex() %>% check_error()\\n\\nsuccess_msg(Great! Take a moment to admire your work.)","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Create the avg_sales dataframe\\navg_sales <- recruitment %>%\\n  group_by(
recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) ","^X","# Load the ggplot2 package\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\nggplot(avg_sales, aes(x = recruiting_source, y = avg_sales_quota_pct)) +\\n  geom_col()","^Y","<p>Look at the <code>ggplot()</code> code sample from the slides. Your data is <code>avg_sales</code>, and the x and y-axes should be specified inside <code>aes()</code>. </p>","^Z",null,"xp",100,"^[",[],"^10",[],"^11","","^16","r","^17",0.6790770393259535],["^ ","id",222111,"^>","NormalExercise","^T","<p>You've been using two quality of hire metrics to compare the recruiting channels. In addition to looking at the sales output of the hires, you are also looking at the attrition rates. Plot a bar chart again, but this time plot average attrition instead of sales quota attainment.</p>","^1","Visualizing the attrition differences","^U","# Load ggplot2\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\n\\n","^V","<p>The summarized data, <code>avg_attrition</code>, is available in your workspace.</p>\\n<ul>\\n<li>Plot a bar chart from the <code>avg_attrition</code> data. Place the recruiting source on the x-axis and the average attrition on the y-axis. </li>\\n</ul>","^N",10,"sct","\\nmsg <- Did you call `geom_col()` or `geom_bar(stat = 'identity')` to plot the bar chart?\\n\\ntest_library_function(ggplot2)\\n\\nex() %>% {\\n    check_function(., ggplot) %>% check_arg(data) %>% check_equal() \\n    check_function(., aes) %>% {\\n            check_arg(., x) %>% check_equal(eval = FALSE)\\n            check_arg(., y) %>% check_equal(eval = FALSE)\\n    }\\n}\\n\\ncheck_or(\\n    ex() %>% check_function(geom_col, not_called_msg = msg, append = FALSE), \\n    ex() %>% override_solution(geom_bar(stat = 'identity')) %>% check_function(geom_bar) %>% check_arg(stat) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n)\\n\\nex() %>% check_error()\\n\\n\\nsuccess_msg(Well done.)","^W","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Create the avg_attrition dataframe\\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)","^X","# Load ggplot2\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\nggplot(avg_attrition, aes(x = recruiting_source, y = attrition_rate)) +\\n  geom_col()","^Y","<ul>\\n<li>Repeat the bar chart from the previous exercise, but use the <code>avg_attrition</code> data.</li>\\n<li>If you need to remember what the <code>avg_attrition</code> dataset looks like, you can view it in the console with <code>head(avg_attrition)</code>.</li>\\n</ul>","^Z",null,"xp",100,"^[",[],"^10",[],"^11","","^16","r","^17",0.821329965303601],["^ ","id",222112,"^>","MultipleChoiceExercise","^T","<p>Now it's time to draw conclusions from your analysis. Both <code>avg_attrition</code> and <code>avg_sales</code> are available in the workspace. Which of the recruiting sources in this dataset produced the best hires, measured by attrition and sales? Which source produced the worst hires?</p>","^1","Drawing conclusions","^U","","^V",["Best: <strong>NA</strong>, Worst: <strong>Search Firm</strong>","Best: <strong>Search Firm</strong>, Worst: <strong>Campus</strong>","Best: <strong>Applied Online</strong>, Worst: <strong>Search Firm</strong>","Best: <strong>Referral</strong>, Worst: <strong>Applied Online</strong>"],"^N",11,"sct","msg1 = Incorrect. You correctly identified NA as the 'group' with the best numbers, but remember that NA is not a hiring source; NA means that the hiring source is missing. \\nmsg2 = Try again. You can look at `avg_attrition` and `avg_sales` in the console.\\nmsg3 = You got it!\\nmsg4 = That's not it. Try looking at `avg_attrition` and `avg_sales` in the console.\\ntest_mc(3, feedback_msgs = c(msg1, msg2, msg3, msg4))","^W","download.
file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\nlibrary(ggplot2)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Recreate the data frames\\navg_sales <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) %>% \\n  arrange(avg_sales_quota_pct)\\n  \\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)","^X","","^Y","<p>A hiring source is considered better if it produces candidates with high sales and low attrition. Look at <code>avg_attrition</code> and <code>avg_sales</code> in the console to see which hiring sources are the best and worst by those criteria.</p>","^Z",null,"xp",50,"^[",[],"^10",[],"^11","","^16","r","^17",0.01085104029736339]]]],"activeImage",["^0",["status","SUCCESS","data","course-5977-master:adb6bc60c57bd224e4985e6e7fdff45d-20181024140727475"]],"sharedImage",["^0",["status","SUCCESS","data","shared-r:cf2b7a605da05ca9c8e92f76d8eab758-20181130092348867"]]]],"systemStatus",["^0",["indicator","none","description","No status has been fetched from the Status Page."]],"backendSession",["^0",["status",["^0",["code","none","text",""]],"isInitSession",false,"message",null]],"settings",["^0",["uiTheme","LIGHT","isOnboarding",false]],"autocomplete",["^0",[]],"user",["^0",["status",null,"settings",["^0",[]]]],"fileBrowser",["^0",["isVisible",true,"fileSelected",null]],"chapter",["^0",["current",["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",1,"slug","identifying-the-best-recruiting-source","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",11,"free_preview",true,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter1.pdf","title","Identifying the best recruiting source","xp",850,"id",19221,"description","In this chapter, you will get an introduction to how data science is used in a human resources context. Then you will dive into a case study where you'll analyze and visualize recruiting data to determine which source of new candidates ultimately produces the best new hires. The dataset you'll use in this and the other chapters in this course is synthetic, to maintain the privacy of actual employees.","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]]]],"boot",["^0",["bootState","PRE_BOOTED","error",null]],"location",["^0",["current",["^0",["pathname","/courses/human-resources-analytics-in-r-exploring-employee-data/identifying-the-best-recruiting-source","query",["^0",["ex","1"]]]],"canonical","https://campus.datacamp.com/courses/human-resources-analytics-in-r-exploring-employee-data/identifying-the-best-recruiting-source?ex=1","before",["^0",["pathname","/courses/human-resources-analytics-in-r-exploring-employee-data/identifying-the-best-recruiting-source","query",["^0",["ex",1]]]]]],"course",["^0",["difficulty_level",1,"reduced_outline",null,"shared_image","shared-r:cf2b7a605da05ca9c8e92f76d8eab758-20181130092348867","active_image","course-5977-master:adb6bc60c57bd224e4985e6e7fdff45d-20181024140727475","author_field",null,"chapters",["~#iL",[["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",1,"slug","identifying-the-best-recruiting-source","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",11,"free_preview",true,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter1.pdf","title","Identifying the best recruiting source","xp",850,"id",19221,"description","In this chapter, you will get an introduction to how data science is used in a human resources context. Then you will dive into a case study where you'll analyze and visualize recruiting data to determine which source of new candidates ultimately
produces the best new hires. The dataset you'll use in this and the other chapters in this course is synthetic, to maintain the privacy of actual employees.","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]],["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",2,"slug","what-is-driving-low-employee-engagement","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",12,"free_preview",null,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter2.pdf","title","What is driving low employee engagement?","xp",950,"id",19222,"description","Gallup defines engaged employees as those who are involved in, enthusiastic about and committed to their work and workplace.  There is disagreement about the strength of the connection between employee engagement and business outcomes, but the idea is that employees that are more engaged will be more productive and stay with the organization longer. In this chapter, you'll  look into potential reasons that one department's engagement scores are lower than the rest.","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]],["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",3,"slug","are-new-hires-getting-paid-too-much","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",12,"free_preview",null,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter3.pdf","title","Are new hires getting paid too much?","xp",950,"id",19223,"description","When employers make a new hire, they must determine what the new employee will be paid. If the employer is not careful, the new hires can come in with a higher salary than the employees that currently work at the same job, which can cause  employee turnover and dissatisfaction. In this chapter, you will check whether new hires are really getting paid more than current employees, and how to double-check your initial observations.","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]],["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",4,"slug","are-performance-ratings-being-given-consistently","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",12,"free_preview",null,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter4.pdf","title","Are performance ratings being given consistently?","xp",950,"id",19224,"description","Performance management helps an organization keep track of which employees are providing extra value, or below-average value, and compensating them accordingly. Whether performance is a rating or the result of a questionnaire, whether employees are rated each year or more often than that, the process is somewhat subjective. An organization should check that ratings are being given with regard to performance, and not individual managers' preferences, or even biases (conscious or subconscious).","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]],["^0",["badge_uncompleted_url","https://assets.datacamp.com/production/default/badges/missing_unc.png","number",5,"slug","improving-employee-safety-with-data","last_updated_on","24/10/2018","title_meta",null,"nb_exercises",13,"free_preview",null,"slides_link","https://s3.amazonaws.com/assets.datacamp.com/production/course_5977/slides/chapter5.pdf","title","Improving employee safety with data","xp",1050,"id",19225,"description","In many industries, workplace safety is a critical consideration. Maintaining a safe workplace provides employees with confidence and reduces costs for workers' compensation and legal liabilities. In this chapter, you'll look for  explanations for an increase in workplace accidents.","badge_completed_url","https://assets.datacamp.com/production/default/badges/missing.png"]]]],"time_needed",null,"
author_image","placeholder.png","runtime_config",null,"lti_only",false,"image_url","https://assets.datacamp.com/production/course_5977/shields/thumb/shield_image_course_5977_20181024-15-yuf5oq?1540391108","topic_id",12,"slug","human-resources-analytics-in-r-exploring-employee-data","last_updated_on","24/10/2018","paid",true,"university",null,"state","live","author_bio",null,"should_cache",true,"sharing_links",["^0",["twitter","http://bit.ly/1eWTMJh","facebook","http://bit.ly/1iS42Do"]],"title","Human Resources Analytics in R: Exploring Employee Data","xp",4750,"image_thumbnail_url","https://assets.datacamp.com/production/course_5977/shields/thumb_home/shield_image_course_5977_20181024-15-yuf5oq?1540391108","short_description","Manipulate, visualize, and perform statistical tests on HR data. ","nb_of_subscriptions",4157,"type","datacamp","link","https://www.datacamp.com/courses/human-resources-analytics-in-r-exploring-employee-data","id",5977,"description","HR analytics, people analytics, workforce analytics -- whatever you call it, businesses are increasingly counting on their human resources departments to answer questions, provide insights, and make recommendations using data about their employees. In this course, you'll learn how to manipulate, visualize, and perform statistical tests on HR data through a series of HR analytics case studies.","programming_language","r"]],"exercises",["^0",["current",0,"all",["^19",[["^0",["sample_code","","sct","","aspect_ratio",56.25,"instructions",null,"question","","hint",null,"possible_answers",["^19",[]],"number",1,"video_hls","//videos.datacamp.com/transcoded/000_placeholders/v1/hls-temp.master.m3u8","user",["^0",["isHintShown",false,"rstudio",["^0",["isReady",false,"settings",["^0",[]],"showHistory",false,"cards",["^0",["messages",["^19",[]],"currentRow",0]]]],"editorTabs",["^0",["files/script.R",["^0",["title","script.R","props",["^0",["active",true,"isClosable",false,"code","","extra",["^0",[]],"resetCode",""]]]]]],"fileBrowser",["^0",["sampleCode",["^0",["fileSelected",null,"files",["^0",["name","files","isOpen",true,"children",["^19",[["^0",["name","script.R","initialContent","","content","","isClosable",false]]]]]]]],"solution",["^0",["fileSelected",null,"files",["^0",["name","solution","isOpen",true,"children",["^19",[["^0",["name","solution.R","initialContent","","content","","isClosable",false]]]]]]]]]],"outputMarkdownTabs",["^0",[]],"markdown",["^0",["titles",["^19",["Knit PDF","Knit HTML"]],"activeTitle","Knit HTML"]],"currentXp",50,"graphicalTabs",["^0",["plot",["^0",["extraClass","animation--flash","title","Plots","props",["^0",["sources",["^19",[]],"currentIndex",0]],"dimension",["^0",["isRealSize",false,"width",1,"height",1]]]],"html",["^0",["extraClass","animation--flash","title","HTML Viewer","props",["^0",["sources",["^19",[]],"currentIndex",0]]]]]],"feedbackMessages",["^19",[]],"lastSubmittedCode",null,"ltiStatus",["^0",[]],"lastSubmitActiveEditorTab",null,"consoleSqlTabs",["^0",["query_result",["^0",["extraClass","","title","query result","props",["^0",["active",true,"isNotView",true,"message","No query executed yet..."]]]]]],"consoleTabs",["^0",["console",["^0",["title","R Console","props",["^0",["active",true]]]],"slides",["^0",["title","Slides","props",["^0",["active",false]]]]]],"inputMarkdownTabs",["^0",[]]]],"randomNumber",0.6250374303150088,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title","Welcome to the course!","xp",50,"language","r","pre_exercise_code","","solution","","type","VideoExercise","id",222104,"projector_key","course_5977_1f1b78a54071ef29773cb5e0c0aa8a3c","video_link","//player.vimeo.com/video/154783078"]],["^0",["sample_code","","sct","","instructions",null,"question","","hint","<p>The thrust of human resources analytics is using data about a company's workforce to create value. Which of these examples don't use data about an organization's workforce?</p>","possible_answers",["^19",["Identifying drivers of employee attrition","[Determining which product to produce next]","Testing whether employee promotion rates are
equal for all demographics","Reducing accidents in a workplace"]],"number",2,"randomNumber",0.640631600688103,"assignment","<p>Based on the video, which of the following would not be considered an application of HR analytics?</p>","feedbacks",["^19",["No, understanding why employees leave is a great example of HR analytics.","That's right. Choosing a product direction is important, but it would not be considered HR analytics.","Try again. HR analytics includes understanding and improving employee fairness and diversity.","Incorrect. Employee safety is another example of an HR analytics problem."]],"attachments",null,"title","Applications of human resources (HR) analytics","xp",50,"language","r","pre_exercise_code","","solution","","type","PureMultipleChoiceExercise","id",222105]],["^0",["sample_code","# Load the readr package\\n___\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Look at the first few rows of the dataset\\n___","sct","\\ntest_library_function(readr)\\n\\nex() %>% {\\n    check_object(., recruitment) %>% check_equal(incorrect_msg = 'Do not modify the code that imports `recruitment_data.csv` into `recruitment`.', append = FALSE)\\n    check_output_expr(., head(recruitment), missing_msg = Did you call `head()` on `recruitment`?, append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Excellent. Time to learn how to think about analyzing this data.)","instructions","<ul>\\n<li>Load the <code>readr</code> package so you can use <code>read_csv()</code>.</li>\\n<li>Look at the first rows of <code>recruitment</code> with <code>head()</code>. </li>\\n</ul>","question","","hint","<ul>\\n<li>Use <code>library()</code> to load external packages. </li>\\n<li>The <code>head()</code> function only requires a single argument, which is the data frame you want to examine.</li>\\n</ul>","possible_answers",["^19",[]],"number",3,"randomNumber",0.8325200594162925,"assignment","<p>Real HR datasets are tough to find because of privacy and ethical concerns about sharing sensitive employee data. The dataset you'll be using throughout this course is a synthetic one produced by <a href=https://www.ibm.com/communities/analytics/watson-analytics-blog/hr-employee-attrition/>IBM</a>, and modified for learning purposes. </p>\\n<p>In this chapter, you'll be focusing on the sales department and the recruiting channels they were hired from.</p>","feedbacks",["^19",[]],"attachments",null,"title","Looking at the recruiting data","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)","solution","# Load the readr package\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Look at the first few rows of the dataset\\nhead(recruitment)","type","NormalExercise","id",222106]],["^0",["sample_code","","sct","","aspect_ratio",56.25,"instructions",null,"question","","hint",null,"possible_answers",["^19",[]],"number",4,"video_hls","//videos.datacamp.com/transcoded/000_placeholders/v1/hls-temp.master.m3u8","randomNumber",0.6698748494912818,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title","Recruiting and quality of hire","xp",50,"language","r","pre_exercise_code","","solution","","type","VideoExercise","id",222107,"projector_key","course_5977_a1c52bd79de1f53bb11b8a9b610ab9c5","video_link","//player.vimeo.com/video/154783078"]],["^0",["sample_code","","sct","","instructions",null,"question","","hint",null,"possible_answers",["^19",[]],"number",5,"randomNumber",0.5771138393051058,"assignment","<p>You would like to help the talent acquisition team understand which recruiting channel will produce the best sales hires. You can apply the HR analytics process to help them. Start by examining the recruiting channels in the data.</p>","feedbacks",["^19",[]],"attachments",null,"title","Identifying groups in data","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/
production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\nlibrary(readr)\\nrecruitment <- read_csv(recruitment_data.csv)","solution","","type","TabExercise","id",267862,"subexercises",["^19",[["^0",["sample_code","# Load the dplyr package\\n___\\n","sct","test_library_function(dplyr)\\n","instructions","<ul>\\n<li>Load the <code>dplyr</code> package.</li>\\n</ul>","question","","hint","<ul>\\n<li>Use <code>library()</code> to load external packages.</li>\\n</ul>","possible_answers",["^19",[]],"number",1,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title",null,"xp",10,"pre_exercise_code","","solution","# Load the dplyr package\\nlibrary(dplyr)\\n","type","NormalExercise","id",292535]],["^0",["sample_code","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\n___\\n","sct","test_library_function(dplyr)\\n\\nex() %>% check_output_expr(summary(recruitment), missing_msg = Did you call `summary()` on `recruitment`?, append = FALSE)\\n","instructions","<p>Take a look at the sales recruiting data, <code>recruitment</code>, with <code>summary()</code>.</p>","question","","hint","<p>The <code>summary()</code> function takes a single argument, which is the data frame you want to examine.</p>","possible_answers",["^19",[]],"number",2,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title",null,"xp",40,"pre_exercise_code","","solution","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n","type","NormalExercise","id",292536]],["^0",["sample_code","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n\\n# See which recruiting sources the company has been using\\n___","sct","test_library_function(dplyr)\\n\\nex() %>% check_output_expr(summary(recruitment), missing_msg = Did you call `summary()` on `recruitment`?, append = FALSE)\\n\\nex() %>% {\\n    check_output_expr(., count(recruitment, recruiting_source), missing_msg = Did you call `count()` on the `recruiting_source` column?, append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Great job! It looks like some employees don't have a recruiting source, but you won't need to worry about that for this analysis.)","instructions","<p>Using <code>summary()</code> didn't tell you much about the <code>recruiting_source</code> variable, because <code>read_csv()</code> imported it as a character vector. Use <code>count()</code> on the <code>recruiting_source</code> column to get more information.</p>","question","","hint","<p>The <code>count()</code> function can be used in a pipeline, just like in the slides from the video. For this exercise, it takes a column name as an argument.</p>","possible_answers",["^19",[]],"number",3,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title",null,"xp",50,"pre_exercise_code","","solution","# Load the dplyr package\\nlibrary(dplyr)\\n\\n# Get an overview of the recruitment data\\nsummary(recruitment)\\n\\n# See which recruiting sources the company has been using\\nrecruitment %>% \\n  count(recruiting_source)","type","NormalExercise","id",292537]]]]]],["^0",["sample_code","","sct","","instructions",null,"question","","hint",null,"possible_answers",["^19",[]],"number",6,"randomNumber",0.664953081591213,"assignment","<p>Which recruiting channel produces the best salespeople? One quality of hire metric you can use is sales quota attainment, or how much a salesperson sold last year relative to their quota. An employee whose <code>sales_quota_pct</code> equals .75 sold 75% of their quota, for example. This metric can be helpful because raw sales numbers are not always comparable between employees. </p>\\n<p>Calculate the average sales quota attainment achieved by hires from each recruiting source.</p>","feedbacks",["^19",[]],"attachments",null,"title","Sales numbers by recruiting source","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/
recruitment_data.csv, destfile = recruitment_data.csv)\\n# Load packages\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\nlibrary(dplyr)","solution","","type","TabExercise","id",222109,"subexercises",["^19",[["^0",["sample_code","# Find the average sales quota attainment \\nrecruitment %>%\\n  ___","sct","\\nmsg <- Looks like you didn't calculate the average sales quota attainment correctly. Did you use `summarize()` and `mean()` functions, and name the resulting column `avg_sales_quota_pct`?\\n\\ncheck_correct({\\n    ex() %>% check_output_expr(summarize(recruitment, avg_sales_quota_pct = mean(sales_quota_pct)), missing_msg = msg, append = FALSE)\\n}, {\\n    ex() %>% {\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n        }\\n    }\\n})","instructions","<p><code>recruitment</code> and <code>dplyr</code> are loaded in your workspace. </p>\\n<ul>\\n<li>Use <code>summarize()</code> to calculate the average sales quota attainment. Store it in a new column called <code>avg_sales_quota_pct</code>.</li>\\n</ul>","question","","hint","<p>The <code>summarize()</code> function applies a function over the dataset. Use <code>mean()</code> to find the average sales quota attainment.</p>","possible_answers",["^19",[]],"number",1,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title",null,"xp",50,"pre_exercise_code","","solution","# Find the average sales quota attainment\\nrecruitment %>%\\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) ","type","NormalExercise","id",292630]],["^0",["sample_code","# Find the average sales quota attainment for each recruiting source\\navg_sales <- ___\\n\\n# Display the result\\navg_sales","sct","\\nmsg <- Looks like you didn't calculate the average sales quota attainment for each recruiting source correctly. You need to use `group_by()`, `summarize()` and `mean()` functions, and name the resulting column `avg_sales_quota_pct`.\\n\\ncheck_correct({\\n   ex() %>% check_object(avg_sales) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n}, {\\n    ex() %>% {\\n        check_function(., group_by) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `group_by()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you group by `recruiting_source`?, append = FALSE)\\n        }\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe the result of `group_by()` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n        }\\n        \\n    }\\n})\\n\\nex() %>% {\\n    check_output_expr(., avg_sales, missing_msg = Do not remove the line of code that prints `avg_sales` to console., append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(You did it! Look at the output. Which recruiting source produces hires with the highest sales?)","instructions","<p>Use <code>summarize()</code> to calculate the average sales quota attainment <strong>within each recruiting source</strong>. Store it in a new column called <code>avg_sales_quota_pct</code>. Assign the result to <code>avg_sales</code>.</p>","question","","hint","<p>Use <code>group_by(recruiting_source)</code> to add <code>recruiting_source</code> as a grouping variable. </p>","possible_answers",["^19",[]],"number",2,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title",null,"xp",50,"pre_exercise_code","","solution","# Find the average sales quota attainment for each recruiting source\\navg_sales <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) \\n  \\n# Display the result\\navg_sales","type","NormalExercise","id",292632]]]]]],["^0",["sample_code","# Find the average attrition
for the sales team, by recruiting source, sorted from lowest attrition rate to highest\\navg_attrition <- recruitment %>%\\n  ___ %>% \\n  ___ %>% \\n  ___\\n\\n# Display the result\\navg_attrition","sct","\\ncheck_correct({\\n    ex() %>% check_object(avg_attrition) %>% check_equal()\\n}, {\\n    ex() %>% {\\n        check_function(., group_by) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe `recruitment` into `group_by()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = You need to group by the `recruiting_source` column., append = FALSE)\\n        }\\n        check_function(., summarize) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe the output of `group_by()` into `summarize()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you calculate the mean attrition rate as `attrition_rate`?, append = FALSE)\\n        }\\n        check_function(., arrange) %>% {\\n            check_arg(., .data) %>% check_equal(incorrect_msg = Did you pipe the output of `summarize()` into `arrange()`?, append = FALSE)\\n            check_result(.) %>% check_equal(incorrect_msg = Did you arrange by `attrition_rate`?, append = FALSE)\\n        }\\n    }\\n})\\n\\nex() %>% {\\n    check_output_expr(., avg_attrition, missing_msg = Do not remove the line of code that prints `avg_attrition` to console., append = FALSE)\\n    check_error(.)\\n}\\n\\nsuccess_msg(Excellent! Now let's learn how to visualize those results.)","instructions","<p><code>recruitment</code> and <code>dplyr</code> are loaded in your workspace. </p>\\n<ul>\\n<li>Use <code>summarize()</code> to calculate the attrition rate <strong>within each recruiting source</strong>. Store it in a new column called <code>attrition_rate</code>.</li>\\n<li>Sort the result by attrition rate, from lowest to highest, and assign it to <code>avg_attrition</code>.</li>\\n</ul>","question","","hint","<ul>\\n<li>The attrition rate is calculated with <code>mean()</code>. Be sure to <code>group_by()</code> the recruiting source first. This exercise is very similar to the last one.</li>\\n<li>To sort by a variable <code>x</code> within a pipeline, use <code>arrange(x)</code>.</li>\\n</ul>","possible_answers",["^19",[]],"number",7,"randomNumber",0.2201465533206446,"assignment","<p>Another quality of hire metric you can consider is the attrition rate, or how often hires leave the company. Determine which recruiting channels have the highest and lowest attrition rates. </p>\\n<p>In the last exercise, the output was a data frame with the recruiting channels and the average quota attainment. It would have been easier to tell which channel had the highest-performing employees if it were sorted with <code>arrange()</code>.</p>","feedbacks",["^19",[]],"attachments",null,"title","Attrition rates by recruiting source","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\nlibrary(dplyr)","solution","# Find the average attrition for the sales team, by recruiting source, sorted from lowest attrition rate to highest\\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)\\n  \\n# Display the result\\navg_attrition","type","NormalExercise","id",222110]],["^0",["sample_code","","sct","","aspect_ratio",56.25,"instructions",null,"question","","hint",null,"possible_answers",["^19",[]],"number",8,"video_hls",null,"randomNumber",0.9741008948072372,"assignment",null,"feedbacks",["^19",[]],"attachments",null,"title","Visualizing the recruiting data","xp",50,"language","r","pre_exercise_code","","solution","","type","VideoExercise","id",267863,"projector_key","course_5977_60f9b307a3483cc6cd3d0498827cbcce","video_link",null]],["^0",["
sample_code","# Load the ggplot2 package\\n___\\n\\n# Plot the bar chart\\n___\\n","sct","\\nmsg <- Did you call `geom_col()` or `geom_bar(stat = 'identity')` to plot the bar chart?\\n\\ntest_library_function(ggplot2)\\n\\nex() %>% {\\n    check_function(., ggplot) %>% check_arg(data) %>% check_equal() \\n    check_function(., aes) %>% {\\n            check_arg(., x) %>% check_equal(eval = FALSE)\\n            check_arg(., y) %>% check_equal(eval = FALSE)\\n    }\\n}\\n\\ncheck_or(\\n    ex() %>% check_function(geom_col, not_called_msg = msg, append = FALSE), \\n    ex() %>% override_solution(geom_bar(stat = 'identity')) %>% check_function(geom_bar) %>% check_arg(stat) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n)\\n\\nex() %>% check_error()\\n\\nsuccess_msg(Great! Take a moment to admire your work.)","instructions","<p>The summarized data, <code>avg_sales</code>, is available in your workspace. </p>\\n<ul>\\n<li>Load the <code>ggplot2</code> package (<code>dplyr</code> is already loaded).</li>\\n<li>Using <code>ggplot()</code>, plot a bar chart from the <code>avg_sales</code> data. Place the recruiting source on the x-axis and <code>avg_sales_quota_pct</code> on the y-axis. </li>\\n</ul>","question","","hint","<p>Look at the <code>ggplot()</code> code sample from the slides. Your data is <code>avg_sales</code>, and the x and y-axes should be specified inside <code>aes()</code>. </p>","possible_answers",["^19",[]],"number",9,"randomNumber",0.6790770393259535,"assignment","<p>The last step in the HR analytics process is to test and plot the results. For now, you'll focus on visualizing the data from the previous exercises. You'll be making a bar chart so you can more easily see the average sales quota attainment for each recruiting channel.</p>","feedbacks",["^19",[]],"attachments",null,"title","Visualizing the sales performance differences","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Create the avg_sales dataframe\\navg_sales <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) ","solution","# Load the ggplot2 package\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\nggplot(avg_sales, aes(x = recruiting_source, y = avg_sales_quota_pct)) +\\n  geom_col()","type","NormalExercise","id",267864]],["^0",["sample_code","# Load ggplot2\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\n\\n","sct","\\nmsg <- Did you call `geom_col()` or `geom_bar(stat = 'identity')` to plot the bar chart?\\n\\ntest_library_function(ggplot2)\\n\\nex() %>% {\\n    check_function(., ggplot) %>% check_arg(data) %>% check_equal() \\n    check_function(., aes) %>% {\\n            check_arg(., x) %>% check_equal(eval = FALSE)\\n            check_arg(., y) %>% check_equal(eval = FALSE)\\n    }\\n}\\n\\ncheck_or(\\n    ex() %>% check_function(geom_col, not_called_msg = msg, append = FALSE), \\n    ex() %>% override_solution(geom_bar(stat = 'identity')) %>% check_function(geom_bar) %>% check_arg(stat) %>% check_equal(incorrect_msg = msg, append = FALSE)\\n)\\n\\nex() %>% check_error()\\n\\n\\nsuccess_msg(Well done.)","instructions","<p>The summarized data, <code>avg_attrition</code>, is available in your workspace.</p>\\n<ul>\\n<li>Plot a bar chart from the <code>avg_attrition</code> data. Place the recruiting source on the x-axis and the average attrition on the y-axis. </li>\\n</ul>","question","","hint","<ul>\\n<li>Repeat the bar chart from the previous exercise, but use the <code>avg_attrition</code> data.</li>\\n<li>If you need to remember what the <code>avg_attrition</code> dataset looks like, you can view it in the console with <code>head(avg_attrition)</code>.</li>\\n</ul>","possible_answers",["^19",[]],"number",10,"randomNumber",0.821329965303601,"assignment","<p>You've been using two quality of
hire metrics to compare the recruiting channels. In addition to looking at the sales output of the hires, you are also looking at the attrition rates. Plot a bar chart again, but this time plot average attrition instead of sales quota attainment.</p>","feedbacks",["^19",[]],"attachments",null,"title","Visualizing the attrition differences","xp",100,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Create the avg_attrition dataframe\\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)","solution","# Load ggplot2\\nlibrary(ggplot2)\\n\\n# Plot the bar chart\\nggplot(avg_attrition, aes(x = recruiting_source, y = attrition_rate)) +\\n  geom_col()","type","NormalExercise","id",222111]],["^0",["sample_code","","sct","msg1 = Incorrect. You correctly identified NA as the 'group' with the best numbers, but remember that NA is not a hiring source; NA means that the hiring source is missing. \\nmsg2 = Try again. You can look at `avg_attrition` and `avg_sales` in the console.\\nmsg3 = You got it!\\nmsg4 = That's not it. Try looking at `avg_attrition` and `avg_sales` in the console.\\ntest_mc(3, feedback_msgs = c(msg1, msg2, msg3, msg4))","instructions",["^19",["Best: <strong>NA</strong>, Worst: <strong>Search Firm</strong>","Best: <strong>Search Firm</strong>, Worst: <strong>Campus</strong>","Best: <strong>Applied Online</strong>, Worst: <strong>Search Firm</strong>","Best: <strong>Referral</strong>, Worst: <strong>Applied Online</strong>"]],"question","","hint","<p>A hiring source is considered better if it produces candidates with high sales and low attrition. Look at <code>avg_attrition</code> and <code>avg_sales</code> in the console to see which hiring sources are the best and worst by those criteria.</p>","possible_answers",["^19",[]],"number",11,"randomNumber",0.01085104029736339,"assignment","<p>Now it's time to draw conclusions from your analysis. Both <code>avg_attrition</code> and <code>avg_sales</code> are available in the workspace. Which of the recruiting sources in this dataset produced the best hires, measured by attrition and sales? Which source produced the worst hires?</p>","feedbacks",["^19",[]],"attachments",null,"title","Drawing conclusions","xp",50,"language","r","pre_exercise_code","download.file(http://s3.amazonaws.com/assets.datacamp.com/production/course_5977/datasets/recruitment_data.csv, destfile = recruitment_data.csv)\\n\\n# Load packages\\nlibrary(readr)\\nlibrary(dplyr)\\nlibrary(ggplot2)\\n\\n# Import the recruitment data\\nrecruitment <- read_csv(recruitment_data.csv)\\n\\n# Recreate the data frames\\navg_sales <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(avg_sales_quota_pct = mean(sales_quota_pct)) %>% \\n  arrange(avg_sales_quota_pct)\\n  \\navg_attrition <- recruitment %>%\\n  group_by(recruiting_source) %>% \\n  summarize(attrition_rate = mean(attrition)) %>% \\n  arrange(attrition_rate)","solution","","type","MultipleChoiceExercise","id",222112]]]]]]]]"