Venciity

Data Science Terms and Hints

Nov 16th, 2017
125
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.40 KB | None | 0 0
  1. Histograms
  2. Helps you understand the distribution of a numeric value in a way that you cannot with mean or median alone
  3.  
  4. Time Series
  5. Any chart that shows a trend over time
  6.  
  7. Scatter Plots
  8. A way to visualize how two numeric variables are related in your data
  9. Helps you find outliers
  10.  
  11. Bar Graphs
  12. A convenient way to compare numeric values of several groups
  13. Good for comparing multiple values (Example: revenue and cost)
  14.  
  15. What to do when your data is too big
  16. 1. Data Aggregation
  17. A way to aggregate your data so that important information is easily seen
  18.  
  19. 2. Sampling
  20. The idea is you can select a random subset of the original data to get an idea about the properties of the original data
  21. Example: When polling firms give out surveys about what people think about particular political candidates they use a similar technique because it's impossible to survey anyone in the given country
  22. Notes:
  23. * Remember to try sampling a few times to make sure that result you're seeing doesn't depend on particular samples that you selected.
  24. * The fact you can sample a subset of your data and still get meaningful results suggests that perhaps you don't need to have collected all the data in the first place. For example, if your website has 100 million users it might not be necessary to keep track of all of their behaviors on the website. It might be sufficient to sample only 10 percent of the users for data analysis purposes.
Add Comment
Please, Sign In to add comment