Advertisement
DTruth14

Labeling data slices, task 5

Feb 19th, 2020
589
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.52 KB | None | 0 0
  1. import pandas as pd
  2.  
  3. data = pd.read_csv('/datasets/visits_eng.csv', sep='\t')
  4. data['local_time'] = (
  5.     pd.to_datetime(data['date_time'], format='%Y-%m-%dT%H:%M:%S')
  6.     + pd.Timedelta(hours=-7)
  7. )
  8. data['too_fast'] = data['time_spent'] < 60
  9. #print(data.head())
  10. #print(data['too_fast'].mean())
  11. too_fast_stat = data.pivot_table(index='id', values='too_fast')
  12. #print(too_fast_stat.head())
  13. too_fast_stat.hist(bins=30)
  14. data['too_slow'] = data['time_spent'] > 1000
  15. data.pivot_table(index='id', values='too_slow').hist(bins=30)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement