Guest User

Untitled

a guest
May 25th, 2018
80
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.42 KB | None | 0 0
  1. data = {'code': ['a', 'b', 'a', 'c', 'c', 'c', 'c'],
  2. 'cost': [10, 20, 100, 10, 10, 500, 10]}
  3. df = pd.DataFrame(data)
  4.  
  5. grouped = df.groupby('code')['cost'].agg(['sum', 'mean']).apply(pd.Series)
  6.  
  7. def is_outlier(s):
  8. # Only calculate outliers when we have more than 100 observations
  9. if s.count() >= 100:
  10. return np.where(s >= s.quantile(0.75) + 1.5 * iqr(s), 1, 0).mean()
  11. else:
  12. return np.nan
Add Comment
Please, Sign In to add comment