Advertisement
Guest User

Untitled

a guest
Jul 24th, 2017
51
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.19 KB | None | 0 0
  1. df1 = pd.DataFrame({
  2. 'date': ['31-05-2017', '31-05-2017', '31-05-2017', '31-05-2017', '01-06-2017', '01-06-2017'],
  3. 'tag': ['A', 'B', 'B', 'B', 'A', 'A'],
  4. 'metric1': [0, 0, 0, 1, 1, 1],
  5. 'metric2': [0, 1, 1, 0, 1, 0]
  6. })
  7.  
  8.  
  9. df2 = pd.DataFrame({
  10. 'date': ['31-05-2017', '31-05-2017', '01-06-2017'],
  11. 'tag': ['A', 'B', 'A'],
  12. 'metric3': [25, 3, 7,]
  13. })
  14.  
  15. date | tag | metric1_sum | metric2_sum | metric2_percentage| metric 3
  16. -----------|-----|-------------|-------------|-------------------|---------
  17. 31-05-2017 | A | 0 | 0 | 0 | 25
  18. 31-05-2017 | B | 1 | 2 | 0.667 | 3
  19. 01-06-2017 | A | 1 | 0 | 0.5 | 7
  20.  
  21. >>> g = df1.groupby(['date', 'tag']).agg(sum)
  22. metric1 metric2
  23. date tag
  24. 01-06-2017 A 2 1
  25. 31-05-2017 A 0 0
  26. B 1 2
  27.  
  28. >>> g.groupby(level=0).apply(lambda x: x/float(x.sum()))
  29. metric2
  30. date tag
  31. 01-06-2017 A 1.0
  32. 31-05-2017 A 0.0
  33. B 1.0
  34.  
  35. >>> pd.merge(g, df2, how='left', on=['date', 'tag'])
  36. KeyError: 'date'
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement