Advertisement
Guest User

Untitled

a guest
Jun 18th, 2019
105
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.79 KB | None | 0 0
  1. Year Month Day Qty Amount Item Customer
  2. 0 2003 9 1 30.0 220.80 N2719 3110361
  3. 1 2003 9 1 1.0 75.17 X1046 3126034
  4. 2 2003 9 1 240.0 379.20 D5853 0008933
  5. 3 2003 9 1 2112.0 2787.84 D5851 0008933
  6. 4 2003 9 1 3312.0 4371.84 D5851 0008933
  7. ...
  8. ...
  9. <2.7M rows>
  10.  
  11. df.set_index(['Item', 'Customer', 'Year', 'Month', 'Day'], inplace=True, drop=True)
  12. df.sortlevel(inplace=True)
  13.  
  14. Item Customer Year Month Day Qty Amount
  15. X1046 3126034 2003 9 1 1.0 75.17
  16. < ... other transactions for X1046/3126034 item/customer combination ...>
  17. 3126035 2005 1 2 50.0 500.00
  18. < ... other transactions for X1046/3126035 item/customer combination ...>
  19. < ... 48 other customers for X1046 ...>
  20.  
  21. N2719 3110361 2003 9 1 30.0 220.80
  22. < ... other transactions for N2719/3110361 item/customer combination ...>
  23. 3110362 2004 9 10 9.0 823.00
  24. < ... other transactions for N2719/3110362 item/customer combination ...>
  25. < ... 198 other customers for N2719 ... >
  26. < ... 6998 other items ... >
  27.  
  28. item_by_customers = df.reset_index().groupby('Item')['Customer'].nunique().sort_values(ascending=False)
  29.  
  30. Item
  31. N2719 200
  32. X1046 50
  33. <... 6998 other rows ...>
  34.  
  35. sorted_data = df.set_index(item_by_customers.index)
  36. < ... gives me ValueError: Length mismatch: Expected axis has 2.7M elements, new values have 7000 elements ...>
  37.  
  38. sorted_data = df.reindex(index=item_by_customers.index, columns=['Item'])
  39. < ... gives me Exception: cannot handle a non-unique multi-index! ...>
  40.  
  41. Item Customer Year Month Day Qty Amount
  42. N2719 3110361 2003 9 1 30.0 220.80
  43. < ... other transactions for N2719/3110361 item/customer combination ...>
  44. 3110362 2004 9 10 9.0 823.00
  45. < ... other transactions for N2719/3110362 item/customer combination ...>
  46. < ... 198 other customers for N2719 ... >
  47.  
  48. X1046 3126034 2003 9 1 1.0 75.17
  49. < ... other transactions for X1046/3126034 item/customer combination ...>
  50. 3126035 2005 1 2 50.0 500.00
  51. < ... other transactions for X1046/3126035 item/customer combination ...>
  52. < ... 48 other customers for X1046 ...>
  53.  
  54. < ... 6998 other items ... >
  55.  
  56. i = df.groupby('Item').Customer.transform('nunique').mul(-1).argsort()
  57. df.iloc[i]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement