Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Year Month Day Qty Amount Item Customer
- 0 2003 9 1 30.0 220.80 N2719 3110361
- 1 2003 9 1 1.0 75.17 X1046 3126034
- 2 2003 9 1 240.0 379.20 D5853 0008933
- 3 2003 9 1 2112.0 2787.84 D5851 0008933
- 4 2003 9 1 3312.0 4371.84 D5851 0008933
- ...
- ...
- <2.7M rows>
- df.set_index(['Item', 'Customer', 'Year', 'Month', 'Day'], inplace=True, drop=True)
- df.sortlevel(inplace=True)
- Item Customer Year Month Day Qty Amount
- X1046 3126034 2003 9 1 1.0 75.17
- < ... other transactions for X1046/3126034 item/customer combination ...>
- 3126035 2005 1 2 50.0 500.00
- < ... other transactions for X1046/3126035 item/customer combination ...>
- < ... 48 other customers for X1046 ...>
- N2719 3110361 2003 9 1 30.0 220.80
- < ... other transactions for N2719/3110361 item/customer combination ...>
- 3110362 2004 9 10 9.0 823.00
- < ... other transactions for N2719/3110362 item/customer combination ...>
- < ... 198 other customers for N2719 ... >
- < ... 6998 other items ... >
- item_by_customers = df.reset_index().groupby('Item')['Customer'].nunique().sort_values(ascending=False)
- Item
- N2719 200
- X1046 50
- <... 6998 other rows ...>
- sorted_data = df.set_index(item_by_customers.index)
- < ... gives me ValueError: Length mismatch: Expected axis has 2.7M elements, new values have 7000 elements ...>
- sorted_data = df.reindex(index=item_by_customers.index, columns=['Item'])
- < ... gives me Exception: cannot handle a non-unique multi-index! ...>
- Item Customer Year Month Day Qty Amount
- N2719 3110361 2003 9 1 30.0 220.80
- < ... other transactions for N2719/3110361 item/customer combination ...>
- 3110362 2004 9 10 9.0 823.00
- < ... other transactions for N2719/3110362 item/customer combination ...>
- < ... 198 other customers for N2719 ... >
- X1046 3126034 2003 9 1 1.0 75.17
- < ... other transactions for X1046/3126034 item/customer combination ...>
- 3126035 2005 1 2 50.0 500.00
- < ... other transactions for X1046/3126035 item/customer combination ...>
- < ... 48 other customers for X1046 ...>
- < ... 6998 other items ... >
- i = df.groupby('Item').Customer.transform('nunique').mul(-1).argsort()
- df.iloc[i]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement