Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- parentData = [(1,’A’,100), (2,’B’,200)]
- parentCols = [‘PID’, ‘PATTR1’, ‘PATTR1’]
- parentDf = pd.DataFrame.from_records(parentData, columns=parentCols)
- Parent Dataframe
- PID PATTR1 PATTR2
- 0 1 A 100
- 1 2 B 200
- childData = [(201,1,’AA’,2100), (202,2,’BB’,2200), (203,2,’CC’,2300)]
- childCols = [‘CID’, ‘PID’, ‘CATTR1’, ‘CATTR1’]
- childDf = pd.DataFrame.from_records(childData, columns=childCols)
- Child Dataframe
- CID PID PATTR1 PATTR2
- 0 201 1 AA 2100
- 1 202 2 BB 2200
- 2 203 2 CC 2300
- mergedDf = parentDf.merge(childDf, left_on=’PID’, right_on=’PID’, how=’outer’)
- Parent merged with Child dataframe
- PID PATTR1 PATTR2 CID CATTR1 CATTR2
- 0 1 A 100 201 AA 2100
- 1 2 B 200 202 BB 2200
- 2 2 B 200 203 CC 2300
- | ???? | ????
- PID PATTR1 PATTR2 | CID CATTR1 CATTR2 | CID CATTR1 CATTR2
- 0 1 A 100 | 201 AA 2100 |
- 1 2 B 200 | 202 BB 2200 | 203 CC 2300
- mergedDf.assign(G=mergedDf.groupby('PID').cumcount()).set_index(['PID','PATTR1','PATTR2','G']).unstack().swaplevel(0,1,1).sort_index(1,level=0)
- Out[218]:
- G 0 1
- CATTR1 CATTR2 CID CATTR1 CATTR2 CID
- PID PATTR1 PATTR2
- 1 A 100 AA 2100.0 201.0 None NaN NaN
- 2 B 200 BB 2200.0 202.0 CC 2300.0 203.0
Add Comment
Please, Sign In to add comment