Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- ID color begin_date end_date location
- 1 red 2017-01-01 2017-01-07 initial
- 1 green 2017-01-05 2017-01-07 nursing
- 1 blue 2017-01-07 2017-01-15 rehab
- 1 red 2017-01-11 2017-01-22 Health
- 2 red 2017-02-22 2017-02-26 initial
- 2 green 2017-02-26 2017-02-28 nursing
- 2 blue 2017-02-26 2017-02-28 rehab
- 3 red 2017-03-11 2017-03-22 initial
- 4 red 2017-04-01 2017-04-07 initial
- 4 green 2017-04-05 2017-04-07 nursing
- 4 blue 2017-04-10 2017-04-15 Health
- ID first_site
- 1 rehab
- 2 nursing
- 3 home
- 4 Health
- SELECT
- OVER( PARTITION ID CASE WHEN end_date[0] = begin_date THEN location
- WHEN location = 'Health' THEN 'Health'
- ELSE 'Home' end) AS [first_site]
- FROM table
- def conditions(x):
- #compare each group first
- val = x.loc[x['begin_date'] == x['end_date'].iloc[0], 'location']
- #if at least one match (not return empty `Series` get first value)
- if not val.empty:
- return val.iloc[0]
- #if value is empty, check if value 'Health' exists within the group
- elif (x['location'] == 'Health').any():
- return 'Health'
- else:
- return 'Home'
- final = df.groupby('ID').apply(conditions).reset_index(name='first_site')
Add Comment
Please, Sign In to add comment