Guest User

search for string of one column in another column

a guest
Feb 2nd, 2020
138
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 1.47 KB | None | 0 0
  1. import pandas as pd
  2. df = pd.DataFrame({'_id':'Y100','paper_title':'abc','reference':['sdfaqdtsdf','sdfkdsjgkgg','fjafjhafkj']},{'_id':'Y101','paper_title':'efg','reference':['cdfabctzdi','vjedbvjbdjk','efhlghjehg']},{'_id':'Y102','paper_title':'lmn','reference':['zdfabdtssf','boblfbjbsfb','qwhfefqwfob']},........)
  3. df.set_index(['_id','paper_title'], inplace = True)
  4. print(df)
  5. Out[1]:
  6. _id   paper_title   reference
  7. Y100   abc          sdfaqdtsdf
  8.        abc          sdfklmngkgg
  9.        abc          fjafefgfkj
  10. Y101   efg          cdfabdtzdi
  11.        efg          vjedbvjbdjk
  12.        efg          efhlmnjehg
  13. Y102   lmn          zdfabdtssf
  14.        lmn          boblfbjbsfb
  15.        lmn          qwhfefqwfob
  16.  
  17. Expected results:
  18. _id   paper_title    reference                                   this_paper_presented_in
  19. Y100   abc          ['sdfaqdtsdf','sdfklmngkgg','fjafefgfkj']     [Y101,Y102]
  20. Y101   efg          ['cdfabdtzdi','vjedbvjbdjk','efhlmnjehg']     [Y102]
  21. Y102   lmn          ['zdfavdtssf','boblfbjbsfb','qwhfefqwfob']    [None(if this paper_title not present in column reference)]
  22.  
  23. SideNote:
  24. Same paper_title can not be present in it's own reference row i.e (Y100 paper_title can not be present in same reference)
  25. `this_paper_presented_in` column is the one which will have list of _id's if the `paper_title` values of those _id's are present in `reference` column
  26. Here is the actual dataframe [https://imgur.com/73t6D26] if someone might want to look the original dataframe.
Add Comment
Please, Sign In to add comment