Advertisement
Guest User

Untitled

a guest
Jul 21st, 2019
98
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.29 KB | None | 0 0
  1. import pandas as pd
  2. import numpy as np
  3.  
  4. df = pd.DataFrame(columns=['Text','Selection_Values'])
  5. df["Text"] = ["Hi", "this is", "just", "a", "single", "sentence.", "This", np.nan, "is another one.","This is", "a", "third", "sentence","."]
  6. df["Selection_Values"] = [0,0,0,0,0,1,0,0,1,0,0,0,0,0]
  7. print(df)
  8.  
  9. Text Selection_Values
  10. 0 Hi 0
  11. 1 this is 0
  12. 2 just 0
  13. 3 a 0
  14. 4 single 0
  15. 5 sentence. 1
  16. 6 This 0
  17. 7 NaN 0
  18. 8 is another one. 1
  19. 9 This is 0
  20. 10 a 0
  21. 11 third 0
  22. 12 sentence 0
  23. 13 . 0
  24.  
  25. [["Hi this is just a single sentence."],["This is another one"], ["This is a third sentence ."]]
  26.  
  27. [["Hi this is"], ["just a"], ["single sentence."],["This is another one"], ["This is"], ["a third sentence ."]]
  28.  
  29. [[s.str.cat(sep=' ')] for s in np.split(df.Text, df[df.Selection_Values == 1].index+1) if not s.empty]
  30.  
  31. [["Hi this is just a single sentence."],["This is another one"], ["This is a third sentence ."]]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement