Advertisement
Guest User

Untitled

a guest
Jan 19th, 2020
121
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.45 KB | None | 0 0
  1. reg_ex = '[a-zA-Z0-9][^ ^\n]*'
  2. course_num = '[A-Z]{4}\xa0\d{5}'
  3. #ls = [x for x in text if x not in course_num]
  4. #print(ls)
  5. nums = (re.findall(course_num, text)[0].replace('\xa0', ' ').split(' '))
  6. #final_text = re.findall(reg_ex, text)
  7. #nums = (re.findall(course_num, text))
  8. other = (re.findall("[a-zA-Z0-9]\w+", text))
  9. #print(other)
  10.  
  11. return [x.lower() for x in other if (x not in INDEX_IGNORE) and (x not in nums)]
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement