Advertisement
Guest User

Script

a guest
Jun 26th, 2019
150
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.38 KB | None | 0 0
  1.  
  2. import json
  3. import re
  4.  
  5. path = 'F:\\wikititle\\wiki3.txt'
  6. i = open(path, encoding="utf-8")
  7. var=i.read()
  8. list2 = re.findall("[^a-zA-Z0-9]([a-zA-Z]{4}_[a-zA-Z]{6|7})[^a-zA-Z0-9]",var)
  9. list2 = list2.split()
  10. seen = set()
  11. uniq = [x for x in list2 if x not in seen and not seen.add(x)]    
  12.  
  13. f = open('F:\\wikititle\\kidsb.txt','x')
  14. f.write(json.dumps(uniq))
  15. f.close()
  16. print("done")
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement