Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- import re
- import codecs
- text = codecs.open('test.txt', encoding='utf-8')
- def cs(text):
- for line in text:
- #...some other replacements with regex and normal characters.
- line = re.sub(ur'(u0f62u0f0b|u0f60u0f72u0f0b)/ES ', ur' 1',line)
- print line #I've tried leaving this out, but still only a blank file.
- output_file = codecs.open('outputtest.txt', 'w', encoding='utf-8')
- output_file.write(line)
- output_file.close()
- འདུལ་//X བ་/E ག་/S བཞུགས་/S སོ/S །/S <utt>
- འདུལ་/X བ་/Y གཞི/E །/S <utt>
- བམ་/X པོ་/E ལྔ་/S བཅུ་/S ལྔ་/X པ/E །/S <utt>
- ཐུན་/X མོང་/E མ་/S ཡིན་/X པ་/E གང་/S ཞེ་/S ན/S །/S <utt>
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement