Guest User

Untitled

a guest
Nov 12th, 2018
82
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.42 KB | None | 0 0
  1. regex_combining = re.compile(u'[\u0300-\u036f\u1dc0-\u1dff\u20d0-\u20ff\ufe20-\ufe2f]',re.U)
  2.  
  3. def remove_diacritics(s):
  4. """ Decomposes string, then removes combining characters.
  5. Hand this a unicode string, not an encoded one
  6. """
  7. #TODO: Figure out whether the NFC is unnecessary
  8. return unicodedata.normalize('NFC',
  9. regex_combining.sub('',unicodedata.normalize('NFD', unicode(s)))
  10. )
Add Comment
Please, Sign In to add comment