Guest User

Untitled

a guest
Jul 16th, 2018
89
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.33 KB | None | 0 0
  1. A = "Diga sí por cualquier número de otro cuidador.".encode("utf-8")
  2.  
  3. # -*- coding: utf-8 -*-
  4.  
  5. A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")
  6.  
  7. A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")
  8.  
  9. # -*- coding: utf-8 -*-
  10.  
  11. Preliminaries:
  12. >>> import unicodedata
  13. >>> unicodedata.name(u'xed')
  14. 'LATIN SMALL LETTER I WITH ACUTE'
  15. >>> uc = u'Diga sxed por'
  16.  
  17. What happens if file is encoded in UTF-8:
  18. >>> infile = uc.encode('utf8')
  19. >>> infile
  20. 'Diga sxc3xad por'
  21. >>> infile.encode('utf8')
  22. Traceback (most recent call last):
  23. File "<stdin>", line 1, in <module>
  24. UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)
  25. #### NOT the message reported in the question ####
  26.  
  27. What happens if file is encoded in cp1252 or latin1 or similar:
  28. >>> infile = uc.encode('cp1252')
  29. >>> infile
  30. 'Diga sxed por'
  31. >>> infile.encode('utf8')
  32. Traceback (most recent call last):
  33. File "<stdin>", line 1, in <module>
  34. UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)
  35. #### As reported in the question ####
  36.  
  37. # Encoding: UTF-8
  38.  
  39. >> type(u"zażółć gęślą jaźń")
  40. -> <type 'unicode'>
  41.  
  42. >> type("zażółć gęślą jaźń")
  43. -> <type 'str'>
  44.  
  45. u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")
  46.  
  47. # -*- coding: utf-8 -*-
Add Comment
Please, Sign In to add comment