Python 2 中的编码问题


Author: yifei / Created: May 30, 2017, 10:59 a.m. / Modified: May 30, 2017, 11:01 a.m. / Edit

to represent a text, we use a array of characters. we use numbers to represent characters. unicode use the range of 0 - Inf, while ascii use 0-127

str is text representation in bytes, unicode is text representation in characters. You decode text from bytes to unicode and encode a unicode into bytes with some encoding. That is:

>>> 'abc'.decode('utf-8')  # str to unicode
u'abc'
>>> u'abc'.encode('utf-8') # unicode to str
'abc' 

Python2's problem is:

the default encoding method is ascii! even worse, python throws a error when no ascii byte found!


有任何问题可以发邮件到 kongyifei (at) gmail.com 讨论