encoding in python 2

Author: yifei / Created: May 30, 2017, 10:59 a.m. / Modified: May 30, 2017, 11:01 a.m. / Edit

to represent a text, we use a array of characters. we use numbers to represent characters. unicode use the range of 0 - Inf, while ascii use 0-127

str is text representation in bytes, unicode is text representation in characters. You decode text from bytes to unicode and encode a unicode into bytes with some encoding. That is:

>>> 'abc'.decode('utf-8')  # str to unicode
>>> u'abc'.encode('utf-8') # unicode to str

Python2's problem is:

the default encoding method is ascii! even worse, python throws a error when no ascii byte found!