to represent a text, we use a array of characters. we use numbers to represent characters. unicode use the range of 0 - Inf, while ascii use 0-127
str is text representation in bytes, unicode is text representation in characters. You decode text from bytes to unicode and encode a unicode into bytes with some encoding. That is:
>>> 'abc'.decode('utf-8') # str to unicode u'abc' >>> u'abc'.encode('utf-8') # unicode to str 'abc'
Python2's problem is:
the default encoding method is ascii! even worse, python throws a error when no ascii byte found!