Revision 4 as of 2007-08-11 00:16:39

Clear message

Python 2.x

Python 2.x has two types to store a string:

Both classes has same methods and are very similar.

Python 3000

Python 3000 use two very different types:

old str and new bytes

Differences between Python 2.x "str" and Python 3000 "bytes":

choose between bytes and str

When you migration from Python 2.x to Python 3000, you have to ask youself: do I manipulate characters or bytes (integers)? "A" is a character and 65 is an integer. Examples:

bytes and loops (for)

The following code will display 97, 98, 99 since the bytes iterator generates integer and not character!

 for item in b'abc':
    print item

compare bytes

>>> b'xyz' == b'xyz'    # case 1
True
>>> b'xyz' == 'xyz'     # case 2
False
>>> b'xyz'[0] == b'x'   # case 3
False
>>> b'xyz'[0]
120

Case 2 shows that bytes and unicode are never equals since they are different types. Case 3 shows an important point: getting an item of a bytes returns an integer (120) and not a bytes (len=1). This behaviour is different than Python 2.x:

# In Python 2.x
>>> "xyz"[0]
'x'
>>> type("xyz"), type("xyz"[0])
(<type 'str'>, <type 'str'>)

open issues

hash(bytes)

bytes is mutable and so it's not hashable. Hacks/Workaorounds:

Other solutions:

Hash is used when bytes is a dictionary key.

Unable to edit the page? See the FrontPage for instructions.