Revision 6 as of 2007-08-23 18:41:11

Clear message

Python 2.x

Python 2.x has two types that can be used to store a string:

Both classes have the same methods and are very similar.

Python 3000

Python 3000 uses two very different types:

Differences Between Python 2.x's "str" and Python 3000's "bytes"

Differences between Python 2.x's str and Python 3000's bytesinclude:

Choosing Between "bytes" and "str"

When you migrate from Python 2.x to Python 3000, you have to ask yourself: do I manipulate characters or bytes (integers)? "A" is a character and 65 is an integer. Examples:

bytes and loops (for)

The following code will display 97, 98, 99 since the bytes iterator generates integer and not character!

 for item in b'abc':
    print item

compare bytes

>>> b'xyz' == b'xyz'    # case 1
True
>>> b'xyz' == 'xyz'     # case 2
False
>>> b'xyz'[0] == b'x'   # case 3
False
>>> b'xyz'[0]
120

Case 2 shows that bytes and unicode are never equals since they are different types. Case 3 shows an important point: getting an item of a bytes returns an integer (120) and not a bytes (len=1). This behaviour is different than Python 2.x:

# In Python 2.x
>>> "xyz"[0]
'x'
>>> type("xyz"), type("xyz"[0])
(<type 'str'>, <type 'str'>)

open issues

hash(bytes)

bytes is mutable and so it's not hashable. Hacks/Workaorounds:

Other solutions:

Hash is used when bytes is a dictionary key.

Unable to edit the page? See the FrontPage for instructions.