Revision 12 as of 2008-11-15 14:00:02

Clear message

Strings in Python 2.x

Python 2.x has two types that can be used to store a string:

Both classes have the same methods and are very similar.

Strings in Python 3000

Python 3000 uses two very different types:

Differences Between Python 2.x's "str" and Python 3000's "bytes"

Differences between Python 2.x's str and Python 3000's bytesinclude:

Choosing Between "bytes" and "str" in Python 3000

When you migrate from Python 2.x to Python 3000, you have to ask yourself: do I manipulate characters or bytes (integers)? "A" is a character and 65 is an integer. Examples:

Iterating over "bytes"

It's important to note that the bytes iterator generates integers and not characters:

>>> for item in b'abc':
...   print item

Comparing "bytes"

Comparing one bytes object to another works as expected:

>>> b'xyz' == b'xyz'
>>> b'xyz' == b'abc'

However, it is important to note that the bytes type is completely distinct from the str type in Python 3000, and comparisons between them do not work:

>>> b'xyz' == 'xyz'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't compare bytes and str

This should make clearly evident some incomplete transitions. But it also means that you really cant mix then very well:

>>> L = ["1", b"1"]
>>> "1" in L
>>> "2" in L
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: can't compare str and bytes

As mentioned earlier, getting an item of a bytes returns an integer, not a bytes object:

>>> b'xyz'[0] == b'x'
>>> b'xyz'[0]

This behaviour is different than Python 2.x:

# In Python 2.x
>>> "xyz"[0]
>>> type("xyz"), type("xyz"[0])
(<type 'str'>, <type 'str'>)

Hashing "bytes"

bytes is mutable, and as a result, it's not hashable. Among other things, this means that bytes objects can't be used as keys in dictionaries.

Hacks and workarounds for this include:

Other solutions include:

Unable to edit the page? See the FrontPage for instructions.