Revision 3 as of 2013-08-07 21:33:33

Clear message

Writing Code that runs under Python2 and 3

The intent of this page is to provide specific guidelines in a quick reference format for writing code that is compatible with both Python2 and Python3. The idea is that you can check this single page once you're familiar with the basic concepts and approaches but need a refresher on a specific coding techniques.

Most entries here will link to a more in-depth explanation of the basic recipe given here in case you need more than a simple refresher on the subject. At the bottom of this page, you will find various resources for diving more into aspects of supporting Python 3, from the pure-Python, C extension module, packaging, and other perspectives.

Before you start

Here are recommendations for you to follow before you start porting.

I cannot overemphasize the last point. Without a clear separation in your mind and data model between bytes and strings, your port will likely be much more painful than it needs to be. This is the biggest distinction between Python 2 and Python 3. Where Python 2 let you be sloppy, with its 8-bit strings that served as both data and ASCII strings, with automatic (but error prone) conversions between 8-bit strings and unicodes, in Python 3 there are only bytes and strings (i.e. unicodes), with no automatic conversion between the two. This is A Good Thing.

Basic compatibility

Put the following at the top of all your Python files:

from __future__ import absolute_import, division, print_function, unicode_literals

This turns on some important compatibility flags.

In your code, make these changes:

Commentary: Some folks don't like to import unicode_literals because it has the potential to change your API (e.g. possibly returning unicodes where before it returned 8-bit strings in Python 2). One developer recommends making this change in your tests only, or only doing this if you have exceptionally good test coverage. YMMV.

built-ins

def mock_it(builtin_name):
   name = ('builtins.%s' if sys.version_info >= (3,) else '__builtin__.%s') % builtin_name
   return mock.patch(name)

[more info]

codecs

from codecs import getencoder
encoder = getencoder('rot-13')
rot13string = encoder(mystring)[0]

[more info]

sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding='UTF-8', line_buffering=True)

dictionaries

doctests

from __future__ import absolute_import, print_function, unicode_literals
def setUp(testobj):
    testobj.globs['absolute_import'] = absolute_import
    testobj.globs['print_function'] = print_function
    testobj.globs['unicode_literals'] = unicode_literals

[more info]

def print_bytes(obj)
    if bytes is not str:
        obj = repr(obj)[2:-1]
    print(obj)

gettext

kwargs = {}
if sys.version_info >= (3,):
    kwargs['unicode'] = True
gettext.install(domain, LOCALEDIR, **kwargs)

iterators

metaclasses

# Define the Enum class using metaclass syntax compatible with both Python 2
# and Python 3.
Enum = EnumMetaclass(str('Enum'), (), {
    '__doc__': 'The public API Enum class.',
    })

Here EnumMetaclass is the metaclass (duh!) and Enum is the class you're creating which has the custom metaclass. You pass in the base classes (of which there are none, hence the empty tuple) and the dictionary of attributes for the class you're creating. The use of str() here is a bit odd, but it's because the enum module uses from __future__ import unicode_literals for Python 2, but in Python 2, class names must be str/bytes. We can't use the b'' prefix though because in Python 3, class names must be unicodes.

operators

from collections import Sequence
return isinstance(obj, Sequence)

raise

strings/bytes/unicodes

subprocess

zope.interfaces

Python extension modules

Compatibility macros

C types

There are lots of differences you need to be aware of when defining types in C extensions. A few important ones:

PyArg_Parse()

PyCObject

reprs

#define REPRV(obj) \
    (PyUnicode_Check(obj) ? (obj) : NULL), \
    (PyUnicode_Check(obj) ? NULL : PyBytes_AS_STRING(obj))

and use it like this:

return PyUnicode_FromFormat("...%V...", REPRV(parent_repr));

[more info]

Third party packages

Resources

Unable to edit the page? See the FrontPage for instructions.