Revision 7 as of 2004-12-11 16:36:03

Clear message

Python speed

People are often worried about the speed of their Python programs; doesn't using Python mean an unacceptable loss in performance? Some people just jump to the conclusion that "hey, it's an interpreted scripting language, and those all run very slow!" Other people have actually tried Python and have found it performs well enough. Sometimes, though, you have a program that just runs too slowly.

Why is raw speed important? Or isn't it?

Some people are inappropriately obsessed with speed and think that just because C can provide better performance for certain types of problem, it must therefore be a better language for all purposes. Other people think that speed of development is far more important, and choose Python even for those applications where it will run slower. Often, they are surprised to find Python code can run at quite acceptable speeds, and in some cases even faster than what they could get from C/C++ with a similar amount of development time invested.

Usually it is not the absolute speed that is important, you should think about what would be an acceptable speed of execution. Optimisations beyond achieving this acceptable speed are wasteful of resources (usually: your time. And thus: money.).

Techniques for Improving Performance and Scalability

Here are some coding guidelines for applications that demand peak performance (in terms of memory utilization, speed, or scalability).

Use the best algorithms and fastest tools

O(n) process. In contrast, using the '+' or '+=' operators can result in an O(n**2) process because new strings may be built for each intermediate step. The CPython 2.4 interpreter mitigates this issue somewhat; however, ''.join(seq) remains the best practice.

xrange, map and itertools.imap, list comprehensions and generator expressions, dict.items and dict.iteritems). In general, the iterator forms are more memory friendly and more scalable. They are preferred whenever a real list is not required.

Applications that take advantage of them can make substantial performance gains. The building blocks include all of the builtin datatypes (lists, tuples, sets, and dictionaries) and extension modules like array, itertools, and collections.deque.

equivalents. For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2).

length stacks. However, for queue applications using pop(0) or insert(0,v)), collections.deque() offers superior O(1) performance because it avoids the O(n) step of rebuilding a full list for each insertion or deletion.

or with the traditional decorate-sort-undecorate technique. Both approaches call the key function just once per element. In contrast, sort's cmp= option is called many times per element during a sort. For example, sort(key=str.lower) is faster than sort(cmp=lambda a,b: cmp(a.lower(), b.lower())).

Take advantage of interpreter optimizations

global variables, builtins, and attribute lookups. So, it is sometimes worth localizing variable access in inner-loops. For example, the code for random.shuffle() localizes access with the line, random=self.random. That saves the shuffling loop from having to repeatedly lookup self.random. Outside of loops, the gain is minimal and rarely worth it.

factor constant expressions out of loops. Likewise, constant folding needs to be done manually. Inside loops, write "x=3" instead of "x=1+2".

Accordingly, it is sometimes worth in-lining code inside time-critical loops.

a single jump. In contrast "while True" takes several more steps. While the latter is preferred for clarity, time-critical code should use the first form.

example "x,y=a,b" is slower than "x=a; y=b". However, multiple assignment is faster for variable swaps. For example, "x,y=y,x" is faster than "t=x; x=y; y=t".

Write "x < y < z" instead of "x < y and y < z".

for only the most demanding applications. For example, "not not x" is faster than "bool(x)".

Take advantage of diagnostic tools

bottlenecks. Profile can distinguish between time spent in pure Python and time spent in C code.

between alternative approaches to writing individual lines of code.

Performance can dictate overall strategy

Consider external tools for enhancing performance

* Numpy is essential for high volume numeric work.

* Psyco and pyrex can help achieve the speed of native code.

More Performance Tips

More performance tips and examples can be found at PythonSpeed/PerformanceTips.

Unable to edit the page? See the FrontPage for instructions.