Differences between revisions 44 and 45
Revision 44 as of 2006-05-16 06:55:58
Size: 6846
Editor: host32
Comment:
Revision 45 as of 2006-05-17 21:59:04
Size: 7186
Editor: 24-107-147-211
Comment:
Deletions are marked like this. Additions are marked like this.
Line 57: Line 57:

   PatrickObrien: one definition of "ordered dictionary" is the order of insertion. The best Python implementation, including extensive unit tests, is available here: [http://dev.schevo.org/schevo/browser/Schevo/trunk/schevo/lib/odict.py source code], [http://dev.schevo.org/schevo/browser/Schevo/trunk/tests/test_odict.py unit tests].

Sprint topics following the NeedForSpeed theme

You can add additional topics below. Please discuss the specific goals and possible approaches to these tasks!

CPython

  • Evaluate the PEPs for optimizing global and attribute lookups
  • Can floating point ops be sped-up by avoiding flag/exception checks at every step? Can some floating point ops be in-lined in ceval.c?
    • TimPeters: By default, CPython does no flag or exception checks on float ops -- the PyFPE_START_PROTECT and PyFPE_END_PROTECT macros normally have empty expansions. Inlining is possible but probably undesirable. Doing masses of fp ops in one gulp via the NumPy (or whatever it's called now) extension is the sanest approach.

  • Implement portions of the decimal module in C
    • RichardJones: Looks like this has a good chance of being done in SummerOfCode.

      RaymondHettinger: We can get them off to a good start by laying the groundwork (the C struct, some access macros, and implementing a couple of methods that they can use as a model).

      GeorgBrandl: I've created a skeleton C module since I was expecting to work on this.

  • Build-out struct module to support fast, high-volume binary conversions -- perhaps with regexp analogs such struct.compile() and struct.finditer()
    • SeanReifschneider: I am interested in working on this. First job would be figuring out how to best do this. Can/should it be any faster than a tight C loop that creates the required objects, I think that may dominate execution speed, but would appreciate input.

  • Create a string subclass that provides lazy slicing without copying
    • FredrikLundh: Creating the class is easy, but integrating it into Python is harder (most code that handles e.g. 8-bit strings assume a PyString). For Py3K, it would be quite interesting to "instrument" a Python interpreter, mapping all PyString macros to functions, and gathering some kind of usage statistics.

      TimPeters: Just noting a subtlety: PyString objects are always NUL-terminated, so that passing them to random C libraries doesn't require copying the guts. String slices won't have that property.

      RaymondHettinger: I will search for a patch that Skip wrote for a possible implementation of str.partition() which returns a string subclass with pointers to the slices instead of a full copy. Any subsequent string ops lazily evaluate the slice into a full string.

  • Allow selective removal of unused features such as profiling support
  • Faster parsing of strings and bytes into int, long, etc.
    • TimPeters: part of that is algorithms, and part is the sheer depth of the call stack. There's at least one patch pending to use faster algorithms for conversion of decimal strings to ints/longs. The platform C library converts decimal strings to floats.

      SeanReifschneider: If the string is less than 9 bytes, can we just call strtol? Would that help? I suspect the check may cost too much, but maybe something in between like special-casing short strings to call strtol?

  • Buffer for use with network I/O
    • Fredrik Lundh: Also see the stringobject comment below.

      SeanReifschneider: If we have buffering of network I/O, we can change readline() so that there is not a system-call for every character in the line. What about for file I/O?

  • Build-out the collections module for optimized data structures:
  • Create a 64 bit PyInt type (for 32 bit machines)

    • FredrikLundh: PyInt64, I hope? Or a configuration option? Or a polymorphic-under-the-hood PyLong type ?

      TimPeters: Guido would probably be happy if "short" Python ints were in fact 64 bits on all boxes; that's come up before.

  • Optimize methods in stringobject.c
    • FredrikLundh: I'd like to work on refactoring the string method implementations into a "polymorphic" (SRE-style) support library. This would let us share source code between 8-bit and Unicode strings, and make it easier to reuse code also for future array/buffer/bytes types (etc).

      SeanReifschneider: I'd help with that. Adding rfind() to string was horribly painful because of the code duplication.

  • Add itertools.imerge() and itertools.izip_longest()
  • Guido has a standing request to have threading.py written in C
  • Revisit Armin's zombie frame idea for reducing function call overhead.
    • RaymondHettinger: IIRC, the unsolved problem was how to save partially constructed frames without impacting the performance of recursive functions.

  • Other ideas for speeding up function calls. For example, [http://www.python.org/sf/1479611 patch 1479611].

  • Are there any other patches in the [http://sourceforge.net/tracker/?group_id=5470&atid=305470 patch tracker] that are worth investigating?

    • SeanReifschneider: I'd be willing to work on going through the tracker for performance patches.

  • Improve gzip's readline performance (e.g. [http://www.python.org/sf/1281707 patch 1281707]).

  • Improve interpreter startup time, like in patch 921466.

Pure Python Projects

Twisted

  • Speed improvements to select and poll reactors
  • Reactor based on /dev/epoll
  • Better integration with psyco
  • Improvements against twisted benchmark

Psyco

  • Support for generator expressions
  • Support for nested scopes
  • Support for more dictionary operations
  • Speedup float arithmetic
  • Support for more built-ins (e.g. int(), long(), float(), etc.)
  • Upgrade for python 2.5
  • Better tools for profiling psyco-ness of application
  • Investigate usefulness of IVM (with aim to producing a more streamlined dispatch loop)
  • LLVM backend
  • Virtualized longs (for long longs)
  • Virtualized slots (Ability to cache getattribute() values)

Py3000

  • Make a wishlist for possible performance gains in Py3.0

NeedForSpeed/Goals (last edited 2008-11-15 13:59:37 by localhost)

Unable to edit the page? See the FrontPage for instructions.