Differences between revisions 1 and 26 (spanning 25 versions)
Revision 1 as of 2006-05-22 16:51:26
Size: 49
Editor: SteveHolden
Comment:
Revision 26 as of 2006-05-24 21:36:04
Size: 2803
Editor: TimPeters
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

 * Frame optimizations: once a function is called it retains the allocated frame for use in future calls, avoiding allocation and initialization overhead. Frame size has also been slightly reduced.

 PyStone is over 10% higher on RichardJones' test machine, compared to Python 2.4 (from 20242 to 22935). PyBench reports an overall slowdown, attributed to a 150% slowdown in a piece of code that wasn't changed. Sigh.

 * Made Gzip readline 30-40% faster (BobIppolito)

 * Speed up Unicode operations (AndrewDalke, FredrikLundh). Most notable, repeat is much faster, and most search operations (find, index, count, in) are a LOT faster (20x for the related stringbench tests). Also, rsplit is now as fast as split, and splitlines is nearly as fast as a plain split("\n"). Current stringbench results:

{{{
        str(ms) uni(ms) % comment
        -----------------------------------------------------
        2271.31 3608.32 62.9 TOTAL 2.5a2
        2261.85 1187.84 190.4 TOTAL tuesday
        2247.84 875.13 256.9 TOTAL wednesday
}}}

  (yes, the Unicode string type is now more than twice as fast on this set of tests, and over 4 times faster than when we started. ymmv.)

 * Patch 1335972 was a combination bugfix+speedup for string->int conversion. These are the speedups measured on my Windows box for decimal strings of various lengths. Note that the difference between 9 and 10 is the difference between short and long Python ints on a 32-bit box. The patch doesn't actually do anything to speed conversion to long directly; the speedup in those cases is solely due to detecting "unsigned long" overflow more quickly:

{{{
        length speedup
        ------ -------
         1 12.4%
         2 15.7%
         3 20.6%
         4 28.1%
         5 33.2%
         6 37.5%
         7 41.9%
         8 46.3%
         9 51.2%
        10 19.5%
        11 19.9%
        12 23.9%
        13 23.7%
        14 23.3%
        15 24.9%
        16 25.3%
        17 28.3%
        18 27.9%
        19 35.7%
}}}

 * The struct module has been rewritten to pre-compile struct descriptors (similar to the RE module). This gives a 20% speedup, on average, for the test suite [BobIppolito]. Taking advantage of new ability to "compile" a struct pattern (similar to compiling regexps) can be much faster still.

 * Worked on using profile guided optimizations in Visual Studio 8 (KristjanJonsson, RichardMTew). This appears to give on the order of 15% speed improvement in the pybench test suite. A new PCBuild8 directory will be added with automated mechanisms for doing this.

Things we think the Python community will like.

  • Frame optimizations: once a function is called it retains the allocated frame for use in future calls, avoiding allocation and initialization overhead. Frame size has also been slightly reduced.

    PyStone is over 10% higher on RichardJones' test machine, compared to Python 2.4 (from 20242 to 22935). PyBench reports an overall slowdown, attributed to a 150% slowdown in a piece of code that wasn't changed. Sigh.

  • Made Gzip readline 30-40% faster (BobIppolito)

  • Speed up Unicode operations (AndrewDalke, FredrikLundh). Most notable, repeat is much faster, and most search operations (find, index, count, in) are a LOT faster (20x for the related stringbench tests). Also, rsplit is now as fast as split, and splitlines is nearly as fast as a plain split("\n"). Current stringbench results:

        str(ms) uni(ms) %       comment
        -----------------------------------------------------
        2271.31 3608.32 62.9    TOTAL 2.5a2
        2261.85 1187.84 190.4   TOTAL tuesday
        2247.84  875.13 256.9   TOTAL wednesday
  • (yes, the Unicode string type is now more than twice as fast on this set of tests, and over 4 times faster than when we started. ymmv.)
  • Patch 1335972 was a combination bugfix+speedup for string->int conversion. These are the speedups measured on my Windows box for decimal strings of various lengths. Note that the difference between 9 and 10 is the difference between short and long Python ints on a 32-bit box. The patch doesn't actually do anything to speed conversion to long directly; the speedup in those cases is solely due to detecting "unsigned long" overflow more quickly:

        length speedup
        ------ -------
         1       12.4%
         2       15.7%
         3       20.6%
         4       28.1%
         5       33.2%
         6       37.5%
         7       41.9%
         8       46.3%
         9       51.2%
        10       19.5%
        11       19.9%
        12       23.9%
        13       23.7%
        14       23.3%
        15       24.9%
        16       25.3%
        17       28.3%
        18       27.9%
        19       35.7%
  • The struct module has been rewritten to pre-compile struct descriptors (similar to the RE module). This gives a 20% speedup, on average, for the test suite [BobIppolito]. Taking advantage of new ability to "compile" a struct pattern (similar to compiling regexps) can be much faster still.

  • Worked on using profile guided optimizations in Visual Studio 8 (KristjanJonsson, RichardMTew). This appears to give on the order of 15% speed improvement in the pybench test suite. A new PCBuild8 directory will be added with automated mechanisms for doing this.

NeedForSpeed/Successes (last edited 2008-11-15 14:00:10 by localhost)

Unable to edit the page? See the FrontPage for instructions.