Diff for "NeedForSpeed/Successes"

Differences between revisions 1 and 26 (spanning 25 versions)

Things we think the Python community will like.

Frame optimizations: once a function is called it retains the allocated frame for use in future calls, avoiding allocation and initialization overhead. Frame size has also been slightly reduced.
PyStone is over 10% higher on RichardJones' test machine, compared to Python 2.4 (from 20242 to 22935). PyBench reports an overall slowdown, attributed to a 150% slowdown in a piece of code that wasn't changed. Sigh.
Made Gzip readline 30-40% faster (BobIppolito)
Speed up Unicode operations (AndrewDalke, FredrikLundh). Most notable, repeat is much faster, and most search operations (find, index, count, in) are a LOT faster (20x for the related stringbench tests). Also, rsplit is now as fast as split, and splitlines is nearly as fast as a plain split("\n"). Current stringbench results:

        str(ms) uni(ms) %       comment
        -----------------------------------------------------
        2271.31 3608.32 62.9    TOTAL 2.5a2
        2261.85 1187.84 190.4   TOTAL tuesday
        2247.84  875.13 256.9   TOTAL wednesday

(yes, the Unicode string type is now more than twice as fast on this set of tests, and over 4 times faster than when we started. ymmv.)

Patch 1335972 was a combination bugfix+speedup for string->int conversion. These are the speedups measured on my Windows box for decimal strings of various lengths. Note that the difference between 9 and 10 is the difference between short and long Python ints on a 32-bit box. The patch doesn't actually do anything to speed conversion to long directly; the speedup in those cases is solely due to detecting "unsigned long" overflow more quickly:

        length speedup
        ------ -------
         1       12.4%
         2       15.7%
         3       20.6%
         4       28.1%
         5       33.2%
         6       37.5%
         7       41.9%
         8       46.3%
         9       51.2%
        10       19.5%
        11       19.9%
        12       23.9%
        13       23.7%
        14       23.3%
        15       24.9%
        16       25.3%
        17       28.3%
        18       27.9%
        19       35.7%

The struct module has been rewritten to pre-compile struct descriptors (similar to the RE module). This gives a 20% speedup, on average, for the test suite [BobIppolito]. Taking advantage of new ability to "compile" a struct pattern (similar to compiling regexps) can be much faster still.
Worked on using profile guided optimizations in Visual Studio 8 (KristjanJonsson, RichardMTew). This appears to give on the order of 15% speed improvement in the pybench test suite. A new PCBuild8 directory will be added with automated mechanisms for doing this.

Page

User

-  ⇤ ← Revision 1 as of 2006-05-22 16:51:26 → 
  Size: 49
  Editor: SteveHolden
  Comment:
+   ← Revision 26 as of 2006-05-24 21:36:04 → ⇥
  Size: 2803
  Editor: TimPeters
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 2:
+ * Frame optimizations: once a function is called it retains the allocated frame for use in future calls, avoiding allocation and initialization overhead. Frame size has also been slightly reduced.

 PyStone is over 10% higher on RichardJones' test machine, compared to Python 2.4 (from 20242 to 22935).  PyBench reports an overall slowdown, attributed to a 150% slowdown in a piece of code that wasn't changed.  Sigh.

 * Made Gzip readline 30-40% faster (BobIppolito)

 * Speed up Unicode operations (AndrewDalke, FredrikLundh).  Most notable, repeat is much faster, and most search operations (find, index, count, in) are a LOT faster (20x for the related stringbench tests).  Also, rsplit is now as fast as split, and splitlines is nearly as fast as a plain split("\n").  Current stringbench results:

{{{
        str(ms) uni(ms) %       comment
        -----------------------------------------------------
        2271.31 3608.32 62.9    TOTAL 2.5a2
        2261.85 1187.84 190.4   TOTAL tuesday
        2247.84  875.13 256.9   TOTAL wednesday
}}}

  (yes, the Unicode string type is now more than twice as fast on this set of tests, and over 4 times faster than when we started.  ymmv.)

 * Patch 1335972 was a combination bugfix+speedup for string->int conversion.  These are the speedups measured on my Windows box for decimal strings of various lengths.  Note that the difference between 9 and 10 is the difference between short and long Python ints on a 32-bit box.  The patch doesn't actually do anything to speed conversion to long directly; the speedup in those cases is solely due to detecting "unsigned long" overflow more quickly:

{{{
        length speedup
        ------ -------
         1       12.4%
         2       15.7%
         3       20.6%
         4       28.1%
         5       33.2%
         6       37.5%
         7       41.9%
         8       46.3%
         9       51.2%
        10       19.5%
        11       19.9%
        12       23.9%
        13       23.7%
        14       23.3%
        15       24.9%
        16       25.3%
        17       28.3%
        18       27.9%
        19       35.7%
}}}

 * The struct module has been rewritten to pre-compile struct descriptors (similar to the RE module).  This gives a 20% speedup, on average, for the test suite [BobIppolito].  Taking advantage of new ability to "compile" a struct pattern (similar to compiling regexps) can be much faster still.

 * Worked on using profile guided optimizations in Visual Studio 8 (KristjanJonsson, RichardMTew).  This appears to give on the order of 15% speed improvement in the pybench test suite.  A new PCBuild8 directory will be added with automated mechanisms for doing this.