Differences between revisions 1 and 2
Revision 1 as of 2010-11-07 23:45:19
Size: 2458
Comment: Adding a new page
Revision 2 as of 2010-11-07 23:46:04
Size: 2464
Comment:
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
Currently there is a huge switch-case statement in the interpreter of PyByteCode. Since there are a lot of cases (upwards of 150), the compiler cannot inline any operations in the case statements. The idea is to convert the switch-case statement into a enum-based dispath. Currently there is a huge switch-case statement in the interpreter of {{{PyByteCode}}}. Since there are a lot of cases (upwards of 150), the compiler cannot inline any operations in the case statements. The idea is to convert the switch-case statement into a enum-based dispath.

Planned and implemented speedups for PyByteCode interpreter

Issue

Currently there is a huge switch-case statement in the interpreter of PyByteCode. Since there are a lot of cases (upwards of 150), the compiler cannot inline any operations in the case statements. The idea is to convert the switch-case statement into a enum-based dispath.

Background

PyTableCode allows us to associate code objects with methods, which minimizes the permgen overhead in contrast to wrapping with classes. (This might have changed in the recent support in Java 6 for anonymous classloaders, however.) This association is through an index, which is then switched to the correct method. Table switches like this basically don't inline, so that's a real performance limitation. PyTableCode objects are loaded by a custom classloader and are compiled from CodeCompiler. You will definitely want to look at CodeCompiler and eventually the ScopesCompiler. We wrap the ASM bytecode library with our own API; most of that is in org.python.compiler.Code.

PyBytecode implements the Python bytecode VM. It's a direct translation of ceval.c from CPython. Please note there were likely some changes from 2.5 (which is what I translated) to 2.6, and then to 2.7. Really the only way to know is to do a diff, this is definitely a part of Python that's documented only in the code. But that's getting ahead of where we need to be.

The marshal module (org.python.modules._marshal, then imported via Lib/marshal.py, this is the usual pattern) implements the logic for marshaling to/from code objects and associated constants.

The pycimport module (Lib/pycimport.py) allows for you to import Python bytecode objects, in the form of pyc files, when available. It's pretty short, but the underlying PEP 302 support for meta importers is not exactly documented well. You can see how it's tested in a really limited way in Lib/test/test_pbcvm.py. Ideally it would run the entire regrtest, or at least a more substantial fraction. The subprocess TODO is what's necessary to avoid collision because of running the same imports repeatedly.

Lastly, you can use the compileall module to compile ahead-of-time (henceforth AOT) Python code, either in CPython or Jython, to pyc or py$class representations respectively. This module is used by distutils, as seen in setup.py package setup scripts.

CodeSpeedupExperiments/PyByteCode (last edited 2011-01-09 08:16:49 by c-98-245-87-8)