Information that might be useful while implementing or maintaining a type - some notes for the Jython Wiki, based on things I discovered while trying to implement bytearray.

Types in General

A Hook for Bootstrapping

_builtin_.java is used to load types and functions into the globals table of the interpreter. There must be a statement in the body of org.python.core._builtin_.fillWithBuiltins() that names the type.

Deriving and Exposing

A Jython type is defined by a Java class, say Piranha.java. The code you write in that class has to be transformed in two ways: the source of a second class will be created from it called PiranhaDerived.java (see GeneratedDerivedClasses); and the compiled version of your class, Piranha.class will be transformed by the type exposer utility (see PythonTypesInJava).

Naming Exposed Methods

A typical Jython type contains a dozen or more methods that will be exposed as methods in Python - there is a rich array of standard library methods like remove, __add__ and __hash__, depending on what the type does. Each Python method is exposed under its standard name, but (by convention) the name of Java method that implements it is formed from the Python name of the type, an underscore and the exposed method name. So remove is implemented by piranha_remove, and __add__ is implemented by piranha___add__. (That was 3 underscores and 2 underscores.) It is usually not necessary to give the exposed name of the method in the @ExposedMethod annotation, because the exposer understands this convention. (It takes the Python name of the type from the @ExposedType annotation on the class.)

The exposed implementation methods should be final and are usually synchronized.

Documentation Strings

The documentation string (__doc__) for each type and method could be given as a string literal argument to the @ExposedMethod annotation, but for Python language types there is a cleverer way. The @ExposedMethod annotations of such types refer to a constant string defined in BuiltinDocs.Java. By convention, the documentation string has a name formed from the Python name of the type, an underscore, the exposed method name, and "_doc". So list.remove is documented by list_remove_doc, and list.__add__ is documented by list___add___doc. (That was 3 underscores each time.) A script Misc/make_pydocs.py is used to generate BuiltinDocs.Java running from a working CPython implementation of the correct vintage. The script contains an easily identified table of the types and built-in function objects for which it is to do this. Thus, to make all the documentation strings for Jython 2.6, one updates the table with any missing types and functions, and runs it under (say) CPython 2.6.6 (the latest at the time of writing).

Java API Methods

Each exposed method has a twin in the Java API for the class. (Typically only the Java API has Javadoc comments.) This twin is named the same as the method as Python sees it, and delegates to the exposed method. Thus __add__ simply calls piranha___add__.

On further investigation, it turns out that the code generated by compiling Python (to Java bytecode, obviously) also calls __add__ directly: this is visible in the Java stack. This invites the speculation that it might be slightly more efficient to have piranha___add__ call __add__.

Implementing Comparison Operations

When Python code needs to compare two objects, the interpreter disappears into a maze of twisty passages, all alike. This is partly a fault in the design of Python, that gets rectified with version 3. There are two comparison ecosystems at work: one based on __cmp__, that is very similar to Java's Comparable<T>.compareTo, intended for sort ordering, and one based on the set of six "rich comparison" operations:  __eq__, __ne__, __lt__, __le__, __ge__, __gt__ used to support the operators  ==, !=, <, <=, >=, > . It is possible to define __cmp__ and the rich comparison operations inconsistently. In Python 3, __cmp__ has gone, sorting is based on __lt__, and a single rich comparison function defines all six comparisons.

It is possible to compare objects of dissimilar type. Given a<b Python tries a.__lt__(b), but if this produces no result (null in the Java code) Python will try b.__gt__(a).

In a further nuance, a<b and a.__lt__(b) behave differently when a.__lt__(b) produces no result. (And a corresponding rule applies for the other 5 operators.) When an object a cannot compare itself to b the Python expression a.__lt__(b) evaluates to the special object NotImplemented, while a<b evaluates to False.

>>> "oops".__lt__(5)
NotImplemented
>>> "oops"<5
False

In a last touch, if you define only some of the operations, Python seems to synthesize missing ones, e.g. __ge__ from the opposite of __lt__.

ImplementNewType (last edited 2012-05-07 17:23:01 by JeffAllen)