Differences between revisions 9 and 10
Revision 9 as of 2005-02-13 13:35:10
Size: 4284
Editor: BrianZimmer
Comment:
Revision 10 as of 2005-03-02 06:03:56
Size: 8502
Editor: ClarkUpdike
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Currently, the support for Java Collections integration is a one-way street. It's possible to make Collections object act as a Py``Object but it's not possible to make a Py``Object act as a Collection. This page considers the integration of java.util collections interfaces into core jython objects in going from Jython 2.1 to 2.2:

Jython 2.1
support for Java Collections integration is a one-way street. It's possible to make Collections object act as a Py``Object but it's not possible to make a Py``Object act as a Collection.
Line 40: Line 42:
ClarkUpdike Mar 2 2005 [[BR]]
This is in response to <waiting for post to show up on sourceforge>. It is a work in progress--but feel free to comment. I've concentrated on the impact of implementing the `List` interface:
 
 `List`:: `java.util.List`
 `list`:: python `list` type (nice to be able to say 'type', eh?)

I've been thinking on how to accomplish the "Approach 2" design, which is basically a delegation model, in the sense that the subject classes will continue to subclass Py``Object. Py``Sequence does not have an "element data" field--which leaves the concrete classes to handle that. Here are some observations on the current 2.1 design:

||'''Jython Class'''||'''Extends'''||'''current element data field'''||'''Notes'''||
||Py``Sequence||Py``Object||N/A|| ||
||Py``Array||Py``Sequence||`Object data;`||`data` is set to primitive array or array of arbitrary class ||
||Py``List||Py``Sequence||`protected PyObject[] list;`|| ||
||Py``String||Py``Sequence||`private String string;`|| jython depends heavily on interning of String ||
||Py``Tuple||Py``Sequence||*`public PyObject[] list;`|| `list` field is referenced directly by 8 classes in 14 methods ||
||Py``XRange||Py``Sequence||N/A|| int attributes start, stop, step, (useless without `PySequence.__iter__()` ||
||Py``Set||Py``Object||`protected Hash``Set _set;`|| based on BrianZimmer's SetsModule ||
||Py``Dictionary||Py``Object||`protected Hashtable table;`|| ||

My current thinking is that we should add a new branch to Py``Sequence--let's call it Py``Sequence``List. Py``Sequence would remain as is, and Py``Sequence``List would subclass Py``Sequence and implement the java.util.List interface. Py``String and Py``XRange would subclass Py``Sequence and Py``Array, Py``List, Py``Tuple would subclass Py``Sequence``List. This seems appropriate because, although Py``X``Range and Py``String technically fall under the description of `List`, the practicality of them as a `List` is nil. Am I missing something obvious here? Would anyone ever use Py``X``Range in java? And Py``String is auto-converted to a `java.lang.String`.

{{{
Py``Sequence <--+-- Py``Sequence`List <---- (Py``Array, Py``List, Py``Tuple)
                |
                +-- (Py``String, Py``X``Range)
}}}

Not sure about Py``List and Py``Tuple sharing an additional base class with a predefined element data field. Could do it that way, or could use delegation on a specialized element data class, or could leave it as-is (with copied code for certain methods--although the amount of copied code will increase.

One key decision is about the element data field in Py``List and Py``Array. Currently, it's Py``Object[]. This is efficient because it eliminates casting, but keeping it makes `List` implementation difficult (take a look at the source for `java.util.AbstractList`). If we were to switch it to an ArrayList, it would buy us easier collection integration, but will cost performance (who knows how much). There's also a "middle-of-the-road" approach, which is to use a specialized class to wrap a Py``Object array and provide `List` like behavior (I have some experience with doing this). This approach might also be used with Py``Array.

Py``Array requires some important decisions also. The collections methods are all `Object` based. So anything coming or going throught these interfaces will require wrapping/boxing. If Py``Array were to use an `Array``List` and fully box everything from both java and jython, their performance (their main reason for existing?) would take a major hit. My thinking is this is not a viable option. This means there could be some difficult code to write to implement `List`, unless we use the specialized class mentioned above (but typed to the particular primitive array types).

Other general improvements:
 *eliminate deprecated methods?
 *Replace signatures to collection implementations (Vector, Hash``Set, Hash``Table) with intefaces (List, Set, Map)
 *Clean up of Py``XRange copies field and deprecate `repeat()`, `getSlice()`

----

Background

This page considers the integration of java.util collections interfaces into core jython objects in going from Jython 2.1 to 2.2:

Jython 2.1 support for Java Collections integration is a one-way street. It's possible to make Collections object act as a PyObject but it's not possible to make a PyObject act as a Collection.

The integration of Collections into Jython happens through the CollectionProxy and CollectionProxy2 classes. They wrap the Collection instance with the appropriate proxy and delegate Jython calls to the Collection instance.

Going the other way fails. Take for example:

  >>> from java.util import ArrayList
  >>> a = ArrayList([1,2,3])
  Traceback (innermost last):
    File "<console>", line 1, in ?
  TypeError: java.util.ArrayList(): 1st arg can't be coerced to java.util.Collection or int

In this example the ArrayList constructor is expecting a java.util.Collection instance but since the PyList does not implement this interface the TypeError is thrown. Since the Collection framework is fundamental to Java since 1.2 the JythonDevelopmentTeam will address this issue. The implementation is currently being written by ClarkUpdike.

Design

There are two different approaches:

  1. Subclass the Abstract classes available in java.util for the Collection framework.
  2. Continuing subclassing PyObject with the additional work of implementing the appropriate interface.

Approach 2 offers the best integration options. Jython is primarily an implementation of Python so implementing the data structures as they are in Python takes priority over the Java implementations. In addition, the keyword and index arguments for method calls are already done. The implementation of the interfaces will need only delegate to the appropriate PyObject instance's method for the same functionality.

Jython Class

Extends

Implements

PySequence

PyObject

List (Collection)

PyArray

PySequence

PyList

PySequence

PyString

PySequence

PyTuple

PySequence

PyXRange

PySequence

PySet

PyObject

Set

PyDictionary

PyObject

Map

Discussion

ClarkUpdike Mar 2 2005 BR This is in response to <waiting for post to show up on sourceforge>. It is a work in progress--but feel free to comment. I've concentrated on the impact of implementing the List interface:

`List`

java.util.List

`list`

python list type (nice to be able to say 'type', eh?)

I've been thinking on how to accomplish the "Approach 2" design, which is basically a delegation model, in the sense that the subject classes will continue to subclass PyObject. PySequence does not have an "element data" field--which leaves the concrete classes to handle that. Here are some observations on the current 2.1 design:

Jython Class

Extends

current element data field

Notes

PySequence

PyObject

N/A

PyArray

PySequence

Object data;

data is set to primitive array or array of arbitrary class

PyList

PySequence

protected PyObject[] list;

PyString

PySequence

private String string;

jython depends heavily on interning of String

PyTuple

PySequence

*public PyObject[] list;

list field is referenced directly by 8 classes in 14 methods

PyXRange

PySequence

N/A

int attributes start, stop, step, (useless without PySequence.__iter__()

PySet

PyObject

protected HashSet _set;

based on BrianZimmer's SetsModule

PyDictionary

PyObject

protected Hashtable table;

My current thinking is that we should add a new branch to PySequence--let's call it PySequenceList. PySequence would remain as is, and PySequenceList would subclass PySequence and implement the java.util.List interface. PyString and PyXRange would subclass PySequence and PyArray, PyList, PyTuple would subclass PySequenceList. This seems appropriate because, although PyXRange and PyString technically fall under the description of List, the practicality of them as a List is nil. Am I missing something obvious here? Would anyone ever use PyXRange in java? And PyString is auto-converted to a java.lang.String.

Py``Sequence <--+-- Py``Sequence`List <---- (Py``Array, Py``List, Py``Tuple)
                |
                +-- (Py``String, Py``X``Range)

Not sure about PyList and PyTuple sharing an additional base class with a predefined element data field. Could do it that way, or could use delegation on a specialized element data class, or could leave it as-is (with copied code for certain methods--although the amount of copied code will increase.

One key decision is about the element data field in PyList and PyArray. Currently, it's PyObject[]. This is efficient because it eliminates casting, but keeping it makes List implementation difficult (take a look at the source for java.util.AbstractList). If we were to switch it to an ArrayList, it would buy us easier collection integration, but will cost performance (who knows how much). There's also a "middle-of-the-road" approach, which is to use a specialized class to wrap a PyObject array and provide List like behavior (I have some experience with doing this). This approach might also be used with PyArray.

PyArray requires some important decisions also. The collections methods are all Object based. So anything coming or going throught these interfaces will require wrapping/boxing. If PyArray were to use an ArrayList and fully box everything from both java and jython, their performance (their main reason for existing?) would take a major hit. My thinking is this is not a viable option. This means there could be some difficult code to write to implement List, unless we use the specialized class mentioned above (but typed to the particular primitive array types).

Other general improvements:

  • eliminate deprecated methods?
  • Replace signatures to collection implementations (Vector, HashSet, HashTable) with intefaces (List, Set, Map)

  • Clean up of PyXRange copies field and deprecate repeat(), getSlice()


  • ClarkUpdike Feb 12 2005

    • [http://java.sun.com/j2se/1.3/docs/api/java/util/Collections.html java.util.Collections]: Is there a need to provide a jython version of this class (especially the synchronized and unmodifiable wrapper methods)? If it is not implemented and the java.util.Collections wrappers are used on new collection objects, the returned objects will only proxy for the java.util interface and will be broken in jython. There is also a PyImmutableSet in BrianZimmer's SetsModule, but I don't think there is an immutable dictionary equivalent.

    BrianZimmer Feb 13 2005

    • I'm inclinded to say no. Consider this code:

          List a = new ArrayList();
          a.add(1);
          a.add(2);
          List b = Collection.unmodifiableList(a);
          try {
            b.add(3);
          } catch (UnsupportedOperationException e) {
            e.printStackTrace();
          }
          a.add(3);
      Without making a copy of the collection it is always possible to mutate the delegate. The Collections methods only wrap and delegate; they do not instruct the source to be immutable. In Python, this is handled through the use of two different classes: list and tuple. Consider this code:
          a = [1,2,3,4]
          b = tuple(a)
          a.append(5)
          print a, b
      Since the tuple makes a copy of the list and does not delegate, the immutability is preserved. If an immutable Collection is really desired, copy it and make it immutable (this is really the true way in Java as well). This relieves Jython collections from concerning themselves with immutability because they will gladly offer up their contents through the standard Collection interface for the creation of a new, all Java Collection which can be used as the source.

      In the case of PyImmutableSet, it's part of the Python module, so it must be implemented. The relationship for mutable and immutable collections in Jython:

      Mutable

      Immutable

      PyList

      PyTuple

      PySet

      PyImmutableSet

      PyDict

      none

      Synchronization should follow the same approach.

CollectionsIntegration (last edited 2008-11-15 09:15:59 by localhost)