These are the Google "Summer of Code" projects involving Python that were funded in 2006.
Note: if a project is listed as having two mentors, the first mentor listed is the primary mentor, and the second one is the back-up mentor.
Improving the Python debugger
Student: Matthew J Fleming
Mentor: Rocky Bernstein
Many people have voiced their concern that the Python debugger could be improved in a number of ways. This project aims to incorporate these suggestions into an improved version of Pdb. These improvements are to allow,
- debugging from another process
- debugging remotely
- debugging a threaded application
The class that overrides Pdb and provides our enhancements is 'MPdb'.
None of the information below is set in stone and may change based on mine or Rocky's ideas and public feedback as we feel it is better to make some of the decisions (communication mechansim) open to change in the future or at least being able to make a decision later rather than sooner.
The design of these improvements has been of great importance and we have specifically concentrated on allowing future programmers to easily add to our work. For example, up until recently in the Python trunk, to get a Pdb session to send program output to some place other than stdout one had to override all of the methods in Pdb that used a 'print' statement. However, now, the version of Pdb in the trunk allows a programmer to specify both a stdin and a stdout stream. We will follow this example. The advantage of our design is that it is easily extensible and it aims to follow the gdb way to doing things.
Debugging communication techniques and ideas
We have discussed various communication mechanisms including sockets and serial communication. For debugging from another process it became clear (after looking at other debuggers) that most people choose to go with a 'socket-based' approach. The advantages of using this approach is that it is a tried and tested method that many debuggers use. One alternative to the socket approach is to use xmlrpc which allows the passing around of requests in a structured format, which may come in most handy when the debugger server is sending information to a debugger console. For instance, if someone were to write an IDE for MPdb, it would be trivial when using xmlrpc to allow the debugging server class to return a tagged message to the console which would indicate,
- Where in the source we currently were
- The state of some configuration variable on the server side
It became clear that the method of communication that is available to a programmer when debugging an application will change between projects. An embedded devices programmer working with python may not necessarily have access to a network card and thus, no access to network sockets. However, they may have access to a serial port on the embedded device and whilst this project may not provide code to allow communication over a serial line, it should be easy for a programmer to subclass MPdb and write one themselves, without having to undo or 'work around' any of the features or specifications or MPdb.
Base multidimensional array type for Python core
Mentor: Travis E. Oliphant
More information: http://scipy.org/BaseArray
The goal is to prepare a simple, generic multidimensional array interface that can be readily placed in the Python core as a new built-in base type (called, for instance, 'dimarray'), and included in a future Python distribution (possibly 2.6?). This new base type will have the same C-structure as the current array implementation in numpy and will be based on an interface recently formulated by Travis Oliphant within a Python Enhancement Proposal (http://svn.scipy.org/svn/PEP/). After preparing a 'ready to insert' version of the array interface, it will be applied to numpy and several other packages that work with multidimensional data, and possibly modified in order to work out an optimal scope.
Logging Usage in the Standard Library (PEP 337)
Student: Jackilyn Hoxworth
Mentor: James Joseph Jewett
The purpose of this project is to create a standard for using the logging system in the standard library. This will simplify development of daemon applications. This requires slight modifications to a large number of standard modules, which will be made in a back-portable way. After the completion of this project one can use the following filtering scheme: logging.getLogger('py.<module>').setLevel(logging.<LEVEL>).
Decimal module in C.
Student: Mateusz Rukowicz
Mentor: Facundo Batista
Adding C implementation of decimal module, which eventually would replace current implementation, with no side effects to applications using this module.
Rewrite of the zipfile module.
Student: Nilton Volpato
Mentor: Ilya Etingof
The project is to write a new and updated version of python's zipfile module for dealing with ZIP files with much more features than those currently supported by zipfile module.
These changes are intended to overcome the current limitations of the module, and include, but are not limited to:
- Add support for bzip2 (de)compression
- Add support for removing files stored in zip files
- Add file-like object API to ZIP archive members
- Add support for the traditional PKWARE encryption
Student: Maciej Fijalkowski
Mentor: Eric van Riet Paap
Complete gencli, the PyPy CLI backend
Student: Antonio Cuni
Mentor: Armin Rigo
gencli is the PyPy backend targeting the Common Language Infrastructure (CLI) virtual machine, i.e. .NET.
Once gencli will be complete it will be possible to translate the Standard Interpreter in order to produce a .NET executable of the Python interpreter (we could call it PyPy.NET).
The result will be in some way similar to the existing IronPython, with the difference that IronPython is a compiler (though most of the job is done by the runtime environment), while PyPy.NET will be an interpreter.
PyPy Proposal - Write and port modules from CPython with ctypes
Student: Lawrence Oluyede
Mentor: Anders Chrigstroem
I'd like to make PyPy ready for encrypted communications adding SSL support in the ongoing and in-development socket module. The other modules I'll port are: bz2, fcntl, mmap and time (very fundamental). If there's time I'd like to start working on 'os' and 'select' modules also.
Neural Nets in SciPy
Student: Frederic Mailhot
Mentor: Robert Kern
The goal of this project is to extend SciPy's functionality by adding modules for the design, training and use of a variety of neural network architectures, including standard feedforward and recurrent networks, among others. As a guide I intend to work from the modules in Matlab's Netlab toolkit, as well as from my own experience implementing recurrent networks.
Enhancements to mathtext (part of matplotlib) - a Python package for typesetting
Student: Edin Salković
Mentor: John D Hunter
- Allow replacing of the existing bakoma truetype fonts with a set of good, comprehensive, unicode math fonts. This will allow pluging in the STIX fonts when (if) they are finaly out. Currently mathtext hardcodes the mapping from TeX symbol name to a (font_file, glyph_index) tuple, which ties mathtext to a given set of fonts (eg, the Bakoma fonts).
- incorporate some of Knuth's layout algorithms into the mathtext layout engine.
- refactor mathtext into a stand-alone module
- add support for kerning - the current bakoma fonts do not have kerning info in them.
- improve the parser to handle more TeX
- add support for fractions (\frac), arrays etc.
Soya3D Collision API : Improving ODE integration in the core
Student: David Pierre-Yves
Soya lacks some very useful tools such as a collision API or a properly integrated physics engine.
I am applying to Summer of Code 2006 in order to fix these deficiencies by improving support for the Open Dynamics Engine, which is barely working and too low-level in the last release of Soya, and provide a similar collision detection system to all objects whether ODE is in use or not.
PySoy & Cal3D exporting/importing tools for Blender
Student: Palle Raabjerg
Mentor: Buddha Michael Dylan Buck
Browse Source: http://www.pysoy.org/browser/trunk/blender-soy
Currently, no PySoy importers exist for Blender. This means you cannot import any Soy or Cal3D models into Blender. This can be inconvenient for both the game developer, as he needs to publish the .blend files to allow people to modify models, and for a gamer community surrounding a project, as they can't modify models unless the developer publishes the .blend files. (Note: This might be a perfectly okay and even preferable situation for some proprietary game developers, but PySoy is built on a philosophy of openness and freedom, making this a more important feature.)
The PySoy exporter tools basically consists of the blender2soya and blender2cal3d scripts. blender2soya is currently used for static (non-animated) models, and blender2cal3d is used for exporting animated skeletal models to the Cal3D format, which can also be used with PySoy. These two scripts lack many features, however, and still contain known flaws.
There are talks about a redesign of the format Soya3D uses for models. In this case, I would be involved in this redesign, and the blender2soya script would have to be rewritten.
If I am assigned to this project, I will create PySoy and Cal3D importers for Blender, and I will work on both exporter scripts
Due to an inconviently-timed project fork, this project has been moved to PySoy (was formerly for Soya3D). Beside the name of the project, all aspects remain the same and the code will be available for the Soya3d project and any other project which needs it.
Pygame on ctypes
Student: Alex Holkner
Mentor: Pete Shinners
My project proposal is to rewrite Pygame to use ctypes. The current implementation is written as a C module that links to SDL. The proposed addition of ctypes to Python 2.5 is a great catalyst for using it to wrap SDL and reimplementing Pygame in pure Python. This would allow developers to extend Pygame with much more ease than is currently possible, and to make use of SDL features not exported by Pygame, and to give PyPy development another library.
SQLAlchemy Schema Migration
Student: Evan Pierce Rosson
Mentor: Jonathan LaCour
SQLAlchemy is an excellent object-relational database mapper for Python projects. Currently, it does a fine job of creating a database from scratch, but provides no tool to assist the user in modifying an existing database. This project aims to provide such a tool.
Drop-in WSGI support for Commodity Hosting
Student: Jonathan Rosebaugh
Mentor: Benjamin C. Bangert
Tentative Pythonesque Project Name: Holy Grail
I intend to take advantage of existing developments in WSGI such as Paste, flup, and various WSGI-compatible frameworks in order to develop a simple drop-in method for hosting WSGI webapps in commodity hosting.
Project Goals in rough order of priority
WSGI Admin Panel - A self-contained WSGI app which allows the user to attach various WSGI apps to different urls. At its most basic it will be a simple wrapper around paste.urlmap, but it will be able to handle more complex situations, involving such things as paste.cascade, paste.urlparser, various middleware, and other complex situations. When I can use PasteDeploy I will, but there will have to be support for manual importing of apps from modules.
- FastCGI made easy - flup has taken us most of the way, but it needs to be friendlier. Some hosting providers want the FastCGI process to be running on a specified port; some providers want the web server to spawn it. Either option should be easy to do, with robust process management to make sure the FastCGI process stays up.
- Ease of installation for webhosts - Ideally, a commodity webhost should be able to provide this package for their customers; just click a button in the signup and the WSGI admin panel is deployed to the user's account, creating an isolated user environment using workingenv and setuptools.
- Ease of installation by end users - However, until this has widespread support by webhosts, the user needs a quick and simple install process. We'll have a simple installer--command line for those who have shell access, python cgi, and maybe even a php-based installer--that lets the user set up their WSGI package using workingenv and setuptools.
PyDev - Python for Eclipse
Student: Sean Handley
Mentor: Fabio Zadrozny
Project Blog: http://www.planet-soc.com/?q=blog/114
My objectives for PyDev are to implement as many of the most popular requested features as I can, bringing the power of Python development in Eclipse up to the same standards as Java. Specifically, I will be adding new features to the debugger and code editor.
Coder: An extensible web-based programming tutor
Student: Johannes Woolard
Mentor: André Roberge
I propose a system that serves web-pages locally over the loopback address – this could maybe be extended later to work remotely if “restricted python” is revived.
Tutorials are really the most important part of this application: not only must they be simple to create, they must also be fun for the students. I am proposing a system whereby tutorials are encapsulated in XML and contain not only the raw tutorial (which will be pretty-printed to XHTML using XSLT, which in turn could be personalised with CSS), but also unit tests written in python that must be satisfied for the student to advance to the next stage. I will base my work on Andre Roberge's Crunchy Frog.
Improving Mailman's User Experience
Student: Ethan Fremen
Mentor: Barry Warsaw
I propose to port Mailman's web interface to one implemented with Kid in order to improve the UI and enable better handling of i18n.
Memory Optimization in Shed Skin
Student: Zheng Siyao
Mentor: Mark Dufour
Shed Skin is an optimizing Python-to-C++ compiler, that accepts pure, but implicitly statically typed Python programs. It already converts many 'smallish' programs, such as a ray tracer, chess player, several sudoku solvers, etc., with an average relative speedup of about 12 over Psyco, and about 45 over CPython. However, Shed Skin is still at an early release, and there are several integration issues that need to be solved. For example, programs can currently only use several supported standard library functions. See http://mark.dufour.googlepages.com for more details.
Shed Skin currently only performs a type inference analysis, prior to generating code. But there are several interesting optimizations that can be performed using the resulting type information. A very promising category is memory optimizations. The student will implement two such techniques. One is to transform as much heap allocation as possible into stack allocation. The other is to transform as much heap allocation as possible into static preallocation (and possibly to share preallocated memory if possible).
The deliverables are stack and static preallocation techniques with performance at least matching that of earlier prototype techniques for the set of benchmarks described in Dufour's thesis, and with superior performance for other/more extreme demonstration programs.
YAML parser and emitter
Student: Kyrylo Simonov
Mentor: Clark C. Evans
I have developed a pure Python YAML parser, which is, as for now, the only existing YAML parser that fully adheres the current YAML 1.1 specification. I would like to apply to the Google Summer of Code program in order to continue my work on implementing a YAML parser and emitter.
My proposal is to write a C library for processing YAML and Python bindings to it. While existing PyYAML parser is stable and robust, being a pure Python code, it is slow and not suitable for processing large amounts of data. Additionally, having a C library will allow bindings for other languages as well, which will increase compatibility between various implementations and serve the goal of YAML: to be a language-independent format for serializing data.
I also plan to progress on another YAML-related project: Writing a reference YAML parser and emitter, alternative grammar rules for YAML that matches the way the parser works and a set of guidelines for parser writers. The language of implementation of the reference parser should be Python without using its advanced language-specific features. This will ensure that a proficient programmer can write a YAML parser and emitter in her language of choice and will greatly increase the adoption of YAML. I do not expect to complete the reference parser during the period of SoC though.
Web-based administration interface for DrProject
Student: Gregory Lapouchnian
Mentor: Greg Wilson
Trac / Repository: http://pyre.third-bit.com/drproject/drproject-dev/
Currently the administration of a DrProject installation has to be done through a command line script. This works well in certain situations where a professor is able (or willing) to write scripts to make use DrProject's API in order to execute batch operations. However a web-based administration interface would greatly simplify course management by providing a central location for common tasks such as removing a student from a course or moving them from one project to another. A web-based console will also make it possible to present to the instructors some general statistics for the groups and students in their course in a convenient format.
Cheesecake enhancements and its integration with PyPI.
Student: Michał Kwiatkowski
Mentor: Grig Gheorghiu
Project timeline and ideas: Cheesecake - feel free to add your comments about the project there!
Cheesecake is an application designed to evaluate and estimate the overall quality (or so called 'kwalitee') of a given software package written in Python. It emphasizes a need for well-written documentation and unit tests, encouraging good programming practices and penalizing sloppy design and careless distribution. Using Cheesecake to check your code gives you confidence that your software doesn't merely run, but is usable and easy to test and modify as well.
Because Python is very easy to learn and use there exists a vast variety of software written in it, most of which was scattered until PyPI was created. Now, when new packages are being indexed on Cheese Shop every day, an effort can be made to spread the spirit of good software design and code reuse among the Python community. This can be achieved by combining the power of Cheesecake and Cheese Shop. Everytime a new version of a package would be uploaded to Cheese Shop, its cheesecake index will be calculated and published on web. Having a way to measure a quality of a package with accordance to other existing packages will be of invaluable help for all developers. It will promote well built packages and in the long run raise the overall quality of Python software.
Adding Cheesecake functionality to PyPI has been already mentioned by Phillip J. Eby on the catalog-sig mailing list. Together with Cheesecake maintainer Grig Gheorghiu we've discussed modifications needed to be done to Cheesecake code to be reliable enough so it could be incorporated into PyPI service. A working copy of our ideas is accessible on the project wiki. It includes enhancing Cheesecake code scoring techniques to take into account unit tests of a package, running tests in secure environment, extending supported archive formats and fixing all known bugs. Development of Cheesecake will adhere to best practices such as unit testing, continuous integration (via buildbot), pylint verification, etc.
The next part of this project will include collaboration with Richard Jones, PyPI maintainer, and merging Cheesecake into PyPI service. Upon completion all PyPI uploads will be automatically scored by Cheesecake. It will be possible to browse packages archive by cheesecake index, sorting results by installability, documentation and code kwalitee index. Statistics in numeric and graphical form will also be made available. This part of a project will involve writing server-side code, with emphasis on security and robustness.
The remaining time will be spent on resolving all problems that would occur during usage of Cheesecake and PyPI. Along with fixing bugs, I will develop a simple Hello world package that can be taken as an example of good development practices for all Python developers. It should also score 100% in the Cheesecake test of course. It will be what hello is for GNU Project.
PyCells: Port of CLOS Cells extension
Student: Ryan Forsythe
Mentor: James Tauber
Summary: Python port of Kenny Tilton's Cells extension to CLOS
Cells is a framework which, rather than defining a class as a set of methods and attributes, instead defines it as a set of interdependent slots, each of whose values are determined by formulas. It is analogous to a spreadsheet. While it is similar to Python's idea of properties, elements of a class which act like values but which are generated by a method, cells build in an automemoization function so their results are recalculated only when needed. My task for Summer of Code would be to build a package which implemented Cells in Python, built regression testing for the package, and translate CLOS Cell's demo applications into PyCell.
Python interface for writing Mozilla (NPAPI) plugins
Student: Marcin Pertek
Mentor: Mark Hammond
Source Browser: http://www.pysoy.org/browser/trunk/pymplug/
Writing plugins with the NPAPI requires using the cumbersome and error-prone C or C++. I propose a thin binding of the interface to Python, by the means of a plugin embedding a Python interpreter and exposing inside it the NPAPI. It does not imply executing Python code from webpages, simply a way to write plugins in Python instead of C.
In addition to ease the creation of plugins, I would make bindings to readily enable the use of wxPython, PyGTK, PyGame and Soya3D. These bindings would enable those libraries to handle the in-browser window and associated events.
Support Vector Machines for SciPy
Student: Albert Strasheim
Mentor: Dave Kammeyer
Integrating libsvm (and possibly other SVM libraries) with SciPy using ctypes and providing various tools for manipulating datasets for use with SVMs.