These are straightforward, finite-effort coding projects.
The Py_VISIT() macro in objimpl.h was introduced to make coding of most tp_traverse slots straightforward, uniform, and obviously correct. For example, see cycle_traverse() in itertoolsmodule.c. Most older modules that define tp_traverse copy/paste/edit the tedious callback dance by hand, and several even define their own work-alike macros. These should be rewritten to use the standard Py_VISIT macro.
The Py_CLEAR macro in object.h was introduced to make coding of "decref and NULL out a containee pointer" operations safe. The reasons for why this is important but tricky to achieve are explained in a comment block before that macro in current trunk source. tp_clear and even tp_dealloc slot implementations should generally use Py_CLEAR now.
Cleanup compiler warnings with icc
There are many 64-bit warnings produced by icc when using the -Wp64 flag. icc is freely available for a one month trial license. NealNorwitz has a license and can make the warnings available on the web if anyone is interest in getting rid of these warnings.
Make modules Py_ssize_t clean
Some modules don't use Py_ssize_t. They need a code review. Any module which declares an int (rather than a Py_ssize_t) for size and passes its address to PyArgs_ParseTuple(args, "s#", &str, &size) needs to be updated. An example of a module that has already been updated is Modules/_codecsmodule.c
Check for consistent memory API usage
Verify that if PyMem_* APIs are used to (re)allocate memory, that PyMem_* APIs are used to free memory. Same deal with PyObject_* APIs. (ie, ensure that PyMem_* and PyObject_* memory APIs aren't mixed.)
Verify all int/long C APIs are correct
With the conversion to use Py_ssize_t, it's important that we didn't miss any APIs. There are very few APIs which take (or return) a long. But there are still quite a few that take/return an int. All of these are believed to be correct, but more reviewers could help. Any API which returns an int is fine if the value is known to fit in 32 bits (like APIs that return a value between -2 and 2).
Update Demo/ and Tools/ directories
PEP 356 says "Check the various bits of code in Demo/ all still work, update or remove the ones that don't."
Directories to check:
cgi classes comparisons curses embed imputil md5test metaclasses newmetaclasses parser pdist pysvr rpc scripts sockets threads tix tix/bitmaps tix/samples tkinter tkinter/guido tkinter/matt xml xmlrpc zlib
Create tests for sequences over 2GB
It would be great if someone could try to put together some tests for bigmem machines. The tests should be broken up by those that require 2+ GB of memory, those that take 4+, etc. Many people won't have boxes with that much memory. Ultimately, the tests should be incorporated into the test suite with a special designation similar to -u all. I'm thinking -G # where # is the size of RAM in GB.
The test cases should test all methods (don't forget slicing operations) at boundary points, particularly just before and after 2GB. Strings are probably the easiest. There's unicode too. lists, dicts are good but will take more than 16 GB of RAM, so those can be pushed out some.