Differences between revisions 1 and 2
Revision 1 as of 2006-01-11 02:02:42
Size: 2651
Comment:
Revision 2 as of 2006-01-11 04:02:15
Size: 4921
Editor: dsl092-068-248
Comment: Add several more existing problems
Deletions are marked like this. Additions are marked like this.
Line 40: Line 40:
Some modules also do not perform rigorous checking of data they operate
on. The marshal module can cause it to crash the interpreter when given
certain strings:

{{{
    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, marshal
    >>> while 1:
    ... try:
    ... marshal.loads(os.urandom(16))
    ... except:
    ... pass
    ...
    Segmentation fault
}}}
Line 54: Line 73:
    Segmentation fault
}}}

A slightly subtler example involves getting the interpreter to exhaust
some resource internally while performing a single operation:

{{{
    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> f = lambda: None
    >>> for i in xrange(1000000):
    ... f = f.__call__
    ...
    >>> del f
    Segmentation fault
}}}

== GC/weakref interaction ==

Interaction between these two systems has historically been a sticky point for
CPython. There is still at least one problem in Python 2.4.2:

{{{
    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import weakref
    >>> ref = None
    >>> class Target:
    ... def __del__(self):
    ... global ref
    ... ref = weakref.ref(self)
    ...
    >>> def g():
    ... w = Target()
    ... w = None
    ... print ref()
    ...
    >>> g()
Line 75: Line 136:

 * The {{{buffer}}} builtin can also be dangerous, since it notionally claims
 a reference to a range of memory, but does so without going through a Python
 object or using the standard Python refcount system. This is visible when,
 for example, constructing a buffer from an {{{array.array}}}, then resizing
 the {{{array}}} such that it internally {{{realloc()}}}s its storage, moving
 the memory in the process. The {{{buffer}}} will now refer to an invalid
 pointer.

TableOfContents

While a lot of effort has gone into making it difficult or impossible to crash the Python interpreter in normal usage, there are lots fairly easy ways to crash the interpreter. The BDFL pronounced recently on the python-dev mailing list:

    I'm not saying it's uncrashable. I'm saying that if you crash it, it's a
    bug unless proven harebrained.

I thought it might be worthwhile to document some ways the interpreter can be crashed so that people can learn where they need to tread lightly.

Bogus Input

Through Python 2.4 you could crash the interpreter by redirecting stdin from a directory:

    % python2.4 -c 'import sys ; print sys.version'
    2.4.1 (#3, Jul 28 2005, 22:08:40) 
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)]
    % python2.4 < .
    Bus error

Starting with 2.5 the interpreter notices and aborts:

    % python2.5 -c 'import sys ; print sys.version'
    2.5a0 (41847M, Dec 29 2005, 22:21:03) 
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)]
    % python2.5 < .
    Fatal Python error: <stdin> is a directory
    Abort trap

Some modules also do not perform rigorous checking of data they operate on. The marshal module can cause it to crash the interpreter when given certain strings:

    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01) 
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, marshal
    >>> while 1:
    ...     try:
    ...         marshal.loads(os.urandom(16))
    ...     except:
    ...         pass
    ... 
    Segmentation fault

Exhausting Resources

There are a number of areas where resource exhaustion can crash the interpreter. Here's one fairly easy to demonstrate way to do it though:

    % python
    Python 2.5a0 (41847M, Dec 29 2005, 22:21:03) 
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1671)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.setrecursionlimit(1<<30)
    >>> f = lambda f:f(f)
    >>> f(f)
    Segmentation fault

A slightly subtler example involves getting the interpreter to exhaust some resource internally while performing a single operation:

    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01) 
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> f = lambda: None
    >>> for i in xrange(1000000):
    ...     f = f.__call__
    ... 
    >>> del f
    Segmentation fault

GC/weakref interaction

Interaction between these two systems has historically been a sticky point for CPython. There is still at least one problem in Python 2.4.2:

    $ python
    Python 2.4.2 (#2, Sep 30 2005, 21:19:01) 
    [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import weakref
    >>> ref = None
    >>> class Target:
    ...   def __del__(self):
    ...       global ref
    ...       ref = weakref.ref(self)
    ... 
    >>> def g():
    ...   w = Target()
    ...   w = None
    ...   print ref()
    ... 
    >>> g()
    Segmentation fault

Dangerous Modules

Some modules are designed to allow programmers access to the guts of things. Naturally, they also give programmers the opportunity to shoot themselves in the foot. Here are a few.

  • The new module allows you to construct various types of objects that normally can't be explicitly created from the interpreter. You can, for example, create code objects and give them arbitrary strings as their "bytecode". There's no telling how successfully the interpreter will handle such abuses.

  • The dl module is available on many Unix systems. It provides an interpreter-level interface to the dlopen() function, giving you dynamic access to the functions in arbitrary shared libraries. No checks are performed on the arguments to the functions you call. Hilarity can thus ensue. (The ctypes module, under consideration for inclusion in Python 2.5, provides similar functionality.)

  • The buffer builtin can also be dangerous, since it notionally claims a reference to a range of memory, but does so without going through a Python object or using the standard Python refcount system. This is visible when, for example, constructing a buffer from an array.array, then resizing the array such that it internally realloc()s its storage, moving the memory in the process. The buffer will now refer to an invalid pointer.

CrashingPython (last edited 2014-02-10 15:07:35 by JoakimEk)

Unable to edit the page? See the FrontPage for instructions.