Differences between revisions 15 and 41 (spanning 26 versions)
Revision 15 as of 2008-03-08 16:56:58
Size: 3798
Editor: dslb-084-056-013-026
Comment: Link fix
Revision 41 as of 2012-01-11 01:11:25
Size: 5829
Editor: c-66-41-60-82
Comment: Add link to PyXB
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
A variety of XML processing solutions are available for Python. This page attempts, at the very least, to list the most actively developed or most easily available. A variety of XML processing solutions are available for Python. This page attempts to list the major tools.

== Packages in the
standard library ==
Line 7: Line 9:
 * a pythonesque, simple-to-use and very fast XML tree library:
   * ElementTree - the [[http://docs.python.org/library/xml.etree.elementtree.html | xml.etree]] package (new in Python 2.5 but available for older versions, also see the fast xml.etree.cElementTree and the independent implementation [[http://lxml.de/|lxml]])
 * event-driven XML parsers:
   * ElementTree's [[http://docs.python.org/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse | iterparse()]] - a fast and easy-to-use event-driven parser with a high-level XML tree interface
   * pyexpat - a fast, low-level XML parser with an event-based callback interface
   * [[Sax]] - the xml.sax package, a Python implementation of the well-known low-level SAX API
Line 10: Line 18:
 * an event-driven XML parser compatible with the W3C SAX standard:
   * ["Sax"] - the xml.sax package
 * a pythonesque XML tree library:
   * ElementTree - the xml.etree package (new in Python 2.5)
Line 15: Line 19:
The DOM and SAX packages have the advantage of being compatible to W3C standard APIs, so users who are already familiar with these APIs can use them without learning too many new things. Everyone else should start with the more pythonic ElementTree library, which is very well integrated into the Python language, and therefore very easy to learn and use. The DOM and SAX packages have the advantage of being compatible with standard or de facto standard APIs, so users who are already familiar with these APIs can use them without learning too many new things. Everyone else should start with the faster and more pythonic ElementTree library, which is very well integrated into the Python language, and therefore very easy to learn and use.
Line 19: Line 23:
A long list of special purpose and general purpose Python XML packages is available from [http://pypi.python.org/pypi?:action=browse&show=all&c=500 PyPI]. The following is a choice of major tools that support a broader set of XML features. A long list of special purpose and general purpose Python XML packages is available from [[http://pypi.python.org/pypi?:action=browse&show=all&c=500|PyPI]]. The following is a choice of major tools that support a broader set of XML features.
Line 23: Line 27:
 * [http://uche.ogbuji.net/tech/4suite/amara/manual Amara] - Amara provides tools you can trust to conform with XML standards without losing the familiar Python feel
 * [http://codespeak.net/lxml/ lxml] - a pythonic, ElementTree-compatible binding for the libxml2 and libxslt libraries that comes with all sorts of powerful XML tools, well integrated into an easy-to-use Python API
 * [http://codespeak.net/lxml/objectify.html lxml.objectify] - a Python object API for XML based on lxml
 * [[http://lxml.de/|lxml]] - a pythonic, ElementTree-compatible binding for the [[http://xmlsoft.org/|libxml2]] and [[http://xmlsoft.org/XSLT/|libxslt]] libraries that comes with all sorts of powerful XML (and HTML) tools, well integrated into an easy-to-use Python API
 * [[http://lxml.de/objectify.html|lxml.objectify]] - a Python object API for XML based on lxml
 * [[http://pyxb.sourceforge.net/|PyXB]] - generates Python classes/modules that correspond to data structures/namespaces defined by XMLSchema, with validation
 * [[http://pyxsd.org/|PyXSD]] - an XML Schema mapping too (somewhat dated, last released in 2006)
 * [[http://www.rexx.com/~dkuhlman/generateDS.html|generateDS]] - generates Python data structures (for example, class definitions) from an XML Schema document
 * [[http://xml3k.org/Amara|Amara 2.x]] - Amara provides tools you can trust to conform with XML standards without losing the familiar Python feel
Line 29: Line 36:
 * [http://4suite.org/ 4Suite] - a framework for XML (and RDF) processing
 * [http://www.ikaaro.org/itools/ itools.xml] - itools provides XML processing support in a fashion similar to that of PullDom
 * [http://www.python.org/pypi/libxml2dom libxml2dom] - PyXML-style API for the libxml2 Python bindings
 * [http://pyxml.sourceforge.net/ PyXML] - the semi-official Python distribution for XML, now without a maintainer!
 * [http://www.python.org/pypi/qtxmldom qtxmldom] - PyXML-style API for the qtxml Python bindings
 * [[http://sourceforge.net/projects/pyxml/|PyXML]] - external add-on to Python's original XML support - (Warning: no longer maintained, does not work with recent Python versions)
 * [[http://www.ikaaro.org/itools/|itools.xml]] - itools provides XML processing support in a fashion similar to that of PullDom
 * [[http://www.python.org/pypi/libxml2dom|libxml2dom]] - PyXML-style API for the libxml2 Python bindings
 * [[http://www.python.org/pypi/qtxmldom|qtxmldom]] - PyXML-style API for the qtxml Python bindings
 * [[http://4suite.org/|4Suite]] - a framework for XML (and RDF) processing

=== XPath Support ===

 * [[http://code.google.com/p/py-dom-xpath/|py-dom-xpath]] - pure Python XPath implementation for use with DOM libraries
 * [[http://lxml.de/|lxml]] - lxml has standards compliant XPath 1.0 support based on libxml2. It also supports the [[http://www.exslt.org/|EXSLT]] extensions (including Python regular expressions) and allows calling Python functions from within XPath expressions.
 * [[http://xml3k.org/Amara2|Amara 2.x]] - Amara exposes an API to fully-compliant XPath (including [[http://www.exslt.org/|EXSLT]])
Line 37: Line 50:
 * [http://uche.ogbuji.net/tech/akara/nodes/2003-01-01/python-xslt XSLT tools for Python] - an (older) collection of examples and links by Uche Ogbuji
 * [http://codespeak.net/lxml/ lxml] has excellent support for XSLT that is based on libxslt
 * [http://www.python.org/pypi/XSLTools XSLTools] - XSL transformations on top of libxslt and [http://www.python.org/pypi/libxml2dom libxml2dom], with added Web development support
 * Some tools linked from the [http://www.w3.org/XML/Query/#implementations XQuery homepage] provide Python bindings for their XSLT2 and XPath2 implementations
If not mentioned otherwise, this means XSLT 1.0, not XSLT 2.0.

 * [[http://lxml.de/|lxml]] has excellent (and easy-to-use) XSLT support that is based on [[http://xmlsoft.org/XSLT/|libxslt]]. It also supports calling into Python code from XSL transformations through both XPath and XSLT extensions.
 * [[http://www.python.org/pypi/XSLTools|XSLTools]] - XSL transformations on top of [[http://xmlsoft.org/XSLT/|libxslt]] and [[http://www.python.org/pypi/libxml2dom|libxml2dom]], with added Web development support
 * Some tools linked from the [[http://www.w3.org/XML/Query/#implementations|XQuery homepage]] provide Python bindings for their XSLT2 and XPath2 implementations
 * [[http://xml3k.org/Amara2|Amara 2.x]] - Amara exposes an API to fully-compliant XSLT (including [[http://www.exslt.org/|EXSLT]])
Line 44: Line 59:
 * [http://pyxmpp.jajcus.net/ PyXMPP] - a Python XMPP (RFC 3920,3921) and [http://www.jabber.org/protocol/ Jabber] implementation
 * [http://jabberpy.sourceforge.net/ jabber.py] - a Python module for the jabber instant messaging protocol
 * [http://xmpppy.sourceforge.net/ xmpppy] - a Python library that is targete
d to provide easy scripting with Jabber
 * [[http://jabberpy.sourceforge.net/|jabber.py]] - a Python module for the jabber instant messaging protocol
 * [[http://
pyxmpp.jajcus.net/|PyXMPP]] - a Python XMPP (RFC 3920,3921) and [[http://www.jabber.org/protocol/|Jabber]] implementation
 * [[http://xmpppy.sourceforge.net/|xmpppy]] - a Python library that is targeted to provide easy scripting with Jabber
Line 52: Line 67:
 * PythonSoap  * see [[WebServices]]

=== Object Serialization in XML ===

 * [[http://coder.cl/software/pyxser/|pyxser]] - a Python extension to serialize/deserialize Python objects into XML
Line 57: Line 76:
 * ["Tutorials on XML processing with Python"]  * [[Tutorials on XML processing with Python]]

Python and XML

A variety of XML processing solutions are available for Python. This page attempts to list the major tools.

Packages in the standard library

The standard library has a number of tools available, which fall into mainly three categories:

  • a pythonesque, simple-to-use and very fast XML tree library:
    • ElementTree - the xml.etree package (new in Python 2.5 but available for older versions, also see the fast xml.etree.cElementTree and the independent implementation lxml)

  • event-driven XML parsers:
    • ElementTree's iterparse() - a fast and easy-to-use event-driven parser with a high-level XML tree interface

    • pyexpat - a fast, low-level XML parser with an event-based callback interface
    • Sax - the xml.sax package, a Python implementation of the well-known low-level SAX API

  • XML tree libraries that adhere to the W3C DOM standard:
    • MiniDom - the xml.dom.minidom package

    • PullDom - the xml.dom.pulldom package

The DOM and SAX packages have the advantage of being compatible with standard or de facto standard APIs, so users who are already familiar with these APIs can use them without learning too many new things. Everyone else should start with the faster and more pythonic ElementTree library, which is very well integrated into the Python language, and therefore very easy to learn and use.

External packages

A long list of special purpose and general purpose Python XML packages is available from PyPI. The following is a choice of major tools that support a broader set of XML features.

Pythonic tools

  • lxml - a pythonic, ElementTree-compatible binding for the libxml2 and libxslt libraries that comes with all sorts of powerful XML (and HTML) tools, well integrated into an easy-to-use Python API

  • lxml.objectify - a Python object API for XML based on lxml

  • PyXB - generates Python classes/modules that correspond to data structures/namespaces defined by XMLSchema, with validation

  • PyXSD - an XML Schema mapping too (somewhat dated, last released in 2006)

  • generateDS - generates Python data structures (for example, class definitions) from an XML Schema document

  • Amara 2.x - Amara provides tools you can trust to conform with XML standards without losing the familiar Python feel

W3C DOM-like libraries

  • PyXML - external add-on to Python's original XML support - (Warning: no longer maintained, does not work with recent Python versions)

  • itools.xml - itools provides XML processing support in a fashion similar to that of PullDom

  • libxml2dom - PyXML-style API for the libxml2 Python bindings

  • qtxmldom - PyXML-style API for the qtxml Python bindings

  • 4Suite - a framework for XML (and RDF) processing

XPath Support

  • py-dom-xpath - pure Python XPath implementation for use with DOM libraries

  • lxml - lxml has standards compliant XPath 1.0 support based on libxml2. It also supports the EXSLT extensions (including Python regular expressions) and allows calling Python functions from within XPath expressions.

  • Amara 2.x - Amara exposes an API to fully-compliant XPath (including EXSLT)

XSLT Support

If not mentioned otherwise, this means XSLT 1.0, not XSLT 2.0.

  • lxml has excellent (and easy-to-use) XSLT support that is based on libxslt. It also supports calling into Python code from XSL transformations through both XPath and XSLT extensions.

  • XSLTools - XSL transformations on top of libxslt and libxml2dom, with added Web development support

  • Some tools linked from the XQuery homepage provide Python bindings for their XSLT2 and XPath2 implementations

  • Amara 2.x - Amara exposes an API to fully-compliant XSLT (including EXSLT)

XML-based Communications

  • jabber.py - a Python module for the jabber instant messaging protocol

  • PyXMPP - a Python XMPP (RFC 3920,3921) and Jabber implementation

  • xmpppy - a Python library that is targeted to provide easy scripting with Jabber

Web Services

Object Serialization in XML

  • pyxser - a Python extension to serialize/deserialize Python objects into XML

Books and Articles

SIG

Editorial Notes

The above lists should be arranged in ascending alphabetical order - please respect this when adding new entries. When specifying release dates please use the format YYYY-MM-DD.

PythonXml (last edited 2012-01-11 01:11:25 by c-66-41-60-82)

Unable to edit the page? See the FrontPage for instructions.