Differences between revisions 4 and 6 (spanning 2 versions)
Revision 4 as of 2008-07-14 09:44:20
Size: 1233
Editor: p5B2DB4B3
Comment: link not working, dont know how to comment out...
Revision 6 as of 2011-02-26 09:37:21
Size: 1501
Editor: StefanBehnel
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:

'''NOTE''': A faster and much simpler way to extract information from an XML document in an event-driven, memory efficient fashion is [[http://docs.python.org/library/xml.etree.elementtree.html#xml.etree.ElementTree.iterparse | ElementTree.iterparse()]].
Line 25: Line 27:
 * [http://docs.python.org/lib/module-xml.sax.html Python Library Reference, xml.sax] -- API documentation
 * [http://www.rexx.com/~dkuhlman/pyxmlfaq.html Python XML FAQ and How-to] -- describes sax & MiniDom
 * [http://pyxml.sourceforge.net/topics/howto/section-SAX.html SAX: The Simple API for XML] -- wordy tutorial
 * [http://www-106.ibm.com/developerworks/linux/library/l-pxml.html Charming Python:Revisiting XML tools for Python] -- kind of old
 * [http://www.xml.com/pub/a/2003/03/12/py-xml.html Usings SAX for Proper XML Output]
 * [[http://docs.python.org/lib/module-xml.sax.html|Python Library Reference, xml.sax]] -- API documentation
 * [[http://www.rexx.com/~dkuhlman/pyxmlfaq.html|Python XML FAQ and How-to]] -- describes sax & MiniDom
 * [[http://pyxml.sourceforge.net/topics/howto/section-SAX.html|SAX: The Simple API for XML]] -- wordy tutorial
 * [[http://www-106.ibm.com/developerworks/linux/library/l-pxml.html|Charming Python:Revisiting XML tools for Python]] -- kind of old
 * [[http://www.xml.com/pub/a/2003/03/12/py-xml.html|Usings SAX for Proper XML Output]]

"Sax" is an XML parser that operates element by element, line by line.

MiniDom sucks up an entire XML file, holds it in memory, and lets you work with it. Sax, on the other hand, emits events as it goes step by step through the file.

NOTE: A faster and much simpler way to extract information from an XML document in an event-driven, memory efficient fashion is ElementTree.iterparse().

Example

   1 import xml.sax
   2 
   3 class InkscapeSvgHandler(xml.sax.ContentHandler):
   4     def startElement(self, name, attrs):
   5         if name == "svg":
   6             for (k,v) in attrs.items():
   7                 print k + " " + v
   8 
   9 parser = xml.sax.make_parser()
  10 parser.setContentHandler(InkscapeSvgHandler())
  11 parser.parse(open("svg.xml","r"))

Sax (last edited 2011-02-26 09:38:25 by StefanBehnel)

Unable to edit the page? See the FrontPage for instructions.