Aug 222011
 

The builtin Python modules to work with markup languages can be found on http://docs.python.org/library/markup.html. For XML these are mainly DOM (incl. minidom), SAX and ElementTree.

A comparison of minidom and ElementTree including good examples can be found on http://mike.hostetlerhome.com/present_files/pyxml.html.

Other than the default Python modules there is also a very Pythonic module called lxml which behaves similar as ElementTree and is based on Gnome’s libxml2.

MiniDom

ElementTree

Here is a small example:

import elementtree.ElementTree as ET
tree = ET.parse("page.xhtml")
# the tree root is the toplevel html element
print tree.findtext("head/title")
# if you need the root element, use getroot
root = tree.getroot()

SAX

lxml

More demanding XML applications including schemes and namespaces should probably use lxml XML toolkit, which is the Pythonic binding for the C libraries libxml2 and libxslt. The most Pythonic way of using it is to make use of lxml.objectify.

Resources

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Related Posts:

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>