Philipp's Computing Blog

Success is about speed and efficiency

Easy Usage of XML with Python

The builtin Python modules to work with markup languages can be found on http://docs.python.org/library/markup.html. For XML these are mainly DOM (incl. minidom), SAX and ElementTree.

A comparison of minidom and ElementTree including good examples can be found on http://mike.hostetlerhome.com/present_files/pyxml.html.

Other than the default Python modules there is also a very Pythonic module called lxml which behaves similar as ElementTree and is based on Gnome's libxml2.

MiniDom

ElementTree

Here is a small example:

import elementtree.ElementTree as ET
tree = ET.parse("page.xhtml")
# the tree root is the toplevel html element
print tree.findtext("head/title")
# if you need the root element, use getroot
root = tree.getroot()

SAX

lxml

More demanding XML applications including schemes and namespaces should probably use lxml XML toolkit, which is the Pythonic binding for the C libraries libxml2 and libxslt. The most Pythonic way of using it is to make use of lxml.objectify.

Resources