Aug 222011

The builtin Python modules to work with markup languages can be found on For XML these are mainly DOM (incl. minidom), SAX and ElementTree.

A comparison of minidom and ElementTree including good examples can be found on

Other than the default Python modules there is also a very Pythonic module called lxml which behaves similar as ElementTree and is based on Gnome’s libxml2.



Here is a small example:

import elementtree.ElementTree as ET
tree = ET.parse("page.xhtml")
# the tree root is the toplevel html element
print tree.findtext("head/title")
# if you need the root element, use getroot
root = tree.getroot()



More demanding XML applications including schemes and namespaces should probably use lxml XML toolkit, which is the Pythonic binding for the C libraries libxml2 and libxslt. The most Pythonic way of using it is to make use of lxml.objectify.


VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)

Related Posts:

 Leave a Reply



You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>