Wednesday, May 2, 2012

Parsing HTML output in Plone functional doctests with lxml

When writing functional doctests, i used to fumble a bit to inspect what was in the HTML. Today i looked into lxml and it makes it a lot easier to test, especially the XPath makes for very readable tests.

For example, to test that a certain text appears in a viewlet, but not in the page itself, parsing the tree of the document is convenient. (Use case: A viewlet that displays "Other Items".)

 This snippet tests our viewlet, which should at that point in the test show exactly one item:
    >>> from lxml import etree
    >>> html = etree.HTML(browser.contents)
    >>> len(html.xpath('//*[@id="other-advertorial-texts"]/div[@class="box"]'))


Gilles Lenfant said...

That's the way I'm testing complex HTML outputs since the OTB xml.etree.ElementTree XPath support is poor, and the HTML outputs are not necessarily well formed XML.

icemac said...

You might have a look at z3c.etestbrowser. This is a zope.testbrowser with integrated lxml support.