Why libxml2 takes forever to transform XHTML files

For my Online XML Course, I process lots and lots of XHTML content using XSLT. Up until now, I avoided the libxml2 XSLT processor (xstlproc) because it was unacceptably slow.

Today, I found out what is happening. It loads all of the XHTML DTD files from W3C each and every time it processes an XHTML file. To get around the problem, use the –novalid flag when invoking the xsltproc command line. You might get warnings about problematic entities, so I suggest you try a DOCTYPE declaration like the following:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<!ENTITY laquo "&#171;">
<!ENTITY raquo "&#187;">
<!ENTITY oelig "&#339;">
<!ENTITY nbsp "&#160;">

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax