I’ve been working hard at an XML course for the last few months. While I’ve been done a lot of e-business related work in recent years, I didn’t consider myself an XML expert.
Still, I’ve been one of the early adopters regarding XML, starting out in 1997-1998 when it was still a cowboyland. I kept hacking away at XML until about 2001 and then I went away and did other things. I actually did a commercial project (as an technology architect) in 2001, but it was one of my last project before going back to academia (Acadia University).
What happened between now and 2001? Well, Mozilla for one thing or, rather, true standard-compliant XML support in widely available browsers. Also, a lot, but really a lot of new “standards” have come along, things like XHTML and so on.
In truth, I don’t think much changed since 2001. Not as much as I thought.
After studying carefully what’s out there, I come away with the following conclusions:
- Internet Explorer doesn’t support basic things like XHTML and its general support for XML is quite lacking. It is simply not a good XML tool. Mozilla (including Firefox) is pretty good but there are a few gotchas: Mozilla just ignores DTDs (not validating) which brings about many problems (like missing entities) and you can’t save or source-view the output of a XSLT transformation. Still, Mozilla is good enough. I don’t know about Opera, but I heard good things.
- DTDs are just fine and they are more often than not an overkill. XML Schema and other formal ways to specify XML applications get little support in actual software and are just not so useful.
- Namespaces are a mess: they complicate things, they are incompatible with DTDs, and URIs as identifiers is a confusing idea. Yet, they work well enough and are usable.
- XSLT 1.0 is truly powerful and very convenient. Couple XSLT with EXSLT extensions and you really can do pretty much anything you want. Exporting XML to HTML or to LaTeX is really easy. However, some things are tricky, like grouping. The best and fastest free XSLT engine I could find is 4suite. XSLT 2.0 is still pretty much unsupported. Either way, whether you use EXSLT extensions or XSLT 2.0, you need things like regular expressions.
- Current RDF/XML is a pain, period. RDF itself is sane.
So, I think that a good XML project probably uses XSLT, maybe DTDs, a lot of XPath, but as little of the rest as possible.
2 thoughts on “Current state of affairs in the XML world (according to me)”
Difficult to tell what your problem is… you tried to post XML, but, of course, the < and > tags disappear.
I suspect you’ve got a namespace issue: the xpath expression /html/head/title doesn’t work if html, head and title are in a namespace.
You need to define a namepspace and modify your xpath expression accordingly. It is a pain, but that’s the name of the game.
Glad to hear some honest commnets on the state of XML. I’ve been having a devil of a time trying to use Java 5.0’s new XPath functionality with XHTML. For example, javax.xml.xpath handles the following sample XML document just fine:
Moved to vlib.org.
However, when you add the extra data to make it an XHTML document, like this (taken from http://www.w3.org/TR/xhtml11/conformance.html), you can no longer get even the title from the document:
Moved to vlib.org.
It doesn’t throw an exception, it just reurns null objects or empty strings. Here’s the code I ran against both documents:
XPath xPath = factory.newXPath();
File xmlDocument = new File(“c:\tmp\test.htm”);
InputSource inputSource = new InputSource(new FileInputStream(xmlDocument));
XPathExpression expression = xPath.compile(“/html/head/title”);
String title = expression.evaluate(inputSource);
Have you encountered this problem with XHTML?
You may subscribe to this blog by email.