The Web, and the principles behind it, is mostly asemantic from a Computer Science point of view. I claim that most of the semantics is, in fact, highly local, at the document level.

First, observe that cross-document semantics is almost absent.

  • The only semantics in hyperlinks is “this resource points to this other resource which may or may not exist, which may or may not change over time”.
  • Resource Identifiers or URLs or URIs are a way to name each resource. No particular semantics there.
  • The HTTP protocol (and related protocols) has some explicit semantics, but it is very simple (GET, POST, …).

However, HTML and related markup languages are where most of the semantics lie. You can get the title of the page, for example, from any well formed HTML page. It is still fairly primitive, and in practice, there might be hardly any semantics present at all, but if there is any on the Web, that’s where it is.

Why? As I said before, semantics is complexity and complexity is hard. But local complexity and thus, local semantics, is easier to manage. You can have a very complex algorithms that could still be practical in daily use. Consider image compression software. But the complexity needs to be localized.

Any attempt at making the Web semantically richer in a cross-document way is bound to fail. (I claim.) Feel free to build complex documents on your own. My blog is certainly a complex beast, and I have crazy semantics going on under the hood. But to export complexity on a global scale is silly. (I claim.)

Do not worry, this blog will not turn into an HTML/CSS blog, but here is a nice trick to select all hyperlinks with absolute URIs:


a[href^="http"] {
background:yellow;
}

This will, naturally, probably never work with Microsoft browsers, but it with Firefox 2.0.

Tim has been told that all this travel he does is not environmentally friendly. That’s right. Planes kill the planet. A single round-trip Montreal-Los Angeles emits 1 ton of CO2. Compare this with the fact that the average Canadian emits 18 tons of dioxide.

Anyone who flies more often than 30 times a year is very likely an evil, tree hater, child molester.

This includes a large fraction of the research community.

Which is exactly why I never bring my kids to the office.

This is great fun. Taporware: prototype of text analysis tools. Their “about” page is probably slightly obselete, but the gist of it is there:

TAPoRware is a set of text analysis tools that enables users to perform text analysis on HTML, XML and plain text files, using documents from the users’ machine or on the web. The TAPoRware tools were developed with support from the Canada Foundation for Innovation and the McMaster University Faculty of Humanities. These tools are being developed by Geoffrey Rockwell, Lian Yan, Andrew Macdonald and Matt Patey of the TAPoR Project for a TAPoR Portal which we expect to open in 2005.

« Previous PageNext Page »

Powered by WordPress