If you think JavaScript (errr… ECMAScript) is uninteresting. Think again! This Yahoo! talk ought to change your mind (part 1 of 3):

In the most recent Communications of the ACM (February 2007), Joshua Goodman and his coauthors tell us, in Spam and the Ongoing Battle for the Inbox1, that it is very difficult to build reliable CAPTCHAs or (reverse) Turing tests, to differentiate machines from human beings. In the most reliable tests, machines had a success rate of 5% whereas in other cases they had a success rate of 67%. This may seem to be a high failure rate (95%), but this only means that the machine needs to try 20 times on average to succeed once. So you slow down the machine by a factor of 20 (in the best of cases), and since machines are thousands of times faster than human beings, you have achieved very little. They do not report human error rates, but I know that I fail Blogger’s tests routinely and I’m not an idiot (though you may think otherwise if you wish), not blind, and so on.

This is not just a theoretical concern. I have used visual CAPTCHAs before on my blog and they failed me. I still got spammed. The solution I know use is to apply a very simple CAPTCHA but one that is unique to my blog. Since I am not a very popular blogger, I hope that spammers will not bother breaking my CAPTCHAs. If I ever, by some strange turn of events, became a popular blogger, my solution would be to craft routinely new CAPTCHAs.

This means that there are AI bots out there at war with legitimate bloggers.

To those who doubt AI can be used for evil purposes, well, there you go. There are people out there purposely designing AIs for evil (spamming is certainly unethical). We are not talking about the military. We are not talking about crazy scientists. We are talking about the worst kind of evil masterminds: greedy unethical capitalists.

1- They cite Using Machine Learning to Break Visual Human Interaction Proofs by Chellapilla and Simard.

What the current SOAP fad has done is to make us forget how to build and deploy applications on the Web according to the true HTTP specification. Even wikipedia is incredibly confused and confusing with respect to HTTP. It is ridiculously simple, but overly ignored and misrepresented.

GET Get some resource identified by a URI. This request should not change the state of the resource.
The resource itself may change over time however.
POST

Add a new resource (post a new message, a new comment, a new post, a new file) or modify an existing resource. The provided URI is not the URI of the new resource, but rather the URI of a related resource (for example, the URI of the blog or posting board).

PUT

Create or replace a resource having the given URI. This method is idempotent!

DELETE Delete a resource.

What does this mean?

  • A POST from should never replace a resource. A POST form cannot be used to edit a post and is safe.
  • GET queries are stateless. No matter who does the GET, the same result should come out. If I copy and paste a URL in my browser and pass it to someone else, they should end up with the same resource. A GET query cannot create, change or delete a resource. GETs are safe. I should always be able to follow a link without fear of deleting or buying something.

As to why this might not work, see what Parand had to say about it.

Some people will love this. I prepared a mockup exam for my INF 6450 students. See if you can pass it (in French, but you can probably grok most of it if only you know the basic XML vocabulary). I’m generally impressed how well my students get by in this course. The full XML course is online, but requires you to have Firefox (warning: sometimes my server is slow).

According to “highly reputable” (well…) people, this is a Mickey Mouse course. But do not take their word for it, go see yourself (with Firefox 2.0 or better). Indeed, there is no Software Engineering. No real Computer Science (as in, algorithms, data structures, and so on). Well, I do offer a real Computer Science course, but I still think that teaching XML is way cool and fully justified. It is a programming and IT course. Programming is fun. Getting by with crazy declarative languages like XSLT is hilarious. Figuring out how to do aggregations in XSLT is really a nasty problem (with several elegant and simple solutions). Figuring out how to intersect sets in XSLT, given that all you have is a union operator, is really fun too. And you never have a student ask you why he needs to learn this. Students see immediately why this is required to be a top-notch Web developer.

I still do not cover very well XQuery or XSLT 2.0. I’m starting to cover CSS 3.0, but barely. MathML is poorly supported so I do not go far in it.

XQuery seemed nice, but I’m still waiting for the real cool applications. So far, XQuery is still, to me, a poor man’s XSLT.

XSLT 2.0 looks good, but support for it is still rare and I still do not have a good use case. Certainly, XSLT 2.0 cleaned up a few things, but I was carefully not to introduce my students to the nasty parts of XSLT 1.0 which is good since they go away now. Regular expressions in XSLT 2.0 is a nice feature but it almost seems like this requires not special introduction: if you know both regular expressions and XSLT, then there is nothing special happening. Being able to generate several documents might be nice, but I still do not see the use case and it seems a trivial addition anyhow.

XLink? Badly supported, not exciting. Still useful in, say, SVG, but trivially so.

SVG? Might be nice, but it is painful to do by hand. In theory, you could have data being transformed to SVG through XSLT, but do people really do that?

XSLFO? No use case. DocBook does fine if you need to generate PDF technical reports. Want to generate bills in PDF? I cannot imagine doing it in XSLFO. Do people really do that?

AJAX is nice and it is a great DOM API use case. But the cross-browser issues are so terrible that you can only go so far.

Java-wise, I now try to show that there are several ways to tackle XML. For example, the iterative approach is rarely included and I think it is very nice.

J2EE, web services? I cover REST (quickly done), I cover some SOA. For the rest, my course is already packed and I do not want to get into enterprise computing which I think is boring and totally lacking of real innovation.

Ontologies? I cover RDF which is barely useful, but still has good use cases (Dublin Core and 3 or 4 others), but anything beyond that is probably a waste of my (undergraduate) students.

DTD, Relax NG… these are the good guys, but they are barely useful. XML is at its best as an extensible language which is not a very schema-friendly concept. Very, very few people need to write DTDs or Relax NG schemas. You sometimes need to read them to figure out what you have to output, and it is useful to check that you are producing good XML, but validating is usually a waste of time unless you have problems. XML Schema? Please! Let us not waste time with this pitiful excuse of a spec.

(Disagree with my statements? Please comment!)

The Web is not virtual. Amazon.com is an actual store. An online course is an actual course. Email is not virtual communication. Communities on the Web are not virtual.

Something is virtual if it is a mere representation of what is. My blog is a virtual notebook: it is not a notebook. But my blog is not virtual! It is a real blog! My identity on the Web is not virtual, but an avatar in a video game is a virtual me. A virtual community would be a representation of a community, so real people and real communication between these people would not occur. Virtual memory is virtual because we make software believe that there is memory, when really, there is none. We commonly work with virtual hardware: you make the operating system believe that it runs directly on a machine, when, in fact, it runs inside a software box emulating a machine.

Something virtual is not real. If it is real, it cannot be virtual.

The word virtual is a dangerous one used by reactionary folks who like to dismiss anything electronic as not being quite real. It is deeply rooted in reactionary thinking. For example, they sometimes suggest that electronic meetings have to happen in virtual rooms with virtual chairs. (Think Second Life.) Experience shows that in the electronic reality, it is often better not to virtualize the real world for the very simple reason that we can do better than setup a virtual reality, we can create an actual one that works better. The Web has no virtual chair, virtual corridor, and so on. The Internet has real Web pages, real blogs, real instant messaging, and so on. Virtual representation of people do not work well, it is better to build real Web identities.

« Previous PageNext Page »

Powered by WordPress