JAWS Screenreader Adaptation for Mozilla Firefox

From Catherine Roy, I learned that there is now a screenreader for Mozilla Firefox. This is an essential tool for visual impaired Web surfers. The adaptation to Firefox is a GPL but JAWS itself is a commercial (Windows-only?) tool.

What do visually impaired Linux users do? I know KDE has an accessibility initiative, but how does it compare with the Windows or Mac universe? Are there screenreaders for Firefox under Linux? I suspect that Linux (or even Mac) is probably behind in this respect?

Update: it looks like Fire Vox could be a better alternative.

Implementing a Rating-Based Item-to-Item Recommender System in PHP/SQL

Following some requests I got about the paper Slope One Predictors for Online Rating-Based Collaborative Filtering, I decided to make available a technical report which actually gives some SQL and PHP code: Implementing a Rating-Based Item-to-Item Recommender System in PHP/SQL.

Useful JavaScript documentation

I can’t find these on my blog anymore, so I’m reposting them. I complained earlier that JavaScript is poorly documented. My friend Scott Flinn gave me some useful links that are hard to come by (Google doesn’t find them quickly for me):

If you need DOM documentation for XML (XHTML in Gecko
is treated as pure XML), you will find it here:

A few other useful links are here:

However, a found another good site: http://www.xulplanet.com/references/objref/

Working upwind

Paul Graham has another beautiful essay where he gives lifelong advice:

Instead of working back from a goal, work forward from promising situations. This is what most successful people actually do anyway.

In the graduation-speech approach, you decide where you want to be in twenty years, and then ask: what should I do now to get there? I propose instead that you don’t commit to anything in the future, but just look at the options available now, and choose those that will give you the most promising range of options afterward.

It’s not so important what you work on, so long as you’re not wasting your time. Work on things that interest you and increase your options, and worry later about which you’ll take.

That’s interesting. Notice that he is actually saying to not set long term goals. Well, I’ve given up on those a long time ago since I can’t seem to stay on one straight line for more than a year anyhow.

He is saying to just look at what is in front of you, and work on what has more potential.

Then, when people ask you what you where you want to be in 5 years, you just make up a story or do as I do: just say you have no idea!

In the electronic world, less structure is better

Here’s a quote touching on something very important for me: people tend to try to reproduce structures they know to work in the “real world” into the eWorld. So, they create electronic management systems that are like real management: hierarchical, centralized and rigid. No, no and no! When will they learn that such things never win in the end?

What worked? Please look at what worked! Email and the Web. These are the things that worked… look at them!!! Both of these things have very minimal models of how people work. Look at google: it doesn’t assume much at all about your work process. Look at amazon: their main goals are fewer clicks and less structure.

I don’t want an electronic management system to tell me how to work. In fact, just stay away from monolithic tools: give me some simplistic tools (like a hammer) and let me work!

The general theme is that less management is better, and that individual learners could write all of their posts, assignments and papers from their own site, and these could be directed to each class as web feeds. The classes would aggregate the feeds from all the studens and instructors. The beauty of this kind of system is that each student keeps all of his/her content, and it does not get locked away in an inaccessible archive of a centrally controlled LMS.

From a post by Harold who was writting about a post by James Farmer

Piled Higher and Deeper

Thanks to geomblog I found out there is such a thing as daily comics about working on a Ph.D. It is pretty funny though I was so among the lucky ones when I wrote my Ph.D.: I was very naïve.

What I want to see is a follow-up where the Ph.D. student actually gets a job!

I read somewhere last night that according to a study, only 15% of Ph.D.s in science working in Québec (Canada) are on a professorship (Canada). It can be either a good or a bad thing. As for myself, after a got my Ph.D., I never could find a decent job offer in Québec that wasn’t a professorship. I know few jobs that are quite good outside academia, but I certainly don’t know many. Where are all those Ph.D.s and are they happy?

Semantic Web Ontologies: What Works and What Doesn’t

Here’s a beautiful paper on Semantic Web Ontologies. The author makes very well the point that most people have gotten by now: ontologies can only have a very limited appeal outside laboratories. If you can include marriage or terrorist in an ontology, then you can’t really do very much outside a very limited scope.

In fact, getting to write ontologies for other people is very much similar to controlling language as in the famous novel 1984 because unlike natural language, ontologies are very limited semantically.

Are the people I’m talking about terrorists or freedom fighters? What’s the definition of patriot? What’s the definition of marriage? Just defining these kinds of ontologies when you’re talking about these kinds of political questions rather than about part numbers; this becomes a political statement. People get killed over less than this. These are places where ontologies are not going to work.

NSERC – Policy on Intellectual Property

NSERC is the main funding body for research in science and engineering in Canada. It has an interesting policy on IP:

NSERC expects that any IP resulting from research it funds wholly or in part will be owned by the university or the inventor, according to university policy. Access to IP should be accorded to other sponsors in recognition of, and in proportion to, the sponsor’s contribution to the collaboration.

Alas, I must say that I violated this rule, against my will, in the past, but I will try harder to stick by it from now on.

The interesting question here is whether things like assigning copyright to a publisher are in violation of the funding body’s rules? Probably.

Academia really needs to get its act together with respect to IP as I’m not the only one who plays with grey areas…

Online courses force a deeper understanding

From eLearn Magazine, I got this quote in a paper by George P. Schell:

Online courses force a deeper understanding of information technology simply because they require immersion in the technology that supports the subject being taught. If students fail to master the technology skills required by the course they ultimately fail the course itself. We’ve long understood that immersion, such as learning a foreign language by living where the language is spoken, is a very effective method for quickly and deeply learning a subject.

Does JavaScript scale?

This post talks about how hard it is to debug JavaScript.

In general, pushing the UI to Javascript makes it hard to develop and debug. There are limited tools, the language is too lenient (no objects, weak typing), and testing involves cycling through webpages over and over again.

Obviously, this statement is false: there are objects in JavaScript and some nice features… JavaScript is not too lenient: there are many solid languages without strong typing and they work just fine. But I must say that, indeed, it is quite hard to debug JavaScript. Incredibly so.

My friend Scott Flinn would say “it’s the browser, stupid”… since he thinks that JavaScript is fine, but that JavaScript in the browser is bad.

That’s why, if I ever attempt to do non-trivial JavaScript, I will try to use command line interpreter. The command line is a powerful programming tool despite what Microsoft and Borland think.

Paquets… ou Seb en français

Seb is now available in French through Paquets… de quoi? Multi-language blogging is an interesting topic I covered elsewhere in my blog. I will eventually open a blog in French, but I’m worried that having too many blogs will kill the fun. I like having one spot to call mine… if I have to run around all over the place, it might get tiring…

My experience so far with Google ads

This is depressing. My blog gets millions of page loads per day (not really). So, being greedy (not really), I decided to put some ads on it. Hence, I put some Google ads following Yuhong’s foot steps. Well, so far, not a single click. Not one of you guys clicked on one of the ads.

I never thought I would make any money, but I still expected a few clicks a week.

To be fair, I think these ads are fairly useless. Right now, I see ads about blogging software. Maybe I write too much about blogging?

Why encyclopaedic row speaks volumes about the old guard

John Naughton wrote about people doubting wikipedia this well phrased bit:

we have become so imbued by the conventional wisdom of managerial capitalism that we think the only way to do things is via hierarchical, top-down, tightly controlled organisations

I certainly can see this phenomenon among many researchers. For them, research is about specifying what ought to be in a top-down approach.

Planning is important: for things you can plan. You cannot plan an encyclopia. You cannot design an encyclopedia using a top-down approach. You cannot design most software using a top-down approach (you can, but your project will fail when you’ll face what you didn’t plan for). You can’t do research in a top-down approach (but you can if you want to build a particule accelerator).

Tim Berners-Lee first executive summary of the World Wide Web

I copy this here for historical reasons. Notice how Tim didn’t simply point to a specification, he actually pointed to a working demo of what the Web could be. (Complete version can be found on the w3c Web site.)

From :Tim Berners-Lee (timbl@info_.cern.ch)
Subject :WorldWideWeb: Summary
alt.hypertext
Date :1991-08-06 13:37:40 PST
(...)
     Information provider view
The WWW browsers can access many existing data systems via existing protocols  
(FTP, NNTP) or via HTTP and a gateway. In this way, the critical mass of data  
is quickly exceeded, and the increasing use of the system by readers and  
information suppliers encourage each other.
Making a web is as simple as writing a few SGML files which point to your  
existing data. Making it public involves running the FTP or HTTP daemon, and  
making at least one link into your web from another. In fact,  any file  
available by anonymous FTP can be immediately linked into a web. The very small  
start-up effort is designed to allow small contributions.  At the other end of  
the scale, large information providers may provide an HTTP server with full  
text or keyword indexing.
The WWW model gets over the frustrating incompatibilities of data format  
between suppliers and reader by allowing negotiation of format between a smart  
browser and a smart server. This should provide a basis for extension into  
multimedia, and allow those who share application standards to make full use of  
them across the web.
This summary does not describe the many exciting possibilities opened up by the  
WWW project, such as efficient document caching. the reduction of redundant  
out-of-date copies, and the use of knowledge daemons.  There is more  
information in the online project documentation, including some background on  
hypertext and many technical notes. (...)

You can also check out Linus’ first email presenting Linux.

Slope One Predictors for Online Rating-Based Collaborative Filtering (SDM’05 / April 20-23th 2005)

I’m very proud of this little paper called Slope One Predictors for Online Rating-Based Collaborative Filtering. The paper report on some of the core collaborative filtering research leading to the inDiscover web site. I’ll be presenting it at SIAM Data Mining 2005 in April (Newport Beach, California).

This is a case where, with Anna Maclachlan, we did something that few researchers do these days: we looked for something simpler. The main result of the paper is that you can use extremely simple and easy to implement algorithms and get very competitive results.

The current trend, in academia, is to develop crazy algorithms that require not 10 lines of code, not 100 lines of code, but several thousands. I think the same is true in some industries: think of Web Services or Java (with the infinite number of new acronyms).

Well, I like complex algorithms and as a math guy, I like a challenge, but once in a while, I think it pays to go I think “wait! what if the average Joe wants to implement this?”

So, if you write real code and are interested in collaborative filtering, go check this paper.