How Google is just plain better

What is one of the most visited page on all of my sites? It is “Jolie, petite coquine”, the Web page of our cat. The page was originally designed by my wife back when she had a Web site of her own. The page ranks high on Voila for some sex bound keyword searches and people arrive at my cat’s page because they are horny. Now, that’s at least 10 hits per day. How come Google doesn’t fall short the same way? I don’t know, but somehow, Google knows my cat isn’t a sex object…

Or is she? Well, check it out for yourself!

Note: in doing research for this post, I found out that dmoz.org actually indexes various sex sites in a taxonomy. But my cat is nowhere to be found.

Living with the fear of failure

Before you start wondering: no I did not fail at anything today. In fact, my life is rather smooth going and while you routinely get bad and good and not so good and not so bad reviews from time to time, all my projects are proceeding forward better than I had a right to expect.

But like so many people, I’m haunted by the constant fear that I may fail. I was reminded of how hard it is by the pressure some Canadian athletes have reported feeling at the Olympics these days. Constant fear of failure is hard because even if your life is beautiful and you succeed in everything, you are still focused on possible failures. Ok. I’ll admit. I’m a pessimist. Or rather, a realist living in a bleak world.

Why do I fear failure so much? Failure is a neutral or even positive force. In fact, many times when I failed, I’ve actually been glad of the failure and found positive things in it… I don’t know… You might not get in the school you want, but you end up getting in an even better school. You do not get to see the movie you wanted to see, but you get to see an even better movie.

I suspect that there is a little cave man in me who fears he’ll get eaten by a dinosaur (yes, I know, I’ve watched too many Flintstones). Failure might be really bad… like having your feet in a dinosaur’s mouth and expecting the dinosaur to start eating you up.

What I know for certain is that fear of failure is a negative force in most of my life. It distracts me. Pulls me away from my family. Makes me dumber. Takes my eyes away from the road and on the ravin where my car will end up.

If you attend all classes, you pass…

Two profs allegedly got fired because they refused to grade students based on “effort” instead of results. Not that I think that recognizing effort in the grading is such an evil thing… and maybe the policy was even acceptable… Saying that students attending all lectures will pass the course might have its advantages… but the fact that the fellows were fired tells us something about the state of education in North America right now… I think there is clearly a downward spiral as far as the academic level goes. Not that I think it is necessarily bad.

It is a bit troubling in the following way however. If Internet is making information more widely available as before, and the university is no long the holder (and certainly not creator) of knowledge… I was thinking that universities could still authenticate knowledge: provide proof to someone that you do, in fact, know about archeology. But I forgot that academic levels have been going down in the last 20 years or so. So what will remain?

Someone commented in one of my earlier posts that universities are good at organizing knowledge. Knowledge might be readily available through Google, but it isn’t validated or organized very well. I guess, this is true: university professors are pretty good at determining what is sensible knowledge, with the unavoidable mistakes and bias. We are also pretty good at organizing it in a sensible fashion. However, time and time again, studies show that students overwhelming enrol in courses and degrees, not to learn, but for the recognition they get. They don’t care so much about the work professors do to organize and validate knowledge. If we lower the academic levels further, could it be that students will just leave universities? I think that if we ever reach the tipping point where corporations lose confidence in the training students receive, and this day is around the corner, we’ll be in trouble.

Most amazing Cringely article ever…

Cringely published an amazing paper on crime in the USA. Turns out that in 1982, a study was paid-for by the American Department of Justice. Three people were involved: Michael Block, Fred Nold, and Sandy Lerner. Cringely believes their study showed that the current sentencing guidelines would lead to a poor, more crime-ridden USA (and it did). The study was “hidden away”. Turns out that killed himself in 1983. Block became a law professor and won’t comment to Cringely about the study. Sandy Lerner went on to found Cisco.

A few things are amazing. The suicide of a researcher who possibly felt like a loser. It reminds me of Wallace Carothers who invented Nylon. It is unclear to me how you can feel like a loser after inventing Nylon, but apparently someone did. The second one is that the USA knows and knew that they were headed for a crime-ridden society and they went ahead anyhow. Why? I can’t figure it out. Lastly, there is the little detail that the statistician part of the study, Sandy Lerner, founded Cisco. This is an interesting contrast with the other fellow who killed himself.

A Theory of Strongly Semantic Information

Thanks to my colleague Jean Robillard, I found out that philosophers do Knowledge Management too! Following a request I made, Jean suggested I read an Outline of a Theory of Strongly Semantic Information by L. Floridi.

Of course, I’m a naïve reader, but still. I think I grasped some very important things.

He starts out by asking how much information is there in a statement? Well, in a finite discrete world (the realm where Floridi appears to live), you can reasonably define “information content” in terms of how many possibilities the statement rules out. For example, if my world is made of two balls, each of which can be either red or blue, so my world has 4 possible states, and I say that “ball 1 is blue”, there are only 2 possibilities left (ball 2 is either red or blue) so I could say that I’ve ruled out 2 possibilities and so my information content is 2. If I say “both balls are blue”, my information content is 4. You can see right away that a self-contradictory statement (“ball 1 is blue, both balls are red”) rules out all possibilities as well, so it has maximal information content. A tautology (“ball 1 is either blue or red”) has 0 information content. Floridi is annoyed by the fact that a self-contradictory statement has maximal information content.

In section 5, he points out that statements are not only either true or false, but they have a degree of discrepancy. So, for example, I can say that I have some balls. This is a true statement, but with high discrepancy. However, I can say that I have 3 balls when in fact I have 2 balls and while false, this is a statement with lower discrepancy, and maybe a more useful statement. Apparently, he borrows this idea from Popper, but no doubt this is not a new idea.

He comes up with conditions on a possible measure of discrepancy between -1 and 1. -1 means that the statement is totally false and matches no possible situation (“I have 2 and 3 balls”), 0 means that you have a very precise and true statement (“I have 2 balls”), and 1 means that I have a true, but maximally vague statement (“I have some number of balls”). What he is getting at is that both extremes (-1 and 1) are equally unuseful, but that things near zero are equally useful (either false or true). Let’s call this value upsilon.

Then, he defines the degree of informativeness as 1-upsilon^2.

This solves the problem we had before. The statement “ball 1 is blue, both balls are red” will now have an upsilon value somewhere between -1 and 0, so it will have some degree of informativeness, but nothing close to the maximal. The statement “ball 2 is either red or blue” will upsilon = 1 and so will have a degree of informativeness of 0. Finally, “ball 1 is blue” will have upsilon positive but less than 1, and possibly close to 0, so that it will have a good degree of informativeness.

That’s what I got out of it for now.

Journal of Algorithms is no longer accepting submissions

We just submited an article to the Journal of Algorithms and we were told that starting in 2003, the editors have stopped accepting papers. One alternative appears to be ACM Transactions on Algorithms.

It seems like the entire board of the Journal of Algorithms had resigned some time ago. I had no idea that Elsevier and other big publishers were in such troubles. I had heard about the Journal of Machine Learning

It feels like soon, all the big journals will have moved to an open or semi-open setup. Very scary for big publishers. Very scary. Yes, they’ve been making ever larger profits, but it may all come down to a stop really soon. Tipping point coming!

Anonymous Academic Bloggers

Ernie’s 3D Pancakes has a post on anonymous academic bloggers. To me, this is an interesting question. I use my own name everywhere on this blog. You can easily figure out where I work, what I teach and to whom, where I publish and so on. You can even find who my son is and so on. I think that Jeff correctly points out that feeling you need to be anonymous is probably misguided. The likelyhood that a colleague is going to come to my blog, read it, be insulted, and try to hurt me on the job, is very, very slim. One reason for that is that I would never bad mouth a colleague on my blog: it just wouldn’t be fun and interesting for my target audience. The likelyhood that a reviewer of a paper I submitted would come on my blog and be insulted and reject my paper is also very slim. However, reviewers have many more reasons to wrongly reject a paper and if you start worrying about this sort of thing, you are not out of the woods!

So, I use my own name. There.

23% Fewer Computer Science Majors This Year!

Slashdot reports on a USA Today article saying that there fewer Computer Science Majors. They cite a 23% decline in enrollment in North America. Here’s one comment about the article:

Most engineering schools are reporting declines in enrollment. This is hardly surprising since most engineering curriculums, including CS, are difficult compared to other fields of study. Without the prospect of a good job waiting for them, many college students are veering away from these majors.

Update: Yuhong correctly points out that this is mostly at the undegraduate level. Graduate schools are finding enough students, at least according to Yuhong. I think this is expected: if job prospects are bad, people won’t enter the system but once they’ve entered it, they will stay in it longer if jobs are scarse.

Cool RDF tools

RDF is everywhere it seems: from Dublin Core to RSS, all to way to FOAF… However, it can be quite painful to parse. Cool tools are starting to emerge however, but google is not yet very good at finding them.

Suppose you have a RDF/XML representation and you want the triples… go to W3C RDF Validation Service and it will do it nicely for you.

On the other hand, the form on this page allows you to go from N3 (the user friendly RDF syntax) to RDF/XML.