A Theory of Strongly Semantic Information

Thanks to my colleague Jean Robillard, I found out that philosophers do Knowledge Management too! Following a request I made, Jean suggested I read an Outline of a Theory of Strongly Semantic Information by L. Floridi.

He starts out by asking how much information is there in a statement? Well, in a finite discrete world (the realm where Floridi appears to live), you can reasonably define “information content” in terms of how many possibilities the statement rules out. For example, if my world is made of two balls, each of which can be either red or blue, so my world has 4 possible states, and I say that “ball 1 is blue”, there are only 2 possibilities left (ball 2 is either red or blue) so I could say that I’ve ruled out 2 possibilities and so my information content is 2. If I say “both balls are blue”, my information content is 4. You can see right away that a self-contradictory statement (“ball 1 is blue, both balls are red”) rules out all possibilities as well, so it has maximal information content. A tautology (“ball 1 is either blue or red”) has 0 information content. Floridi is annoyed by the fact that a self-contradictory statement has maximal information content.

In section 5, he points out that statements are not only either true or false, but they have a degree of discrepancy. So, for example, I can say that I have some balls. This is a true statement, but with high discrepancy. However, I can say that I have 3 balls when in fact I have 2 balls and while false, this is a statement with lower discrepancy, and maybe a more useful statement. Apparently, he borrows this idea from Popper, but no doubt this is not a new idea.

He comes up with conditions on a possible measure of discrepancy between -1 and 1. -1 means that the statement is totally false and matches no possible situation (“I have 2 and 3 balls”), 0 means that you have a very precise and true statement (“I have 2 balls”), and 1 means that I have a true, but maximally vague statement (“I have some number of balls”). What he is getting at is that both extremes (-1 and 1) are equally unuseful, but that things near zero are equally useful (either false or true). Let’s call this value upsilon.

Then, he defines the degree of informativeness as 1-upsilon^2.

This solves the problem we had before. The statement “ball 1 is blue, both balls are red” will now have an upsilon value somewhere between -1 and 0, so it will have some degree of informativeness, but nothing close to the maximal. The statement “ball 2 is either red or blue” will upsilon = 1 and so will have a degree of informativeness of 0. Finally, “ball 1 is blue” will have upsilon positive but less than 1, and possibly close to 0, so that it will have a good degree of informativeness.

Journal of Algorithms is no longer accepting submissions

We just submited an article to the Journal of Algorithms and we were told that starting in 2003, the editors have stopped accepting papers. One alternative appears to be ACM Transactions on Algorithms.

It seems like the entire board of the Journal of Algorithms had resigned some time ago. I had no idea that Elsevier and other big publishers were in such troubles. I had heard about the Journal of Machine Learning

It feels like soon, all the big journals will have moved to an open or semi-open setup. Very scary for big publishers. Very scary. Yes, they’ve been making ever larger profits, but it may all come down to a stop really soon. Tipping point coming!

Anonymous Academic Bloggers

Ernie’s 3D Pancakes has a post on anonymous academic bloggers. To me, this is an interesting question. I use my own name everywhere on this blog. You can easily figure out where I work, what I teach and to whom, where I publish and so on. You can even find who my son is and so on. I think that Jeff correctly points out that feeling you need to be anonymous is probably misguided. The likelyhood that a colleague is going to come to my blog, read it, be insulted, and try to hurt me on the job, is very, very slim. One reason for that is that I would never bad mouth a colleague on my blog: it just wouldn’t be fun and interesting for my target audience. The likelyhood that a reviewer of a paper I submitted would come on my blog and be insulted and reject my paper is also very slim. However, reviewers have many more reasons to wrongly reject a paper and if you start worrying about this sort of thing, you are not out of the woods!

So, I use my own name. There.

23% Fewer Computer Science Majors This Year!

Slashdot reports on a USA Today article saying that there fewer Computer Science Majors. They cite a 23% decline in enrollment in North America. Here’s one comment about the article:

Most engineering schools are reporting declines in enrollment. This is hardly surprising since most engineering curriculums, including CS, are difficult compared to other fields of study. Without the prospect of a good job waiting for them, many college students are veering away from these majors.

Update: Yuhong correctly points out that this is mostly at the undegraduate level. Graduate schools are finding enough students, at least according to Yuhong. I think this is expected: if job prospects are bad, people won’t enter the system but once they’ve entered it, they will stay in it longer if jobs are scarse.

Cool RDF tools

RDF is everywhere it seems: from Dublin Core to RSS, all to way to FOAF… However, it can be quite painful to parse. Cool tools are starting to emerge however, but google is not yet very good at finding them.

Suppose you have a RDF/XML representation and you want the triples… go to W3C RDF Validation Service and it will do it nicely for you.

On the other hand, the form on this page allows you to go from N3 (the user friendly RDF syntax) to RDF/XML.

How to be creative

Through Downes’, I found this great post about how to be creative. HOWTOs are always interesting and sell magazines, but they are somewhat more interesting in blogosphere because someone you can get to know put his heart into it.

  • Ignore everybody
  • Creativity is its own reward
  • Put the hours in
  • If your biz plan depends on you suddenly being “discovered” by some big shot, your plan will probably fail
  • You are responsible for your own experience
  • everyone is born creative; everyone is given a box of crayons in kindergarten
  • Keep your day job
  • Companies that squelch creativity can no longer compete with companies that champion creativity
  • Everybody has their own private Mount Everest they were put on this earth to climb
  • The more talented somebody is, the less they need the props
  • Don’t try to stand out from the crowd; avoid crowds altogether
  • If you accept the pain, it cannot hurt you
  • Never compare your inside with somebody else’s outside

One of the most interesting one is number 5: Nobody can tell you if what you’re doing is good, meaningful or worthwhile. The more compelling the path, the more lonely it is.

Of course, I don’t buy all of it. Being extremely lonely is no way to be creative I think. Nobody gets awfully creative at the bottom of a cave. I do think you have to look for others. The strength of your network is key because it multiplies your own brain power. I guess we go back to Emerson’s independence of solitude. Be in a network, be in a crowd, but do not be a mere node in the crowd, be your own node. It does require courage though, and you have to expect to fail, fail badly even.

Great Hackers

Paul Graham wrote an essay called ‘Great Hackers‘. I’m pretentious enough to call myself a hacker (though I do not claim to be great), so I had to jump on it!

Here are some juicy quotes…

Good hackers find it unbearable to use bad tools. They’ll simply refuse to work on projects with the wrong infrastructure.

Great hackers also generally insist on using open source software. Not just because it’s better, but because it gives them more control.

They [great hackers] work in cosy, neighborhoody places with people around and somewhere to walk when they need to mull something over, instead of in glass boxes set in acres of parking lots.

There’s no way around it: you can’t manage a process intended to produce beautiful things without knowing what beautiful is.

And this is the reason that high-tech areas only happen around universities. The active ingredient here is not so much the professors as the students. Startups grow up around universities because universities bring together promising young people and make them work on the same projects. The smart ones learn who the other smart ones are, and together they cook up new projects of their own.

If you’re worried that your current job is rotting your brain, it probably is.

A megabyte is a mebibyte, and a kilobyte is a kibibyte

If you’ve been annoyed about the fact that a kilobyte has 1024 bytes and not 1000 bytes, well, you were right all along! What people call a kilobyte is really a kibibyte. (Thanks to Owen for pointing it out to me!)

Examples and comparisons with SI prefixes
one kibibit  1 Kibit = 210 bit = 1024 bit
one kilobit  1 kbit = 103 bit = 1000 bit
one mebibyte  1 MiB = 220 B = 1 048 576 B
one megabyte  1 MB = 106 B = 1 000 000 B
one gibibyte  1 GiB = 230 B = 1 073 741 824B
one gigabyte  1 GB = 109 B = 1 000 000 000 B

Source: Definitions of the SI units: The binary prefixes

Michael Nielsen: Principles of Effective Research

Michael just finished his essay: Principles of Effective Research. I think it is a must read for all Ph.D. students, young researchers, and even idiots like me who always get it wrong. Michael takes a very refreshing view to what research is all about. He is not cynical yet he is true to what research really is. You may never win the Nobel prize if you follow his guidelines, you may never be a guru researcher, but I think you’ll be a good or even excellent researcher. As he explains, being an influent researcher is not a subset of being a good researcher, and that’s a very important statement. In any case, Michael did all of us a favor and I hope that he essay is read by a lot of people. (Power of the network?) I implore you all: link to his essay!!!

Collaborative Filtering Java Learning Objects

Through Downes’, I found an interesting paper on the application of collaborative filtering to e-Learning in ITDL (by Jinan A. W. Fiaidhi).

It makes the point quite well that we must differentiate heterogeneous settings from sane laboratory conditions:

Searching for LOs within heterogeneous repositories as well as within collaborative repositories is far more complicated problem. In searching for such LOs we must first decide on appropriate metadata schema, but which one!

The Three Dijkstra Rules for Successful Scientific Research

Through Didier and Nielsen, I found a list of Golden Rules for Successful Scientific Research attributed to Dijkstra.

  • “Raise your quality standards as high as you can live with, avoid wasting your time on routine problems, and always try to work as closely as possible at the boundary of your abilities. Do this, because it is the only way of discovering how that boundary should be moved forward.”
  • “We all like our work to be socially relevant and scientifically sound. If we can find a topic satisfying both desires, we are lucky; if the two targets are in conflict with each other, let the requirement of scientific soundness prevail.”
  • “Never tackle a problem of which you can be pretty sure that (now or in the near future) it will be tackled by others who are, in relation to that problem, at least as competent and well-equipped as you.”

Of the three rules, only the last one seems important. The second one appears self-evident: you want to be socially relevant, but not to the point of producing low quality work. This being said, most researchers go to the other extreme and ignore social relevance and their work loses out its motivation. If you tackle a problem that only you care about, don’t expect much recognition. I actually disagree with the first rule: small problems, technical issues actually often hide interesting problems. Always focusing on the management and top level issue is a bad idea I think. Michelangelo was painting a church! In research, do not be so quick to think that there are noble and not-so-noble problems. All problems can be interesting and knowledge of technical issues can bring much insight.

Nielsen’s Extreme Thinking

Blogging is a fascinating past-time. Who would have thought? I just read bits and pieces of an essay on Extreme Thinking.

Here’s a fascinating quote:

The key to keeping this independence of solitude is to develop a long-term vision so compelling and well-internalized, that it can override behaviours for which the short-term rewards are significant, but which may be damaging in the long run.

Update: Independence of solitude: I didn’t know this expression. Found 600 or so hits on Google. Seems that maybe the expression comes from Ralph Waldo Emerson.

What I must do is all that concerns me, not what the people think. This rule, equally arduous in actual and intellectual life, may serve for the whole distinction between greatness and meanness. It is the harder, because you will always find those who think they know what is your duty better than you know it. It is easy in the world to live after the world’s opinion; it is easy in solitude to live after our own; but the great person is one who in the midst of the crowd keeps with perfect sweetness the independence of solitude.

Michael Nielsen: Principles of Effective Research: Part VII

Didier reminded me to check Nielsen’s last post on Principles of Effective Research. I take a quote out of it…

The foundation is a plan for the development of research strengths. What are you interested in? Given your interests, what are you going to try to learn? The plan needs to be driven by your research goals, but should balance short-term and long-term considerations. Some time should be spent on things that appear very likely to lead to short-term research payoff. Equally well, some time needs to be allocated to the development of strengths that may not have much immediate pay-off, but over the longer-term will have a considerable payoff.

This is a refreshing view.

Freedom in networked research: what does it mean?

When I started out as a researcher, as a young Ph.D. student, I thought research was about “having ideas”. Then, it occured to me that it was about “having ideas and ‘selling’ them” because “having ideas” is easy and too many people have too many ideas already. But marketing experts sell ideas all the time… surely, they don’t do “research”. Then, I changed my mind and decided research was about “taking ideas, validating them, putting them in practice, and building tools out of it” where “tools” is to be interpreted in a very wide sense. Turns out it is not a bad definition of what research is. But the part about “taking ideas and validating them” is a networking problem. Where do your ideas come from, how do you know how good they are? Ultimately, “validating” an idea means putting it in front of a community and getting the community to say “this is a good idea”. “Validating” is not the same as selling, though it might be hard to tell what a person is really trying to do.

But to be blunt, I don’t have yet a satisfying definition of what “research” is and I’m not looking very hard… though, networking is a necessary condition for sure. Scientists on desert islands without telecommunication can’t do research. That’s the part that I did not understand until a few years after my Ph.D. Well, maybe I’m hard on myself, maybe I understood it on the surface, but I didn’t internalized until much later.

Michael Nielsen pointed me to an interesting Web page very useful for Ph.D. students and novice researchers: Networking on the Network.

In Networking on the Network, Philip E. Agre accurately describes the world of research as a network. A network isn’t good or bad… so, some nodes will suck energy out of the network, and others will contribute much to it. The network is somewhat self-regulating, but it is possible, nevertheless, for bad leaders to emerge… He has this to say about the relationship between students and supervisor which I find rings very true:

It is good to be powerful, but only in the correct sense of the term. People with the right kind of power, in my view, do not need to manipulate or control others. To the contrary, they are (sic) know that they are well-served when others grow and find their own directions, so they happily support everyone in their growth. They don’t take responsibility for others’ growth, which is a different question. They speak to the healthy part of a person, and they are concerned to draw out and articulate the brilliant ideas and worthy vision that lie beneath the surface of whatever anyone is saying. For example, they don’t try to enroll students as acolytes in their empire-building strategies, but honestly ask what’s best for each student’s own development, confident that their knowledge, vision, and connections will have an important influence on the student’s development in any case.

As you can see, he talks a lot about “Empire building”. Indeed, because research is all about networking, to a large extend, one can build an empire out of thin air, with no substance.

It seems you can either build an empire for the purpose of building an empire, because that’s you definition of success, or else, you can aim to remain “free”. That’s a very powerful idea:

You build networks around the issues you care about, you grow and change through the relationships that result, you articulate the themes that are emerging in the community’s work, and through community-building and leadership you get the resources to do the things that you most care about doing. It’s true that this method will never give you arbitrary power. But the desire for arbitrary power is not freedom — it is a particularly abject form of slavery. If you can let go of preconceived plans then you are free: you can choose whom to associate with, and as you build your network you multiply the further directions that you can choose to go. You also multiply the unexpected opportunities that open up, the places you can turn for assistance with your projects, the flows of useful information that keep you in contact with reality, the surveillance of the horizon that keeps you from getting cornered by unanticipated developments, and the public persona that ensures that people keep coming to you with offers that you can take or leave. That is what freedom is, and it is yours if you will do the work.

I give Agre a lot of credit from bringing in the concept of “freedom” in research. University professors will often talk about “academic freedom”. I think that freedom in research is a stronger form of freedom. You can have “academic freedom” but be a slave to the “publish-or-perish” paradigm for the power it brings you. Or else, you can “do the work”, that is, do your research as a network node, and leverage the strength of the network to make the research you want to do anyhow, much better, much stronger.

Michael Nielsen: Principles of Effective Research: Part IV

I’ve been reading Michael Nielsen’s Principles of Effective Research, he is up to Part IV now.

He makes a very important point about research. When I started out doing research, I thought that research was about sitting in your office thinking up new ideas. God! Was I wrong!

Now, don’t get me wrong, research is not about having meetings with other researchers or spending time chatting, or drawing UML diagrams of what is to be done, or spending weeks on funding proposals. We might do these things, but they don’t make us good researchers. But neither will sitting in your office thinking new ideas. That’s not effective research.

On quasi-desert islands with no telecommunications, you’ll find very few great researchers. The social network doesn’t need to be immediate: I think you can be a great researcher even in a tiny school. And I don’t think your network should be made of students mostly, especially not your own students.

I believe the secret to being a good researcher is to belong to a tightly knitted group of solid researchers. Research is about networking. By tightly knitted, I don’t necessarily mean “military-like”: I mean that you feel peer pressure all the time to do good research. This can be achieved through emails, blogging, phone… whatever the mean…

A must read paper in the Chronicle

A must read paper in the Chronicle Is There a Science Crisis? Maybe Not. The paper is about the oversupply of graduate students in science which is brought upon by universities who have a vested interest in producing more and more science Ph.D.s but don’t necessarily need to adjust to the job market.

It brings back memories. At the end of the eighties, they were predicting a severe shortage of science Ph.D. As it turns out, it was totally false and the paper documents very well the fact that life after a science Ph.D. has gotten tremendously worse and that there are clearly an ever increasing number of science Ph.D.s with fewer and fewer jobs.

The truth is that universities are being irresponsible (and so are professors). Training highly specialized students who know how to solve one type of technical problems has no value for society. Whatever you do, train students to have a wide range of skills. This means that we need to reduce drastically the number of science Ph.D.s and focus on well-rounded students.

I’m convinced governments will soon wake-up and stop listening to universities. They’ll be forced soon to look at the numbers and figure out that generously paying universities to produce more science Ph.D.s is a waste of tax payer money.

Some beautiful quotes:

An editorial in Science this year argued: “We’ve arranged to produce more knowledge workers than we can employ, creating a labor-excess economy that keeps labor costs down and productivity high. Maybe we keep doing this because in our heart of hearts, we really prefer it this way.”

Mr. Freeman, like other economists, looks to dollars to make sense of the trends among graduate students. “They’re not studying science,” he says, “because they look and say, ‘Do I want to be a postdoc paid $35,000 or $40,000 at age 35, with extreme uncertainty working in somebody else’s lab, and maybe getting credit for my work and maybe not getting full credit? Or would I rather be an M.B.A. and making $150,000 and hiring Ph.D.’s?'”

With wages stagnant and too few jobs for engineers, adding to the work force will only make those careers less attractive, says one of the authors, George F. McClure, a retired aerospace engineer who studies employment issues for the Institute of Electrical and Electronics Engineers. “The problem is that everybody has focused on the supply side, and very few have focused on the demand side,” he says. “People in colleges and universities are concerned with maintaining the pipeline and throughput.”

In a case study, Ms. Stephan, the Georgia State economist, has analyzed the growth of the bioinformatics field, generally regarded as one of the hottest areas in science. The number of degree programs blossomed from 21 in 1999 to 74 in 2003. “There’s been a tremendous increase in the number of students in these programs,” she says. But, she adds, “we also track job announcements in bioinformatics, and they’ve been declining.” She sees parallels to other leading fields. “Everybody is talking right now that there’ll be lots and lots of jobs in nanotechnology,” she says. “I’ve not seen a convincing case that that is happening, or that it will happen.”

Open Learning Initiative at Carnegie Mellon

The people at Carnegie Mellon seem to have the right idea: their Open Learning Initiative offers you to browse right there and now the content of their courses. It seems to be a driving point: they want to make courses free for individuals, and low cost for institutions. By building communities around their courses, I guess they hope to make these same courses, the de facto standards. It could be very powerful and could make some other online schools obselete.

A primary objective of the project is to build a community of use for the courses that will play an important role in ongoing course development and improvement. The courses are developed in a modular fashion to allow faculty at a variety of institutions to either deliver the courses as designed or to modify the content and sequence to fit the needs of their students and/or their curricular and course goals. These courses will be broadly disseminated at no cost to individual students and at low cost to institutions.

Students as Colleagues for Professors

Stephen gave an interview to Clientology about eLearning.

Some of his statements should be scaring the h*ll out of some people in universities and elsewhere. First, he points out that gatekeepers are slowly losing their power:

It is important to recall how much of our culture – including political culture, economic culture, educational culture — has been shaped by ‘gatekeepers’, elites who, because of their knowledge and position, are the sole arbiters of what we will read, buy or learn. This gatekeeping function has already been disintermediated; new people — what Robin Good calls the ‘newsmaster’ are taking their place, and the result is a much more balanced exchange.

This is so obvious in my daily life and was brought on by the Internet and excellent tools like Google. It actually links with a previous post I wrote this week: non-tech natives are the gatekeeper generation. The new tech natives won’t see a purpose for these gatekeepers who kept their knowledge close and their power even closer. Information and knowledge is always changing and flowing so that controlling knowledge doesn’t make sense anymore. In a very deep way, we’ve become a dynamic society. This is not just class mobility, that is, the ability for a large segment of the population to acquire some key knowledge and then, have a shot at becoming a gatekeeper. No. I think that the notion of gatekeeper itself is failing.

This has very concrete consequences in universities:

In education, the result is the gradual erosion of the power relationship that existed between student and professor. In some senses, we see this already by the designation by many of the student as ‘customer’ rather than, say, apprentice. But it’s deeper than that, and we will see eventually the designation of student as ‘colleague’ — and in an important sense, it will not be possible to distinguish between student and professor online.

I couldn’t agree more. Students do not depend on their professors the same way they use to. Professors won’t be able to hold their status as gatekeepers for very long now. Not when anyone in the planet can quickly become an expert in almost any field just by reading up on the Internet.

I think university professors will remain marketing tools. First, they will serve to sell training, which they have always done anyhow, and second, they will be used to authentify knowledge which will become an increasingly important task. Students won’t look so much for knowledge but for recognition and professors who can bring student some recognition which will be in high demand.

Eating poutine in Montréal

Through Seb, I found Idle Words. The guy is moving to Montreal and just discovered Poutine. If you know what poutine is, you’ve got to go read his post on his first Poutine experience: hilarious.

Actually, last time I ate poutine was one I made myself. Well, at least, I made the French fries myself. It was pretty good, but it is a bad meal if you plan to do any work in the 2-3 days after eating it and you are past thirty.

Here’s my recipe for French fries. Buy lots of olive oil. Cut some Yukon gold potatoes (the yellow type) and let the cut potatoes in water a few minutes, then dry them. Heat up the oil over the stove, make sure the oil is very hot, but also make sure you don’t overheat (if there is smoke, it became to hot). Then, using only a small amount of potatoes each time, drop them the potatoes in the oil. Be careful not to burn yourself! You must make sure you don’t put all of the potatoes at once, otherwise you will drop the temperature of the oil too low and you’ll produce greasy French fries. That’s essentially it.

You can use these hand-made French fries to seduce a girl (or a guy, I suppose).