There is a special issue of JIME on Semantic Web for Education (as in “Learning Objects”). I picked it up from Downes‘ in one recent post.

Not only is the issue interesting, from what I could tell, it is a journal that gets it. First of all, reviews are on-line, for all to see. Don’t get me wrong: I don’t mind peer review. I cherish it. But I’ve gotten too many poorly written, poorly prepared reviews. I really wish reviews would go on-line when the paper is published. This way, it might give an incentive to the reviewer.

Plus, if a poor paper gets accepted, you can trace back the reasons why it was accepted…

It still doesn’t help if your paper gets rejected for bad reasons, but in such instances, you can go elsewhere with your paper. At the very least, you know that if the paper makes it, you’ll have an exciting review to go with it. Useful for you and the reader.

Update: other blogs have picked up this issue of JIME.

I’ve wasted a considerable amount of time in the last two days upgrading my RSS aggregate so that it will have better support for atom feeds. I use the feedparser library.

One thing that gets to me is how unintuitive unicode is under Python. For example, the following is a string…

t="éee"

Just copy this in your python interpreter, and it will work nicely. For example,


>>> t='éee'
>>> print t
�ee

However, for some reason, if I just type “t”, then it can’t print it properly…

>>> t
'xe9ee'

See how it is already confusing? (And we haven’t used unicode yet!)

Next, we can map this string to unicode…

r=unicode(t)

which has the following result…

>>> r=unicode(t)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)
</stdin>

Ah… so it tries to interpret t as ascii… fair enough, we know it is “latin-1″ or “iso8859-1″. It is already quite strange that “print” knows what to do with my string, but nothing else in Python seems to know… so we do


>>> r=unicode(t,'latin-1')
>>> r
u'xe9ee'
>>> print r
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'xe9' in position 0: ordinal not in range(128)
</stdin>

because, see, you can’t print unicode to the string… but you can do the following…


>>> print r.encode('latin-1')
éee
>>> print r.encode('iso-8859-1')
éee

but also


>>> r.encode('latin-1')
'xe9ee'
>>> r.encode('iso-8859-1')
'xe9ee'

What is my beef?

  • If ‘print’ assumes ‘latin-1′ then shouldn’t everything else? Why is this not consistent? If it is unsafe to assume ‘latin-1′, then why does print do it?
  • The encode, decode thing is a mess. We had a perfectly valid construct for converting things to strings, and that’s ‘str’. Now, we have a new one called ‘encode’. So that, given some unicode, I can do either t.encode(‘ascii’) or str(t) for the same result. Bad. Now, I’m stuck forever in a world where I have to figure out whether I encode or decode a string, and which is which. This is hard. This is confusing.
  • A string object should know its encoding so I don’t have to. What happens if I receive a string from some library and I need to convert it to unicode? How am I supposed to know what the encoding of the string is? There is no sensible way to communicate this right now which makes debugging a pain. The only excuse I see is that sometimes it is impossible for python to know the encoding… well, then it should just fail and require the programmer to specify the encoding. There are way too many things that can go wrong when you expect the programmer to keep tracks of his strings and which is encoded how…

Two links that are very invaluable to researchers who want to know how to succeed…

The second one was found by Seb.

Offline someone commented that more than half of the Ph.D. students are foreigners and that the Ph.D. is serving as a funding source. True. But that’s somewhat of a cynical view if you ask me.

In any case, you are a young student, and despite reading my blog, you still want to do a Ph.D. Maybe because you come from a Third World country and getting some cash to study is a compelling idea on its own. But whatever your reasons, here’s what you should be looking for, I humbly suggest…

  • Look at the past projects and students. Did all the students this prof. supervised ended up on welfare? Or can you google them as Harvard professors now? Can you find traces of the past projects this prof. was involved with or did they all fail? Look beyong the fanfare: look for evidence the prof. can’t control easily. Google past students.
  • Is the prof. aware of what the world is like, right now? Does he know the employment rate and career possibilities for young Ph.D.s or does he just pretend he knows? Where does he gets his facts from if he has any?
  • Does he give you the full story, with the pros and cons of doing a Ph.D. with him? Pros and cons of the research life?
  • Does he need to consume graduate students to get his research going or is the training of students only tangential to his research? In other words, can the guy still do research without students or are students cheap labour?

Nice post on Critical Mass today about a researcher who sold his services as co-author on eBay and actually got 50 bids and many phone calls. It would appear that many people, from industry to students, are willing to pay so they can produce high quality scientific content with their names on it.

I’m not sure it is an interesting line of work though. Most people don’t realize how expensive it is to write a good scientific journal article. Probably upward of $50K. It’d be difficult to sell 10 pages for $50K. Of course, there is other types of “research”, like journalistic research, when you get 10 pages for much less… but real science is awfully expensive.

What’s more interesting to me is the reasons why people where interested: “There’s this whole constellation of things they could get from it. They could get credentials. They would get the ability to have their questions actually answered.”

Why do I pick on this bit of news? Because I was actually offered jobs like this, and I always turned them down. I was offered money to write journal articles at least twice by totally different people. It was meant to promote a product or a service, in the end, or rather, give the product or service some credibility. I think this is misguided since there is an actual proper form for such publications: patents, technical reports or white papers.

In any case, it would actually be doable: sell your services as a scientist who publish papers to give credibility to products and services. It would be similar to a patent consultant, I guess, except that law is not so involved anymore. I found a lot of people everywhere think they have very unique ideas. They’d love them to be validated and have their ideas pushed in a very prestigious publication, just like having patents.

Writting papers is like taking pictures for Playboy. You look at beauty most of the time, and you have to capture the beauty… you have to make sure enough is being shown, but not too much. It is seen as a very romantic job where you are living a dream, but are, in fact, just doing your job. The only difference is that few people write papers attracting as many eye balls as Playboy pictures and most earn less money too.

Next Page »

Powered by WordPress