# The fallacy of absolute numbers

I often come across the following type of arguments in research papers:

• You could save 3 bits of storage for every value in your database. Surely that’s irrelevant. Nobody cares about saving 3 bits!
• You can sort arrays in 10 ms. Surely, that cannot be improved upon? You are already down to 10 ms and nobody cares about such small delays.

I hope you can see what is wrong with these statements?

I call it the fallacy of absolute numbers: you express a measure or a gain in absolute value, and then conclude to optimality or near optimality because the number appears small (or large).

Remember: Saving 3 bits of storage out of 6 bits is a 2:1 compression ratio. Sorting in 5 ms instead of 10 ms doubles the speed.

Note: I am sure that someone else has documented this fallacy, but I could not find any reference to it.

1. Frederick Mosteller coined the term numerator-only data for things like this.

Comment by John — 18/6/2010 @ 13:16

2. You’ve got to love blogging! Thanks!

I did read your blog post back then, I’m sure, but I never connected it with what I see in research papers.

Comment by Daniel Lemire — 18/6/2010 @ 14:06

3. Hi Daniel, love your blog.
I just did a Google search for ‘fish’, the results… “About 359,000,000 results (0.17 seconds) ”

Suppose Google told me that they could make it 100 times faster, just 0.0017 seconds!

I really would not care, for me, in this context there is no difference between 0.17 seconds and even 0.00000000000017 seconds.

Of course, you might argue that if I build a crawler can call google a million times, then I would care. This is true, but there really are papers that make similar claims in domains for which we just don’t need speedup.

One example is a paper on a faster way to do a calculations on human ancestor remains. They had a speed-up of a factor of two. However, every prehistoric human ancestor remain we have could comfortably be placed in a small suitcase. Making the algorithm faster was polishing the wrong apple, we just don’t need to speedup that problems.

Comment by Anonymous — 20/6/2010 @ 20:34

4. When people talk about the improving or comparing any algorithm the only meaningful way to present it is the Pareto front. I learned about it too late, possibly should put blog post about it.

Comment by Alex Mikhalev — 21/6/2010 @ 6:13

5. On the other hand, beware of the fallacy of relative numbers: if my web site has the fastest growth in access, that is may be because it went up from 1 access (myself) to, say, 100. This is usually clear with the rate of adoption of, say, new browsers.

Comment by Muhammad Alkarouri — 29/6/2010 @ 20:33

Sorry, the comment form is closed at this time.

« Blog's main page