A bitmap is an efficient array of boolean values. They are commonly used in bitmap indexes. The Java language has a bitmap class: BitSet.
Unfortunately, the Java BitSet class will not scale to large sparse bitmaps—the Sun implementation does not use compression.
I published a new free alternative: JavaEWAH.
References:
- The library contains the 64-bit EWAH algorithm described—among other places—in one of my recent research papers.
- It is a Java port of some of my C++ Bitmap Index library.
Credit:
- Glen Newton gave me the idea for this project.
@Anonymous
My statement is very clear: to my knowledge, there is no patent violation. However, I am a researcher, not a patent lawyer.
I have just downloaded the software. I have just one preliminary question: how are you sure that EWAH is not patented. In other words why do you think that a patent on WAH does not cover EWAH as well.
Hi,
Great paper, great wok, nothing to add… I’m currently in Dublin, Ireland, it is amazing that I can download your paper and enjoy your work. Thanks a lot!
No. It is a matter of trade-off: if you want fast random access, a different (less aggressive) form of compression is required. Thanks for the question.
Daniel,
Is there a way to check whether a bit at the given position is set? In the constant time of course, not using the iterator. Thanks in advance.
What are your thoughts on a Javascript implementation of EWAH?
I’m curious if the benefits of cache coherency would outweigh the lack of 64-bit integer support and other JS inefficiencies.
I have a fast implementation of regular/uncompressed bitsets in JavaScript and it is mostly decent. I suspect you could make a 32-bit EWAH work.