External-Memory Sorting in Java : the First Release

In my previous post, you were invited to help with a reference implementation of external sorting in Java. Several people tested and improved the code. I like the result.

  • I posted the code on Google code. All contributors are  owners of the project. The source code is under subversion.
  • I have added a link to it from the wikipedia page.

What is left to do?

  • The code remains untested. Please run your benchmarks! Find bugs!
  • Please contribute unit tests.
  • Can you write a tutorial on how to use the code?
  • Can you simplify the code further while making it faster and more robust?

Caveat: My intent was for the code to be in the public domain—nobody should own reference implementations—but Google code would not allow it. I selected the lesser GPL license instead, for now.

Reference: There is a fast external sorting implementation in Java by the Yahoo! people. (Thanks to Thierry Faure for pointing it out.) I have not looked at it.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

4 thoughts on “External-Memory Sorting in Java : the First Release”

  1. Nowadays people expect to be able to build a project and run tests with a build tool. 🙂 I could add this. I just wanted to make sure that there not 3 builds in place.

  2. @Vellino I didn’t know what to do license-wise. The code itself says it is in the public domain.

    Hopefully, it won’t matter. Nobody will hire lawyers to resolve conflicts around this piece of code. I really, really hope so.

Leave a Reply to Daniel Lemire Cancel reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.