Number Parsing at a Gigabyte per Second

Computers typically rely on binary floating-point numbers. Most often they span 64 bits or 32 bits. Many programming languages call them double and float. JavaScript represents all its numbers, by default, with a 64-bit binary floating-point number type.

Human beings most of often represent numbers in decimal notation, such as 0.1 or 1e-1. Thus many systems store numbers in decimal notation using ASCII text. The software must go from binary floating-point numbers to ASCII and back. There has been much work done on the serialization (from binary floating-point numbers to ASCII) but comparatively less work on the deserialization (from ASCII to binary floating-point numbers).

Typically, reading decimal numbers and converting them to binary floating-point numbers is slow. How slow? Often on the order of 200 MB/s. So much slower than your disk, if you have a fast disk. A PlayStation 5 has a disk capable of over 5 GB/s in bandwidth.

You can do much better. I finally published a manuscript that explains a better approach: Number Parsing at a Gigabyte per Second. Do not miss the acknowledgements section of the paper: this was joint work with really smart people.

The benchmarks in the paper are mostly based on the C++ library fast_float. The library requires a C++11 standard compliant compiler. It provides functions that closely emulate the standard C++ from_chars functions for float and double types. It is used by Apache Arrow and Yandex ClickHouse. It is also part of the fastest Yaml library in the world. These from_char functions are part of the C++17 standard. To my knowledge, only microsoft implemented it at this point: they are not available in GNU GCC.

On my Apple M1 MacBook, using a realistic data file (canada), we get that fast_float can far exceeds a gigabyte per second, and get close to 1.5 GB/s. The conventional C function (strtod) provided by the default Apple standard library does quite poorly on this benchmark.

What about other programming languages?

A simplified version of the approach is now part of the Go standard library, thanks to Nigel Tao and other great engineers. It accelerated Go processing while helping to provide exact parsing. Nigel Tao has a nice post entitled The Eisel-Lemire ParseNumberF64 Algorithm.

What about Rust? There is a Rust port. Unsurprisingly, the Rust version is a match for the C++ version, speed-wise. Here are the results using the same file and the same processor (Apple M1):

from_str (standard) 130 MB/s
lexical (popular lib.) 370 MB/s
fast-float 1200 MB/s

There is an R binding as well, with the same great results:

On our machine, fast_float comes out as just over 3 times as fast as the next best alternative (and this counts the function calls and all, so pure parsing speed is still a little bettter).

A C# port is in progress and preliminary results suggest we can beat the standard library by a healthy margin. I am hoping to get a Swift and Java port going this year (help and initiative are invited).

Video. Last year, I gave a talk at Go Systems Conf SF 2020 entitled Floating-point Number Parsing w/Perfect Accuracy at GB/sec. It is on YouTube.

Further reading. See my earlier posts… Fast float parsing in practice (March 2020 blog post) and Multiplying backward for profit (April 2020 blog post).

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

15 thoughts on “Number Parsing at a Gigabyte per Second”

    1. @Idiot

      My statement is…

      JavaScript represents all its numbers, by default, with a 64-bit binary floating-point number type.

      The link you offer supports this statement because it says that we can represent integers exactly up to 2^53, which is what happens under the IEEE binary64 type.

  1. Is there any work on replacing the functions for this in the language’s respective standard libraries? The Rust implementation seems like it would be a good fit for the Rust standard library, with it having no dependencies and no_std support.

  2. I would like to help out on the Java port but I my requirement is that I can go round trip without having different cache table for float/double to string and string to float/double.

    At the moment I am trying to port a DragonBox version (,, but I am interested in this if it can outperform DragonBox and can go round trip (float/double to string, string to float/double).

  3. But, when I have to store e.g. a big matrix of floating point numbers, I would do a copy of that contiguous chunk of memory to disk, and vice-versa, possibly throwing in mmap – precisely to avoid parsing from text?

    1. Right. If you serialize your numbers in binary form, you obviously have no parsing difficulty. In the paper, I also allude to another possibility: you can use hexadecimal floating-point numbers.

  4. What do you feel accounts for the great differences in bandwidth used by each approach? It would be interesting to test these same computations on multiple different CPUs.

  5. hello, nice work, but I’m unsure about one point:
    the video title reads ‘w/Perfect Accuracy’, but in the end you state:
    ‘can do exact computation 99,99% of the time’, doe’s that mean:
    ‘in 0.01% of time ( cases ) you see an error and can fall back to other algorithm’, or
    ‘in 0.01% of cases you get a slightly wrong result, learn to live with it’?

Leave a Reply

Your email address will not be published.

You may subscribe to this blog by email.