Parsing numbers in C++: streams, strtod, from_chars

When programming, we often want to convert strings (e.g., “1.0e2”) into numbers (e.g., 100). In C++, we have many options. In a previous post, I reported that it is an expensive process when using the standard approach (streams).

Many people pointed out to me that there are faster alternatives. In C++, we can use the C approach (e.g., strtod). We can also use the from_chars function. The net result is slightly more complicated code.

do {
    number = strtod(s, &end);
    if(end == s) break;
    sum += number;
    s = end; 
} while (s < theend);

I use long strings (100,000 numbers), and the GNU GCC 8.3 compiler on an Intel Skylake processor.

integers (stream) 200 MB/s
integers (from_chars) 750 MB/s
floats (stream) 50 MB/s
floats (strtod) 90 MB/s

We see that for integers, the from_chars function almost four times faster than the stream approach. Unfortunately my compiler does not support the from_chars function when parsing floating-point numbers. However, I can rely on the similar C function (strtod). It is nearly twice as fast as the floating-point approach. Even so, it still costs nearly 38 cycles per byte to parse floating-point numbers.

For each floating-point number, there are almost 10 branch misses in my tests, even though I generate numbers using a fixed format. The number of branch misses is nearly the same whether we use a C++ stream or the C function.

My source code is available.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

7 thoughts on “Parsing numbers in C++: streams, strtod, from_chars”

  1. Unfortunately my compiler does not support the from_chars function when parsing floating-point numbers.

    Don’t post this article then, or upgrade, we’re gcc-10 if I understand things correctly.

    Iff I would believe STL (lead STL dev), on Windows (the VC-STL, not some surrogate) this function (from_chars) beats anything.

    1. Don’t post this article then, or upgrade, we’re gcc-10 if I understand things correctly.

      As of November 2019, looking at the source code in the GNU GCC repo, I do not see support for floats in from_chars.

      Are you saying that it available in GNU GCC 10? When was that released?

        1. The world does not start and end with GNU GCC

          It does not look like it is available with LLVM at this time.

          If someone can help me port my code to Visual Studio, while preserving the performance counters, I will gladly run the tests.

            1. Good one! Have been looking for a solution to the apparently non-trivial problem of FAST float parsing. Clang and GCC not making any moves to support this part of C++17… 🙁

              Building abseil::string now!

            2. bad news…

              i haven’t done very scientific testing, but it looks like absl::from_chars is dead slow for doubles

              I could do a

              std::string(const char* start_ptr, const char* end_ptr) field; // AND

              stod(field);

              and still be twice as fast as char_conv, which can do that in one step.

              (I am iterating through a 190MB mmap’ed file of floats).

Leave a Reply to Oliver Schönrock Cancel reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.