When programming, we often want to convert strings (e.g., “1.0e2”) into numbers (e.g., 100). In C++, we have many options. In a previous post, I reported that it is an expensive process when using the standard approach (streams).
Many people pointed out to me that there are faster alternatives. In C++, we can use the C approach (e.g., strtod). We can also use the from_chars function. The net result is slightly more complicated code.
do { number = strtod(s, &end); if(end == s) break; sum += number; s = end; } while (s < theend);
I use long strings (100,000 numbers), and the GNU GCC 8.3 compiler on an Intel Skylake processor.
integers (stream) | 200 MB/s |
integers (from_chars) | 750 MB/s |
floats (stream) | 50 MB/s |
floats (strtod) | 90 MB/s |
We see that for integers, the from_chars function almost four times faster than the stream approach. Unfortunately my compiler does not support the from_chars function when parsing floating-point numbers. However, I can rely on the similar C function (strtod). It is nearly twice as fast as the floating-point approach. Even so, it still costs nearly 38 cycles per byte to parse floating-point numbers.
For each floating-point number, there are almost 10 branch misses in my tests, even though I generate numbers using a fixed format. The number of branch misses is nearly the same whether we use a C++ stream or the C function.
Don’t post this article then, or upgrade, we’re gcc-10 if I understand things correctly.
Iff I would believe STL (lead STL dev), on Windows (the VC-STL, not some surrogate) this function (from_chars) beats anything.
As of November 2019, looking at the source code in the GNU GCC repo, I do not see support for floats in from_chars.
Are you saying that it available in GNU GCC 10? When was that released?
The world does not start and end with GNU GCC [and living on the edge 😉 is more fun].
It does not look like it is available with LLVM at this time.
If someone can help me port my code to Visual Studio, while preserving the performance counters, I will gladly run the tests.
You can also try the abseil implementation of from_chars (https://github.com/abseil/abseil-cpp/blob/master/absl/strings/charconv.h )
Good one! Have been looking for a solution to the apparently non-trivial problem of FAST float parsing. Clang and GCC not making any moves to support this part of C++17… 🙁
Building abseil::string now!
bad news…
i haven’t done very scientific testing, but it looks like absl::from_chars is dead slow for doubles
I could do a
std::string(const char* start_ptr, const char* end_ptr) field; // AND
stod(field);
and still be twice as fast as char_conv, which can do that in one step.
(I am iterating through a 190MB mmap’ed file of floats).