For processing strings, streams in C++ can be slow

The C++ library has long been organized around stream classes, at least when it comes to reading and parsing strings. But streams can be surprisingly slow. For example, if you want to parse numbers, then this C++ routine is close to being the worst possible choice for performance:

std::stringstream in(mystring);
while(in >> x) {
   sum += x;
}
return sum;

I recently learned that some Node.js engineers prefer stream classes when building strings, for performance reasons. I am skeptical.

Let us run an experiment. We shall take strings containing the ‘%’ character and we build new strings where the ‘%’ character is replaced by ‘%25’ but the rest of the string is otherwise unchanged.

A straight-forward string construction is as follows:

std::string string_escape(const std::string_view file_path) {
  std::string escaped_file_path;
  for (size_t i = 0; i < file_path.length(); ++i) {
    escaped_file_path += file_path[i];
    if (file_path[i] == '%')
      escaped_file_path += "25";
  }
  return escaped_file_path;
}

An optimized version using streams is as follows:

std::string stream_escape(const std::string_view file_path) {
  std::ostringstream escaped_file_path;
  for (size_t i = 0; i < file_path.length(); ++i) {
    escaped_file_path << file_path[i];
    if (file_path[i] == '%')
      escaped_file_path << "25";
  }
  return escaped_file_path.str();
}

I envision using these functions over strings that contain few ‘%’ characters. It is possible that most of the strings do not contain the ‘%’. In such cases, I can just search for the character and only do non-trivial work when one is found. The following code should do:

std::string find_string_escape(std::string_view file_path) {
  // Avoid unnecessary allocations.
  size_t pos = file_path.empty() ? std::string_view::npos :
    file_path.find('%');
  if (pos == std::string_view::npos) {
   return std::string(file_path);
  }
  // Escape '%' characters to a temporary string.
  std::string escaped_file_path;
  do {
    escaped_file_path += file_path.substr(0, pos + 1);
    escaped_file_path += "25";
    file_path = file_path.substr(pos + 1);
    pos = file_path.empty() ? std::string_view::npos :
      file_path.find('%');
  } while (pos != std::string_view::npos);
  escaped_file_path += file_path;
  return escaped_file_path;
}

I wrote a benchmark that uses a large collection of actual file URLs as a data source. The benchmark runs under macOS and Linux. I use Linux, a recent Intel server and GCC 12:

naive strings 260 ns/string 0.45 GB/s
stream 1000 ns/string 0.12 GB/s
find 33 ns/string 3.49 GB/s

At least in this case, I find that the stream version is four times slower than  naive string processing, and it is 30 times slower than the optimized ‘find’ approach.

Your results will vary depending on your system, but I generally consider the use of streams in C++ as a hint that there might be poor performance.

Further reading: I turned this blog post into a pull request to Node.js.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

14 thoughts on “For processing strings, streams in C++ can be slow”

  1. TBH I’m not sure if C++ streams are good at anything 🙂 They’re the default choice when you don’t care strongly about performance or ergonomics, but that’s all.

    It’s also possible that old std::string implementations weren’t good at mutation and appending.

    1. Some of the reasins are:

      Streams are specified as an inheritance hierarchy with virtual methods, so stream operations incur virtual calls.
      Streams are locale-aware and access locale on every operation, which is a mutex-protected global.
      stringstreams make a copy of the underlying string (ability to move the string has been added in one of the latest standards)
      For cin , cout, and cerr, (not the topic of this post) there is also sycronization with the libc functions (can be switched off using sync_with_stdio(false) and tie(nullptr))

  2. I played it a little bit with them. My stream_escape2 faster I just add a memory preallocation. I did measurement on desktop machine Windows vs2022 compiler.

    loaded 6207 URLs
    volume 725997 bytes
    find_string_escape : 1.94 GB/s 16.6 Ma/s 60.34 ns/d
    string_escape : 0.21 GB/s 1.8 Ma/s 566.31 ns/d
    stream_escape : 0.05 GB/s 0.4 Ma/s 2461.41 ns/d
    stream_escape2 : 0.69 GB/s 5.9 Ma/s 168.68 ns/d
    find_string_escape2 : 1.97 GB/s 16.9 Ma/s 59.32 ns/d

    Code is here:

    std::string stream_escape2(const std::string_view file_path) {
    std::string escaped_file_path;
    escaped_file_path.reserve(file_path.length() * 2);

    for (char c : file_path) {
    if (c == '%') {
    escaped_file_path += "25";
    }
    else {
    escaped_file_path.push_back(c);
    }
    }

    return escaped_file_path;
    }

    Similar did some change in find_string_escape function and after remove unnecessary string copying performance improved:

    std::string find_string_escape2(std::string_view str) {
    std::string escaped_file_path;
    size_t pos = 0;
    size_t start = 0;

    while ((pos = str.find('%', start)) != std::string_view::npos) {
    escaped_file_path += str.substr(start, pos - start);
    escaped_file_path += "25";
    start = pos + 1;
    }

    escaped_file_path += str.substr(start);
    return escaped_file_path;
    }

    I tested on your adb92cefbd3fbd97a4055734681dee20a1936f26 commit.

  3. Aside from the question of reserving memory, noted above, you are somewhat missing the point about streams. Streams are a general-purpose formatting mechanism, not a narrow string appending mechanism. Of course an exercise involving simple string concatenation will perform somewhat better using string::append() or operator +=(), as the extra overhead of the stream will have some cost. (Not to mention that going character-by-character will always generate a pessimal outcome.)

    For a fairer comparison, which uses streams for the formatting purposes they were intended for, instead of always replacing % with %25, use an integral counter and replace % with “%” and the counter value appended. For your non-stream version you will probably want to use sprintf; for the stream version the insertion operator will suffice.

    1. For your non-stream version you will probably want to use sprintf; for the stream version the insertion operator will suffice.

      Why would I use sprintf?

      If I am unconcerned with performance, I would simply do:

      str += std::to_string(counter);
      

      If I care about performance, I would probably use std::to_chars to get a boost (at the expense of some complexity).

      Streams would be last on my list because they would assuredly be slower and not any more convenient.

      1. I was thinking more generically. Granted that as of C++17 to_chars is more efficient for floating point and integral conversions (though underneath it may not be much more efficient than sprintf if much formatting is involved). Some of us are stuck in older environments.

        Nor would I use even to_chars in a fully up-to-date environment: I’d use the C++20 formatting library, which was introduced precisely to address the weaknesses in streams, in this case using std::format_to_n().

        Streams are meant to replace the C style formatting functions (as is the new formatting library), not to replace strcat, and strcpy, which is effectively what you are doing. The sstream version was itself an afterthought, replacing the old strstream library, and was certainly not introduced as an aid to efficiency. It is a huge benefit if you have functions you want to unit test which normally use fstreams for writing to disk and you want to test with a generic iostream interface.

        In production, outside of a tight inner loop, I wouldn’t use any of this. I’d use boost’s replace_all_copy in their string algorithms library as being far more expressive (and probably more efficient) than a hand-rolled loop for your original example.

  4. I find the conclusion a bit misleading.
    The main difference isn’t string vs string_view vs stream; it’s early exit or not.
    The huge majority of your test strings do not contain any ‘%’; if you write some “find_stream” just prepending

    if (file_path.find('%') == std::string::npos)
    return std::string(file_path);

    I agree streams are pretty bad, but this is not the culprit here.

    FWIW, here’s my take at the exercise

    std::string memchr_escape(const std::string_view file_path) {
    const char *found = (const char *)memchr(file_path.begin(), '%', file_path.size());
    if (found == nullptr)
    return std::string(file_path);

    size_t n_percent = (size_t)std::count(file_path.begin(), file_path.end(), '%');
    size_t escaped_size = file_path.size() + 2 * n_percent;

    auto copy_and_replace = [&](char *escaped, size_t sz) {
    const char *start = file_path.begin();
    const auto end = file_path.end();
    do {
    size_t len = 1 + (size_t)std::distance(start, found);
    memcpy(escaped, start, len);
    memcpy(escaped + len, "25", 2);
    escaped += len + 2;
    start += len;
    found = (const char *)memchr(start, '%', (size_t)std::distance(start, end));
    } while(found != nullptr);
    memcpy(escaped, start, (size_t)std::distance(start, end));
    return sz;
    };

    #if __cpp_lib_string_resize_and_overwrite
    std::string escaped_file_path;
    escaped_file_path.resize_and_overwrite(escaped_size, copy_and_replace);
    #else
    std::string escaped_file_path(escaped_size, '\0');
    copy_and_replace(escaped_file_path.data(), escaped_size);
    #endif

    return std::string(escaped_file_path);
    }

    less maintainable I admit, but noticeably faster if you use top100.txt for the benchmark.

  5. I can’t help but think that perhaps part of the difference is that you’re comparing find_string_escape() to stream_escape(), rather than to a comparable find_stream_escape(); your string version has a significantly more efficient algorithm than your stream version, so it’s only natural that there’ll be a significant performance difference. (Notably, you provided no version of stream_escape() which short-circuits on empty input, which is a major difference in and of itself.)

    If you want a more even test, perhaps a find_stream_escape() would be useful? Even a relatively unoptimised version using std::getline() produces a noticeable difference; my quick testing puts it at roughly a 33% improvement. Which is admittedly still worse than raw strings, and by a pretty wide margin at that, but it does at least put the two on variants of the same algorithm, and thus at similar Big O complexities.

    std::string find_stream_escape(std::string_view file_path) {
    // Short-circuit for empty strings.
    if (file_path.empty()) { return std::string(file_path); }

    // ---

    std::istringstream path_finder((std::string(file_path)));
    std::ostringstream retter;

    // Loop over string.
    std::string temp;
    while (std::getline(path_finder, temp, '%')) {
    retter << temp;

    // Check last character.
    if (path_finder.unget().get() == '%') { retter << "%25"; }
    }

    return retter.str();
    }

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.