Avoid exception throwing in performance-sensitive code

There are various ways in software to handle error conditions. In C or Go, one returns error code. Other programming languages like C++ or Java prefer to throw exceptions. One benefit of using exceptions is that it keeps your code mostly clean since the error-handling code is often separate.

It is debatable whether handling exceptions is better than dealing with error codes. I will happily use one or the other.

What I will object to, however, is the use of exceptions for control flow. It is fine to throw an exception when a file cannot be opened, unexpectedly. But you should not use exceptions to branch on the type of a value.

Let me illustrate.

Suppose that my code expects integers to be always positive. I might then have a function that checks such a condition:

int get_positive_value(int x) {
    if(x < 0) { throw std::runtime_error("it is not positive!"); }
    return x;

So far, so good. I am assuming that the exception is normally never thrown. It gets thrown if I have some kind of error.

If I want to sum the absolute values of the integers contained in an array, the following branching code is fine:

    int sum = 0;
    for (int x : a) {
        if(x < 0) {
            sum += -x;
        } else {
            sum += x;

Unfortunately, I often see solutions abusing exceptions:

    int sum = 0;
    for (int x : a) {
        try {
            sum += get_positive_value(x);
        } catch (...) {
            sum += -x;

The latter is obviously ugly and hard-to-maintain code. But what is more, it can be highly inefficient. To illustrate, I wrote a small benchmark over random arrays containing a few thousand elements. I use the LLVM clang 12 compiler on a skylake processor. The normal code is 10000 times faster in my tests!

normal code 0.05 ns/value
exception 500 ns/value

Your results will differ but it is generally the case that using exceptions for control flow leads to suboptimal performance. And it is ugly too!

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

18 thoughts on “Avoid exception throwing in performance-sensitive code”

  1. To make the cost of exceptions more explicit, consider Java. When an exception is thrown, a full backtrace is stored. This involves quite some memory allocation.

    Now if you are in a language with elegant and efficient optionals such as Rust, you can get some very nice code to handle nonstandard cases.
    But things get more tricky when you need to, e.g., parse numbers from a stream. What if you expect a number, but it’s syntax is invalid? In Java, the method then no longer can return a native int, but you would need to trust the hotspot compilers escape analysis to eliminate all overhead from returning a double boxed Optional.
    Some tools (e.g., ELKI https://github.com/elki-project/elki/blob/146bcb9fbc428e9c4bccdfde3e6f17ae18a38ebd/elki-core-util/src/main/java/elki/utilities/io/ParseUtil.java#L162 ) use preallocated exceptions without a stack trace in performance critical paths, where exceptions are somewhat needed to not pay other overheads (boxing) and a lot of boilerplate code to unbox and handle the exceptions yourself.
    I don’t know if something similar would be possible in C?
    The Rust solution here seems very nice and elegant to me.

    1. do NOT use Java or any other interpreted languages.
      do NOT implement on any architecture which does not hardware support for integer underflows or overflows.
      do NOT use IEEE floating point EVER, when the number can underflow or overflow the IEEE values unless you REALLY, REALLY (really) understand the implications. 99% of folks do not understand said implications

  2. I think that optimizing for the most common path and then adjusting for the extraordinary path is the best way to do things.

    Exceptions should be rare, building the context for an exception is done dynamically as far as I can remember. There is a fair amount of discussion about exactly this topic here on StackOverflow: https://stackoverflow.com/questions/13835817/are-exceptions-in-c-really-slow

    To build onto the comment of E.S., here is more historical context as to why C++ does not store the context except precisely where an exception occurs: https://stackoverflow.com/questions/3311641/c-exception-throw-catch-optimizations

    The current accepted answer as of May 16, 2022 links to the clang-dev mailing list describing why exceptions are done the way they are: https://lists.llvm.org/pipermail/cfe-dev/2015-March/042035.html

    Lemire does make an excellent point and I wholly agree.

    I have rather harsh opinion on error codes, mostly for the meat space aspect where context may not easily be understood or there is cascade of missing context(s) on the scheme of returning codes up the stack of a call that goes deep. But that is probably a discussion for a later time.

  3. The result of “0.05 ns/value” is highly dubious. If your test machine has 5GHz and you manage to execute 4 instructions per cycle, you still consume 0.05 ns/instruction. The code looks like it would take more than 1 instruction per value.

    You use the result, so the compiler probably doesn’t remove the entire loop. But the core looks simple enough for auto-vectorization. Depending on your view, that may make the comparison unfair.

    Then again, even translated to scalar code I would expect throughput around 1ns/value, still 500x faster than the exception variant.

    1. I am not sure why you are saying that the result is dubious. You would expect most compilers to autovectorize the exception-free routine. That’s the desired output.

      1. I understand his argument that you end up measuring two effects at the same time: the (in-)ability to auto-vectorize this simple code and the overhead of exception handling. And that it would be better to test an example where no auto-vectorization happens when you’re interested in the overhead of using exception traps itself.

          1. I am not an expect on microbenchmarking C.
            Maybe some compiler hint to prevent auto vectorization,
            and a more expensive computation inside (say, sqrt or cos) helps. And a third version with no if statement, so we can measure the overhead of if/exception vs. branchless. Though I would expect branchless to not make much of a difference because of CPU pipelining. It if we get the 500 ns/value again, we know it’s reliable. Then try with positive/negative input data only to see if the cost is asymmetric. Because if it’s highly asymmetric, it may still be fine to use for low frequency case handling.

      2. Maybe “surprising” would have been a better term. Typically when I see performance numbers far outside my expectation, there is a bug somewhere. In this case the answer is auto-vectorization, so not a bug.

        Whether it is a fair comparison is a matter of debate. Both answers seem defensible. Anyway, thank you for the interesting benchmark.

  4. I randomly stumbled across this article and decided to measure it using Google’s benchmark library. I did it locally and on Quickbench and did not observe the same results as you. You can take a look here:


    Click on the “Quick-bench” link above the source code, and then “Run benchmark” on quick bench website. Alternatively you could build this locally and test, though you do need to ensure you turn of cpu throttling to get accurate numbers.

    If you haven’t already, Niall Douglas’ outcome and status-code are worth checking out (as well as std::expected), Niall has done some interesting benchmarks:



    1. Your benchmark measures the time needed to sum empty arrays. Unsurprisingly, it is a flat tiny cost.

      Try with the following routine.

      for (size_t i = 0; i != 10000; i++) {
      data.push_back((rand() % 1000) - 500);

      Google Benchmark is convenient, but it does not do anything magical. This being said, tools like Google Benchmark do report the effective CPU frequency when they can so any throttling would be detected. In this instance, it is highly unlikely you will experience throttling on a reasonable platform. In any case, the result is so plain that no sophistication is needed to measure it.

      Thanks for the links.

      1. Oops! You are correct, I’ve updated the example here:


        On this updated benchmark on Quickbench, the with exception version is 450 times slower than the without exception version.

        I agree that Google benchmark doesn’t do anything magical, but it does make sharing benchmarks much more concise and does offer quite a few nice utility functions/macros that are annoying to re-implement.

  5. Some time ago, I optimized a very performance sensitive data validation routine. While the original code was good, it didn’t pay a lot of attention to optimizing the common path and separating the more difficult cases. In addition there were some places where we could replace branchy code by small table lookups. I expected gains there.

    The code also returned a status that was either OK or some error code. Mostly for code readability and I-cache footprint, I decided to remove the explicit status handling in favor of throwing exceptions on (very rare) exception cases where the input fails validation. To my surprise, the performance benefits were large (like 2x), even for the Google benchmark loops that have great cache locality. The knock-on benefits of telling the compiler (“this will not actually happen”, which implicitly confers a will-not-return attribute and puts all code dominating the basic block in question in a cold segment) are tremendous. If gives the compiler information that it can propagate forward and backward through the code: it serves either as a challenge to prove an assertion (and then use that for optimization purposes), or leave a dynamic check with a “known” outcome, resulting in great code on the happy path. In all, validation speed improved by about an order of magnitude, something inconceivable without taking advantage of never-taken exception paths.

    (The code in question is in the validateBSON routine of the MongoDB server.)

  6. To add to my previous comment, a corollary to your post would be: “Avoid returning error codes from performance sensitive code.” There’s great benefit in focusing on the happy no-error path, and entirely separating out error-handling code.

    For the input validation part, we expect it never to fail in production code. However, we can’t afford to omit validation. While true zero-cost error handling may be a holy grail that we’ll never find, current C++ exception handling comes pretty close. Without a doubt, it’s a significant improvement over explicit error code handling, both for performance as well as code clarity.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.