Fast midpoint between two integers without overflow

Let us say that I ask you to find the number I am thinking about between -1000 and 1000, by repeatedly guessing a number. With each guess, I tell you whether your guess is correct, smaller or larger than my number. A binary search algorithm tries to find a value in an interval by repeating finding the midpoint, using smaller and smaller intervals. You might start with 0, then use either -500 or 500 and so forth.

Thus we sometimes need a fast algorithm to find the midpoint in an interval of integers.The following simple routine to find the midpoint is incorrect:

int f(int x, int y) {
  return (x + y)/2;
}

If the integers use a 64-bit two’s complement representation, we could pick 1 for x and 9223372036854775807 for y, and then the result of the function could be a large negative value.

Efficient solutions are provided by Warren in Hacker’s Delight (section 2.5):

int f(int x, int y) { 
  return (x|y) - ((x^y)>>1); 
}

int f(int x, int y) { 
  return ((x^y)>>1) + (x&y); 
}

They provide respectively the smallest value no smaller than (x+y)/2 and the largest value no larger than (x+y)/2. The difference between the two values is (x ^ y) & 1 (credit: Harold Aptroot).

They follow from the following identities: x+y=(x^y)+2*(x&y) and x+y=2*(x|y)-(x^y).

Update: Reader BartekF observes that C++20 added a dedicated function for this problem: std::midpoint.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

18 thoughts on “Fast midpoint between two integers without overflow”

  1. ((x^y)>>1) + (x&y) is 4 operations. Whereas x + (y-x)>>1 is only 3 operations (and has the same span). Am I missing something?

        1. Update: after reading cppreference now I see why: unlike optimized versions, std::midpoint provides very specific requirements for rounding, which are not supported by optimized versions.

  2. EVERY properly written optimising compiler SHOULD emit code like

    ADD RDI, RSI
    RCR RDI, 1
    MOV RAX, RDI

    for this function.
    If the target processor lacks the equivalent of the RCR (Rotate through carry right) instruction, but has a ROR (Rotate right) instruction, it can emit

    ADD RDI, RSI
    ADC RDI, 0
    ROR RDI, 1
    MOV RAX, RDI

    instead.

      1. That’s the other reason why I mentioned to substitute it by ADC/ROR
        ALSO: RCR is (if available) ALWAYS less expensive than the “pure” C formula/expression.

  3. I think the example with 1 and 9223372036854775807 doesn’t demonstrate the problem: the problem is negative numbers, otherwise one can always do (uint) (x + y) >> 1.

    1. If you only have positive integers, and you are using a two’s complement signed type, then I agree that you can always work around overflows with relative ease. I did not make this assumption.

      1. It doesn’t matter, actually. The addition works exactly the same for signed and unsigned types in two’s complement representation. The cast is needed to perform the unsigned bit shift which doesn’t preserve the sign (unlike the signed shift). In some sense the signed addition of non-negative values overflows to the sign bit, then we interpret the result as unsigned and do the unsigned division by 2.

        1. That’s still an overflow though.

          $ swift repl                                                  130
          Welcome to Apple Swift version 5.7.2 (swiftlang-5.7.2.135.5 clang-1400.0.29.51).
          Type :help for assistance.
            1> 9223372036854775807+1
          expression failed to parse:
          error: repl.swift:1:20: error: arithmetic operation '9223372036854775807 + 1' (on type 'Int') results in an overflow
          9223372036854775807+1
          ~~~~~~~~~~~~~~~~~~~^~
          

          You may say that the overflow, if it is not trapped, may be ignored, and you will be right because modern C/C++ and most other systems rely on two’s complement. However, it is still, by definition, an overflow.

          1. You are right guys. Today I learned that signed integer overflow behavior is undefined in C(++). Sorry for inconvenience.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.