What the heck is the value of “-n % n” in programming languages?

When coding efficient algorithms having to do with hashing, random number generations or even cryptography, a common construction is the expression “-n%n“. My experience has been that it confuses many programmers, so let us examine it further.

To illustrate, let us look at the implementation of std::uniform_int_distribution found in the GNU C++ library (Linux) and clean up the line in question:

threshold = -range % range;

The percent sign (%) in this expression refers to the modulo operation. It returns the remainder of the integer division. To simplify the discussion, let us assume that range is strictly positive since dividing by zero causes problems.

We should pay attention to the leading minus sign (). It is the unary operator that negates a value, and not the subtraction sign. There is a difference between “-range % range" and “0-range % range". They are not at all equivalent. They will actually give you different values; the latter expression is always zero. And that is because of the priority of operation. The negation operation has precedence on the modulo operation which has precedence on the subtraction operation. Thus you can rewrite “-range % range" as “(-range) % range". And you can write “0-range % range" as “0- (range % range)“.

When the variable range is a signed integer, then the expression -range % range is zero. In a programming language with only signed integers, like Java, this expression is always zero.

So let us assume that the variable range is an unsigned type, as it is meant to be. In such cases, the expression is generally non-zero.

We need to compute -range. What does it mean to negate an unsigned value?

When the variable range is an unsigned type, Visual Studio is likely to be unhappy at the expression -range. A recent Visual Studio returns the following warning:

warning C4146: unary minus operator applied to unsigned type, result still unsigned

Nevertheless, I believe that it is a well defined operation in C++, Go and many other programming languages. Jonathan Adamczewski has a whole blog post on the topic which suggests that the Visual Studio warning is best explained by a historical deviations from the C++ standard from the Microsoft Visual Studio team. (Note that the current Visual Studio team seems committed to the standards going forward.)

My favorite definition is that –range is defined by range + (-range) = 0. That is, it is the value such that when you add it to range, you get zero. Mathematicians would say that it is the “additive inverse”. In programming languages (like Go and C++) where unsigned integers wrap around, then there is always one, and only one, additive inverse to every integer value.

You can define this additive inverse without the unary negation: if max is the maximum value that you can represent, then you can replace –range by maximumrange + 1. Or, maybe more simply, as (0-range). And indeed, in the Swift programming language, this particular line was represented as follow:

      let threshold = (0 &- range) % range

The Swift language has two subtraction operations, one that is not allowed to overflow (the usual ‘-‘), and one that is allowed to overflow (‘&-‘). It is somewhat inconvenient that Swift forces us to write so much code, but we must admit that the result is probably less likely to confuse a good programmer.

In C#, the system will not let you negate an unsigned integer and will instead cast it as a signed integer, so you have to go the long way around if you want to remain in unsigned mode, like so…

threshold = (uint.MaxValue - scale + 1) % scale

This expression is unfortunately type specific (here uint).

To conclude: you can learn a lot just by examining one line of code. To put it another way, programming is a much deeper and complex practice than it seems at first. As I was telling a student of mine yesterday: you are not supposed to read new code and understand it right away all of the time.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

18 thoughts on “What the heck is the value of “-n % n” in programming languages?”

  1. Sorry, but I miss the point of such code, then. If the value of -range % range is always 0, what’s the intent?

  2. As I was telling a student of mine yesterday: you are not supposed to read new code and understand it right away all of the time.

    I reckon a whole series of posts could be inspired by that statement.

    1. That resonated with me, it’s the opposite of “software should be readable like proze” which always sounded odd to me.

  3. This is a neat trick! But I found something surprising with unsigned short: https://gcc.godbolt.org/z/4xWo1G. It looks like -n is type-promoted (probably to a signed integer? I haven’t checked the standard) so unless you re-cast it (I’m not sure that cast isn’t UB…) you don’t get the expected result.

  4. You do an excellent job of explaining what this seemingly odd construction does, but it might be good to mention in the intro why “-n %n” is a common construction in the first place. Stealing from an HN comment making the same suggestions, “It computes (2^b) % n, assuming n is an unsigned b-bit integer. You can’t do this directly, since 2^b itself doesn’t fit into a b-bit integer.”

  5. I dont understand. I tried it in C# and printed out the result of -8%8 and i got 0 so i dont get what is supposed to be happening

  6. I noticed unexpected results with unsigned short and unsigned char. It’s worth noting that -n will be converted to a signed int if n is a narrower type.

  7. I’m finding out the hard way that gcc 10 seems to optimize this expression to 0 for unsigned values. I’m having to cast to signed to get the unary minus to take effect, and then cast back to unsigned for the modulo to work the way it should. YMMV.

  8. In programming languages (like Go and C++) where unsigned integers
    cannot overflow, then there is always one, and only one, additive
    inverse to every integer value.

    This sentence is a bit confusing to read. If unsigned integers cannot overflow (i.e. unsigned overflow doesn’t exist), then there can be no (unsigned) additive inverse.

    By the way, I expected this post to discuss something else entirely. In Python, you have the following:

    >>> (-3) % 5
    2
    >>> (-3) % (-5)
    -3
    >>> 3 % (-5)
    -2

    But in C (and presumably other C-like languages), the results are -3, -3, 3 respectively.

    1. By the way, I expected this post to discuss something else entirely. In Python, you have the following: (…) But in C (and presumably other C-like languages), the results are -3, -3, 3 respectively.

      The division and remainder (modulo) between signed integers are defined differently depending on the programming. E.g., we all agree that 4/3 is 1. But what is -4/3? It could be -2 or -1. You can round toward zero or toward minus infinity.

Leave a Reply

Your email address will not be published.

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You may subscribe to this blog by email.