In school, we learn that the division is the same as multiplying the inverse (or reciprocal), that is x / y = x * (1/y). In software, this does not work as-is when working with integers since 1/y is not an integer when y is an integer (except for y = 1 or y = -1). However, computers have floating-point numbers which are meant to emulate real numbers. It is reasonable to expect that x / y = x * (1/y) will work out with floating-point numbers. It does not.

I woke up this morning reading a blog post by Reynolds on how to compute the inverse quickly when the divisor is known ahead of time. Reynolds is concerned with performance, but his post reminded me of the fact that, in general, multiplying by the inverse is not the same as dividing.

For example, dividing by 3.1416 and multiplying by 1/3.1416 do not result in the same numbers, when doing the computation on a computer.

To prove it, let me pull out my node (JavaScript) console…

> x = 651370000000 > x / 3.1416 * 3.1416 651370000000 > invd = 1 / 3.1416 > x * invd * 3.1416 651370000000.0001

Maybe only JavaScript is broken? Let us try in Python…

>>> x = 651370000000 >>> x / 3.1416 * 3.1416 651370000000.0 >>> invd = 1 / 3.1416 >>> x * invd * 3.1416 651370000000.0001

Always keep in mind thatÂ floating-point numbers are different from real numbers… for example, half of all floating-point numbers are in the interval [-1,1].

**Further reading**: What every computer scientist should know about floating-point arithmetic and Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries

I would hope that anyone with a passing familiarity with floating point numbers and numerical issues would not find this surprising!

As a rule of thumb, I assume that

anytransformation which changes the order of operations, or combines or splits operations into new ones, may have a different result as the original formulation.I’m curious about the counter-examples to this rule. I can think of only a few trivial transformations that can work, like

`a - b = a + (-b)`

because in that case the magnitude of everything is the same after the transformation, but for pretty much anything interesting it seems, naively, like the rule would apply.I would hope that anyone with a passing familiarity with floating point numbers and numerical issues would not find this surprising!Indeed, but it could be true (that x/y = x * (1/y)), in C/C++ if you use the “-ffast-math” flag on some compilers. When examining outputs from godbolt, I find that compilers do not seem to make use of the optimizations described in Reynold’s post. However, if you use the “-ffast-math” flag, they do turn the division into a multiplication, but without any apparent correction factor.

Yes, because that’s what basically what

`-ffast-math`

means: perform transformations on floating point “expressions” that do not necessarily preserve the exact C/C++-rules-following IEEE754 results of the literal expression in the source.This doesn’t always mean a

worseresult either: sometimes the result of`-ffast-math`

is closer to the true result than the version that follows the literal source (e.g., when using an FMA to replace discrete multiply and addition, since it preserves precision in the intermediate value).I see what you are saying though:

`-ffast-math`

may perform a transformation on what appears to be a single operation, transforming it to a reciprocal multiplication and giving a different result. So it’s not just multi-operation expressions that might be eligible for this type of transformation.I’m going to admit to not understanding what the “it” is in this sentence.

The “it” was a reference to “x/y = x * (1/y) is true”. Bad English on my part.

I see.

It ends up being true because both sides are calculating it the less precise way, not because both sides are calculating it the precise way!

Every day I count my blessings that I have don’t actually need to use floating point math in any significant capacity in my day-to-day activities :).

An interesting question on the subject: what floating point values of a satisfy the following for all x:

x / a == x * (1 / a)

Only trivial solutions I can think of are 0 or of the form 2^n (n being integer, of course), because any other inverse of a would result an infinite binary expansion. But floating point arithmetic is finite in precision, so are there non-trivial variants that work correctly (at least on some rounding mode)?

Quick empirical estimation suggests that no values which would avoid errors, apart from powers of two, really exist. There are constants for which rounding errors are rarer than for others, but even for the square root of two which is one of the better candidates almost 13% of results are wrong, and for the square root of three which is on the worse end around 50% of results are off.

On average, multiplication by floating point inverse seems to differ from IEEE division for about 27% of (uniformly distributed) values.

Yeah 1/y is only exact for POT. Reference support: https://hal.inria.fr/ensl-00409366

And the approximate number of exact is also tight. Reference: http://perso.ens-lyon.fr/jean-michel.muller/fpdiv.html (in the supplement)

The difference identified appears to be related to how round-off is applied to the last bit for floating point numbers.

-ffast-math also modifies how round-off is applied.

The other modifier is multi-threaded calculations, which frequently has a more significant effect on round-off, not only for the last bit but for more significant precision loss in sums, often with more significant effects.

When summing floating point values, any assessment of the significance of this last-bit error needs to consider the accuracy available from the input values, which should show the available accuracy is much less than this last-bit error. Variations in the last bit are rarely a significant problem.

I’m not quite following this. By multi-threaded I assume you mean breaking a serial computation into parts, calculating the parts and then recombining the result?

Although this leads to a

differentresult in general it is not obvious to me that it generally leads to a less precise result.Indeed, for basic examples like a sum of all elements in a vector, I would expect multiple accumulators needed for parallel sums to

reduceerror since magnitudes are smaller and split across more mantissa bits.I agree that in some cases summing the parts can reduce the apparent error.

When subtracting two near equal values (a-b), the accuracy is not due to the last bit adjustment, but an assessment of the available accuracy of a and b. So I often find that the maximum error estimate, based on the (storage) accuracy of each input value, can be more significant than the error due to using the inverse.

It is in sums, rather than products that accuracy needs to be reviewed.

The main question is how big the loss in accuracy is compared to the performance gains.

In your example, the error probably is about 1 ulp? That is usually well within tolerance.