This 15-digit accuracy fails for numbers that outside the valid range. For example, the number 1e500 is too large and cannot be directly represented using standard 64-bit floating-point numbers. Similarly, 1e-500 is too small and it can only be represented as zero.
The range of 64-bit floating-point number might be defined as going from 4.94e-324 to 1.8e308 and -1.8e308 to -4.94e-324, together with exactly 0. However, this range includes subnormal numbers where the relative accuracy can be small. For example, the number 5.00000000000000e-324 is best represented as 4.94065645841247e-324, meaning that we have zero-digit accuracy.
For the 15-digit accuracy rule to work, you might remain in the normal range, e.g., from 2.225e−308 to 1.8e308 and -1.8e308 to -2.225e−308. There are other good reasons to remain in the normal range, such as poor performance and low accuracy in the subnormal range.
To summarize, standard floating-point numbers have excellent accuracy (at least 15 digits) as long you remain in their normal range which is between 2.225e−308 to 1.8e308 for positive numbers.
4 thoughts on “Floats have 15-digit accuracy in their normal range”
Some libraries such as Facebook AI FAISS use single precision, so they only have 7 digits.
When working with squared values, e.g., when computing variance, covariance, PCA, or kmeans, going back from squares to linear values tends to cut this precision in half. So 4 digits for single precision, 8 for double.
Try computing the variance of 10000, 10002 in some of these tools… or of 1000000 and 1000000.02 in some SQL databses.
The variance of 100000000 and 100000002 is a problem with 32-bit floating-point numbers, but I think that the variance of 10000 and 10002 is fine.
Can you elaborate ?
Actually, squared values are an exponent problem, not a mantissa problem. Squaring can be thought of (very approximately) as moving the msbit of the mantissa to the lsbit of the exponent.
So any two distinct floats have distinct squares, if there is no over/underflow This is not true in reverse; on average each float has one other which shares the same square root.
So this mean that an algorithm which computes an accurate variance loses one additional lsbit when converting to standard deviation, but that’s not horrible.
The problem with variance is cancellation; the naive algorithms require a large amount of excess precision. But if you have a good estimate of the mean (e.g. using a first pass), then the two-pass compensated algorithm produces an excellent variance without needing extra-precision temporaries.
Several libraries still use the naive approach when computing such values!
E.g. FAISS for PCA (and PCA based indexing…)
E.g., sklearn when computing Euclidean distance using sqrt(a²+b²-2ab) – but they always use double precision here to lessen the effects for single precision input data.
You may subscribe to this blog by email.