A simple algorithm to compute the square root of an integer, byte by byte

A reader asked me for some help in computing (1 – sqrt(0.5)) to an arbitrary precision, from scratch. A simpler but equivalent problem is to compute the square root of an integer (e.g., 2). There are many sophisticated algorithms for such problems, but we want something relatively simple. We’d like to compute the square root bit by bit…

For example, the square root of two is…

  1. 5 / 4
  2. 11 / 8
  3. 22 / 16
  4. 45 / 32
  5. 90 / 64
  6. 181 / 128

More practically, 8-bit by 8-bit, we may want to compute it byte by byte…

  1. 362 / 256
  2. 92681 / 65536
  3. 23726566 / 16777216

How can we do so?

Intuitively, you could compute the integer part of the answer by starting with 0 and incrementing a counter like so:

x1 = 0
while (x1+1)**2 <= M:
  x1 += 1

Indeed, the square of the integer part cannot be larger than the desired power.

You can repeat the same idea with the fractional part… writing the answer as x1+x2/B+... smaller terms.

x2 = 0
while (x1*B + x2 + 1)**2 <= M*B**2:
  x2 += 1

It will work, but it involves squaring ever larger numbers. That is inefficient.

We don’t actually need to compute powers when iterating. If you need to compute x**2, (x+1)**2, (x+2)**2, etc. You can instead use a recursion: if you have computed (x+n)**2 and you need the next power, you just need to add 2(x+n) + 1 because that’s the value of (x+n+1)**2 (x+n)**2.

Finally, we get the following routine (written in Python). I left the asserts in place to make the code easier to understand:

B = 2**8 # or any other basis like 2 or 10
x = 0
power = 0
limit = M
for i in range(10): # 10 is the number of digits you want
  limit *= B**2
  power *= B**2
  while power + 2*x + 1 <= limit:
    power += 2*x + 1
    x += 1
    assert(x**2 == power)
    assert(x**2 <= limit)
# x/B**10 is the desired root 

You can simplify the code further by not turning the power variable into a local variable within the loop. We subtract it from the power variable.

B = 2**8
x = 0
limit = M
for i in range(10):
  limit *= B**2
  power = 0
  while power + 2*x + 1 <= limit:
    power += 2*x + 1
    x += 1
  limit -= power
# x/B**10 is the desired root 

The algorithm could be further optimized if you needed more efficiency. Importantly, it is assumed that the basis is not too large otherwise another type of algorithm would be preferable. Using 256 is fine, however.

Obviously, one can design a faster algorithm, but this one has the advantage of being nearly trivial.

Further reading: A Spigot-Algorithm for Square-Roots: Explained and Extended by Mayer Goldberg

Credit: Thanks to David Smith for inspiring this blog post.

Daniel Lemire, "A simple algorithm to compute the square root of an integer, byte by byte," in Daniel Lemire's blog, April 11, 2024.

Published by

Daniel Lemire

A computer science professor at the University of Quebec (TELUQ).

Leave a Reply

Your email address will not be published.

You may subscribe to this blog by email.