Shortcut to finding the square-root of a perfect-square?

up vote
10
down vote

favorite

I've been trying to speed up an algorithm where the most expensive operation is the square-root. However, I can often guarantee that the input value is a perfect-square. I'm curious to know if there are any algorithms that will find the square-root faster (constant time?) if it is known that the input is a perfect-square?

Thanks,
Ryan

asked yesterday

Ryan Pierce Williams

817

1

How large are your squares? 48 binary digits? 64? 1000? And since this is very applied numerical math: what machine are we running on? Or is this configurable hardware and we can implement our own logic?
â€“Â Marcus MÃ¼ller
yesterday

1

And: do you need to make a decision after every square root calculation, or can you calculate multiple square roots in parallel (e.g. data parallelism)?
â€“Â Marcus MÃ¼ller
yesterday

2

If you can run on a cluster, you have data parallelism (prob'ly it's better to discard the pi cluster idea. It's very little bang for the initial Buck and little bang for the power bill Buck; embedded computers are not optimized for number crunching, much unlike modern desktop CPUs)
â€“Â Marcus MÃ¼ller
yesterday

1

Practical idea, Depending on use case, you may be able to compute your square root earlier and memorize it for the bottle neck. Assuming you're doing some earlier computation to Ensure this number is a square. Memorization is also an option, but perhaps that's too much memory. To elaborate: get n^2, split thread. One end does your usual thing, the other computes the square root. Savings will depend on if you are doing more than just finding root n.
â€“Â Artimis Fowl
yesterday

2

By the way, gmp comes with an integer sqrt function. And I don't you'll he much faster than that on large integers.
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 26 more comments

up vote
10
down vote

favorite

Thanks,
Ryan

asked yesterday

Ryan Pierce Williams

817

1

How large are your squares? 48 binary digits? 64? 1000? And since this is very applied numerical math: what machine are we running on? Or is this configurable hardware and we can implement our own logic?
â€“Â Marcus MÃ¼ller
yesterday

1

And: do you need to make a decision after every square root calculation, or can you calculate multiple square roots in parallel (e.g. data parallelism)?
â€“Â Marcus MÃ¼ller
yesterday

2

If you can run on a cluster, you have data parallelism (prob'ly it's better to discard the pi cluster idea. It's very little bang for the initial Buck and little bang for the power bill Buck; embedded computers are not optimized for number crunching, much unlike modern desktop CPUs)
â€“Â Marcus MÃ¼ller
yesterday

1

Practical idea, Depending on use case, you may be able to compute your square root earlier and memorize it for the bottle neck. Assuming you're doing some earlier computation to Ensure this number is a square. Memorization is also an option, but perhaps that's too much memory. To elaborate: get n^2, split thread. One end does your usual thing, the other computes the square root. Savings will depend on if you are doing more than just finding root n.
â€“Â Artimis Fowl
yesterday

2

By the way, gmp comes with an integer sqrt function. And I don't you'll he much faster than that on large integers.
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 26 more comments

up vote
10
down vote

favorite

Thanks,
Ryan

asked yesterday

Ryan Pierce Williams

817

Thanks,
Ryan

asked yesterday

Ryan Pierce Williams

817

asked yesterday

Ryan Pierce Williams

817

asked yesterday

Ryan Pierce Williams

817

asked yesterday

Ryan Pierce Williams

817

1

How large are your squares? 48 binary digits? 64? 1000? And since this is very applied numerical math: what machine are we running on? Or is this configurable hardware and we can implement our own logic?
â€“Â Marcus MÃ¼ller
yesterday

1

And: do you need to make a decision after every square root calculation, or can you calculate multiple square roots in parallel (e.g. data parallelism)?
â€“Â Marcus MÃ¼ller
yesterday

2

If you can run on a cluster, you have data parallelism (prob'ly it's better to discard the pi cluster idea. It's very little bang for the initial Buck and little bang for the power bill Buck; embedded computers are not optimized for number crunching, much unlike modern desktop CPUs)
â€“Â Marcus MÃ¼ller
yesterday

1

Practical idea, Depending on use case, you may be able to compute your square root earlier and memorize it for the bottle neck. Assuming you're doing some earlier computation to Ensure this number is a square. Memorization is also an option, but perhaps that's too much memory. To elaborate: get n^2, split thread. One end does your usual thing, the other computes the square root. Savings will depend on if you are doing more than just finding root n.
â€“Â Artimis Fowl
yesterday

2

By the way, gmp comes with an integer sqrt function. And I don't you'll he much faster than that on large integers.
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 26 more comments

1

How large are your squares? 48 binary digits? 64? 1000? And since this is very applied numerical math: what machine are we running on? Or is this configurable hardware and we can implement our own logic?
â€“Â Marcus MÃ¼ller
yesterday

1

And: do you need to make a decision after every square root calculation, or can you calculate multiple square roots in parallel (e.g. data parallelism)?
â€“Â Marcus MÃ¼ller
yesterday

2

If you can run on a cluster, you have data parallelism (prob'ly it's better to discard the pi cluster idea. It's very little bang for the initial Buck and little bang for the power bill Buck; embedded computers are not optimized for number crunching, much unlike modern desktop CPUs)
â€“Â Marcus MÃ¼ller
yesterday

1

Practical idea, Depending on use case, you may be able to compute your square root earlier and memorize it for the bottle neck. Assuming you're doing some earlier computation to Ensure this number is a square. Memorization is also an option, but perhaps that's too much memory. To elaborate: get n^2, split thread. One end does your usual thing, the other computes the square root. Savings will depend on if you are doing more than just finding root n.
â€“Â Artimis Fowl
yesterday

2

By the way, gmp comes with an integer sqrt function. And I don't you'll he much faster than that on large integers.
â€“Â Marcus MÃ¼ller
yesterday

How large are your squares? 48 binary digits? 64? 1000? And since this is very applied numerical math: what machine are we running on? Or is this configurable hardware and we can implement our own logic?
â€“Â Marcus MÃ¼ller
yesterday

And: do you need to make a decision after every square root calculation, or can you calculate multiple square roots in parallel (e.g. data parallelism)?
â€“Â Marcus MÃ¼ller
yesterday

If you can run on a cluster, you have data parallelism (prob'ly it's better to discard the pi cluster idea. It's very little bang for the initial Buck and little bang for the power bill Buck; embedded computers are not optimized for number crunching, much unlike modern desktop CPUs)
â€“Â Marcus MÃ¼ller
yesterday

Practical idea, Depending on use case, you may be able to compute your square root earlier and memorize it for the bottle neck. Assuming you're doing some earlier computation to Ensure this number is a square. Memorization is also an option, but perhaps that's too much memory. To elaborate: get n^2, split thread. One end does your usual thing, the other computes the square root. Savings will depend on if you are doing more than just finding root n.
â€“Â Artimis Fowl
yesterday

By the way, gmp comes with an integer sqrt function. And I don't you'll he much faster than that on large integers.
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 26 more comments

4 Answers
4

active

oldest

votes

up vote
12
down vote

The sum of the first $k$ odd numbers is $k^2$. Knowing this, you can you calculate the square root by summing successive odd numbers (starting from one)Ã¢Â€Â”once you reach the input value, return the number of summations you made.

For example, $16 = 1 + 3 + 5 + 7$; that's $4$ addends, so $sqrt16=4$. This process will always work, since our input is guaranteed to be of the form $k^2$ with $k in mathbb N$.

I think this method would run in $O(sqrt n)$.

edited yesterday

answered yesterday

Tiwa Aina

2,526319

1

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

1

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

4

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

1

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

7

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

Â |Â
show 3 more comments

up vote
6
down vote

With integers within sensible bounds compared to what your CPU can natively compute, it can be quite easy to restrict the range of numbers you have to binary search to find the square root of x.

(0. remove two-blocks of trailing 0 from your binary number. Each block you remove is one factor of 2 to be multiplied to the result of the following step. This can be done in constant time, if I'm not mistaken: Observe the structure of "Subtract 1 and XOR with the input" for numbers with $t$ trailing 0s. Then use the POPCNT (Hamming weight) instruction of most serious CPUs. After removing these 0s, i.e. dividing by $4^n$, you'll end up with an odd number; if you end up with an even number after removing an even number of 0s, your number is not a perfect square.)

Find $k=lfloorlog_2 xrfloor $, see https://graphics.stanford.edu/~seander/bithacks.html

$a=frac k2$

Thus, $2^a$ becomes a lower limit for $sqrt x$ and $2^a+1$ an upper. Both values can be found via bit-shifting 1.

From here, do a binary searchÃ‚Â¹.

I doubt you'd be much faster than converting to floating point and letting the FPU do it in hardware, giving you an approximate value, comvertable back to integer, from which you only need to search small ranges (namely, the lost precision) for the actual integer square root.

Note that in such problems as yours, algorithmic elegance often plays a minor role - it needs to be fast on actual hardware, so execution avoiding a lot of memory interaction is a good thing, and: with SIMD instructions, doing four to 16 operations of the same type take about as long as doing one; so if you just need to test a few integers for their square, modifying your algorithm to be able to try four in parallel is way more efficient than saving half of the operations necessary.

You have a technological problem, not so much a numerical.

Ã‚Â¹ binary search assumes that you can do one squaring and one comparison at once; as hinted at before, you might very well be able to divide your interval into five search chunks by calculating four products at once and comparing four numbers at once using SIMD. This further hints that even if there should be no constant time algorithm (and I'm pretty sure there's none), you can be better than $mathcal O(n^2Ã‚Â·log_2 x)$; compare FÃƒÂ¼rer's algorithm.

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

1

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

1

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 2 more comments

up vote
3
down vote

I think the only advantage gained by having a perfect square in analytic methods is that you know an iterative algorithm will actually terminate. So instead here is a number theoretic solution that'll work for numbers less than $2^66$.

Fact 1: If $p$ is a prime with $p equiv 3 mod 4$ and $x$ is a perfect square $mod p$, then $$x equiv left(x^(p+1)/4right)^2 mod p,$$ i.e. you can compute the modular square root by exponentiating by $(p+1)/4$. (See https://crypto.stackexchange.com/a/20994/18112)

Fact 2: The numbers $m_17=2^17-1$, $m_19=2^19-1$, and $m_31=2^31-1$ are (Mersenne) primes whose product is greater than $2^66$.

Method: Let $S$ be the square whose root $t$ you'd like to find. Compute the following
$$t_17 equiv S^2^15 mod m_17$$
$$t_19 equiv S^2^17 mod m_19$$
$$t_31 equiv S^2^29 mod m_31$$
Then the Chinese Remainder Theorem gives $$t equiv pm 31207 t_17 m_19 m_31 pm 298611 m_17 t_19 m_31 pm 413071270 m_17 m_19 t_31 mod m_17m_19m_31$$ Then check these 8 possibilities.

Remarks: I don't know how computationally efficient this is; it's more of a mathematical solution taking advantage of knowing that $S$ is a square. I would venture to guess it's about as "constant time" as you could get as the number of steps is essentially fixed, but that constant may be larger than the $sqrtn$ of other methods for this range of $n$.

edited 19 hours ago

answered yesterday

Aeryk

257211

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

add a commentÂ |Â

up vote
1
down vote

I think a binary search type algorithm would be quite efficient for large input values if we know the input is a perfect square.

Let $n$ be the input value.

Begin with two integers $a$ and $b$ such that $a^2 < n$ and $b^2 > n$. We could use $a=0$ and $b=n$.

Find the midpoint of $a$ and $b$ and round down to the nearest integer if necessary. Call this $m$.

Find $m^2$. If $m^2=n$ then $m$ is the square root. If $m^2>n$ then $m$ is too high, so we return to step 2 with $a$ and $m$ as our two integers. If $m^2<n$ then $m$ is too low, so we return to step 2 with $m$ and $b$ as our two integers. Repeat until the square root is found.

The squaring of $m$ may be what slows the algorithm down, however I believe that multiplication algorithms are implemented in processor hardware and therefore very efficient. In terms of the number of operations, I believe the binary search would run in logarithmic time and therefore be preferable to $O(sqrt n)$ for large input values. However, I am no expert on algorithm efficiency...

answered yesterday

alcana

812

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

3

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

2

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2872437%2fshortcut-to-finding-the-square-root-of-a-perfect-square%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
12
down vote

For example, $16 = 1 + 3 + 5 + 7$; that's $4$ addends, so $sqrt16=4$. This process will always work, since our input is guaranteed to be of the form $k^2$ with $k in mathbb N$.

I think this method would run in $O(sqrt n)$.

edited yesterday

answered yesterday

Tiwa Aina

2,526319

1

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

1

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

4

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

1

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

7

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

Â |Â
show 3 more comments

up vote
12
down vote

For example, $16 = 1 + 3 + 5 + 7$; that's $4$ addends, so $sqrt16=4$. This process will always work, since our input is guaranteed to be of the form $k^2$ with $k in mathbb N$.

I think this method would run in $O(sqrt n)$.

edited yesterday

answered yesterday

Tiwa Aina

2,526319

1

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

1

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

4

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

1

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

7

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

Â |Â
show 3 more comments

up vote
12
down vote

For example, $16 = 1 + 3 + 5 + 7$; that's $4$ addends, so $sqrt16=4$. This process will always work, since our input is guaranteed to be of the form $k^2$ with $k in mathbb N$.

I think this method would run in $O(sqrt n)$.

edited yesterday

answered yesterday

Tiwa Aina

2,526319

For example, $16 = 1 + 3 + 5 + 7$; that's $4$ addends, so $sqrt16=4$. This process will always work, since our input is guaranteed to be of the form $k^2$ with $k in mathbb N$.

I think this method would run in $O(sqrt n)$.

edited yesterday

answered yesterday

Tiwa Aina

2,526319

edited yesterday

answered yesterday

Tiwa Aina

2,526319

answered yesterday

Tiwa Aina

2,526319

answered yesterday

Tiwa Aina

2,526319

1

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

1

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

4

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

1

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

7

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

Â |Â
show 3 more comments

1

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

1

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

4

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

1

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

7

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

+1. I would edit your answer to make your use of $n$ consistent between the first and last sentences. (Perhaps prefer changing $n$ to $k$ in your first sentence to match the second paragraph.)
â€“Â Brian Tung
yesterday

Good answer :) I actually tried this myself. Unfortunately, as n gets larger this approach takes a lot more time than more general approaches like Newton's Method.
â€“Â Ryan Pierce Williams
yesterday

However, I wonder if this is actually faster than the existing sqrt implementation in most compiled languages.
â€“Â Brian Tung
yesterday

Newton's algorithm converges much faster (on average; worst case: your integer is the square of two large primes, then complexity approaches $mathcal O(sqrt n)$, I think.
â€“Â Marcus MÃ¼ller
yesterday

Terrible. Remember that $n$ here is the number itself, which is of order $2^s$ where $s$ is number of bits used to represent the number; so $sqrt n = sqrt2^s = (sqrt 2)^s$, which is exponential in the input size. Newton's algorithm is polynomial in $s$.
â€“Â user202729
yesterday

Â |Â
show 3 more comments

up vote
6
down vote

With integers within sensible bounds compared to what your CPU can natively compute, it can be quite easy to restrict the range of numbers you have to binary search to find the square root of x.

Find $k=lfloorlog_2 xrfloor $, see https://graphics.stanford.edu/~seander/bithacks.html

$a=frac k2$

Thus, $2^a$ becomes a lower limit for $sqrt x$ and $2^a+1$ an upper. Both values can be found via bit-shifting 1.

From here, do a binary searchÃ‚Â¹.

You have a technological problem, not so much a numerical.

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

1

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

1

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 2 more comments

up vote
6
down vote

With integers within sensible bounds compared to what your CPU can natively compute, it can be quite easy to restrict the range of numbers you have to binary search to find the square root of x.

Find $k=lfloorlog_2 xrfloor $, see https://graphics.stanford.edu/~seander/bithacks.html

$a=frac k2$

Thus, $2^a$ becomes a lower limit for $sqrt x$ and $2^a+1$ an upper. Both values can be found via bit-shifting 1.

From here, do a binary searchÃ‚Â¹.

You have a technological problem, not so much a numerical.

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

1

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

1

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 2 more comments

up vote
6
down vote

With integers within sensible bounds compared to what your CPU can natively compute, it can be quite easy to restrict the range of numbers you have to binary search to find the square root of x.

Find $k=lfloorlog_2 xrfloor $, see https://graphics.stanford.edu/~seander/bithacks.html

$a=frac k2$

Thus, $2^a$ becomes a lower limit for $sqrt x$ and $2^a+1$ an upper. Both values can be found via bit-shifting 1.

From here, do a binary searchÃ‚Â¹.

You have a technological problem, not so much a numerical.

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

With integers within sensible bounds compared to what your CPU can natively compute, it can be quite easy to restrict the range of numbers you have to binary search to find the square root of x.

Find $k=lfloorlog_2 xrfloor $, see https://graphics.stanford.edu/~seander/bithacks.html

$a=frac k2$

Thus, $2^a$ becomes a lower limit for $sqrt x$ and $2^a+1$ an upper. Both values can be found via bit-shifting 1.

From here, do a binary searchÃ‚Â¹.

You have a technological problem, not so much a numerical.

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

edited yesterday

answered yesterday

Marcus MÃ¼ller

1827

answered yesterday

Marcus MÃ¼ller

1827

answered yesterday

Marcus MÃ¼ller

1827

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

1

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

1

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 2 more comments

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

1

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

1

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Step 1 assumes the input is a machine-sized integer, which may not be a case.
â€“Â user202729
yesterday

Well if it is some kind of big integer: whatever big integer library you're using somehow has to keep track of the length in binary digits, anyway.
â€“Â Marcus MÃ¼ller
yesterday

(of course it's impossible to do it in constant time)
â€“Â user202729
yesterday

Did you try to time it? The binary search requires a lot of multiplications for large numbers so my bet is that this is much slower than the built in square root function (this is what I find with a simple test in C++ using long long integers, but its always the caveat that there is a better implementation than what I did)
â€“Â Winther
yesterday

"built-in square root function": of what, operating on what data types?
â€“Â Marcus MÃ¼ller
yesterday

Â |Â
show 2 more comments

up vote
3
down vote

Fact 2: The numbers $m_17=2^17-1$, $m_19=2^19-1$, and $m_31=2^31-1$ are (Mersenne) primes whose product is greater than $2^66$.

edited 19 hours ago

answered yesterday

Aeryk

257211

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

add a commentÂ |Â

up vote
3
down vote

Fact 2: The numbers $m_17=2^17-1$, $m_19=2^19-1$, and $m_31=2^31-1$ are (Mersenne) primes whose product is greater than $2^66$.

edited 19 hours ago

answered yesterday

Aeryk

257211

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

add a commentÂ |Â

up vote
3
down vote

Fact 2: The numbers $m_17=2^17-1$, $m_19=2^19-1$, and $m_31=2^31-1$ are (Mersenne) primes whose product is greater than $2^66$.

edited 19 hours ago

answered yesterday

Aeryk

257211

Fact 2: The numbers $m_17=2^17-1$, $m_19=2^19-1$, and $m_31=2^31-1$ are (Mersenne) primes whose product is greater than $2^66$.

edited 19 hours ago

answered yesterday

Aeryk

257211

edited 19 hours ago

answered yesterday

Aeryk

257211

answered yesterday

Aeryk

257211

answered yesterday

Aeryk

257211

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

add a commentÂ |Â

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

(1) I've never heard of a practical algorithm computing square root that may not terminate. (2) (as I pointed out in a comment under the question) it's impossible to do this in constant time, as storing a number $n$ takes at least $O(log n)$ bits.
â€“Â user202729
yesterday

(3) Note that for $n$ being in a constant range (in this case $[1 dots 2^66]$, any (deterministic, correct) algorithm must run in constant asymptotic time complexity. (4) It's only necessary for $p$ to be larger than $2sqrt n$ instead of $n$, the proof is left to the reader. ... and ...
â€“Â user202729
yesterday

@user202729 You're totally right about the 8 solution ambiguity. I made the correction in the exposition above. Thanks!
â€“Â Aeryk
19 hours ago

add a commentÂ |Â

up vote
1
down vote

I think a binary search type algorithm would be quite efficient for large input values if we know the input is a perfect square.

Let $n$ be the input value.

Begin with two integers $a$ and $b$ such that $a^2 < n$ and $b^2 > n$. We could use $a=0$ and $b=n$.

Find the midpoint of $a$ and $b$ and round down to the nearest integer if necessary. Call this $m$.

Find $m^2$. If $m^2=n$ then $m$ is the square root. If $m^2>n$ then $m$ is too high, so we return to step 2 with $a$ and $m$ as our two integers. If $m^2<n$ then $m$ is too low, so we return to step 2 with $m$ and $b$ as our two integers. Repeat until the square root is found.

answered yesterday

alcana

812

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

3

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

2

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

add a commentÂ |Â

up vote
1
down vote

I think a binary search type algorithm would be quite efficient for large input values if we know the input is a perfect square.

Let $n$ be the input value.

Begin with two integers $a$ and $b$ such that $a^2 < n$ and $b^2 > n$. We could use $a=0$ and $b=n$.

Find the midpoint of $a$ and $b$ and round down to the nearest integer if necessary. Call this $m$.

Find $m^2$. If $m^2=n$ then $m$ is the square root. If $m^2>n$ then $m$ is too high, so we return to step 2 with $a$ and $m$ as our two integers. If $m^2<n$ then $m$ is too low, so we return to step 2 with $m$ and $b$ as our two integers. Repeat until the square root is found.

answered yesterday

alcana

812

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

3

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

2

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

add a commentÂ |Â

up vote
1
down vote

I think a binary search type algorithm would be quite efficient for large input values if we know the input is a perfect square.

Let $n$ be the input value.

Begin with two integers $a$ and $b$ such that $a^2 < n$ and $b^2 > n$. We could use $a=0$ and $b=n$.

Find the midpoint of $a$ and $b$ and round down to the nearest integer if necessary. Call this $m$.

Find $m^2$. If $m^2=n$ then $m$ is the square root. If $m^2>n$ then $m$ is too high, so we return to step 2 with $a$ and $m$ as our two integers. If $m^2<n$ then $m$ is too low, so we return to step 2 with $m$ and $b$ as our two integers. Repeat until the square root is found.

answered yesterday

alcana

812

I think a binary search type algorithm would be quite efficient for large input values if we know the input is a perfect square.

Let $n$ be the input value.

Begin with two integers $a$ and $b$ such that $a^2 < n$ and $b^2 > n$. We could use $a=0$ and $b=n$.

Find the midpoint of $a$ and $b$ and round down to the nearest integer if necessary. Call this $m$.

Find $m^2$. If $m^2=n$ then $m$ is the square root. If $m^2>n$ then $m$ is too high, so we return to step 2 with $a$ and $m$ as our two integers. If $m^2<n$ then $m$ is too low, so we return to step 2 with $m$ and $b$ as our two integers. Repeat until the square root is found.

answered yesterday

alcana

812

answered yesterday

alcana

812

answered yesterday

alcana

812

answered yesterday

alcana

812

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

3

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

2

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

add a commentÂ |Â

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

3

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

2

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

Thanks for your answer alcana :) This is a more general algorithm for the square root (would work for non-perfect squares as well with some minor modifications to the logic for determining when we are done). It doesn't take advantage of the special property of the input value: that it is a perfect-square. It is faster than the previous answer as the input size increases however :)
â€“Â Ryan Pierce Williams
yesterday

@RyanPierceWilliams - also note that on many current Intel processors, multiplies are optimized and only take 1 to 3 cycles, so for larger n, the binary search using multiply and compare should be faster.
â€“Â rcgldr
yesterday

Isn't this very similar to Newton's method, just less efficient because it doesn't use deltas to correct the prediction, just a blind binary split testing?
â€“Â ktb
yesterday

Yes, this is less efficient than Newton's method.
â€“Â user202729
yesterday

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

ukmuiik