-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved integer square root. #4403
Changes from 2 commits
171e5c2
beec057
a48fda8
b6c29b1
0d2bbb3
1c8e8ab
8e2ab78
77db52d
8b3f15f
dc098e8
bd9d969
dfe2fbc
a53d378
6c74012
fe8abc3
382ab6f
c60542b
ddea292
360f65d
4288c74
256189a
6d00bbd
6c8be26
67d0328
85a56d6
e748bfc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -224,39 +224,31 @@ library Math { | |
|
||
/** | ||
* @dev Returns the square root of a number. If the number is not a perfect square, the value is rounded down. | ||
* | ||
* Inspired by Henry S. Warren, Jr.'s "Hacker's Delight" (Chapter 11). | ||
*/ | ||
function sqrt(uint256 a) internal pure returns (uint256) { | ||
if (a == 0) { | ||
return 0; | ||
} | ||
|
||
// For our first guess, we get the biggest power of 2 which is smaller than the square root of the target. | ||
// | ||
// We know that the "msb" (most significant bit) of our target number `a` is a power of 2 such that we have | ||
// `msb(a) <= a < 2*msb(a)`. This value can be written `msb(a)=2**k` with `k=log2(a)`. | ||
// | ||
// This can be rewritten `2**log2(a) <= a < 2**(log2(a) + 1)` | ||
// → `sqrt(2**k) <= sqrt(a) < sqrt(2**(k+1))` | ||
// → `2**(k/2) <= sqrt(a) < 2**((k+1)/2) <= 2**(k/2 + 1)` | ||
// | ||
// Consequently, `2**(log2(a) / 2)` is a good first approximation of `sqrt(a)` with at least 1 correct bit. | ||
uint256 result = 1 << (log2(a) >> 1); | ||
|
||
// At this point `result` is an estimation with one bit of precision. We know the true value is a uint128, | ||
// since it is the square root of a uint256. Newton's method converges quadratically (precision doubles at | ||
// every iteration). We thus need at most 7 iteration to turn our partial result with one bit of precision | ||
// into the expected uint128 result. | ||
unchecked { | ||
if (a <= 1) { return a; } | ||
if (a >= ((1 << 128) - 1)**2) { return (1 << 128) - 1; } | ||
uint256 aAux = a; | ||
uint256 result = 1; | ||
if (aAux >= (1 << 128)) { aAux >>= 128; result <<= 64; } | ||
if (aAux >= (1 << 64 )) { aAux >>= 64; result <<= 32; } | ||
if (aAux >= (1 << 32 )) { aAux >>= 32; result <<= 16; } | ||
if (aAux >= (1 << 16 )) { aAux >>= 16; result <<= 8; } | ||
if (aAux >= (1 << 8 )) { aAux >>= 8; result <<= 4; } | ||
if (aAux >= (1 << 4 )) { aAux >>= 4; result <<= 2; } | ||
if (aAux >= (1 << 2 )) { result <<= 1; } | ||
result += (result >> 1); | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
result = (result + a / result) >> 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I only count 7 iterations, yet the PDF you linked shows that in some cases, 8 are required. I'm worried about that part. Do you have an example of value that needs 8 iterations that we could run both version of the code with ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. First, Newton iteration refers to So, As to your comments about "8 cases are required", you are misunderstanding the comment and figure. If only Newton iterations are performed and the same initialization is used that is currently used by OZ, then it sometimes takes 8 Newton iterations to reach the correct value. Note that this does not apply either in the current OZ code nor the proposed change. That figure was primarily meant to show why a good initialization value is required for efficient computation. |
||
result = (result + a / result) >> 1; | ||
return min(result, a / result); | ||
if (result * result <= x) { | ||
chgorman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return result; | ||
} | ||
return result-1; | ||
} | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk of overflow ? Can you document which part is succeptible ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using this specific method and return logic, it is possible to overflow when computing
result**2
during the checkresult**2 <= a
, because it may be thatresult == 2**128
; thus,result**2 == 0
because this isunchecked
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interresting. I guess this is not necessary if you do
min(result, a/result)
so the current version is good in that regard.I'd rewrite that as:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the cost of that test outweight the cost of the
min
at the end.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version has no proof of correctness. This proposed change does. See Appendix B.4.3. Have you looked at that report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code was updated with suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, the overflow check (
if (a >= uint256(type(uint128).max)**2) { return type(uint128).max; }
) costs 23 gas.