-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thoughts on Exactness #181
Comments
I agree that the 0.1 + 0.2 === 0.3 example can be worked around. Some initial thoughts:
I agree that more education is needed. At a minimum, I hope this GitHub issue gets indexed and JS programmers discover this and can learn something here. There are a number of websites out there that explain what binary floats are and some tips & tricks, such as suitable use of |
Regarding adding toPrecision intermediate stages: The fundamental problem with this approach is multiple rounding. It's difficult even for experts to analyze code that does multiple stages of rounding. Here's an example:
OK, so you then try this in stages where in the first stage you do toPrecision(3) and in the next stage you do toPrecision(2). But that's still wrong:
so it seems at first glance to do round-ties-to-odd. But wait!
Such multiple rounding is a well-known antipattern among folks familiar with numerics. It's to be strongly avoided except in cases where either you prefer fast over correct arithmetic (in which case you're probably using Numbers anyway) or you can construct a proof that it works. If you're not using a decimal library, best practices are to scale the numbers into integral Numbers or BigInts and do arithmetic on those. |
The concept of "rounding errors" is misplaced. You aren't "rounding" your Number so long as toPrecision has more digits of precision than is intended to be represented. You're "rounding away" binary float artifacts. If done correctly, calling toPrecision at each step is totally safe and well defined. If in doubt, call it with an argument of 15. |
That response is factually incorrect. Try my examples replacing the 3 with 15 and you'll still get wrong answers. |
I'm not following what you're trying to say, then. In your post, all of these are identities:
You could have made the same point about how binary floats "sometimes round up, sometimes round down" by writing 1.05.toPrecision(2) returns "1.1"
1.15.toPrecision(2) returns "1.1"
1.25.toPrecision(2) returns "1.3"
1.35.toPrecision(2) returns "1.4" My point is that any |
I think the following ultra simple function works for typical use cases: Number.prototype.decimalAdd = function(operand) {
return Number((this + operand).toPrecision(15));
} Note: Despite calling a rounding function at each step, this code prevents the accumulation of "rounding errors" because the Observation: Cases such as "0.1 + 0.2" should always return either the expected decimal equivalent number or the number exactly one epsilon up or down. The binary answer should never differ from the decimal answer by more than one epsilon. I do not have a proof for this theorem. To-do: find something more clever than hard-coding |
What is an "epsilon"? It is the distance between a reference binary float and the binary float that is exactly 1 tick up or down the number line from the reference. For example, to get the next binary float up or down, something like this does the job almost all of the time (exception: when on the boundary between two powers of 2): Number.prototype.nextFloat = function() {
let buffer = new ArrayBuffer(8);
let floatArray = new Float64Array(buffer);
let intArray = new BigInt64Array(buffer);
floatArray[0] = this;
intArray[0] += 1n;
return floatArray[0];
} With that function, you can observe the following: 0.1 + 0.2 === 0.3.nextFloat() // true Another fun little function: count the number of discrete floats between two numbers (same constraint as above: don't cross a power of 2): function countFloatsBetween(a, b) {
let buffer = new ArrayBuffer(16);
let floatArray = new Float64Array(buffer);
let intArray = new BigInt64Array(buffer);
floatArray[0] = a;
floatArray[1] = b;
return intArray[1] - intArray[0];
} Example: take 0.3 and the next decimal number at 15 significant digits: countFloatsBetween(0.3, 0.300000000000001)
// 18n
// Observe how the number of epsilons is not constant:
[
countFloatsBetween(0.1, 0.100000000000001),
countFloatsBetween(0.2, 0.200000000000001),
countFloatsBetween(0.3, 0.300000000000001),
countFloatsBetween(0.4, 0.400000000000001),
countFloatsBetween(0.5, 0.500000000000001),
countFloatsBetween(0.6, 0.600000000000001),
countFloatsBetween(0.7, 0.700000000000001),
countFloatsBetween(0.8, 0.800000000000001),
countFloatsBetween(0.9, 0.900000000000001),
]
// [72n, 36n, 18n, 18n, 9n, 9n, 9n, 9n, 9n]
// It resets at powers of 10 because we move the rightmost digit to the left:
[
countFloatsBetween(1, 1.00000000000001),
countFloatsBetween(9, 9.00000000000001),
countFloatsBetween(10, 10.0000000000001),
countFloatsBetween(99, 99.0000000000001),
countFloatsBetween(100, 100.000000000001),
]
// [45n, 6n, 56n, 7n, 70n] My little You can also see how we could be more clever when picking the number of digits when performing the operation. It should be quick to determine whether 15, 16, or 17 digits should be used for a particular number. I picked 15 because I think it always works. This post should also serve to answer @jessealama's two questions: this approach generalizes up to the capacity of the binary float in decimal space, which is 15-17 significant digits; and yes, this generally works as the number and variety of operations increases. |
Actually no, it's not the same problem. Sorry. There is a Test262 which does the opposite of this. assert.sameValue(Math.sumPrecise([1e308, 1e308, 0.1, 0.1, 1e30, 0.1, -1e30, -1e308, -1e308]), 0.30000000000000004); So we would need a different algorithm for my |
@sffc The way you implement (1.551).decimalAdd(-1.55); // returns 0.00099999999999989 instead of 0.001 it is supposed to return |
The README throws around the word "exact". It discusses how 0.1 as a binary float is not actually 0.1 in memory. It states, "Statistically, most human-authored decimal numbers cannot be exactly represented as a binary floating-point number (AKA float)."
These are true statements.
However, floating point numbers do have a canonical decimal equivalent. This is easy to see, because all decimal numbers with less than approximately 15 significant digits have a unique, one-to-one mapping with a binary float, an equivalence relation. (For this post, I'll assume that f64 has enough significant digits to be sufficient for the use cases Decimal seeks to cover. This constraint can be relaxed, but it is not relevant to the main point of this thread.) This behavior is exposed already in JavaScript today via string conversion.
The algorithms for binary-to-decimal and decimal-to-binary are well researched, efficient, continue to evolve, and are already implemented in major browsers.
What does this mean?
The most common argument I see for the footgun of float arithmetic is this:
I understand that this is surprising behavior. However, there are ways to solve this problem without using decimals as the numeric type.
Rounding More Often
For most operations, it's totally fine and expected to round more often, which will essentially take the binary float and make it the expected decimal equivalent.
This is currently accessible in the language via
Number.prototype.toPrecision
, which returns the string form of the equivalence relation.More developers should do this, and we should improve documentation and introductory computer science classes.
We could even add a helper function that behaves like
toPrecision
but returns aNumber
.Number.prototype.decimalAdd
Alternatively, it is also possible to write a function that works like this:
or even invent a new operator to make it look nicer
The contract of
Number.prototype.decimalAdd
is: add the binary floats as if adding their decimal equivalent, and return the binary float corresponding to the result.I claim that this can be implemented efficiently. If the efficiency of this operation is the crux, I'm happy to go into this further.
My point is that the concept of "exactness" does not imply that Decimal128 is required as a numeric data type. Exactness can be achieved with Float64 in both representation (via the equivalence relation) and arithmetic (via either developer education or helper functions on Number).
The text was updated successfully, but these errors were encountered: