Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPSILON is a bad error margin and should not be recommended [float_cmp] #6816

Closed
CAD97 opened this issue Mar 1, 2021 · 12 comments · Fixed by #13079 · May be fixed by #11948
Closed

EPSILON is a bad error margin and should not be recommended [float_cmp] #6816

CAD97 opened this issue Mar 1, 2021 · 12 comments · Fixed by #13079 · May be fixed by #11948
Labels
A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions

Comments

@CAD97
Copy link
Contributor

CAD97 commented Mar 1, 2021

[f32|f64]::EPSILON are the machine epsilon of the type, or (as stated in the Rust docs), the distance between 1.0 and the next representable floating point number.

The page linked in the more info specifically says that abs( a - b ) < epsilon is wrong for any value of espilon. However, it's especially egregious with f__::EPSILON, because for floating point numbers outside the range -2..=2, floating point numbers cannot be f__::EPSILON close, so abs( a - b ) < f__::EPSILON is actually equivalent to a strict equality check.

There isn't a generally applicable solution to recommend. The most thorough resource I've found suggests comparison in ULPs for testing against a non-zero number, and testing against a fixed epsilon (but one bigger than f__::EPSILON.

At the least, we (and probably std) shouldn't be recommending comparing against f__::EPSILON, as it's basically as poor as bitwise equality and gives a false sense of handling the problem, when it isn't really handled.

@llogiq
Copy link
Contributor

llogiq commented Mar 1, 2021

When I was still writing float-heavy code, I always had a function to calculate a suitable epsilon-value based on the values to compare and the number of mantissa bits that I expected to be equal.

@camsteffen
Copy link
Contributor

So then should we lint against abs(a - b) < f__::EPSILON?

@camsteffen camsteffen added A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions labels Mar 2, 2021
@camsteffen
Copy link
Contributor

This also has implications for float_equality_without_abs.

@CAD97
Copy link
Contributor Author

CAD97 commented Mar 2, 2021

So then should we lint against abs(a - b) < f__::EPSILON?

I'm not completely certain. In cases where a and b are expected to be "small" (for some value of "small" smaller than 1), this might be a reasonable comparison epsilon, because ULPs for small enough floats are really small, and the machine epsilon might be a reasonable bound for "small enough". At the same time, the article I linked recommends some small multiple of the machine epsilon for these cases.

That said, I do think that abs(a - b) < f__::EPSILON most likely came about by an unfortunately misguided attempt to address the "imprecise floats problem." As such, I think a lint against specifically < f__::EPSILON is fair, so long as it doesn't fire for small multiples of the machine epsilon, as that shows that some amount of thought went into picking the comparison epsilon.

To be extra pedantic, a pedantic lint against abs(a - b) < ANY_FLOAT_CONST where a and b are function arguments could be argued for, pointing out how a fixed epsilon is subtly wrong for arbitrary floating point due to its varrying precision.

I don't have any strong opinions on linting float comparisons, though, beyond not recommending comparison against machine epsilon.

@ghost
Copy link

ghost commented Feb 16, 2022

The old classic "What Every Computer Scientist Should Know About Floating-Point Arithmetic" says that when a real number is rounded to the closest floating point number it's relative error is bounded by machine epsilon. Note that is relative error not absolute.

So based on that I would expect the correct comparison to be abs( (a - b)/a ) <= epsilon for a != 0. When a is zero I guess you could use fxx::MIN_POSITIVE.

@ghost
Copy link

ghost commented Feb 16, 2022

Oh, it looks like fxx::MIN_POSITIVE is the smallest positive normal value. So that wouldn't be right.

@trentj
Copy link

trentj commented Feb 18, 2023

This bad suggestion came up on URLO.

@CAD97
Copy link
Contributor Author

CAD97 commented May 10, 2024

This came up on IRLO again and will likely continue to do so until this is somehow categorically fixed.

@curoli
Copy link

curoli commented May 10, 2024

Is it that hard? The test should be:

2.0*abs(a-b) <= eps*(abs(a) + abs(b))

where eps is the relative precision the user needs. If eps is around f__::EPSILON or smaller, it becomes strict equality, which most algorithms will fail to achieve, so you want eps to be larger than f__::EPSILON, probably at least twice as large.

But before we do any of the above, consider alternatives:

  • Use integers if possible. Some calculations, like taxes, wages, bank statements, or election results, should not be implemented using floating points, but fixed point numbers, i.e. integers.
  • If you are checking whether a number is zero to see if you can divide by it, or whether the determinant of a matrix is zero to see whether it can be inverted, keep in mind that you will often get inaccurate results if these values are close to zero. Consider alternative algorithms. Consider recognizing that your algorithm does not work in all cases.
  • If you have an iterative algorithm that converges to a solution, instead of checking if the solution is good, check whether iterations make it better or not. Even better if you can figure out in advance how many iteration you need.

@curoli
Copy link

curoli commented May 10, 2024

Also, I'd say use f64 instead of f32. Now that most architectures are 64 bit, I'm not sure using f32 buys you anything, and the added precision of f64 makes things way more robust.

@jedbrown
Copy link

A backward-stable algorithm evaluating a function $f$ with condition number $\kappa$ has a relative error bounded by $\kappa , \epsilon_{\text{machine}}$. In many cases for which $f(x) = 0$, the condition number blows up. For example, the algorithm |x| (1.0 + x) - 1.0 is unstable because its second operation has unbounded condition number as $x\to 0$, and indeed produces relative error of size 1. We can't give reliable advice for a (relative or absolute) tolerance without knowing the condition numbers involved. In the above example, the condition number of the entire function is 1 (great!) but the algorithm is unstable and thus violates any useful relative error bound. It would be nice if users always wrote backward-stable algorithms, but it's going to be really confusing if the lint is telling people to test with a tolerance that can only be achieved with by doing so. When the functions are differentiable, we could estimate a condition number and provide diagnostics about numerical stability using Enzyme (cf. rust-lang/rust#124509).

@sandersaares
Copy link
Contributor

sandersaares commented Jul 9, 2024

Submitted #13079 to fix this. As already described above in this thread, there are further implications for float_equality_without_abs and we should probably have an explicit "do not use plain < EPSILON" lint but these can be taken care of as separate items - focused on just the bad guidance right now.

bors added a commit that referenced this issue Jul 10, 2024
…=y21

Fix guidance of [`float_cmp`] and [`float_cmp_const`] to not incorrectly recommend `f__::EPSILON` as the error margin.

Using `f32::EPSILON` or `f64::EPSILON` as the floating-point equality comparison error margin is incorrect, yet `float_cmp` has until now recommended this be done. This change fixes the given guidance (both in docs and compiler hints) to not reference these unsuitable constants.

Instead, the guidance now clarifies that the scenarios in which an absolute error margin is usable, provides a sample implementation for using a user-defined absolute error margin (as an absolute error margin can only be used-defined and may be different for different comparisons) and references the floating point guide for a reference implementation of relative error based equality comparison for cases where absolute error margins cannot be identified.

changelog: [`float_cmp`] Fix guidance to not incorrectly recommend `f__::EPSILON` as the error margin.
changelog: [`float_cmp_const`] Fix guidance to not incorrectly recommend `f__::EPSILON` as the error margin.

Fixes #6816
@bors bors closed this as completed in a067cd2 Jul 10, 2024
bors added a commit that referenced this issue Jul 16, 2024
`float_cmp` changes

fixes #2834
fixes #6816

---

changelog: This:
* Deprecated `float_cmp_const` in favor of a config option on [`float_cmp`].
  [#11948](#11948)
* [`float_cmp`]: Don't lint literal and self comparisons.
  [#11948](#11948)
* [`float_cmp`] [#11948](#11948)
    * Add the [`float-cmp-ignore-named-constants`] configuration to ignore comparisons to named constants.
    * Add the [`float-cmp-ignore-change-detection`] configuration to ignore const-evaluatabled values.
    * Add the [`float-cmp-ignore-constant-comparisons`] configuration to ignore comparisons to the modification of an operand (e.g. `x == f(x)`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-documentation Area: Adding or improving documentation C-bug Category: Clippy is not doing the correct thing good-first-issue These issues are a good way to get started with Clippy L-suggestion Lint: Improving, adding or fixing lint suggestions
Projects
None yet
7 participants