Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option errorTolerance is misnamed and theoretically wrong #18

Open
cshaa opened this issue Sep 7, 2018 · 8 comments · May be fixed by #19
Open

Option errorTolerance is misnamed and theoretically wrong #18

cshaa opened this issue Sep 7, 2018 · 8 comments · May be fixed by #19

Comments

@cshaa
Copy link

cshaa commented Sep 7, 2018

As can be seen in index.js and errorCalculation.js, the option named errorTolerance is there to stop the program after the sum of absolute values of residuals drops below some critical value.

This, however, makes little sense, as 1) it's named error tolerance despite errors and residuals being two different concepts and 2) the algorithm minimizes χ², the sum of squares of residuals, not their absolute values. Furthermore I don't see why would anybody want to stop at a particular value of χ² (except for χₘᵢₙ² + 1 for which you still need to know the minimum value), as one usually has no idea how good the fit is going to be – that's the main reason they use this algorithm, isn't it?

I propose deprecating this option and replacing it with something like parameterEpsilon which would stop the loop when none of the parameters changed by more than this constant. (Which still can be tricky with some functions so I'm looking forward for your opinions.)

@cshaa
Copy link
Author

cshaa commented Sep 8, 2018

Okay, I checked Wikipedia and it says:

If either the length of the calculated step δ or the reduction of sum of squares from the latest parameter vector β + δ fall below predefined limits, iteration stops, and the last parameter vector β is considered to be the solution.

It seems that it really is standard to use sum of residuals, not the sum of their squares, contrary to my previous comment. But they use the reduction of that sum, not the sum itself. Therefore it seems that the current implementation was just an oversight and could be classified as a bug, and should be repaired even if it slightly changes the API.

The name, however, is quite unfortunate as the value has nothing to do with tolerance to errors. The same is true with result.parameterError which is again the sum of residuals, not the approximated error of parameters as one would expect. These names should probably change in the next version.

EDIT: It seems that the reduction of χ² gets really low during the computation. The algorithm often stops at iteraton 5 even for values as low as 1e-15. If it's not just some kind of trivial typo I made, I might need help with this issue 😞

@cshaa cshaa linked a pull request Sep 8, 2018 that will close this issue
7 tasks
@cshaa
Copy link
Author

cshaa commented Sep 8, 2018

Fine, the reason why the algorithm stops very early even for really small values of errorTolerance is that… it's really that slow! 😮 Somebody forgot to update the damping parameter in the step method, so now it's very far from the steepest descend method. I'm going to correct this in the PR.

@jobo322
Copy link
Member

jobo322 commented Sep 8, 2018

Hello m93a, I am working on this package, update the damping parameter is a good and necessary improvement for this package, may be this paper can help you.

@cshaa
Copy link
Author

cshaa commented Sep 9, 2018

Thanks for the paper, I'll take a look at that!

I've implemented the adaptive damping parameter in PR and the algorithm got significantly faster, often converging in under 20 steps. Sadly some tests began to fail after the upgrade. I believe that some of them were just too specific and were tuned to match the result of the algorithm and not the other way around. Not all of them, however. For example the now-failing sine wave fit has exact data evaluated from the same function, but the algorithm runs away towards infinite frequency. I'm not saying it's a bad fit – you can fit any data with a sine wave of infinite amplitude and frequency – but certainly it's not a useful one.

@jacobq
Copy link
Contributor

jacobq commented Oct 17, 2018

Wow, I didn't realize that the damping parameter wasn't being adjusted during the algorithm. That explains a lot! I would suggest that in order to avoid breaking the API (i.e. the algorithm may no longer converge given the same inputs) you also add an option to enable this (default off). For example,

const options = {
  // ...
  damping: 0.1,
  adjustDamping: true  // new option to enable new feature
};

@cshaa
Copy link
Author

cshaa commented Oct 17, 2018

@jacobq I'm not sure about that… Self-adjusting lambda parameter is the most distinctive feature of LM. I can imagine that many people who download the package will have no clue that this isn't enabled by default. After all, neither of the two of us knew that.

One possibility would be to deprecate the damping parameter (remove it from the README, document it as deprecated in jsDoc) and treat all calls with damping !== undefined as fixed-lambda. My thinking was that people who really cared about the exact result optimized the damping parameter to fit their needs, while those who only needed a result didn't care about damping and left it unset.

It's true, however, that this is still a breaking change in some cases. (If you're worried about the risk of falling into a different local minimum than before.) Another solution would be to mark the new version as 2.0 which doesn't need to be backwards-compatible according to semver.


If anything in my comment didn't seem to make sense to you, it's probably because it really doesn't. I'm really tired right now. 😀😴

@jacobq
Copy link
Contributor

jacobq commented Oct 17, 2018

Sure, I'm just saying that rather than (1) make a breaking change to 1.x or (2) bumping to 2.x, just add support to 1.x in a non-breaking way and document it, even if that means the API becomes a little awkward. Doubtless there will be other feature requests and more feedback collected as time goes on. Once it's clear what breaking changes are desirable, they can all be wrapped together into just one major version bump. This is nice for people upgrading as they don't need to change their code as often. In the same way, it's nice for the test suite as existing tests don't need to be revised as often.

@targos
Copy link
Member

targos commented Oct 18, 2018

I propose that we do as many breaking changes as we need to make a good version 2.0.
To gather some feedback, we can release pre-versions of the package based off master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants