Difference between BADS and standard Bayesian optimization #11

zaizibai · 2024-08-01T03:20:46Z

zaizibai
Aug 1, 2024

Dear BADS developers,

thanks for making this wonderful tool. As a psychology student, i have very little knowledge about the optimization. But I can not help but curious about what is the difference between BADS and other commonly used parameter optimization function in computational model of behavior data and why BADS is better on noisy unstable loss functions. I have a rough look about the BADS paper. Seems compare to the standard Bayesian optimzation, BADS has an extra grid search poll step in case that Bayesian optimization does not work. Is this poll step the key that BADS can handle the noisy loss function well? If other commonly used optimization algorithm like (LBFGS-B, simplex or differential evolution) include the poll step, can they have equivalent performance as BADS on noisy loss functions? Thanks a lot.

lacerbi · 2024-08-01T06:42:56Z

lacerbi
Aug 1, 2024
Maintainer

Thanks for your interest in BADS.

BADS has an extra grid search poll step in case that Bayesian optimization does not work. Is this poll step the key that BADS can handle the noisy loss function well?

There are multiple reasons why BADS is (generally speaking) fairly robust to noise. Having a fallback strategy when Bayesian Optimization fails definitely is one of the main elements. Then there is a bunch of additional heuristics.

If other commonly used optimization algorithm like (LBFGS-B, simplex or differential evolution) include the poll step, can they have equivalent performance as BADS on noisy loss functions?

Do you mean "if other algorithms are included in the poll step", replacing the mesh-based search? It strongly depend on the algorithm.

Clearly changing the base algorithm would put BADS outside of the MADS framework [1], so it would automatically lose a bunch of (perhaps not so important in practice) theoretical guarantees. But that's kind of irrelevant since BADS is already out of the MADS framework for noisy targets. So, it is possible to do. In this case, you would definitely want to use a robust base algorithm (e.g., LBFGS-B is very powerful but extremely brittle; DE might be a better choice - but then it's a population algorithm).

If you meant keeping the POLL the same but changing the SEARCH - well, you can do pretty much whatever you want with search. Of course if you remove BO then you would get something very different. There is a large literature on "what you could do in the SEARCH step", starting from the original MADS framework. [1]

See also the PyBADS paper [2] maybe for a more gentle discussion (but probably not much more information than what you already have).

[1] Audet, C., & Dennis Jr, J. E. (2006). Mesh adaptive direct search algorithms for constrained optimization. SIAM Journal on optimization, 17(1), 188-217. https://epubs.siam.org/doi/abs/10.1137/040603371

[2] G Sangra Singh, L Acerbi (2024). PyBADS: Fast and robust black-box optimization in Python. Journal of Open Source Software 9 (94), 5694. https://ui.adsabs.harvard.edu/abs/2024JOSS....9.5694S/abstract

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acerbilab

Difference between BADS and standard Bayesian optimization #11

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

acerbilab

Difference between BADS and standard Bayesian optimization #11

zaizibai Aug 1, 2024

Replies: 1 comment

lacerbi Aug 1, 2024 Maintainer

zaizibai
Aug 1, 2024

lacerbi
Aug 1, 2024
Maintainer