multi-objective optimization (e.g. qNEHVI) vs. scalarized objectives. Which to choose? #1210

sgbaird · 2022-10-15T16:09:44Z

I put together a tutorial illustrating the use of Ax's multi-objective optimization functionality and comparing this against scalarization approaches. When using a scalarized quantity to compare performance, it makes sense that the scalarized objectives do better than MOO. However, when looking at Pareto fronts and comparing them against a naive scalarization approach (sum the two objectives), I was surprised to see that, in general, the naive scalarization Pareto fronts seem better. This was on a straightforward, 3-parameter task with a single local maximum AFAIK. The task is meant as a teaching demo (see e.g. notebook tutorials). In particular, the notebook is 6.1-multi-objective.ipynb, linked above.

I noticed that I regularly got the following warning during MOO:

c:\Users\<USERNAME>\Miniconda3\envs\sdl-demo\lib\site-packages\ax\modelbridge\transforms\winsorize.py:240: UserWarning:

Automatic winsorization isn't supported for an objective in `MultiObjective` without objective thresholds. Specify the winsorization settings manually if you want to winsorize metric frechet.

c:\Users\sterg\Miniconda3\envs\sdl-demo\lib\site-packages\ax\modelbridge\transforms\winsorize.py:240: UserWarning:

Automatic winsorization isn't supported for an objective in `MultiObjective` without objective thresholds. Specify the winsorization settings manually if you want to winsorize metric luminous_intensity.

https://ax.dev/api/modelbridge.html#ax.modelbridge.transforms.winsorize.Winsorize
- Clip the mean values for each metric to lay within the limits provided in the config.
https://en.wikipedia.org/wiki/Winsorizing
- Winsorizing or winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers.

Out of the sklearn preprocessing scalers, winsorization seems most comparable to sklearn's RobustScaler (interesting that it was the 3rd hit when searching for winsorization sklearn). There's also a winsorization function in sklearn. This is my attempt to frame it in light of things I'm somewhat familiar with.

Maybe I chose a poorly suited task to use for making this comparison.
Does anything seem amiss in the implementation?
Is part of the issue perhaps that I'm not specifying thresholds?
Open to any thoughts/feedback

The text was updated successfully, but these errors were encountered:

sdaulton · 2022-10-15T20:00:48Z

Thanks for documenting this. It looks like you have a cool use case! I took a look at your notebook. Just to confirm my understanding:

For multi-objective optimization (qNEHVI):

you have 8 objectives (delta_*)
frechet and luminosity are tracking metrics (not objectives that are optimized)

For optimizing a scalarized objective:

you are optimizing a scalarized objective of frechet+luminosity

If this is the case, it is not surprising that optimizing a scalarized objective of frechet+luminosity works well since you are looking at the Pareto frontier for those metrics (separately) and those metrics are not targeted by qNEHVI (based on the configured experiment). Furthermore, optimizing (learning the pareto frontier across) 8 objectives simultaneously is difficult. Why not optimize frechet and luminosity with qNEHVI rather than (delta_*) if you care about frechet and luminosity? (Note: I don't know what these metrics are)

A couple other notes:

is 5000 a good choice of threshold for the delta_* metrics?
Are your simulations noisy? If not, get_observed_pareto_frontiers would an easy way of evaluating the pareto frontier across the evaluated (in-sample) designs

sgbaird · 2022-10-16T02:28:25Z

@sdaulton thanks for your response!

For multi-objective optimization (qNEHVI):

you have 8 objectives (delta_*)

frechet and luminosity are tracking metrics (not objectives that are optimized)

I had two sets of simulations in the order that I was exploring things, which probably made it confusing. In the first set of simulations, I had 8 objectives (delta_*), but in the second set of simulations, I defined Frechet distance of the currently observed discrete spectrum relative to the target spectrum as the first objective, and an approximate luminosity (i.e. the radiated power of the LEDs) as the second objective. I compared MOO with frechet and luminosity (set to minimize and no thresholds) to single objective optimization with scalarized_objective = frechet + luminosity.

If this is the case, it is not surprising that optimizing a scalarized objective of frechet+luminosity works well since you are looking at the Pareto frontier for those metrics (separately) and those metrics are not targeted by qNEHVI (based on the configured experiment). Furthermore, optimizing (learning the pareto frontier across) 8 objectives simultaneously is difficult. Why not optimize frechet and luminosity with qNEHVI rather than (delta_*) if you care about frechet and luminosity? (Note: I don't know what these metrics are)

AFAIK, qNEHVI was operating directly on frechet and luminosity (two-objective optimization). I was surprised to see that the Pareto fronts seemed better with the scalarized objective than for qNEHVI with frechet and luminosity.

A couple other notes:

is 5000 a good choice of threshold for the delta_* metrics?

Basically it's something "low", where the max might be 50k or so.

Are your simulations noisy? If not, get_observed_pareto_frontiers would an easy way of evaluating the pareto frontier across the evaluated (in-sample) designs

The simulations aren't noisy. Thank you! I was wondering about that. I'll plan on running it again with get_observed_pareto_frontiers.

sgbaird · 2022-10-16T02:34:06Z

@sdaulton I'm noticing that get_observed_pareto_frontiers takes different arguments than compute_posterior_pareto_frontier. In particular the following are not present in the former:

    primary_objective=objectives[0].metric,
    secondary_objective=objectives[1].metric,
    absolute_metrics=[objectives[0].metric_names[0], objectives[1].metric_names[0]],

Should I refactor my hacky scalarized objective (where I sum the two objectives in the evaluate function) and use a proper ax.core.objective.ScalarizedObjective instead? #883

Right now, the scalarized kwargs to compute_posterior_pareto_frontier are:

    primary_objective=experiment.tracking_metrics[0],
    secondary_objective=experiment.tracking_metrics[1],
    absolute_metrics=[
        experiment.tracking_metrics[0].name,
        experiment.tracking_metrics[1].name,
        scalar_name,
    ],

Or is there some other workaround you'd suggest for comparing the two Pareto frontiers?

sdaulton · 2022-10-23T15:56:29Z

Ah thank for clarifying your setup. For the second MOO experiment on frechet and luminosity, what are the inferred objective thresholds? Also, it looks like those plots are gone from your notebook.

Should I refactor my hacky scalarized objective (where I sum the two objectives in the evaluate function) and use a proper ax.core.objective.ScalarizedObjective instead?

ScalarizedObjective will model the outcomes independently, whereas if you scalarize the metrics yourself and provide a single scalar metric to Ax, only thenscalarized metric will be modeled. If the objectives are quite correlated, then modeling the scalarized metric will likely give better results.

For plotting the observed metrics (including tracking metrics) for the evaluated designs (as in get_observed_pareto_frontier), it might be easier to follow this example : https://ax.dev/tutorials/multiobjective_optimization.html#Plot-empirical-data.

This style of plot is also nice because it shows the observations collected over time, which might provide more insight into the behavior of the method during data collecton

lena-kashtelyan · 2022-10-31T21:48:18Z

@sgbaird, did you get a full answers to your questions or are there unresolved follow-ups?

sgbaird · 2022-10-31T21:52:12Z

@sdaulton my bad, I thought I had responded to this already. I will need to go back and check what the inferred thresholds were. Thanks for the detailed response!

I plan to follow the example you linked and post the updated results here.

@lena-kashtelyan I think it's resolved to a good enough point. Will close for now! Thanks for checking in.

danielcohenlive assigned sdaulton Oct 17, 2022

sgbaird closed this as completed Oct 31, 2022

sgbaird mentioned this issue Apr 20, 2023

[Question] Number of trials and batches in offline optimization + tracking issue to learn of experiment outcomes #1562

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-objective optimization (e.g. qNEHVI) vs. scalarized objectives. Which to choose? #1210

multi-objective optimization (e.g. qNEHVI) vs. scalarized objectives. Which to choose? #1210

sgbaird commented Oct 15, 2022

sdaulton commented Oct 15, 2022

sgbaird commented Oct 16, 2022

sgbaird commented Oct 16, 2022

sdaulton commented Oct 23, 2022

lena-kashtelyan commented Oct 31, 2022

sgbaird commented Oct 31, 2022

multi-objective optimization (e.g. qNEHVI) vs. scalarized objectives. Which to choose? #1210

multi-objective optimization (e.g. qNEHVI) vs. scalarized objectives. Which to choose? #1210

Comments

sgbaird commented Oct 15, 2022

sdaulton commented Oct 15, 2022

sgbaird commented Oct 16, 2022

sgbaird commented Oct 16, 2022

sdaulton commented Oct 23, 2022

lena-kashtelyan commented Oct 31, 2022

sgbaird commented Oct 31, 2022