Modify Trial when using a Sweeper #2293

michelkok · 2022-07-14T11:15:29Z

🚀 Feature Request

A way to modify a Trial object while using a sweeper.

Motivation

Using a sweeper the Trial object is not accessible from Python code anywhere.

A potential use case is the setting of User Attributes before or after the Sweeper launched the Trial. For example, when doing hyperparameter optimization I use the Optuna sweeper plugin. The Study gives me back the best Trial with the optimal configuration. To find my model (that is stored in a separate database), I do not want to retrain my model with the optimal setup. I'd rather just extract the reference ID of this model. User Attributes seems like a logical place to store such information. However, using the sweeper there is no way to set these as the Trial object is not accessible.

I have tried asking on StackOverflow but no answer was given. Then I looked through the codebase and it just seems impossible at the moment (unless I am missing something).

Pitch

Describe the solution you'd like
Ideally, I'd like the Trial object to be available within the main function. Something along these lines:

@hydra.main(version_base=None, config_path="conf", config_name="sweep")
def sphere(cfg: TrainConfig, trial: Trial) -> float:
    model, loss = train_model(cfg)
    model_db_id = save_model(model)
    trial.set_user_attr("model_db_id": model_id)

    return loss, trial

Describe alternatives you've considered
An alternative could be to provide some callback function that can be used to modify the Trial object like custom_search_space. I like this less though, as there is no control over when to modify the Trial (e.g., the function below is called before actually computing anything - in my example, I'd like to use it after training my model).

I have even tried to implement this for my use case but this is really just a hack and not a solution as the function is clearly not intended to do the stuff I'm doing here.

hydra:
  sweeper:
    sampler:
      seed: 123
    direction: minimize
    study_name: train_model
    storage: sqlite:///trials.db
    n_trials: 20
    n_jobs: 1
    params:
      x: range(-5.5, 5.5, step=0.5)
      y: choice(-5 ,0 ,5)
      z: choice([0, 1], [2, 3], [2, 5])
    custom_search_space: package.run.configure

def configure(_, trial: Trial) -> None:
    trial.set_user_attr("model_db_id", 123456)

Are you willing to open a pull request? (See CONTRIBUTING)
Yes, but I am not familiar with most of the codebase. But for relatively simple changes, sure!

The text was updated successfully, but these errors were encountered:

Jasha10 · 2022-07-14T18:22:36Z

For reference, there is some related discussion in #1710.

Jasha10 · 2022-10-24T23:00:57Z

there is no control over when to modify the Trial (e.g., the function below is called before actually computing anything - in my example, I'd like to use it after training my model

@mkardas, what is the use-case for modifying the trial object after training the model?

mkardas · 2022-10-25T10:15:17Z

@mkardas, what is the use-case for modifying the trial object after training the model?

@Jasha10 did you mean @michelkok?

Jasha10 · 2022-10-25T17:58:45Z

Yes, I did mean @michelkok! Sorry about the ping Marcin!

michelkok · 2022-10-26T07:14:36Z

For example, to attach User Attributes.

omry · 2022-10-26T10:13:18Z

Since Hydra is abstracting the underlying sweeper, it does not seem likely that this can be supported.
There are a few issues:

The abstraction is getting in the way. We can't easily expose objects from the underlying sweeper.
Launchers means that the actual job process may be running somewhere else. There isn't a full fledged communication channel between the launching/sweeping process and the child processes which means you may not have access to the information you intend to attach in the sweeping process.

michelkok · 2022-12-02T08:55:50Z

Sorry, I'm only coming back to this now.
I understand that my proposed solution does not really fit with the layered design of Hydra.

Is there any other way to achieve what I want?

Again, the main thing is that I'm struggling to find back the model + parameters of the best run after a sweep.
The current workflow is like this:

Run a sweep (with the Optuna Sweeper plugin).
Load the sth.db study and find the best trial.
Find parameters, model weights and metrics of the trial.

Going from step 2 to 3 seems really hard since the trial object itself contains only a bit of information and it cannot be modified.

Another approach I have tried for example is using the experimental Callback. But in that I have to maintain a state with similar properties to the sth.db which seems redundant (e.g., the best run and it's configured output directory).

michelkok added the enhancement Enhanvement request label Jul 14, 2022

Jasha10 added optuna sweep labels Jul 14, 2022

domvwt mentioned this issue Jul 18, 2023

[Optuna Sweeper Plugin]: Add user attributes to Optuna trial #2718

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify Trial when using a Sweeper #2293

Modify Trial when using a Sweeper #2293

michelkok commented Jul 14, 2022 •

edited

Loading

Jasha10 commented Jul 14, 2022

Jasha10 commented Oct 24, 2022

mkardas commented Oct 25, 2022

Jasha10 commented Oct 25, 2022

michelkok commented Oct 26, 2022

omry commented Oct 26, 2022 •

edited

Loading

michelkok commented Dec 2, 2022

Modify Trial when using a Sweeper #2293

Modify Trial when using a Sweeper #2293

Comments

michelkok commented Jul 14, 2022 • edited Loading

🚀 Feature Request

Motivation

Pitch

Jasha10 commented Jul 14, 2022

Jasha10 commented Oct 24, 2022

mkardas commented Oct 25, 2022

Jasha10 commented Oct 25, 2022

michelkok commented Oct 26, 2022

omry commented Oct 26, 2022 • edited Loading

michelkok commented Dec 2, 2022

michelkok commented Jul 14, 2022 •

edited

Loading

omry commented Oct 26, 2022 •

edited

Loading