Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Trial when using a Sweeper #2293

Open
michelkok opened this issue Jul 14, 2022 · 7 comments
Open

Modify Trial when using a Sweeper #2293

michelkok opened this issue Jul 14, 2022 · 7 comments
Labels

Comments

@michelkok
Copy link

michelkok commented Jul 14, 2022

🚀 Feature Request

A way to modify a Trial object while using a sweeper.

Motivation

Using a sweeper the Trial object is not accessible from Python code anywhere.

A potential use case is the setting of User Attributes before or after the Sweeper launched the Trial. For example, when doing hyperparameter optimization I use the Optuna sweeper plugin. The Study gives me back the best Trial with the optimal configuration. To find my model (that is stored in a separate database), I do not want to retrain my model with the optimal setup. I'd rather just extract the reference ID of this model. User Attributes seems like a logical place to store such information. However, using the sweeper there is no way to set these as the Trial object is not accessible.

I have tried asking on StackOverflow but no answer was given. Then I looked through the codebase and it just seems impossible at the moment (unless I am missing something).

Pitch

Describe the solution you'd like
Ideally, I'd like the Trial object to be available within the main function. Something along these lines:

@hydra.main(version_base=None, config_path="conf", config_name="sweep")
def sphere(cfg: TrainConfig, trial: Trial) -> float:
    model, loss = train_model(cfg)
    model_db_id = save_model(model)
    trial.set_user_attr("model_db_id": model_id)

    return loss, trial

Describe alternatives you've considered
An alternative could be to provide some callback function that can be used to modify the Trial object like custom_search_space. I like this less though, as there is no control over when to modify the Trial (e.g., the function below is called before actually computing anything - in my example, I'd like to use it after training my model).

I have even tried to implement this for my use case but this is really just a hack and not a solution as the function is clearly not intended to do the stuff I'm doing here.

hydra:
  sweeper:
    sampler:
      seed: 123
    direction: minimize
    study_name: train_model
    storage: sqlite:///trials.db
    n_trials: 20
    n_jobs: 1
    params:
      x: range(-5.5, 5.5, step=0.5)
      y: choice(-5 ,0 ,5)
      z: choice([0, 1], [2, 3], [2, 5])
    custom_search_space: package.run.configure
def configure(_, trial: Trial) -> None:
    trial.set_user_attr("model_db_id", 123456)

Are you willing to open a pull request? (See CONTRIBUTING)
Yes, but I am not familiar with most of the codebase. But for relatively simple changes, sure!

@michelkok michelkok added the enhancement Enhanvement request label Jul 14, 2022
@Jasha10
Copy link
Collaborator

Jasha10 commented Jul 14, 2022

For reference, there is some related discussion in #1710.

@Jasha10
Copy link
Collaborator

Jasha10 commented Oct 24, 2022

there is no control over when to modify the Trial (e.g., the function below is called before actually computing anything - in my example, I'd like to use it after training my model

@mkardas, what is the use-case for modifying the trial object after training the model?

@mkardas
Copy link

mkardas commented Oct 25, 2022

@mkardas, what is the use-case for modifying the trial object after training the model?

@Jasha10 did you mean @michelkok?

@Jasha10
Copy link
Collaborator

Jasha10 commented Oct 25, 2022

Yes, I did mean @michelkok! Sorry about the ping Marcin!

@michelkok
Copy link
Author

For example, to attach User Attributes.

@omry
Copy link
Collaborator

omry commented Oct 26, 2022

Since Hydra is abstracting the underlying sweeper, it does not seem likely that this can be supported.
There are a few issues:

  1. The abstraction is getting in the way. We can't easily expose objects from the underlying sweeper.
  2. Launchers means that the actual job process may be running somewhere else. There isn't a full fledged communication channel between the launching/sweeping process and the child processes which means you may not have access to the information you intend to attach in the sweeping process.

@michelkok
Copy link
Author

Sorry, I'm only coming back to this now.
I understand that my proposed solution does not really fit with the layered design of Hydra.

Is there any other way to achieve what I want?

Again, the main thing is that I'm struggling to find back the model + parameters of the best run after a sweep.
The current workflow is like this:

  1. Run a sweep (with the Optuna Sweeper plugin).
  2. Load the sth.db study and find the best trial.
  3. Find parameters, model weights and metrics of the trial.

Going from step 2 to 3 seems really hard since the trial object itself contains only a bit of information and it cannot be modified.

Another approach I have tried for example is using the experimental Callback. But in that I have to maintain a state with similar properties to the sth.db which seems redundant (e.g., the best run and it's configured output directory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants