Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeding in the AgentManager and additive fits #157

Closed
mmcenta opened this issue Mar 22, 2022 · 2 comments
Closed

Seeding in the AgentManager and additive fits #157

mmcenta opened this issue Mar 22, 2022 · 2 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@mmcenta
Copy link
Collaborator

mmcenta commented Mar 22, 2022

Hello,

I've been trying to make the StableBaselinesAgent (PR #148) be compatible with additive fits but I ran into some issues:

  1. In the _fit_worker auxiliary function, we reseed external libraries. I believe that this is done to guarantee reproducibility when doing distributed training. However, when doing two fits .fit(X), the result will not be the same as doing a single .fit(2X) because the seed will be reset halfway throughout training. Here is the code.
  2. In the load method of AgentHandlers, we reseed the environment after loading the agent, which causes similar issues. I also noticed that the handler's seed is used to reseed the environment, which is different from the seed that was originally used. Here is the code.

I would love to know your opinions on the matter!

@mmcenta mmcenta self-assigned this Mar 22, 2022
@mmcenta mmcenta added bug Something isn't working help wanted Extra attention is needed labels Mar 22, 2022
@TimotheeMathieu
Copy link
Collaborator

TimotheeMathieu commented Mar 23, 2022

Maybe we could modify rlberry Seeder in order to accept a pytorch generator as a seed_seq.
I looked into torch's rng and really they don't seem compatible with anything but themselves (they can't import a numpy rng for instance) so I don't think it is easy to reseed torch generator in the manager, it would be better to import torch generator as an rlberry Seeder.

@omardrwch
Copy link
Member

Regarding @mmcenta 's point:

.fit(X), the result will not be the same as doing a single .fit(2X) because the seed will be reset halfway throughout training.

I think it's ok to have this behavior in AgentManager only, as long as the whole pipeline (parameters -> manager -> outputs) is reproducible. I believe it's important to enforce the additive property of fit() only at the Agent level, to make sure that the optimization done by AgentManager.optimize_hyperparams makes sense when fit_fraction < 1 (that is, when fit() is called several times to evaluate hyperparameters).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants