Seeding in the AgentManager and additive fits #157

mmcenta · 2022-03-22T16:25:57Z

Hello,

I've been trying to make the StableBaselinesAgent (PR #148) be compatible with additive fits but I ran into some issues:

In the _fit_worker auxiliary function, we reseed external libraries. I believe that this is done to guarantee reproducibility when doing distributed training. However, when doing two fits .fit(X), the result will not be the same as doing a single .fit(2X) because the seed will be reset halfway throughout training. Here is the code.
In the load method of AgentHandlers, we reseed the environment after loading the agent, which causes similar issues. I also noticed that the handler's seed is used to reseed the environment, which is different from the seed that was originally used. Here is the code.

I would love to know your opinions on the matter!

The text was updated successfully, but these errors were encountered:

TimotheeMathieu · 2022-03-23T15:09:19Z

Maybe we could modify rlberry Seeder in order to accept a pytorch generator as a seed_seq.
I looked into torch's rng and really they don't seem compatible with anything but themselves (they can't import a numpy rng for instance) so I don't think it is easy to reseed torch generator in the manager, it would be better to import torch generator as an rlberry Seeder.

omardrwch · 2022-04-19T19:46:31Z

Regarding @mmcenta 's point:

.fit(X), the result will not be the same as doing a single .fit(2X) because the seed will be reset halfway throughout training.

I think it's ok to have this behavior in AgentManager only, as long as the whole pipeline (parameters -> manager -> outputs) is reproducible. I believe it's important to enforce the additive property of fit() only at the Agent level, to make sure that the optimization done by AgentManager.optimize_hyperparams makes sense when fit_fraction < 1 (that is, when fit() is called several times to evaluate hyperparameters).

mmcenta self-assigned this Mar 22, 2022

mmcenta added bug Something isn't working help wanted Extra attention is needed labels Mar 22, 2022

mmcenta mentioned this issue Mar 22, 2022

[MRG] StableBaselines Agent #148

Merged

TimotheeMathieu mentioned this issue Mar 24, 2022

[WIP] (feat) Seeding torch & rlberry #158

Open

KohlerHECTOR closed this as completed Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeding in the AgentManager and additive fits #157

Seeding in the AgentManager and additive fits #157

mmcenta commented Mar 22, 2022

TimotheeMathieu commented Mar 23, 2022 •

edited

Loading

omardrwch commented Apr 19, 2022

Seeding in the AgentManager and additive fits #157

Seeding in the AgentManager and additive fits #157

Comments

mmcenta commented Mar 22, 2022

TimotheeMathieu commented Mar 23, 2022 • edited Loading

omardrwch commented Apr 19, 2022

TimotheeMathieu commented Mar 23, 2022 •

edited

Loading