Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional normalization (cont.) #98

Merged
merged 7 commits into from
Nov 20, 2022
Merged

Conversation

maxreciprocate
Copy link
Collaborator

@maxreciprocate maxreciprocate commented Nov 17, 2022

  • change scaling reward to have three options: ref/running/none
  • add single thread test of running mean calculations

https://wandb.ai/sorry/public/runs/2x8totg2

@Dahoas
Copy link
Collaborator

Dahoas commented Nov 18, 2022

I just noticed all the unittests are no longer being run upon pr submission. Do we know why?

@Dahoas
Copy link
Collaborator

Dahoas commented Nov 18, 2022

Things look good. Can you just provide a way for users to optionally specify a ref_mean and ref_var in the ppo config? You can make them Optional[float] parameters defaulting to false in the PPOMethodConfig

@Dahoas
Copy link
Collaborator

Dahoas commented Nov 18, 2022

Also louis will want you to address all his comments from before

@maxreciprocate
Copy link
Collaborator Author

maxreciprocate commented Nov 18, 2022

I just noticed all the unittests are no longer being run upon pr submission. Do we know why?

#92 was the last pr on which unittests ran, I can fix it in this pr
it's about master->main change

branches: [ master ]

Also louis will want you to address all his comments from before

I think I have haven't I, i.e. running var calculation and storing ref_*

@maxreciprocate maxreciprocate mentioned this pull request Nov 18, 2022
@Dahoas
Copy link
Collaborator

Dahoas commented Nov 18, 2022

@LouisCastricato Are you happy with the comments?

Once the ppoconfig is modified to accept pre-computed ref_mean, ref_var I'm good to merge

@LouisCastricato
Copy link
Contributor

I think this is ok to merge

@LouisCastricato LouisCastricato merged commit 3db86ca into main Nov 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants