Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer length for history #6

Open
rose-jinyang opened this issue Nov 24, 2021 · 1 comment
Open

buffer length for history #6

rose-jinyang opened this issue Nov 24, 2021 · 1 comment

Comments

@rose-jinyang
Copy link

rose-jinyang commented Nov 24, 2021

Hello
How are you?
Thanks for contributing to this project.
I made my own AutoClipper class based on your code.

image

Please check if there is any problem.
Here I doubt the buffer length for the gradient history.
You mentioned the effect of ONLY percentile value in the training performance.
What about the effect of the buffer length for history?
If we set the buffer length to the number of steps in one epoch?

@pseeth
Copy link
Owner

pseeth commented Jan 4, 2022

Hi there! I think those are all fine ideas, just didn't have time to explore all the variations when I was working on this. FWIW I use AutoClip still every day in my own work. I think it's totally reasonable to only keep track for the past so-and-so iterations, but just be careful you're not clipping the gradient too much. Maybe 80 would be a reasonable default for p if you were keeping track of things in shorter histories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants