-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'fc1' outgoing_weights are getting reset https://github.com/timoklein/redo/blob/main/src/redo.py#L120 #3
Comments
I'm not quite sure I got you correctly but let me try to explain:
The code on the new branch does that, please see here Lines 109 to 137 in 6fbd0c3
I have fixed everything except the moment-resets on that branch, please refer to it. |
I just rechecked. I believe the
|
No, this is actually correct. Good catch. I'll fix this and run some experiments to verify it. It must also be fixed for the outgoing weight bias resets here: Lines 163 to 172 in 7069e9f
It's quite interesting that the wrong resets are already improving performance substantially |
Closing this as it's implemented in #2. |
fc1
has (3136,512) params and it seems like current implementation alwaysresets
andsets 0
to the dead-neuron for out-going layer-512. The implementation is supposed toreset
the dead-neurons of theincoming-layer
andset 0
the dead-neuron of theoutgoing-layers
. https://github.com/timoklein/redo/blob/main/src/redo.py#L120The text was updated successfully, but these errors were encountered: