You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.
In the RNN classification example, using characters of names to predict the names language, the train function re-zeros the hidden state (and gradient) every epoch. I was wondering why this is done, instead of carrying over the final hidden states of the epoch before?
The text was updated successfully, but these errors were encountered:
One epoch means a run-through of a word. If we start a new epoch, which means we are training the network with a new word, we need to redefine the hidden state of the initial letter of the new word, since states of different words are independent.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In the RNN classification example, using characters of names to predict the names language, the train function re-zeros the hidden state (and gradient) every epoch. I was wondering why this is done, instead of carrying over the final hidden states of the epoch before?
The text was updated successfully, but these errors were encountered: