Skip to content

Conversation

@ahban
Copy link
Contributor

@ahban ahban commented Feb 15, 2019

this commit is simple. but it is essential for some people trying to understand the natural gradient method.


Now, a note on what we do on time t = 0, i.e. for the first minibatch. We
initialize X_0 to the top R eigenvectors of 1/N X_0 X_0^T, where N is the
initialize R_0 to the top R eigenvectors of 1/N X_0 X_0^T, where N is the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be another problem here. I believe X is (N by D) where N is the minibatch size. (Although elsewhere I seem to have used M for the minibatch size so that might be a better letter). We want the eigenvectors of a D x D matrix, so it should be X_0^T X_0. That appears in the lines immediately below, too.

@danpovey
Copy link
Contributor

danpovey commented Mar 5, 2019

closing this because I have some more extensive fixes I want to merge about those comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants