-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LDA implementation #1132
LDA implementation #1132
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1132 +/- ##
==========================================
+ Coverage 90.27% 90.56% +0.29%
==========================================
Files 93 95 +2
Lines 7341 7518 +177
==========================================
+ Hits 6627 6809 +182
+ Misses 714 709 -5
Continue to review full report at Codecov.
|
780c4eb
to
590d4f4
Compare
15c22e0
to
7218655
Compare
0256683
to
ba08237
Compare
2ac639f
to
a22c04d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the last small change
|
||
|
||
def logistic_normal_approximation( | ||
alpha: torch.Tensor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment/docstring about what's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also realized that the sigma here was the variance when I should be returning the std, don't think it actually changes a lot though
Autoencoder Variational Bayes implementation of Latent Dirichlet Allocation as a PyroModule. Runs in the same magnitude of time as sklearn's implementation of LDA and achieves better perplexity on average. See benchmark here: https://colab.research.google.com/drive/1Iq_drlBTLadM8KJtZwIs96RdzG8MmEFA?authuser=1#scrollTo=lINSWBVshwvr.
Note: to have this work with reparametrization stably, I used a logistic normal to approximate the dirichlet distribution.