-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Perplexity not monotonically decreasing for batch Latent Dirichlet Allocation #6777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Have you solved this problem? The literature states that the perplexity should decrease with the number of topics increases. I tried this both on my dataset and sklearn.datasets, but the perplexity didn't go down in either case. I also tried assigning evaluate_every a non-zero number, but it didn't print out the perplexity over iterations. Did I make any mistake? |
There might be an issue in the implementation, but I'm not sure :( |
@kenanz0630 you should see the perplexity at each iteration, though. Are you doing batch or online learning? |
Not sure if this is related to #7992 |
@kenanz0630 can you provide sample code? And can you try running the code from #7992? Because obviously I can't reproduce any more :( |
Do we have a minimal failing example? It seems unclear if this issue remains. |
I tried gridsearch on the number of topics for LDA recently and log likelihood scores for both training and validation sets monotonically decreases with increasing number of topics. |
I see the same behavior, that makes this implementation almost useless. I'm using sklearn 0.23.2 |
Hi, I am having trouble with the same bug using the latest sklearn version. Has anyone got any idea how to sort it/ what alternative routes to use to get perplexity scores? I think this would be very useful. |
Hi, is there any news regarding this ? It's been five years and this bug is still present, I think it'd be wise to hide this feature entirely or mention it's broken in the docs at the very least |
Nobody has posted a minimal reproducing example yet as far as I can see. |
I totally forgot about this bug. I faced this issue when doing one of my projects. This seems to happen because of document lengths. |
I'm seeing this issue as well... |
Same here, guys. Any tips on how to solve it? |
When using the batch method, the perplexity in LDA should be non-increasing in every iteration, right?
I have cases where it does increase. If this is indeed a bug, I'll investigate.
The text was updated successfully, but these errors were encountered: