-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Description
Description
The perplexity method of the LatentDirichletAllocation class appears to have broken during the transition from scikit-learn 0.17.1 to 0.18.1. The values returned by the method are no longer consistent with the values printed during training iterations (verbose=1, evaluate_every=1).
Steps/Code to Reproduce
Gist with reproducible example can be found here: https://gist.github.com/garyForeman/321a10ebe29215a0c1acbcb4b320fb8e
Expected Results
Final perplexity printed during training should equal the value returned by the perplexity method when passed the training data.
Results when using 0.17.1:
iteration: 100, perplexity: 4044.2226
Train set perplexity: 4044.22258392
Actual Results
Results when using 0.18.1:
iteration: 100, perplexity: 4042.6522
Train set perplexity: 7592353.46945
Versions
Darwin-15.6.0-x86_64-i386-64bit
('Python', '2.7.12 |Anaconda custom (x86_64)| (default, Jul 2 20
63FB
16, 17:43:17) \n[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)]')
('NumPy', '1.11.2')
('SciPy', '0.18.1')
('Scikit-Learn', '0.18.1')