8000 LatentDirichletAllocation perplexity method broken in version 0.18.1 · Issue #7954 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

LatentDirichletAllocation perplexity method broken in version 0.18.1 #7954

@garyForeman

Description

@garyForeman

Description

The perplexity method of the LatentDirichletAllocation class appears to have broken during the transition from scikit-learn 0.17.1 to 0.18.1. The values returned by the method are no longer consistent with the values printed during training iterations (verbose=1, evaluate_every=1).

Steps/Code to Reproduce

Gist with reproducible example can be found here: https://gist.github.com/garyForeman/321a10ebe29215a0c1acbcb4b320fb8e

Expected Results

Final perplexity printed during training should equal the value returned by the perplexity method when passed the training data.

Results when using 0.17.1:
iteration: 100, perplexity: 4044.2226
Train set perplexity: 4044.22258392

Actual Results

Results when using 0.18.1:
iteration: 100, perplexity: 4042.6522
Train set perplexity: 7592353.46945

Versions

Darwin-15.6.0-x86_64-i386-64bit
('Python', '2.7.12 |Anaconda custom (x86_64)| (default, Jul 2 20 63FB 16, 17:43:17) \n[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)]')
('NumPy', '1.11.2')
('SciPy', '0.18.1')
('Scikit-Learn', '0.18.1')

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0