8000 Bad fp-comparison in check_priors (at least naive_bayes.py) · Issue #9633 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Bad fp-comparison in check_priors (at least naive_bayes.py) #9633
Closed
@sschnug

Description

@sschnug

Description

Reading this StackOverflow question lead me to check the code in naive_bayes.py where the priors are checked.

I did not check the whole method and what is internally assumed about these priors, but:

if priors.sum() != 1.0:
    raise ValueError('The sum of the priors should be 1.')

obviously calls for trouble, like in the example in the above SO-post.

import numpy as np
priors = np.array([0.08, 0.14, 0.03, 0.16, 0.11, 0.16, 0.07, 0.14, 0.11, 0.0])
my_sum = np.sum(priors)
print('my_sum: ', my_sum)
print('naive: ', my_sum == 1.0)
print('safe: ', np.isclose(my_sum, 1.0))

#('my_sum: ', 1.0000000000000002)
#('naive: ', False)
#('safe: ', True)

Steps/Code to Reproduce

Just take the official GaussianNB example and use the numbers above.

Expected Results

Safe fp-math comparison OR internal correction when input-sum is expected to be close to 1.

Using np.isclose() is a 5 second change, but without checking the remaining code (which i did not) i don't know if this will have potential to effect in errors in a later stage.

Actual Results

ValueError('The sum of the priors should be 1.')

Versions

Current master: d8c363f296948a9171ac8a5d69f79dcb56589335.

Further remarks:

numpy.random.sample() is actually doing the more safe-approach too (but not using np.isclose()) as seen here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolvehelp wanted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0