-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Bayesian priors in nearest neighbors classification/regression #399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I did not understood what to do you mean by Bayesian prior and where is flat prior used in the neighbours module? Please Pardon me,I am relatively new to Machine Learning and scikit-learn. |
If you're doing KNN classification on two classes with unequal sizes (i.e. 1000 foreground objects among 100,000 background objects) then the KNN classifier as written may classify everything as background simply because of the imbalance in numbers. A Bayesian prior would weight the classification based on the relative number of samples, and potentially lead to a better classification. It's similar to the |
I'm working on this right now during the SciPy sprint. |
There's some work going on for this here as well: #970 |
Ack, that looks all but fixed. I'm moving on. |
Maybe #9597 fixed this? I am not sure |
If somebody wants to fix this, they can check #970 and continue that work. |
Currently the classification and regression algorithms in the
neighbors
module use a flat prior. They should be modified to compute a prior based on training data, and to optionally accept a user-defined prior.The text was updated successfully, but these errors were encountered: