8000 Suggestion to Have multiclass.py allow prediction over one sample only ! · Issue #5135 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Suggestion to Have multiclass.py allow prediction over one sample only ! #5135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
soufanom opened this issue Aug 18, 2015 · 5 comments
Closed
Labels
Bug Easy Well-defined and straightforward way to resolve
Milestone

Comments

@soufanom
Copy link

Greetings Guys,

I came through the contributed implementation to multiclass.py in Scikit-learn. I just have a suggestion for you to consider the case when only one testing sample is passed to decision_function "Decision function for the OneVsOneClassifier". As for the current implementation, an undesirable output comes since n_samples = X.shape[0] will take a number larger than one when X is only a single list vector with some values. I may suggest you check the shape of X before parsing it in a particular way, or update the documentation to advise the user on a suggested way to get the prediction for one testing sample.

In a sense, it is true to say that usually, there is a testing set of many samples but in a specific case of mine, it was preferable to predict sample by sample. I overcome this by using X[0:1,:] instead of X[0,:] where X is a testing set of several samples.

The sklearn version I have installed is 0.16.1

I did not get an error when inputing 1d X and what I receive back are predictions as many as the length of this 1d list.

For example:

from sklearn import datasets
from sklearn.multiclass import OneVsOneClassifier
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
X, y = iris.data, iris.target
OneVsOneClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X[1,:])
Out[1]: array([0, 1, 1, 1])

And by replacing X[1,:] to be X[1:2,:] which in terms of values are the same:

OneVsOneClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X[1:2,:])
Out[2]: array([0]) # Proper output

Regards,
Othman

@amueller
Copy link
Member

linearSVC should really complain here :-/ I guess the decision function is called... hum... Not sure we can actually fix it without #4511.

@amueller
Copy link
Member
amueller commented Sep 9, 2015

This is somewhat fixed in #5152. But the current behavior is still non-sensical. I think we should put a stop-gap in master for the next release.

@amueller amueller added Bug Easy Well-defined and straightforward way to resolve Need Contributor labels Sep 9, 2015
@amueller amueller added this to the 0.17 milestone Sep 9, 2015
@ogrisel
Copy link
Member
ogrisel commented Sep 10, 2015

But the current behavior is still non-sensical. I think we should put a stop-gap in master for the next release.

What stopgap do you have in mind?

On the current master we now have:

>>> OneVsOneClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X[1,:])
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
/volatile/ogrisel/code/scikit-learn/sklearn/utils/validation.py:372: DeprecationWarning: Passing 1d arrays as data is deprecated and will be removed in 0.18. Reshape your data either usingX.reshape(-1, 1) if your data has a single feature orX.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)
array([0])
>> OneVsOneClassifier(LinearSVC(random_state=0)).fit(X, y).predict(X[1:2,:])
array([0])

This looks fine to me, right?

@ogrisel
Copy link
Member
ogrisel commented Sep 10, 2015

BTW I fixed the deprecation message in 250507f.

@amueller
Copy link
Member

It seems this got fixed somewhere between 0.16.1 and now. I don't know where, though, it is a bit odd. But let's close it.

I'm surprised by the number of deprecation warnings. I would have expected two, not six.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Easy Well-defined and straightforward way to resolve
Projects
None yet
Development

No branches or pull requests

3 participants
0