8000 Crash in univariate feature selection if no feature is selected. · Issue #4059 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Crash in univariate feature selection if no feature is selected. #4059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue Jan 7, 2015 · 7 comments
Closed

Crash in univariate feature selection if no feature is selected. #4059

amueller opened this issue Jan 7, 2015 · 7 comments
Labels
Milestone

Comments

@amueller
Copy link
Member
amueller commented Jan 7, 2015

Univariate feature selection crashes if no feature is selected with an unhelpful message:

from sklearn.feature_selection import SelectFdr

rng = np.random.RandomState(0)
X = rng.rand(40, 10) 
y = rng.randint(0, 4, size=40)

fdr = SelectFdr()
fdr.fit(X, y)
fdr.transform(X)
ValueError                                Traceback (most recent call last)
<ipython-input-7-5cd77e510247> in <module>()
----> 1 asdf.transform(X)

/home/andy/checkout/scikit-learn/sklearn/feature_selection/base.pyc in transform(self, X)
     73         """
     74         X = check_array(X, accept_sparse='csr')
---> 75         mask = self.get_support()
     76         if len(mask) != X.shape[1]:
     77             raise ValueError("X has a different shape than during fitting.")

/home/andy/checkout/scikit-learn/sklearn/feature_selection/base.pyc in get_support(self, indices)
     44             values are indices into the input feature vector.
     45         """
---> 46         mask = self._get_support_mask()
     47         return mask if not indices else np.where(mask)[0]
     48 

/home/andy/checkout/scikit-learn/sklearn/feature_selection/univariate_selection.pyc in _get_support_mask(self)
    488         alpha = self.alpha
    489         sv = np.sort(self.pvalues_)
--> 490         threshold = sv[sv < alpha * np.arange(len(self.pvalues_))].max()
    491         return self.pvalues_ <= threshold
    492 

/usr/lib/python2.7/dist-packages/numpy/core/_methods.pyc in _amax(a, axis, out, keepdims)
     15 def _amax(a, axis=None, out=None, keepdims=False):
     16     return um.maximum.reduce(a, axis=axis,
---> 17                             out=out, keepdims=keepdims)
     18 
     19 def _amin(a, axis=None, out=None, keepdims=False):

ValueError: zero-size array to reduction operation maximum which has no identity
@amueller amueller added the Bug label Jan 7, 2015
@jnothman
Copy link
Member
jnothman commented Jan 7, 2015

I assume only for sparse input, and again a nice solution is only available in very recent scipy.

@jnothman
Copy link
Member
jnothman commented Jan 7, 2015

Or perhaps not only for sparse input...

@amueller
Copy link
Member Author
amueller commented Jan 7, 2015

no, the example above is not sparse. There is just no max of an empty array.

@amueller
Copy link
Member Author
amueller commented Jan 7, 2015

Maybe bailing is the right thing to do here, but certainly not with that message. The other option would be to return an empty array, which is not super helpful.

@ogrisel
Copy link
Member
ogrisel commented Feb 5, 2015

Maybe bailing is the right thing to do here, but certainly not with that message.

I agree. Any suggestion for the class of exception we should raise for such as case? I don't think there is any standard library exception nor standard numpy exception that matches that case well. ValueError would qualify as the least offending. It also has the advantage of not breaking backward compat with users that already catch it in their code.

The other option would be to return an empty array, which is not super helpful.

-1 as well.

@ogrisel
Copy link
Member
ogrisel commented Feb 5, 2015

I am working on a fix.

@ogrisel
Copy link
Member
ogrisel commented Feb 5, 2015

Please have a look at #4206 and let me know what you think of this solution.

ogrisel added a commit to ogrisel/scikit-learn that referenced this issue Feb 6, 2015
@jnothman jnothman closed this as completed Feb 7, 2015
jnothman added a commit that referenced this issue Feb 7, 2015
[MRG] explicit warning message for strict selectors

Also fixes #4059
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
0