8000 univariate_selection -> feature_selection -> selectFDR is wrong ? · Issue #7474 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

univariate_selection -> feature_selection -> selectFDR is wrong ? #7474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mpjlu opened this issue Sep 23, 2016 · 6 comments · Fixed by #7490
Closed

univariate_selection -> feature_selection -> selectFDR is wrong ? #7474

mpjlu opened this issue Sep 23, 2016 · 6 comments · Fixed by #7490
Labels
Bug Easy Well-defined and straightforward way to resolve
Milestone

Comments

@mpjlu
Copy link
mpjlu commented Sep 23, 2016

Description

scikit-learn/sklearn/feature_selection/univariate_selection.py line 577:

def _get_support_mask(self):
    check_is_fitted(self, 'scores_')

    n_features = len(self.pvalues_)
    sv = np.sort(self.pvalues_)

    # here
    selected = sv[sv <= float(self.alpha) / n_features
                  * np.arange(n_features)]

    if selected.size == 0:
        return np.zeros_like(self.pvalues_, dtype=bool)
    return self.pvalues_ <= selected.max()

Should Be:

    selected = sv[sv <= float(self.alpha) / n_features
                  * (np.arange(n_features) + 1)]

Because np.arange is start from 0, here it should be start from 1

@agramfort
Copy link
Member

@bthirion any comment?

@jnothman
Copy link
Member

It does seem a bit of a harsh criterion!

@mpjlu
Copy link
Author
mpjlu commented Sep 25, 2016

consider the first element

@mpjlu
Copy link
Author
mpjlu commented Sep 25, 2016

hi @jnothman any more comments

@bthirion
Copy link
Member

I agree with the comment.

@jnothman
Copy link
Member

So I suppose we mark this as a (long-standing) bug and fix it without notice?

@jnothman jnothman added Bug Easy Well-defined and straightforward way to resolve Need Contributor labels Sep 26, 2016
@amueller amueller modified the milestone: 0.19 Sep 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Easy Well-defined and straightforward way to resolve
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
0