-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Minimum redundancy maximum relevance (mRMR) feature selection #2547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pep8 somehow does not identify this error, while Travis CI does
------- | ||
_compute_mRMR(X, y) | ||
Computes the minimal relevance maximal redundancy of each feature | ||
returning mask and score |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring should never refer to private methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
It would be cool to have an example that compares this method with univariate selection and RFE. |
This looks good, thanks. I am not totally convinced by the tests, though. I am not familiar with the method but it would be good if the expected result from the model could be computed in an easy way. Currently it looks like the scores are some magic numbers and I don't know what they mean. |
Nice suggestion! I will insert that kind of example in the documentation, as soon as I get familiar with sphinx. About the testing, I emulated what done in the tests for mutual_information (sklearn.metrics.cluster). Also in that case there are arbitrary numbers. I am not aware of a theoretical value of mRMR that can be used for this purpose. |
To create an example, you simply have to add a file under the |
Thanks for the clarification, I was thinking of adding it in the description of the method, inside feature_selection.rst By the way, once I have added the example in the examples folder, which rst file do I need to modify to make sure that it gets published in http://scikit-learn.org/stable/auto_examples/index.html? |
Are there any plans to end up implementing this? I'd love to see mrmr/Mutual Info feature selection actually decently implemented in python (without needing the C.exe) especially in scikit. |
Sorry this lay around for a while. We seem to be all pretty busy at the moment. I still think this is a cool addition. |
|
||
Attributes | ||
---------- | ||
k : int, default=2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in a Parameters
section together with "rule".
|
||
assert_array_equal([2, 0], m.mask) | ||
|
||
assert_array_equal(0.6730116670092563, m.score[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to compare to results in the literature, which I presume are not reported to 16 decimal places ;)
Closed in favour of the other PR |
Hi!
I have created a new class computing the mRMR filtering feature selection.
pep8, pyflakes and nosetests run succesfully on the submitted code.
I am planning to create the documentation for this class as soon as I receive your approval.
Thanks!
Andrea