Minimum redundancy maximum relevance (mRMR) feature selection #2547

AndreaBravi · 2013-10-23T17:16:03Z

Hi!

I have created a new class computing the mRMR filtering feature selection.

pep8, pyflakes and nosetests run succesfully on the submitted code.

I am planning to create the documentation for this class as soon as I receive your approval.

Thanks!

Andrea

pep8 somehow does not identify this error, while Travis CI does

coveralls · 2013-10-25T05:22:23Z

Coverage remained the same when pulling 62cf55b on AndreaBravi:mRMR into d43a767 on scikit-learn:master.

larsmans · 2013-10-26T14:27:53Z

sklearn/feature_selection/multivariate_filtering.py

+    -------
+    _compute_mRMR(X, y)
+        Computes the minimal relevance maximal redundancy of each feature
+        returning mask and score


The docstring should never refer to private methods.

amueller · 2013-10-27T04:15:23Z

It would be cool to have an example that compares this method with univariate selection and RFE.

amueller · 2013-10-27T04:20:46Z

This looks good, thanks. I am not totally convinced by the tests, though. I am not familiar with the method but it would be good if the expected result from the model could be computed in an easy way. Currently it looks like the scores are some magic numbers and I don't know what they mean.

AndreaBravi · 2013-10-29T17:22:20Z

Nice suggestion! I will insert that kind of example in the documentation, as soon as I get familiar with sphinx.

About the testing, I emulated what done in the tests for mutual_information (sklearn.metrics.cluster). Also in that case there are arbitrary numbers. I am not aware of a theoretical value of mRMR that can be used for this purpose.

coveralls · 2013-10-29T17:52:08Z

Coverage remained the same when pulling c2d1852 on AndreaBravi:mRMR into d43a767 on scikit-learn:master.

amueller · 2013-11-04T03:03:06Z

To create an example, you simply have to add a file under the examples folder that starts with plot_.

AndreaBravi · 2013-11-04T15:50:49Z

Thanks for the clarification, I was thinking of adding it in the description of the method, inside feature_selection.rst

By the way, once I have added the example in the examples folder, which rst file do I need to modify to make sure that it gets published in http://scikit-learn.org/stable/auto_examples/index.html?

…to mRMR

coveralls · 2013-11-04T21:34:49Z

Coverage remained the same when pulling 9d1ea9a on AndreaBravi:mRMR into d43a767 on scikit-learn:master.

ddofer · 2014-11-19T08:09:15Z

Are there any plans to end up implementing this? I'd love to see mrmr/Mutual Info feature selection actually decently implemented in python (without needing the C.exe) especially in scikit.

amueller · 2015-01-09T20:28:13Z

Sorry this lay around for a while. We seem to be all pretty busy at the moment. I still think this is a cool addition.

jnothman · 2015-01-18T11:03:37Z

sklearn/feature_selection/multivariate_filtering.py

+
+    Attributes
+    ----------
+    k : int, default=2


This should be in a Parameters section together with "rule".

jnothman · 2015-01-18T11:20:02Z

sklearn/feature_selection/tests/test_multivariate_filtering.py

+
+    assert_array_equal([2, 0], m.mask)
+
+    assert_array_equal(0.6730116670092563, m.score[0])


It would be better to compare to results in the literature, which I presume are not reported to 16 decimal places ;)

MechCoder · 2015-11-05T00:00:24Z

Closed in favour of the other PR

AndreaBravi added 6 commits October 23, 2013 13:02

Minimum redundancy maximum relevance feature selection

3559544

Corrected docstring indentation error

9f47461

pep8 somehow does not identify this error, while Travis CI does

Corrected docstring 2

73d55aa

Checking for NaNs and Inf during fit()

14a7cf0

Substituted lambda functions because of pickle problem

ec17096

Solved pickle problem and added test

62cf55b

larsmans reviewed Oct 26, 2013
View reviewed changes

Implemented suggested corrections

c2d1852

AndreaBravi added 2 commits November 4, 2013 16:06

Example comparing mRMR with other selection algorithms

0f47cca

Merge branch 'mRMR' of https://github.com/AndreaBravi/scikit-learn in…

9d1ea9a

…to mRMR

mRMR rst description

4b97111

larsmans force-pushed the master branch from 58a55ad to 4b82379 Compare August 25, 2014 21:50

MechCoder force-pushed the master branch from 6deaea0 to 3f49cee Compare November 3, 2014 12:36

jnothman reviewed Jan 18, 2015
View reviewed changes

nmayorov mentioned this pull request Oct 8, 2015

[MRG+1] ENH: Feature selection based on mutual information #5372

Closed

MechCoder closed this Nov 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Minimum redundancy maximum relevance (mRMR) feature selection #2547

Minimum redundancy maximum relevance (mRMR) feature selection #2547

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		assert_array_equal([2, 0], m.mask)

		assert_array_equal(0.6730116670092563, m.score[0])

Uh oh!

Minimum redundancy maximum relevance (mRMR) feature selection #2547

Minimum redundancy maximum relevance (mRMR) feature selection #2547

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!