8000 DOC Added meaning of default=None for n_components in MiniBatchSparsePCA and MiniBatchDictionaryLearning by alceballosa · Pull Request #21428 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC Added meaning of default=None for n_components in MiniBatchSparsePCA and MiniBatchDictionaryLearning #21428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

alceballosa
Copy link
Contributor

Reference Issues/PRs

Solves two sub-issues in #17295.

What does this implement/fix? Explain your changes.

The default case (None) for the n_components parameter in sklearn.decomposition.MiniBatchSparsePCA and sklearn.decomposition.MiniBatchDictionaryLearning was not described, so I went ahead and added a short description. In addition, I noticed that the sklearn.decomposition.dict_learning_online function also has a n_components parameter. This parameter has a default=2 value but can take None values, so I added a description for that case too.

Any other comments?

#DataUmbrella sprint
cc: @amueller @thomasjpfan

@@ -1336,7 +1337,8 @@ class DictionaryLearning(_BaseSparseCoding, BaseEstimator):
Parameters
----------
n_components : int, default=n_features
Number of dictionary elements to extract.
Number of dictionary elements to extract. If None, then ``n_components``
is set to ``n_features_in_``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can keep it n_features, and in particular I think it's good if the docstring is the same for all the functions and method. Sorry if there was some confusion about that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with using n_features to be consistent with code everywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @amueller, @thomasjpfan, should I update my other PR (#21421) to use n_features instead of n_features_in_?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I think so?

Copy link
Member

Choose a reason for hiding this co 8000 mment

The reason will be displayed to describe this comment to others. Learn more.

@alceballosa You can open another PR to make the adjustment.

Updated references to the number of features to be ``n_features``.
@alceballosa
Copy link
Contributor Author
alceballosa commented Oct 23, 2021

Hi @amueller and @thomasjpfan I just updated n_features_in_ to n_features where needed :)

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -1336,7 +1337,8 @@ class DictionaryLearning(_BaseSparseCoding, BaseEstimator):
Parameters
----------
n_components : int, default=n_features
Number of dictionary elements to extract.
Number of dictionary elements to extract. If None, then ``n_components``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the above line is wrong. It should be

n_components : int, default=None

@@ -763,7 +763,8 @@ def dict_learning_online(
Data matrix.

n_components : int, default=2
Number of dictionary atoms to extract.
Number of dictionary atoms to extract. If None, then ``n_components``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should also correct the line above

n_components : int or None, default=2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @glemaitre: quick question before editing this, since n_components for class DictionaryLearning (line 1338) can take None too, shouldn't the dtype for line 765 be something along the lines of "int or None" too? Or is that unnecessary because the default value being None already states that fact implicitly?

Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is that unnecessary because the default value being None already states that fact implicitly

This is exactly the case, default=None is already making it implicit. We should only ensure to specify what it means in the docstring that follows (it is indeed what you are doing).

We would sometimes mention None in the type and as a default (e.g. random_state because the semantic of None is then different in this case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you! I just did the relevant edit. Let me know if anything else looks off.

Minor changes for consistency in n_components default value and its dtype.
@glemaitre glemaitre merged commit d742a58 into scikit-learn:main Oct 25, 2021
@glemaitre
Copy link
Member

Thanks @alceballosa LGTM

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Oct 28, 2021
< 8000 div class="Details-content--hidden mt-2">
…PCA and MiniBatchDictionaryLearning (scikit-learn#21428)
samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021
glemaitre pushed a commit that referenced this pull request Dec 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0