8000 DOC Document `config_context` and transform output by ravwojdyla · Pull Request #25289 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC Document config_context and transform output #25289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 5, 2023

Conversation

ravwojdyla
Copy link
Contributor

Reference Issues/PRs

Fixes: #25287

What does this implement/fix? Explain your changes.

Add documentation around config_context and transform output usage.

@betatim
Copy link
Member
betatim commented Jan 4, 2023

Thanks for taking care of this. I left two comments, mostly around rephrasing things. Let me know what you think of the suggestions.

@ravwojdyla ravwojdyla force-pushed the doc-config-cntx branch 2 times, most recently from eb55296 to 3eebf43 Compare January 4, 2023 16:06
@ravwojdyla
Copy link
Contributor Author

@betatim thanks for the review - updated.

Copy link
Member
@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@betatim betatim added the Quick Review For PRs that are quick to review label Jan 5, 2023
Comment on lines 129 to 134
with config_context(transform_output="pandas"):
# X_test_scaled_pd will contain a dataframe
X_test_scaled_pd = scaler.transform(X_test[num_cols])

# X_test_scaled_np will contain a numpy array
X_test_scaled_np = scaler.transform(X_test[num_cols])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it better to show the output directly:

Suggested change
with config_context(transform_output="pandas"):
# X_test_scaled_pd will contain a dataframe
X_test_scaled_pd = scaler.transform(X_test[num_cols])
# X_test_scaled_np will contain a numpy array
X_test_scaled_np = scaler.transform(X_test[num_cols])
# %%
with config_context(transform_output="pandas"):
scaler.transform(X_test[num_cols])
# %%
scaler.transform(X_test[num_cols])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, I want to point out that the suggested change:

  • will not print the pandas output, since it's inside the context manager
  • the numpy output is relatively verbose

See:

image

@glemaitre we could instead do a head() of the scaled values for the dataframe, and maybe slice for the numpy array? would that be good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it would look like:

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed this ^ change in the latest version 1f4b855

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to show the content and the use of head(). I like it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this would be nice. Thinking about it maybe reverting a small code comment to mention array/dataframe could be better.

Copy link
Member
@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM given that https://github.com/scikit-learn/scikit-learn/pull/25289/files#r1062343534 's remark is taken into account.

@jjerphan jjerphan removed the Quick Review For PRs that are quick to review label Jan 5, 2023
Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to modify the wording.

Copy link
Member
@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Enabling merging when CIs will be ready.

Copy link
Member
@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well, apparently the auto-merge thingy did not work. I will squash and merge manually.

@ogrisel ogrisel merged commit 3efdc89 into scikit-learn:main Jan 5 6D40 , 2023
@ravwojdyla ravwojdyla deleted the doc-config-cntx branch January 6, 2023 16:54
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 20, 2023
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transform_output set in config_context not preserved in the Transformer object?
5 participants
0