-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
DOC Document config_context
and transform output
#25289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f4c9388
to
82fcc8a
Compare
Thanks for taking care of this. I left two comments, mostly around rephrasing things. Let me know what you think of the suggestions. |
eb55296
to
3eebf43
Compare
@betatim thanks for the review - updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
with config_context(transform_output="pandas"): | ||
# X_test_scaled_pd will contain a dataframe | ||
X_test_scaled_pd = scaler.transform(X_test[num_cols]) | ||
|
||
# X_test_scaled_np will contain a numpy array | ||
X_test_scaled_np = scaler.transform(X_test[num_cols]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would find it better to show the output directly:
with config_context(transform_output="pandas"): | |
# X_test_scaled_pd will contain a dataframe | |
X_test_scaled_pd = scaler.transform(X_test[num_cols]) | |
# X_test_scaled_np will contain a numpy array | |
X_test_scaled_np = scaler.transform(X_test[num_cols]) | |
# %% | |
with config_context(transform_output="pandas"): | |
scaler.transform(X_test[num_cols]) | |
# %% | |
scaler.transform(X_test[num_cols]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, I want to point out that the suggested change:
- will not print the pandas output, since it's inside the context manager
- the numpy output is relatively verbose
See:
@glemaitre we could instead do a head()
of the scaled values for the dataframe, and maybe slice for the numpy array? would that be good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed this ^ change in the latest version 1f4b855
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to show the content and the use of head()
. I like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this would be nice. Thinking about it maybe reverting a small code comment to mention array/dataframe could be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM given that https://github.com/scikit-learn/scikit-learn/pull/25289/files#r1062343534 's remark is taken into account.
3eebf43
to
1f4b855
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to modify the wording.
1f4b855
to
088bcab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Enabling merging when CIs will be ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well, apparently the auto-merge thingy did not work. I will squash and merge manually.
Reference Issues/PRs
Fixes: #25287
What does this implement/fix? Explain your changes.
Add documentation around config_context and transform output usage.