-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
transform_output
set in config_context
not preserved in the Transformer object?
#25287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
config_context
set transform_output
not preserved in the Transformer object?transform_output
set in config_context
not preserved in the Transformer object?
I agree that this can be a bit confusing, however I don't think this is a bug. A context manager is a tool to use if you want to do setup some state of the world that only applies within that context. A different way of thinking of the config context is as:
If you want the configuration to persist beyond the context, Context managers are like Las Vegas: what happens in Vegas, stays in Vegas ;) |
Maybe something to do thought is to improve the documentation to explain a bit more about the interplay of a config context and |
@betatim thanks for the prompt response, much appreciated :) For some more context on:
In our use case we have a number of separate Pipelines, that we construct in one place, and use in a separate place, we want all of them to use pandas output. My first intuition was to just construct those Pipeline objects within the options context manager, expecting it would be preserved in the pipeline objects (thus this issue), I quite liked that syntax btw. Instead we could:
#25288 implements the original idea, but I'm happy to adjust it to update the documentation. |
The behaviour is the one expected when using context manager: In [5]: with open("xxx.file", "r") as f:
...: f.read()
...: print(f.closed)
True Here, the context manager is in charge of closing the file. Basically, you cannot expect it to do whatever on In the problem above, you would use So I will close #25288 but I am happy to see an improved documentation. |
Sounds good, thanks for quick feedback. Pushed update to the documentation at: #25289 PTAL. |
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
This is related to: #23734 (btw I love this enhancement!), when
config_context
is used the Transformers created within the context do not register/memoize the transform output. This may be expected, tho I could not find that explicitly in the documentation?Could the solution be to add default init to
_SetOutputMixin
to capturetransform_output
if set?Steps/Code to Reproduce
This works as expected:
But:
So when
fit_transform
is not underconfig_context(transform_output="pandas")
the output defaults to numpy array (default output).StandardScaler()
constructor doesn't register the config at the construction time.This is slightly confusing because:
Expected Results
As a user I would expect that
config_context(transform_output="pandas")
is memoized/registered during transformer construction. Similar to explicitly callingset_output
on a transformer.Actual Results
See above.
Versions
The text was updated successfully, but these errors were encountered: