8000 [MRG + 1] ENH: allow to pass callable as column specifier in ColumnTransformer by jorisvandenbossche · Pull Request #11592 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG + 1] ENH: allow to pass callable as column specifier in ColumnTransformer #11592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

A partly take-over of #11301 to only add the actual functionality of being able to pass a function (and not the provided select_dtypes factory function).

@amueller amueller changed the title ENH: allow to pass callable as column specifier in ColumnTransformer [MRG + 1] ENH: allow to pass callable as column specifier in ColumnTransformer Jul 17, 2018
Copy link
Member
@amueller amueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jorisvandenbossche
Copy link
Member Author

(Travis is passing)

@jorisvandenbossche jorisvandenbossche added this to the 0.20 milestone Jul 17, 2018
Copy link
Member
@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Merging

@GaelVaroquaux GaelVaroquaux merged commit 9caa982 into scikit-learn:master Jul 17, 2018
@jorisvandenbossche jorisvandenbossche deleted the column-selector-functions branch July 17, 2018 20:31
@amueller
Copy link
Member

ohhh yeaahhh

Copy link
Member
@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have realised the behaviour here can be problematic. if key(X) returns a different value at fit and at transform time... This might not be so simple as we thought; it needs to resolve the set of column names/indices at fit time only, no?

@jorisvandenbossche
Copy link
Member Author

It would not be that hard to convert the callable to a set of integer indices at fit time (such a conversion mechanism already exists, for the remainder functionality), main question is how to store cleanly separate from the self.transformers (although we could say that in the fitted self.transformers_ we store the converted column selector, instead of the original one, that could be quite clean).

@jorisvandenbossche
Copy link
Member Author

It would not be that hard to convert the callable to a set of integer indices at fit time

Or not even necessarily "integer indices", we could also simply store what key(X) returns.

@jnothman
Copy link
Member

we could say that in the fitted self.transformers_ we store the converted column selector, instead of the original one, that could be quite clean

Yes, I think so. A PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0