8000 add 'most_frequent' drop method to OneHotEncoder · Issue #18553 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

add 'most_frequent' drop method to OneHotEncoder #18553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kylegilde opened this issue Oct 7, 2020 · 5 comments
Open

add 'most_frequent' drop method to OneHotEncoder #18553

kylegilde opened this issue Oct 7, 2020 · 5 comments

Comments

@kylegilde
Copy link
kylegilde commented Oct 7, 2020

I would like to propose adding a 'most_frequent' method as one of the drop parameter options in OneHotEncoder.

I find that using the most frequent value as the reference level aids with interpreting the newly created OHE features. The 'first' method is not very intuitive.

I would also be helpful if a dropped_levels_ attribute was included, instead of having to derive it from the categories_ and drop_idx_ attributes. Thanks

@kylegilde kylegilde changed the title add 'most_frequent' drop method to OneHotEncoder add 'most_frequent' drop method to OneHotEncoder Oct 7, 2020
@trewaite
Copy link
Contributor

I would find this quite useful as well. I am willing to make a pull request if core development team approves.

@jnothman
Copy link
Member
jnothman commented Oct 20, 2020 via email

@trewaite
Copy link
Contributor

take

@drshnchndr
Copy link

Is anybody working on this? If not, we would be happy to take it up.

@ogrisel
Copy link
Member
ogrisel commented Sep 9, 2022

Related discussion: #23436.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
0