8000 [WIP] 'most_frequent' drop method for OneHotEncoder by trewaite · Pull Request #18678 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[WIP] 'most_frequent' drop method for OneHotEncoder #18678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

trewaite
Copy link
Contributor

…tribute

Reference Issues/PRs

#18553

What does this implement/fix? Explain your changes.

Added 'most frequent' option to drop argument. This will drop most frequent category in each feature. If all categories at an equal count, first category will be dropped. Added new categories_count attribute to help accomplish this.

Any other comments?

In original issues, it was mentioned that a dropped_levels_ would be helpful. Let me know if you would also like this added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
0