-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Adds infrequent categories to OneHotEncoder #16018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
adrinjalali
merged 125 commits into
scikit-learn:main
from
thomasjpfan:infrequent_one_hot_encoder_rb
Mar 14, 2022
Merged
Changes from all commits
Commits
Show all changes
125 commits
Select commit
Hold shift + click to select a range
9c5dec4
ENH Completely adds infrequent categories
thomasjpfan 6613645
STY Linting
thomasjpfan 741bd10
STY Linting
thomasjpfan f1ba191
DOC Improves wording
thomasjpfan ae3f873
DOC Lint
thomasjpfan d7eb2b6
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan dc4249b
BUG Fixes
thomasjpfan 5941d97
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan d539245
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan c070f16
CLN Address comments
thomasjpfan 3400e07
CLN Address comments
thomasjpfan 5defa0b
DOC Uses math to description float min_frequency
thomasjpfan 35d2470
DOC Adds comment regarding drop
thomasjpfan f59f18f
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan aec1430
BUG Fixes method name
thomasjpfan a64ffdd
DOC Clearer docstring
thomasjpfan f445018
TST Adds more tests
thomasjpfan c7c2fa9
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 0516482
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan cac9d00
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 462b46c
FIX Fixes mege
thomasjpfan a920d37
CLN More pythonic
thomasjpfan 9398229
CLN Address comments
thomasjpfan 3a3eb5d
STY Flake8
thomasjpfan e5c4eef
CLN Address comments
thomasjpfan a249944
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 78fa495
DOC Fix
thomasjpfan 0c431ed
MRG
thomasjpfan 8cf73fa
WIP
thomasjpfan ecf9e7b
ENH Address comments
thomasjpfan 9a40eb7
STY Fix
thomasjpfan 33a653f
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan eb8b501
ENH Use functiion call instead of property
thomasjpfan 56aec01
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan dadaac2
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 8660704
ENH Adds counts feature
thomasjpfan b8a883f
CLN Rename variables
thomasjpfan 29005b1
DOC More details
thomasjpfan 03c8d4d
CLN Remove unneeded line
thomasjpfan f669c54
CLN Less lines is less complicated
thomasjpfan 23ba7fd
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan ffe2976
CLN Less diffs
thomasjpfan 8979f0b
CLN Improves readiabilty
thomasjpfan 530a3fe
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan a1ed299
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 41a29b0
BUG Fix
thomasjpfan 0d58dc3
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 5183d3c
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan ef2ebf6
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan a1cff1f
CLN Address comments
thomasjpfan e222452
TST Fix
thomasjpfan 0d5942c
Merge branch 'master' into infrequent_one_hot_encoder_rb
jnothman d5f85d4
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan db96b44
CLN Address comments
thomasjpfan dc73894
CLN Address comments
thomasjpfan 1a686b5
CLN Move docstring to userguide
thomasjpfan 853f54d
DOC Better wrapping
thomasjpfan 5ad5917
TST Adds test to handle_unknown='error'
thomasjpfan 7414e26
ENH Spelling error in docstring
thomasjpfan 99de0a6
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 265d85e
BUG Fixes counter with nan values
thomasjpfan 090c594
BUG Removes unneeded test
thomasjpfan 998272d
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan 8411e3d
BUG Fixes issue
thomasjpfan 213e3c3
Merge remote-tracking branch 'upstream/master' into infrequent_one_ho…
thomasjpfan ec6e23f
ENH Sync with main
thomasjpfan a730bce
DOC Correct settings
thomasjpfan 97e9f7a
DOC Adds docstring
thomasjpfan e1f72d9
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 433ccd7
DOC Immprove user guide
thomasjpfan ecb82df
DOC Move to 1.0
thomasjpfan 35d0544
DOC Update docs
thomasjpfan 274c090
TST Remove test
thomasjpfan abc504e
DOC Update docstring
thomasjpfan 4df4b29
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 484070a
STY Linting
thomasjpfan 6088f9e
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan c48ada2
DOC Address comments
thomasjpfan 1922b32
ENH Neater code
thomasjpfan 91fa58b
DOC Update explaination for auto
thomasjpfan a68ce31
Update sklearn/preprocessing/_encoders.py
thomasjpfan 6f0c542
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 3e305ef
TST Uses docstring instead of comments
thomasjpfan fec44b2
TST Remove call to fit
thomasjpfan e4ad665
TST Spelling error
thomasjpfan 10b8aec
ENH Adds support for drop + infrequent categories
thomasjpfan ef86eb1
ENH Adds infrequent_if_exist option
thomasjpfan ad639e9
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 61d1ddb
DOC Address comments for user guide
thomasjpfan 2493223
DOC Address comments for whats_new
8000
thomasjpfan a9f643f
DOC Update docstring based on comments
thomasjpfan 1de557a
CLN Update test with suggestions
thomasjpfan 058112e
ENH Adds computed property infrequent_categories_
thomasjpfan 7ab2434
DOC Adds where the infrequent column is located
thomasjpfan aa7d5cf
TST Adds more test for infrequent_categories_
thomasjpfan 939123c
DOC Adds docstring for _compute_drop_idx
thomasjpfan 6a467ac
CLN Moves _convert_to_infrequent_idx into its own method
thomasjpfan f11ccff
TST Increases test coverage
thomasjpfan fac1f21
TST Adds failing test
thomasjpfan 87a06fb
CLN Careful consideration of dropped and inverse_transform
thomasjpfan 49aaa23
STY Linting
thomasjpfan cd3d29b
DOC Adds docstrinb about dropping infrequent
thomasjpfan 01bc992
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 06397b2
DOC Uses only
thomasjpfan 9af51ba
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan f7c8839
Merge branch 'main' into infrequent_one_hot_encoder_rb
glemaitre 48a03ea
DOC Numpydoc
thomasjpfan 388e2f3
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan e36ca57
TST Includes test for get_feature_names_out
thomasjpfan 6bbc6d4
DOC Move whats new
thomasjpfan 23ae2e8
DOC Address docstring comments
thomasjpfan 523625e
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan 7980c6e
DOC Docstring changes
thomasjpfan 552c983
TST Better comments
thomasjpfan 4deb105
TST Adds check for handle_unknown='ignore' for infrequent
thomasjpfan ecb2a44
CLN Make _infrequent_indices private
thomasjpfan e7d8301
CLN Change min_frequency default to None
thomasjpfan 0bc1fee
DOC Adds comments
thomasjpfan c802291
ENH adds support for max_categories=1
thomasjpfan 10137a5
ENH Describe lexicon ordering for ties
thomasjpfan 0da2ee1
DOC Better docstring
thomasjpfan 07b38bd
STY Fix
thomasjpfan 2e28bb0
Merge remote-tracking branch 'upstream/main' into infrequent_one_hot_…
thomasjpfan cf73b27
CLN Error when explicity dropping an infrequent category
thomasjpfan 66306a4
STY Grammar
thomasjpfan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.