-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] Native support for missing values in GBDTs #13911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
adrinjalali
merged 103 commits into
scikit-learn:master
from
NicolasHug:missing_value_gbdt
Aug 21, 2019
Merged
Changes from all commits
Commits
Show all changes
103 commits
Select commit
Hold shift + click to select a range
e279161
Added NaN support in mapper
NicolasHug 91105a6
pep
NicolasHug 000ab9a
WIP
NicolasHug 66c2502
some more
NicolasHug 810b7b0
WIP
NicolasHug 5fd59cb
WIP
NicolasHug 670566b
bug fix
NicolasHug e338e0a
basic tests
NicolasHug d288518
some doc
NicolasHug 2d1659b
avoid some interactions
NicolasHug f2a83a0
Added tag
NicolasHug cd1de3c
better test
NicolasHug 5cd8e59
decent test + fix bug
NicolasHug af1558a
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug d6b73ed
add missing_fraction param to benchmark
NicolasHug 5e06fa7
bin training and validation data separately
NicolasHug 1a34856
shorter test
NicolasHug aae10a2
Map missing values to first bin instead of last
NicolasHug 35eda6e
pep8
NicolasHug 1fa9b26
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 1f63282
Added whats new entry
NicolasHug e3d34a9
avoid some python interactions
NicolasHug 542cb25
make predict_binned work
NicolasHug bf822b4
fixed bug due to offset in bin_thresholds_ attribute
NicolasHug 112b400
more sensible binning strat
NicolasHug 21a3ee3
typo
NicolasHug 28c15b2
user name
NicolasHug 5a5f39d
Add small test
NicolasHug a4da8d0
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug b07fed9
convert to fortran array in tests
NicolasHug b78e96b
some doc
NicolasHug a9f878c
Added function test
NicolasHug 71b64e8
pep8
NicolasHug 0af212f
Merge branch 'master' of github.com:scikit-learn/scikit-learn into bi…
NicolasHug 2c2373e
Bin validation data using binmaper of training data
NicolasHug 0e8edd1
Merge branch 'bin_train_val_separately' into missing_value_gbdt
NicolasHug deda348
Allocate first bin for missing entries based on the whole data, not just
NicolasHug e8fcc31
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 1a471ce
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 1f78807
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 3fed0ab
Addressed Thomas' comments
NicolasHug 3e7bb7d
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 7ad5bce
Update sklearn/ensemble/_hist_gradient_boosting/tests/test_grower.py
NicolasHug 4a9dc3a
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 489b861
Merge branch 'missing_value_gbdt' of github.com:NicolasHug/scikit-lea…
NicolasHug e83b39e
Addressed Guillaume's comments
NicolasHug c80b250
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug d2de00b
always allocate first bin for missing values
NicolasHug 26b66ab
reduce diff
NicolasHug f370a71
minor more consistent test
NicolasHug ec57171
typo
NicolasHug beae859
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug e82a5f4
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 92f3e28
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 2dfaad8
WIP
NicolasHug 457e720
some doc
NicolasHug af5ef38
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 45c5068
reduce diff
NicolasHug 5a8fbe5
pep8
NicolasHug bc9c0df
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 889835a
minor
NicolasHug 8995db4
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 8d5e36e
remove prints
NicolasHug d28ab14
towards nan only splits
adrinjalali 48fa149
don't check right to left on split_on_nan
adrinjalali 76e18f8
cleaups
adrinjalali eb0f7e6
format and comment
adrinjalali e0abc50
Fixed bug + added more tests
NicolasHug 77846a3
refactor tests
NicolasHug 14d444f
put back n_threads to max value
NicolasHug 8fb80fd
minor changes
NicolasHug 0440398
minor cleaning
NicolasHug 0fde968
Support splitting on nans
NicolasHug 5301c5a
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 210f90f
Merge branch 'missing_value_gbdt' of github.com:NicolasHug/scikit-lea…
NicolasHug 4b0176a
Add (failing) test that checks equivalence with min max imputation
ogrisel d38881c
Decrease the likelihood of ties when training the trees
ogrisel f5e8e45
More robust test
ogrisel a0963fb
Fix pytest parametrization
ogrisel d0be6cb
Check bin thresholds in test
ogrisel 9c9d7e5
Try to make the test even easier to see if the Linux 32bit build woul…
ogrisel 191cfc6
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 5fc7453
Merge branch 'missing_value_gbdt' of github.com:NicolasHug/scikit-lea…
NicolasHug 75dc126
Don't check last non-missing bin if there's no nan
NicolasHug 3b2075c
Improve min-max imputation test
ogrisel a66103c
FIX: _find_best_bin_to_split_right_to_left is still required even whe…
ogrisel 0bda5d1
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug 49140c2
comments
NicolasHug e39f48e
remove split_on_nan
NicolasHug f89c1c5
ooops deleted useless files
NicolasHug 299d3e0
Got rid of individual checks in predictor code
NicolasHug 9540f99
can also remove special case in binning code
NicolasHug cb3936d
minor typos + more consistent test
NicolasHug c8f6409
renamed types -> common
NicolasHug 6f0e191
1e300 -> almost inf
NicolasHug a56db0b
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
NicolasHug c112335
added user guide section on missing values
NicolasHug 0c5dc90
Merge branch 'master' into missing_value_gbdt
ogrisel ef5cce2
Merge branch 'master' of github.com:scikit-learn/scikit-learn into mi…
ogrisel 3b0c2ba
Addressed Olivier's comment + updated whatsnew
NicolasHug 7c868ae
addressed comments
NicolasHug 876f538
Fix doctest formatting
ogrisel 601dc22
Fix nan predictive doctest
ogrisel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.