8000 Infinite loop when running isotonic regression with some zero-valued weights · Issue #4297 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Infinite loop when running isotonic regression with some zero-valued weights #4297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ogrisel opened this issue Feb 26, 2015 · 8 comments
Closed
Labels
Milestone

Comments

@ogrisel
Copy link
Member
ogrisel commented Feb 26, 2015

I extract the following bug from the discussion in #2507 (comment) :

import numpy as np
import sklearn.isotonic

regression = sklearn.isotonic.IsotonicRegression()
n_samples = 60

x = np.linspace(-3, 3, n_samples)
y = x + np.random.uniform(size=n_samples)
w = np.random.uniform(size=n_samples)
w[5:8] = 0
regression.fit(x, y, sample_weight=w)

This bug alone should probably be considered a release critical bug for 0.16.

@ogrisel ogrisel added the Bug label Feb 26, 2015
@ogrisel ogrisel added this to the 0.16 milestone Feb 26, 2015
@ogrisel ogrisel changed the title Infinite loop when running isotonic regression Infinite loop when running isotonic regression with some zero-valued weights Feb 26, 2015
mjbommar added a commit to mjbommar/scikit-learn that referenced this issue Feb 27, 2015
mjbommar added a commit to mjbommar/scikit-learn that referenced this issue Feb 27, 2015
@mjbommar
Copy link
Contributor

@ogrisel , have fix in my personal repo but want to wait until the work @amueller and I did in #4302 is done to minimize mess.

@amueller
Copy link
Member

I don't understand the fix. How does that work? sample_weight needs to have the same shape as X and y, right? I think after #4302, it is just a matter of dropping the points with zero weight in fit.

@amueller
Copy link
Member

Never mind, I misread your fix, it is good. It only works after removing fit_transform in #4302, though.

@mjbommar
Copy link
Contributor

@amueller , yup, which is why I wanted to wait to pull against master after #4302 is merged :)

@ogrisel
Copy link
Member Author
ogrisel commented Mar 5, 2015

Appart from the rng comment, your fix LGTM (once #4302 is merged ;)

@mjbommar
Copy link
Contributor
mjbommar commented Mar 5, 2015

Thanks, good catch. Sorry for missing. Done now:
mjbommar@6e9d254

amueller pushed a commit to amueller/scikit-learn that referenced this issue Mar 6, 2015
amueller added a commit that referenced this issue Mar 6, 2015
[MRG + 2] Adding fix for issue #4297, isotonic infinite loop
@mjbommar
Copy link
Contributor
mjbommar commented Mar 6, 2015

We are good on this thanks to @amueller's work today . I believe it can be closed.

@GaelVaroquaux
Copy link
Member

Thanks!

cemoody pushed a commit to cemoody/scikit-learn that referenced this issue Mar 7, 2015
rasbt pushed a commit to rasbt/scikit-learn that referenced this issue Apr 6, 2015
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Jul 11, 2015
* tag '0.16b1': (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in SkipTest
  STYLE trailing spaces
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Jul 11, 2015
* releases: (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in SkipTest
  STYLE trailing spaces
  ...

Conflicts:
	sklearn/externals/joblib/__init__.py
	sklearn/externals/joblib/numpy_pickle.py
	sklearn/externals/joblib/parallel.py
	sklearn/externals/joblib/pool.py
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this issue Jul 11, 2015
* dfsg: (1589 commits)
  0.16.X branching, version 0.16b1
  Fix scikit-learn#4351. Rendering of docs in MinMaxScaler.
  Fix rebase conflict
  MAINT use canonical PEP-440 dev version consistently
  Adding fix for issue scikit-learn#4297, isotonic infinite loop
  DOC deprecate random_state for DBSCAN
  FIX/TST boundary cases in dbscan (closes scikit-learn#4073)
  Do not shuffle in DBSCAN (warn if `random_state` is used).
  Update docstring predict_proba()
  Update documentation of predict_proba in tree module
  add scipy2013 tutorial links to presentations on website.
  TST boundary handling in LSHForest.radius_neighbors
  ENH improve docstrings and test for radius_neighbors models
  use a pipeline for pre-processing feature selection, as per best practise
  DOC remove unnecessary backticks in CONTRIBUTING.
  ENH no need for tie breaking jitter in calibration
  Implement "secondary" tie strategy in isotonic.
  Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue scikit-learn#4184
  MAINT fix typo pyagm -> pygamg in Ski
5134
pTest
  STYLE trailing spaces
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants
0