8000 Increase the max_iter for LabelPropagation. by musically-ut · Pull Request #9441 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Increase the max_iter for LabelPropagation. #9441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 27, 2017

Conversation

musically-ut
Copy link
Contributor

In practice, LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This PR changes max_iter for LabelPropagation to 1000.

This was extracted from #5893.

LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
@jnothman
Copy link
Member

Does this correspond, approximately, to 30 in LabelSpreading, or is it much more generous?

@jnothman jnothman added this to the 0.19 milestone Jul 24, 2017
@jnothman
Copy link
Member

Btw, I think we should change this for the 0.19 release, seeing as we're basically overturning LabelPropagation there anyway.

@jnothman
Copy link
Member

@musically-ut: If you can briefly illustrate the convergence n_iters_ with a couple of datasets, I'll be happy to merge this.

@ogrisel
Copy link
Member
ogrisel commented Jul 27, 2017

I have tried with the scaled digits dataset with 90% unlabeled data and the RBF kernel and for good cross-validated values of gamma (e.g. gamma between 1 and 20), the effective number of iterations can vary between 2 and 3000 (I had to increase tol to 0.01).

So indeed max_iter=30 is much too low. I think max_iter=1000 is a reasonable default.

@ogrisel ogrisel merged commit a4fe183 into scikit-learn:master Jul 27, 2017
ogrisel pushed a commit that referenced this pull request Jul 27, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from #5893.
@ogrisel
Copy link
Member
ogrisel commented Jul 27, 2017

Backported to the 0.19.X branch as c51aee8.

@musically-ut
Copy link
Contributor Author
musically-ut commented Jul 27, 2017

Oh, cool. Thanks @ogrisel!

I too was composing a benchmark with the datasets; I noticed that we may have to change the example of digit learning in light of the changed underlying algorithm.

@ogrisel
Copy link
Member
ogrisel commented Jul 27, 2017

The digit learning example seems to be fine on the dev branch:

http://scikit-learn.org/dev/auto_examples/semi_supervised/plot_label_propagation_digits_active_learning.html

@musically-ut
Copy link
Contributor Author

Yes, the example works with LabelSpreading and the accuracy is still good with large number of labels.

However, is it only me or is the table much harder to read on the dev branch than on 0.18.2?

@ogrisel
Copy link
Member
ogrisel commented Jul 27, 2017

This is a matplotlib issue, we should use interpolation='nearest'. But it's unrelated to the change in the algorithm.

@amueller
Copy link
Member

@ogrisel I think I prefer backporting just before the release, because otherwise it's hard to keep track of what to backport.

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Aug 6, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
dmohns pushed a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
AishwaryaRK pushed a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for
LabelPropagation.

This was extracted from scikit-learn#5893.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0