Increase the max_iter for LabelPropagation. #9441

musically-ut · 2017-07-24T10:44:28Z

In practice, LabelPropagation converges much slower than LabelSpreading. The default
of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This PR changes max_iter for LabelPropagation to 1000.

This was extracted from #5893.

LabelPropagation converges much slower than LabelSpreading. The default of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This was extracted from scikit-learn#5893.

jnothman · 2017-07-24T10:49:20Z

Does this correspond, approximately, to 30 in LabelSpreading, or is it much more generous?

jnothman · 2017-07-24T10:50:09Z

Btw, I think we should change this for the 0.19 release, seeing as we're basically overturning LabelPropagation there anyway.

jnothman · 2017-07-24T10:51:52Z

@musically-ut: If you can briefly illustrate the convergence n_iters_ with a couple of datasets, I'll be happy to merge this.

ogrisel · 2017-07-27T13:45:50Z

I have tried with the scaled digits dataset with 90% unlabeled data and the RBF kernel and for good cross-validated values of gamma (e.g. gamma between 1 and 20), the effective number of iterations can vary between 2 and 3000 (I had to increase tol to 0.01).

So indeed max_iter=30 is much too low. I think max_iter=1000 is a reasonable default.

LabelPropagation converges much slower than LabelSpreading. The default of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This was extracted from #5893.

ogrisel · 2017-07-27T13:48:32Z

Backported to the 0.19.X branch as c51aee8.

musically-ut · 2017-07-27T13:50:14Z

Oh, cool. Thanks @ogrisel!

I too was composing a benchmark with the datasets; I noticed that we may have to change the example of digit learning in light of the changed underlying algorithm.

ogrisel · 2017-07-27T13:57:15Z

The digit learning example seems to be fine on the dev branch:

http://scikit-learn.org/dev/auto_examples/semi_supervised/plot_label_propagation_digits_active_learning.html

musically-ut · 2017-07-27T14:02:25Z

Yes, the example works with LabelSpreading and the accuracy is still good with large number of labels.

However, is it only me or is the table much harder to read on the dev branch than on 0.18.2?

ogrisel · 2017-07-27T14:04:54Z

This is a matplotlib issue, we should use interpolation='nearest'. But it's unrelated to the change in the algorithm.

amueller · 2017-07-27T16:02:16Z

@ogrisel I think I prefer backporting just before the release, because otherwise it's hard to keep track of what to backport.

LabelPropagation converges much slower than LabelSpreading. The default of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This was extracted from scikit-learn#5893.

Increase the max_iter for LabelPropagation.

002f37b

LabelPropagation converges much slower than LabelSpreading. The default of max_iter=30 works well for LabelSpreading but not for LabelPropagation. This was extracted from scikit-learn#5893.

jnothman added this to the 0.19 milestone Jul 24, 2017

jnothman added the Blocker label Jul 24, 2017

ogrisel merged commit a4fe183 into scikit-learn:master Jul 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Increase the max_iter for LabelPropagation. #9441

Increase the max_iter for LabelPropagation. #9441

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Increase the max_iter for LabelPropagation. #9441

Increase the max_iter for LabelPropagation. #9441

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!