E526
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #3550, #5774, #3758, #6727.
Travis likes it.
Sorry, something went wrong.
Files for my dev environment with Docker
f37cff0
Fixing label clamping (alpha=0 for hard clamping)
f725281
Deprecating alpha, fixing its value to zero
f609105
Correct way to deprecate alpha for LabelPropagation
3c4f627
The previous way was breaking the test sklearn.tests.test_common.test_all_estimators
Detailed info for LabelSpreading's alpha parameter
2499098
Based on the original paper.
Minor changes in the deprecation message
2c0645b
Improving "deprecated" doc string and raising DeprecationWarning
606d65e
Using a local "alpha" in "fit" to deprecate LabelPropagation's alpha
7b267a8
This solution isn't great, but it sets the correct value for alpha without violating the restrictions imposed by the tests.
Removal of my development files
bd1a06c
Using sphinx's "deprecated" tag (jnothman's suggestion)
2662196
Deprecation warning: stating that the alpha's value will be ignored
551feec
Use __init__ with alpha=None
91b7f9a
Merge branch 'master' into lpalpha
c5b515e
Update what's new
69b3e89
Merge pull request #2 from jnothman/lpalpha
297c16b
Changes to fixing scikit-learn#5774 (label clamping)
Merge branch 'master' into issue-5774
95f73ef
Try fix RuntimeWarning in test_alpha_deprecation
10d82d5
@boechat107, it looks like kernel='knn' makes Travis happy. If you'd rather adopt this fix in your branch, that's fine. In either case, I hope we can get some quick reviews and merge soon.
kernel='knn'
DOC Indent deprecation details
70623f0
DOC wording
e778159
I've made another couple of small documentation fixes here.
There was a problem hiding this comment.
The reason will be displayed to describe this comment to others. Learn more.
Just a minor comment related to narrative docs.
I'm confused, the algorithm and the documentation says that alpha is the percentage of the initial class distribution that will be maintained. Quoting the documentation.
alpha
"The LabelPropagation algorithm performs hard clamping of input labels, which means \alpha=1."
In that case, why does this default to zero assuming the definition of alpha is the same?
Nvm, I just read the documentation below. Can you please change the narrative documentation so that there is no confusion.
Maybe I need to check a bit further that the semantics of this fix is right. I feel what's new is under-stating the change particularly because of the change from ...[unlabelled] to ...[~unlabelled]
...[unlabelled]
...[~unlabelled]
Okay, so I understand why the test was failing: part of it was the implementation of the algorithm and part of it was misinterpretation of clf._build_graph() function.
clf._build_graph()
The test should have looked like the following:
n_classes = 2 X, y = make_classification(n_classes=n_classes, n_samples=200, random_state=0) y[::3] = -1 clf = SS.label_propagation.LabelSpreading().fit(X, y) # adopting notation from Zhou et al: # W = clf._build_graph() # D = np.diag(W.sum(axis=1)) # Dinvroot = scipy.linalg.sqrtm(np.linalg.inv(D)) # S = np.dot(np.dot(Dinvroot, W), Dinvroot) S = clf._build_graph() Y = np.zeros((len(y), n_classes + 1)) Y[np.arange(len(y)), y] = 1 Y = Y[:, :-1] for alpha in [0.1, 0.3, 0.5, 0.7, 0.9]: expected = np.dot(np.linalg.inv(np.eye(len(S)) - alpha * S), Y) expected /= expected.sum(axis=1)[:, np.newaxis] clf = SS.label_propagation.LabelSpreading(max_iter=10000, alpha=alpha) clf.fit(X, y) assert_array_almost_equal(expected, clf.label_distributions_, 4)
That is, the clf._build_graph() function directly returns S instead of returning W.
S
W
Then, the "actual" algorithm talked about in the paper is the following (our modifications are commented out):
# clamp_weights = np.ones((n_samples, 1)) # clamp_weights[~unlabeled, 0] = alpha # TODO TESTING clamp_weights = alpha * np.ones((n_samples, 1)) # ... if alpha > 0.: y_static *= 1 - alpha # TODO TESTING # y_static[unlabeled] = 0
I have this version implemented in this branch of my local fork; one can check out that branch and verify that the sanity-check test does succeed in that case.
I can sort of see that the modifications we have made to the algorithm make sense. Similar guarantees probably can be worked out for this modified version as well, but I would be far more comfortable just using the version from the paper or finding a reference rather than use unpublished versions.
What do you think?
So we could say: alpha > 0: F = alpha * F + (1 - alpha) * Y alpha = 0: F = {F[i] if i unlabelled else Y[i]}
So we could say:
Err, I don't think the last case makes sense because the initial Y[i] are not supplied by the user (they are all 0s). It will be weird to return this without any hint of something having gone wrong.
Y[i]
Personally, throwing a ValueError makes more sense.
ValueError
Update: No, it does make sense. Rethinking.
While we are at it, I also recommend adding a warning if the method did not converge after max_iterations, akin to this.
max_iterations
Okay, my suggestion is that we disallow both the extreme values of alpha, i.e. 0 and 1. The paper requires alpha to be in the open interval (0, 1) because in one case, we are completely ignoring the transduction and in the other, the input labels. Hence, I'll be happy to throw a ValueError in both cases for LabelSpread.
0
1
(0, 1)
LabelSpread
For LabelPropagation, I'm working on handling the case alpha == None gracefully and writing a similar sanity check for it.
LabelPropagation
alpha == None
Actually, I can use some help; my numpy matrix-fu might be wrong somewhere.
I'm trying to replicate the calculations done in eqn. (12) in the reference for LabelPropagation.
Am I doing the correct things here?
n_classes = 2 X, y = make_classification(n_classes=n_classes, n_samples=200, random_state=0) y[::3] = -1 # Using Zhu.' 2002 notation: clf = label_propagation.LabelPropagation().fit(X, y) T_bar = clf._build_graph() Y = np.zeros((len(y), n_classes + 1)) Y[np.arange(len(y)), y] = 1 unlabelled_idx = Y[:, (-1,)].nonzero()[0] labelled_idx = (Y[:, (-1,)] == 0).nonzero()[0] Tuu = T_bar[np.meshgrid(unlabelled_idx, unlabelled_idx, indexing='ij')] Tul = T_bar[np.meshgrid(unlabelled_idx, labelled_idx, indexing='ij')] Y = Y[:, :-1] Y_u = np.dot(np.dot(np.linalg.inv(np.eye(Tuu.shape[0]) - Tuu), Tul), Y[labelled_idx]) expected = Y.copy() expected[unlabelled_idx, :] = Y_u expected /= expected.sum(axis=1)[:, np.newaxis] assert_array_almost_equal(expected, clf.label_distributions_, 4)
Feedback on making this more efficient/easier to read also welcome.
Okay, I've pushed the changes to a branch on my fork which branches on alpha is None to differentiate LabelSpreading and LabelPropagation. This is a leaky abstraction but something I can live with.
alpha is None
LabelSpreading
I've also added the tests and have changed the old test from using knn to using rbf kernel because #8008. I've adjusted the gamma parameter such that the exp underflow is not a problem.
knn
rbf
gamma
Curiously, adding a step to do the following:
LabelPropagation row-normalizes Y to be a valid probability, while LabelSpreading makes such no constraints on the analogous F(T). I suppose we should change this behaviour to be true to the original algorithms depending upon if fit is called from LabelPropagation or LabelSpreading.
i.e.,
# ... self.label_distributions_ = safe_sparse_dot( graph_matrix, self.label_distributions_) if alpha is None: # LabelPropagation normalizer = np.sum(self.label_distributions_, axis=1)[:, np.newaxis] self.label_distributions_ /= normalizer # clamp # ...
did not change the outcome of the test, while changing increasing the gamma to even 10 brought about numerical instability in the tests (perhaps in the algorithm as well?), making the results diverge and the test fail.
Things still left to do in this PR:
alpha = 0
alpha = 1
Maybe:
I'm not surprised that normalisation or not does not change the test passing (the test itself normalises the final distribution, and everything else is affine, but I've not fully thought it through).
Using assert_array_almost_equal might be less robust to changes in gamma than assert_allclose which uses relative tolerance. I might be wrong, though.
assert_array_almost_equal
assert_allclose
I know there's been a lot of this, but I'd be happy for you to take over here if you wish, or to send me a PR.
I've added throwing of ValueError for invalid alpha (including tests) and the normalization step before clamping LabelPropagation.
I'm sort of at a loss while designing tests, though. For example, I would love to have tests which:
T
I've tried using assert_allclose and it succeeds after playing with rtol a little for LabelPropagation. However, looking at the definition of convergence in the the code:
rtol
def _not_converged(y_truth, y_prediction, tol=1e-3): """basic convergence check""" return np.abs(y_truth - y_prediction).sum() > tol
I don't think we should be using relative tolerance.
re: this PR; do you mean to close this discussion and start a new one?
I think we might have to leave this broken for 0.19.0, and aim to merge it soon after the release.
testing for sparse should be easy. is it not already tested?
I don't think so. Does something like make_classification exist to generate sparse X?
make_classification
X
Not ideal, but sounds reasonable.
I don't mind if we close and start anew [pull request].
I'd rather keep this context around somehow. I missed the discussion on this thread for quite a while (~ 1 week?) because I wasn't automatically subscribed to it.
Note to self: Explicitly ping everyone involved on the new PR which will eventually be created.
Does something like make_classification exist to generate sparse X?
Why not just generate dense X and then convert to sparse?
Actually, the graph_matrix can only be sparse if the kernel returns a sparse matrix and only the kNN kernel returns that (which I am avoiding fixing in this PR).
graph_matrix
kNN
Hence, we don't need to test the sparse implementation right away. However, tests for in-the-loop normalization and numerical stability (how?) would be nice to have.
I am fairly confident that the code implements the algorithm correctly and merging it as-is will move the implementation in the correct direction (i.e. from flat Earth -> spherical Earth). The tests would help move it to the 'oblate spheroid' realm in my head, but coming up with tests is ... difficult.
Thoughts?
I meant a new PR with your changes + the tweaks/tests we've developed over this thread, which are on my branch. I'll presently create a PR from it and link to it here.
@musically-ut can you please give us a summary of what remains to be done for this PR? Is this ready for final review? If so please update the title from [WIP] to [MRG].
[WIP]
[MRG]
A actually I understand that this should be closed in favor of #9239.
@ogrisel Thanks, yes, #9239 supersedes this PR.
MechCoder MechCoder approved these changes
Successfully merging this pull request may close these issues.