-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
MRG Fit pairwise #803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MRG Fit pairwise #803
Conversation
In retrospect, I am a bit worried by the fact that this will add 4 new methods (fit_pairwise, predict_pairwise, transform_pairwise and fit_transform_pairwise). Another way to solve the grid search problem would be to agree upon on a flag that must be set by the estimator and can be checked by the grid search. So, for example, if kernel="precomputed", we internally set In algorithms that use a similarity / affinity matrices in fit, we should add a new |
Yeah, four new methods are a lot. I didn't really see that until I tried to implement everything and adjust the tests. I would not call the option you suggest "metric" as I think it does not need to be one. This can probably be decided on an per-algorithm base and made as consistent as possible ;) |
I would not call the option you suggest "metric" as I think it does not
|
self.pairwise is a simple and tempting idea. I would however use a function like is_pairwise() as you probably don't want to have svc.kernel and svc.pairwise defined in the init (required by clone). A property would work too but a function is more explicit. |
If I'm not mistaken, this requirement only holds if the attribute is a constructor parameter, which wouldn't be the case here. |
I certainly agree but to rephrase, svc.pairwise to remain properly defined after a clone should either be an init param or a property. Otherwise we could have svc.is_pairwise() |
Ah ok, I got your point now. A 3rd way is to modify clone so as to check for the presence of |
why not if it's an exception for pairwise. I am sure @GaelVaroquaux |
I wouldn't be shocked to have it in clone since |
ping @mblondel @agramfort @GaelVaroquaux |
Maybe the affinity propagation still needs some love... though I'm not very familiar with that. |
Conflicts: sklearn/svm/base.py
One more comment on property vs function: getattr(est, '_pairwise', lambda: False)() Which looks a bit ugly to me. Therefor I'm +0 on property vs function. |
Any opinions on this at all? |
I would +1 reusing I don't see why you say that spectral clustering and affinity propagation use different styles of affinities: you can select it according to what make sense depending on the nature of the data and the assumption you make on the cluster structure. |
As for the |
@ogrisel thanks for your comment. Maybe I'm just not familiar enough with affinity propagation. Have you an example where it is used with something that is not the euclidean distance? |
2012/8/22 Andreas Mueller notifications@github.com
Olivier |
…ightly change the meaning of scalar parameter. Scaling the medium seems more intuitive that giving absolute values.
|
||
Parameters | ||
---------- | ||
|
||
S: array [n_points, n_points] | ||
X: array [n_points, n_points] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shape should be [n_samples, n_samples]
if affinity == "precomputed"
and [n_samples, n_features]
otherwise.
…f a ``precomputed`` parameter, to support other affinities in the future.
…or consistency.
|
||
np.exp(-gamma * d(X,X) ** 2) | ||
|
||
or a k-nearest neighbors matrix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connectivity matrix
@mblondel Thanks for your comments. You gave |
…g that @mblondel pointed out.
I thought you wanted to rename |
It's nice that As I said on the ML, we could postpone the decision about the code factorization of affinity propagations to later. This way we can address the more cosmetic changes (like using a property or not) and merge soon. This PR has been pending for too long :) |
I think this looks ok now. It does not do a lot of code sharing, because of the inclusion loop problem (see ML). What this PR accomplishes now is:
This is important because it make these next steps possible:
@mblondel I fully agree :) |
+1 for postponing the affinity == neighbors refactoring and get this merged first. |
I think I am 👍 for merging if all the tests pass. Can you quickly check the test coverage report to check that the new code does not introduce new untested code blocks? Also can you check that the affinity / spectral clustering example still run and have their plots look good with your refactoring? BTW thanks for working on this. |
… (was a bit over-eager there)
Always a pleasure ;) |
hurray :) |
Good job!!! |
This is supposed to be (a draft of?) the
fit_pairwise
proposal.It createsfit_pairwise
andpredict_pairwise
functions for SVM, KernelPCA, SpectralClustering and AffinityPropagation.It also makes all these usable using GridSearchCV with a new
fit_pairwise
for the grid search.Now uses a property to check whether the estimator was fit using a "pairwise" X.