8000 All kernel regression methods should accept precomputed Gram matrices · Issue #8445 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

All kernel regression methods should accept precomputed Gram matrices #8445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chrishmorris opened this issue Feb 23, 2017 · 25 comments
Open
Labels
Enhancement help wanted Moderate Anything that requires some knowledge of conventions and best practices module:gaussian_process

Comments

@chrishmorris
Copy link

SVR has an option

    kernel='precomputed'

If this is chosen, then the X array passed to the fit method is the Gram matrix of the training examples. This option should also be available for GPR and KRR.

Here is some motivation. Many machine learning projects begin by defining a feature set, and many algorithms intrinsically require a vector of real numbers for each sample. e.g. linear regression.

Kernel methods take a different approach: the modeller supplies a function, the "kernel", which measures the similarity between two samples. This need not be expressed in terms of features. For example, it is easy to say what it means for two DNA sequences to be similar, but hard to reduce a DNA sequence to a vector of features. So SVR correctly is willing to build a model with:

model = svm.SVR(kernel='precomputed') 
model.fit( kernel(training.molecule),  training.y.values)

(My current project is a cheminformatic one.)

It would be great if the other kernel methods supported this. At present, they require a kernel function which they pass the X values to, after checking that X is an array of floats. So some of the value of kernel methods is denied.

@amueller
Copy link
Member

They do support kernel="precomputed". If there's an issue with that, please provide a minimum example to reproduce.

@chrishmorris
Copy link
Author

The documentation for GPR does not say it accepts a precomputed gram matrix:
http://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html#sklearn.gaussian_process.GaussianProcessRegressor
and the fit method unconditionally executes:
K = self.kernel_(self.X_train_)

KRR does indeed accept a precomputed gram matrix, but the documentation does not say so:
http://scikit-learn.org/stable/modules/generated/sklearn.kernel_ridge.KernelRidge.html#sklearn.kernel_ridge.KernelRidge.fit

@agramfort
Copy link
Member
agramfort commented May 23, 2018 via email

@amueller amueller reopened this May 24, 2018
@amueller
Copy link
Member

Sorry for prematurely closing! I'm not actually sure if there is a way for GPR to use a precomputed gram matrix. Maybe @jmetzen can say? For KRR it's a doc fix.

@amueller amueller added Easy Well-defined and straightforward way to resolve Documentation good first issue Easy with clear instructions to resolve help wanted labels May 24, 2018
@amueller
Copy link
Member

(good first issue is the doc fix only, there might be another issue for the GP)

@wesbarnett
Copy link
Contributor

I can do a PR for the doc fix for KRR precomputed kernels. I've used that option before (saw the option while examining the source). First sklearn issue for me.

@chrishmorris
Copy link
Author

To do GPR with a precomputed matrix, I use the attached subclass.

Please feel free to use the code if you like it. It is directly determined by the nature of the problem to be solved, the conventions of the Python community, and the sklearn API, so I consider that I have no copyright in it.

GPR.py.txt

@amueller
Copy link
Member

@wesbarnett go for it!

@amueller
Copy link
Member

@chrishmorris I was hoping we could do it easier than that, maybe by using a precomputed kernel class for the kernel.

@chrishmorris
Copy link
Author

I'm not sure what that means.

There is one advantage in what I have done. Computing the gram matrix is expensive. After doing so, you can fit it multiple ways, using different models or cross-validating the parameters of one model.

Maybe your approach can do this if the class is memoized.

wesbarnett added a commit to wesbarnett/scikit-learn that referenced this issue May 24, 2018
wesbarnett added a commit to wesbarnett/scikit-learn that referenced this issue May 25, 2018
jnothman 8000 pushed a commit that referenced this issue May 26, 2018
* document "precomputed" kernel for KernelRidge

See #8445.

* fix doc for KernelRidge

use n_samples_fitted in score
@amueller amueller added Enhancement Moderate Anything that requires some knowledge of conventions and best practices and removed Documentation Easy Well-defined and straightforward way to resolve good first issue Easy with clear instructions to resolve labels May 26, 2018
@amueller
Copy link
Member

Documentation is done, still need to work on supporting it in GP.

@jmetzen
Copy link
Member
jmetzen commented May 29, 2018

It would prefer if we could support precomputed kernels without having to adapt the GP classes by adding a Precomputed kernel to the kernels module.

@chrishmorris could you check if you could adapt your code in this way?

@chrishmorris
Copy link
Author

I've been thinking about this, but I'm not sure what you are expecting.

Computing a gram matrix is expensive. When cross-validating, it is best to compute it once, then iterate through parameter values. So it seems to me right that at present fit and predict methods can accept a gram matrix.

Alternatively, these methods could accept a list of objects, and the kernel function could be memoized. A practical obstacle to this is that lists in Python are mutable. Maybe the lists (or series or numpy arrays) can be converted to tuples. Even then, the objects in the list may be mutable.

Are you suggesting a change of signature for fit() and predict()? If so, what to?

8000

@lixinyuu
Copy link

Any updates for his issue? I am looking for GPR which accept precomputed kernel also.
I tried

cv = KFold(n_splits=5, shuffle=True, random_state=12354)
kernel = PairwiseKernel(metric = 'precomputed')
gp = GaussianProcessRegressor(kernel= kernel)
param_grid = {'alpha': [0.1, 0.2, 0.3]}
grid_search = GridSearchCV(gp, cv = cv, param_grid = param_grid, scoring = 'neg_mean_absolute_error')

But it don't work. (pairwise kernel want X.shape[1] = y.shape[0], but this requirement cannot be met during grid search cross validation)

@jnothman
Copy link
Member

The problem there is that the _pairwise attribute is not set on GaussianProcessRegressor. It could be set by recursively checking for the pairwiseness of the kernel...

@jnothman
Copy link
Member

A fix for that would be welcome.

@sshojiro
Copy link

version: 0.22.1

FYI, there's not been an example of PairwiseKernel with a user-defining kernel function, which can pass and work correctly as follows:

# dist_metric is a user-defining kernel function which must have gamma function in it
PairwiseKernel(metric=lambda x,y,gamma: dist_metric(x,y,gamma))

As the above comments, [1] and [2], imply, the PairwiseKernel with various kernel functions is required.

If somebody tells me how to publish a sample code onto the sklearn doc, I'll be glad to follow the instruction. Thank you.

@jnothman
Copy link
Member
jnothman commented Jan 25, 2020 via email

@Andrew-S-Rosen
Copy link
Andrew-S-Rosen commented Mar 1, 2020

Has the GPR class been updated to allow for a precomputed kernel matrix? This is quite clearly explained for KRR and SVR, but it's less obvious (if present) for GPR in the docs. Whereas in KRR and SVR, you can pass kernel='precomputed', that does not appear to be an option in GPR.

Do you have to specifically supply the kernel as an object, in contrast with KRR and SVR? Drawing parallels with KRR and SVR, I'd imagine X_train could have dimensions of (n_samples x n_samples) instead of (n_samples x n_features) in this scenario, but the docs say it has dimensions of (n_samples x n_features) "or a list of objects" (what objects, I'm not entirely sure).

@chrishmorris
Copy link
Author

See my comment of 24 May 2018. The attachment provides a subclass of GPR that accepts kernel='precomputed'.

@Andrew-S-Rosen
Copy link

Thanks @chrishmorris. Just saw that. Exactly what I was looking for. Thanks for sharing.

@Andrew-S-Rosen
Copy link
Andrew-S-Rosen commented Mar 21, 2020

Just a head's up. The script by @chrishmorris has a bug in line 67. If self.kernel == 'precomputed', then self.kernel_ = None and then self.kernel_.n_dims cannot be evaluated. I assume this if statement should be skipped if self.kernel_ = None, which is an easy enough fix. In addition, line 8 should be uncommented to allow for from operator import itemgetter. I've attached a script that's up-to-date with the current version of scikit-learn: GPR.py.txt

Anyway, about this issue in general, I agree with @chrishmorris and do not understand why a solution like kernel='precomputed' (just like is currently implemented in KRR and SVR) would not be desirable. It seems like having a precomputed kernel class would be more complicated, not less.

@FanwangM
Copy link

If we can allow precomputed kernel for Gaussian process, that would be higly apprecriated and I believe it's going to be very useful for a lot of people.

@392781
Copy link
392781 commented Feb 22, 2022

@chrishmorris Were you planning to do a PR on this or is there still discussion happening with @amueller concerning implementation?

If no one is working on this anymore, I'd be happy to look into it and work on a PR for this.

@chrishmorris
Copy link
Author

Go ahead - I'm not planning to work on a PR. Thanks to arosen93 for bug fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement help wanted Moderate Anything that requires some knowledge of conventions and best practices module:gaussian_process
Projects
None yet
Development

No branches or pull requests

0