8000 [MRG+2] Gaussian process revisited by jmetzen · Pull Request #4270 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG+2] Gaussian process revisited #4270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 165 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
165 commits
Select commit Hold shift + click to select a range
a6187a8
ADD New kernel module for GPs which supports kernel engineering
Feb 17, 2015
ca82e7b
ADD New implementation of GaussianProcessRegression
Feb 17, 2015
2de1d23
ADD Gaussian Process classification based on Laplace approximation
Feb 17, 2015
d8e34ad
TST Tests for GP regression and classification
Feb 17, 2015
4ac2f5b
ADD Examples for GP classification
Feb 17, 2015
a6be0ae
ADD Examples for GP regression
Feb 17, 2015
d55ccaf
FIX cross_correlation of anisotropic RBF kernel computed correctly
Feb 17, 2015
0765d09
FIX Fix some bugs in kernel module discovered during writing unit test
Feb 18, 2015
0240f4e
TST Test for GP kernels
Feb 18, 2015
ecb5f39
REFACTOR auto_correlation() -> auto(), cross_correlation() -> cross()
Feb 18, 2015
ebb95f5
DOC GP's kernel module documented
Feb 18, 2015
7024332
DOC Revised GP examples (added doc etc.)
Feb 18, 2015
c4b345b
ENH GPR.predict allows returning standard-deviations of predictions
Feb 19, 2015
9f8a90c
DOC Revised examples based on @eickenberg's comments
Feb 20, 2015
6681df0
MISC Revised gaussian_process package based on @eickenberg's suggestions
Feb 20, 2015
204ca41
TST Dealing with failing tests from test_common
Feb 20, 2015
dfdac51
REFACTOR Merging kernel methods auto() and cross() in __call__()
Feb 22, 2015
292a009
ENH Adding PairwiseKernel for GPs (wraps kernels from metrics.pairwise)
Feb 22, 2015
2be7733
REFACTOR GaussianProcessClassification/Regression -> ...Classifier/Re…
Feb 24, 2015
3a6afbc
ENH More efficient approximation of gradient in PairwiseKernel
Feb 24, 2015
38fe9cf
ENH Avoid to create square matrix W_sr in all but one place (TODO)
Feb 24, 2015
7899e58
ENH Adding RationalQuadratic and ExpSineSquared kernels
Feb 26, 2015
0f42572
FIX Dealing with singular kernel matrix (-inf log-marginal-likelihood)
Feb 28, 2015
0fac1ad
ENH Adding DotProduct kernel and fixing ExpSine2 kernel
Feb 28, 2015
c7c6734
DOC Initial version of Mauna Loa CO2 example
Feb 28, 2015
e087a22
REFACTOR Hyperparameter optimization in GPs enabled explicitly by opt…
Feb 28, 2015
540a009
DOC Revising CO2 example
Feb 28, 2015
8c99027
TST Cover more kernels in tests
Mar 1, 2015
264dd9e
ENH Analytic gradients for ExpSine2, RationalQuadratic, DotProduct ke…
Mar 1, 2015
7e8c4a7
DOC Extended documentation of GP kernels module
Mar 1, 2015
63e92de
MISC Reorganizing kernel module
Mar 1, 2015
6be29fa
ENH Adding method is_stationary to GP kernels
Mar 1, 2015
57a7b93
TST Testing kernel stationarity
Mar 1, 2015
ac5f635
DOC Illustrating different kernels in plot_gpr_prior_posterior.py
Mar 1, 2015
0a98dfc
FIX test_converged_to_local_maximum deals with convergence to boundary
Mar 1, 2015
508867c
ENH More efficient computation of lml-gradient in GPR using einsum
Mar 1, 2015
da8bdc4
FIX GPs do not modify kernel attribute but store optimized kernels in…
Mar 2, 2015
e34fa32
REFACTOR GP kernels use separate specification of bounds and params (…
Mar 7, 2015
5c99aad
REFACTOR REnamed kernel property params to theta
Mar 7, 2015
cd375d7
ENH Kernel implements equality check and get_params() method
Mar 8, 2015
04ebe40
TST Tests for cloning kernels with sklearn.base's clone function
Mar 8, 2015
ca6e6a2
REFACTOR Replacing all calls to deepcopy() by clone()
Mar 8, 2015
e7586b1
TST Replace call to deepcopy() by clone()
Mar 8, 2015
a1186bb
ENH Added Exponentiation kernel (DotProduct no longer has internal de…
Mar 8, 2015
f2ec6a2
TST test_kernels tests Exponentiation kernel
Mar 8, 2015
29c6d3b
FIX Using numpy-downward compatible ones_lile() instead of full_like()
Mar 8, 2015
6a0b32e
DOC Reorganizing gaussian_process examples
Mar 8, 2015
e144924
FIX Minor bugfixes and downward-compatibility/python3 issue
Mar 8, 2015
6c45af4
REFACTOR Rename GPR method sample to sample_y
Mar 10, 2015
b1531eb
REFACTOR Compute standard deviation of predictive distribution at n p…
Mar 11, 2015
5293bbc
FIX GPC uses LabelEncoder in fit() and returns (n_samples, n_classes)…
Mar 13, 2015
d922f5d
FIX Adapted examples and tests to GPC.predict_proba() adapted return …
Mar 13, 2015
948949c
FIX GaussianProcessRegressor inherits from RegressorMixin
Mar 13, 2015
5b1189a
ENH Reducing memory-consumption of GPC.predict_proba()
Mar 13, 2015
1e7713b
FIX Numpy downward-compatible way of computing sum over two axis
Mar 13, 2015
6496f92
ENH GPC can use warm-starts in _posterior_mode()
Mar 14, 2015
eec5c50
ENH Adding diag(X) method to GP kernels.
Mar 14, 2015
08ead1a
DOC Adding GPC to plot_classifier_comparison.py script and enabling w…
Mar 14, 2015
4ac2b12
DOC gpr module fully documented
Mar 15, 2015
f5bb2a7
DOC gpc module fully documented
Mar 15, 2015
64ec6ae
REFACTOR Kernel hyperparamters are now set via their name
Mar 29, 2015
5ba8e71
TST Adapted tests to changes in kernels and added additional test for…
Mar 29, 2015
57ebe4f
DOC Adapted GP examples to changes in kernel module
Mar 29, 2015
9ffab4c
FIX Kernel property theta can deal with anisotropic RBF kernel
Mar 29, 2015
584b7b1
FIX Dealing with anisotropic length-scales correctly
Mar 31, 2015
78ff16c
DOC Revised docstrings of kernels.py
Mar 31, 2015
07b08bb
REFACTOR Removed literal form of specifying bounds for ConstantKernel
Mar 31, 2015
5f4038d
REFACTOR Changed order of l and alpha in RationalQuadratic kernel
Mar 31, 2015
be9d749
ENH GPR can use individual noise-levels for each datapoint
Apr 1, 2015
c7f9ff5
ENH GaussianProcessRegressor supports multiple restarts of the optimizer
Apr 2, 2015
90aceb8
TST An increasing number of random-starts of GPR only increases the L…
Apr 2, 2015
3d18f95
DOC Adapted old example of fitting with noisy targets to new GPR class
Apr 2, 2015
c29a2b2
ENH GaussianProcessClassifier supports multiple restarts of the optim…
Apr 2, 2015
b6681a3
TST An increasing number of random-starts of GPC only increases the L…
Apr 2, 2015
9947c25
FIX Fetching CO2 data from mldata
Apr 2, 2015
eaba93c
FIX GP kernels check for type equality before checking attribute equa…
Apr 3, 2015
63c931a
FIX cross_validation and pairwise_kernels treat GP kernels as special…
Apr 3, 2015
e0e0493
DOC Adding example comparing kernel ridge with GPR.
Apr 4, 2015
7b47208
DOC Revised plot_gpc example
Apr 4, 2015
57c4746
DOC Adding documentation for GP kernels
Apr 8, 2015
8135106
DOC Documentation of the CO2 example
Apr 10, 2015
2515a25
DOC Backported documentation to kernels module
Apr 10, 2015
eaf5602
DOC Extending further the GP documentation
Apr 15, 2015
1e8c719
ENH GPR supports normalization of target values
May 20, 2015
c3387ad
TST Testing normalization of output values in GPR
May 20, 2015
cf6ce0e
ENH GPR supports target value with more than one dimension
May 20, 2015
bbd5753
TST Testing multi-out GPR
May 20, 2015
f779bf1
FIX Not performing hyperparamter-tuning in GPR if there are no hyperp…
May 20, 2015
8ad4c51
ENH GPR can use an externally defined optimizer for hyperparameter tu…
May 20, 2015
103672e
TST Testing GPR with custom external optimizer
May 20, 2015
7c13efe
REFACTOR GP kernel hyperparameters theta and their bounds are log-tra…
May 21, 2015
910188a
TST Adapting GP tests to changed hyperparameter representation
May 21, 2015
947bc0d
DOCAdapting GP examples to changed hyperparameter representation
May 21, 2015
9607576
REFACTOR Remove theta_ attribute from GPR and GPC
May 21, 2015
0272c48
ENH GPC can use an externally defined optimizer for hyperparameter tu…
May 21, 2015
63660e2
TST Testing GPC with custom external optimizer
May 21, 2015
a3ae48b
REFACTOR normalize_y in GPR does not modify standard-deviation
May 21, 2015
60da933
FIX Sampling theta_intial uniformly from log-bounds in GPR
May 28, 2015
3f7672b
FIX Correcly composing theta vector in GP kernels
May 28, 2015
9af9c24
FIX Ensuring that predicted variances are never negative
Jun 25, 2015
b682370
ENH Adding Matern kernel for Gaussian Processes
Jul 7, 2015
537790a
DOC Adding documentation on Matern kernel
Jul 7, 2015
245b262
TST Added test checking that GPR can deal with different outputs for …
Jul 14, 2015
4aeef95
FIX Checking for correct dimensionality of anisotropic kernels
Jul 30, 2015
a4f5570
FIX RBF and Matern kernel handle 1d length 1 length-scales correctly
Jul 31, 2015
692ec66
DOC Added further doc for GPC
Apr 19, 2015
2396737
ADD CompoundKernel for multi-class GP classification
Aug 2, 2015
c86ed8e
FIX GPR's random_state is cloned correctly
Aug 2, 2015
3867429
ENH Support for multi-class GPC added based on OneVsRestClassifier
Aug 2, 2015
bbc6f24
DOC Adding GPC to plot_classification_probability.py example
Aug 2, 2015
4a79bfe
TST Testing GPC for multi-class classification problems
Aug 2, 2015
a5b8db4
FIX Several minor changes/fixes in gaussian_process module
Aug 3, 2015
f943b5d
FIX Using six.string_types instead of basestring (python3 compatibility)
Aug 3, 2015
6a93190
FIX Excluding WhiteKernel from one unit-test
Aug 3, 2015
1a42e76
TST Extending testing of multi-class GP
Aug 3, 2015
089aab4
DOC Updated documentation of Gaussian process classifier
Aug 3, 2015
b447dae
ADD Example illustrating Gaussian process classifier on iris dataset
Aug 3, 2015
9a9b1ce
TST Fixing some GP-releated issues in estimator checks
Aug 3, 2015
ff0df69
TST Fix GPC and GPR such that all tests are ok
Aug 3, 2015
1085ff0
TST Further GP-related fixes
Aug 3, 2015
e27c23a
FIX Hiding method decision_function in GaussianProcessClassifier
Aug 4, 2015
71ac590
TST GaussianProcessRegressor added to MULTI_OUTPUT estimators
Aug 4, 2015
fea4735
PEP8 Fixing PEP8 issues in gaussian_process package
Aug 4, 2015
75f1053
ENH X_train_ can be copied if requested
Aug 4, 2015
65a1c65
ENH Adding max_iter parameter for Newton iteration in GPC
Aug 4, 2015
db57902
COS Removing keyword magic from PairwiseKernel
Aug 4, 2015
81580fd
FIX Added missing import of ConstantKernel in gpc.py
Aug 4, 2015
798ec3d
TST Excluded GPC from test_non_transformer_estimators_n_iter()
Aug 4, 2015
1ec07f7
MISC Deprecating legacy GP implementation
Aug 5, 2015
8000
ee3e670
COS Making BinaryGaussianProcessClassifierLaplace private
Aug 5, 2015
2b43f62
DOC Cleaning up documenation of legacy GPs
Aug 5, 2015
2a0792b
DOC Updating GPC XOR example (showing also results for DotProduct ker…
Aug 5, 2015
47cc6a2
FIX Remove BinaryGaussianProcessClassifierLaplace from packages __all…
Aug 5, 2015
26b7ed7
FIX Undoing regression_model and correlation_model deprecation
Aug 5, 2015
c254688
TST Testing n_jobs option of GaussianProcessClassifier
Aug 6, 2015
c255326
REF Realizing multi-class GPC bei delegating to OneVsRest or OneVsOne
Aug 6, 2015
f9ff45e
DOC Updating documentation of multi-class GPC
Aug 6, 2015
7f429cf
REF Several small changes based on @eickenberg's suggestions
Aug 10, 2015
434531f
ENH More efficient computation of K_gradient for anisotropic Matern k…
Aug 11, 2015
628aeb5
REF Renaming sigma_squared_n to alpha (consistency with Ridge)
Aug 11, 2015
05da11e
ENH Full support of (deep) get_params() and set_params() in GP kernels
Aug 12, 2015
963e968
DOC Revising GP documentation based on @kastnerkyle comments
Aug 15, 2015
8ced3a8
FIX theta_vars correctly supported in KernelOperators
Aug 17, 2015
fec159a
TST Adding two unittests for testing GP kernels
Aug 17, 2015
a7dfc3d
REF Removing theta_vars attribute from kernels and using Hyperparamet…
Aug 17, 2015
96ad112
DEL Removing bounds.setter in kernels.py
Aug 19, 2015
0581e36
REF Renaming max_iter to max_iter_predict in GaussianProcessClassifier
Aug 19, 2015
a4d81c5
DOC Documenting GP kernel API
Aug 19, 2015
7ff0f71
DOC Fixing some minor issues in narrative doc of GPs
Aug 19, 2015
63ca51c
FIX Fix failing tests (doctest, order of hyperparameters, pairwise_k…
Aug 19, 2015
523b439
FIX python3 compatibility in doctest
Aug 19, 2015
d2a3851
FIX Fixing test_kernel_theta
Aug 19, 2015
a032640
FIX Further python3 related fixes
Aug 19, 2015
bbce7d2
FIX Fixing a python2.6-related issue
Aug 19, 2015
e075585
FIX Changing bounds in GP doctest
Aug 20, 2015
52060d0
MISC copy_X_train defaults to True in GPR and GPC
Aug 20, 2015
2176839
FIX Fix message of ValueError for wrong shape of alpha in GPR
Aug 20, 2015
5164af3
PEP8 Making pep8 and pyflakes happy
Aug 30, 2015
0b6116a
REF More meaningful names for hyperparameters of GP kernels
Aug 31, 2015
5778a0a
MISC Changing default for copy_X_train to True in GaussianProcessClas…
Aug 31, 2015
42aa29e
FIX String comparison via equality and not identity in Hyperparameter…
Sep 7, 2015
0942e1f
ENH GPR.log_marginal_likelihood() returns the current log-likelihood …
Sep 7, 2015
68b654d
FIX Enforcing y to be numeric in GPR and fixing python3 issue with maps
Sep 7, 2015
bd69c0b
ADD Mixins for normalized and stationary kernels
Sep 7, 2015
709d43c
TST test_lml_precomputed() checks only for equality in first 7 digits
Sep 14, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 20 additions & 12 deletions doc/modules/classes.rst
A3E2
Original file line number Diff line number Diff line change
Expand Up @@ -490,22 +490,30 @@ From text
:toctree: generated/
:template: class.rst

gaussian_process.GaussianProcessRegressor
gaussian_process.GaussianProcessClassifier
gaussian_process.GaussianProcess

.. autosummary::
:toctree: generated
:template: function.rst
Kernels:

gaussian_process.correlation_models.absolute_exponential
gaussian_process.correlation_models.squared_exponential
gaussian_process.correlation_models.generalized_exponential
gaussian_process.correlation_models.pure_nugget
gaussian_process.correlation_models.cubic
gaussian_process.correlation_models.linear
gaussian_process.regression_models.constant
gaussian_process.regression_models.linear
gaussian_process.regression_models.quadratic
.. autosummary::
:toctree: generated/
:template: class.rst

gaussian_process.kernels.Kernel
gaussian_process.kernels.Sum
gaussian_process.kernels.Product
gaussian_process.kernels.Exponentiation
gaussian_process.kernels.ConstantKernel
gaussian_process.kernels.WhiteKernel
gaussian_process.kernels.RBF
gaussian_process.kernels.Matern
gaussian_process.kernels.RationalQuadratic
gaussian_process.kernels.ExpSineSquared
gaussian_process.kernels.DotProduct
gaussian_process.kernels.PairwiseKernel
gaussian_process.kernels.CompoundKernel
gaussian_process.kernels.Hyperparameter

.. _grid_search_ref:

Expand Down
663 changes: 589 additions & 74 deletions doc/modules/gaussian_process.rst

Large diffs are not rendered by default.

10 changes: 7 additions & 3 deletions examples/classification/plot_classification_probability.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
Plot the classification probability for different classifiers. We use a 3
class dataset, and we classify it with a Support Vector classifier, L1
and L2 penalized logistic regression with either a One-Vs-Rest or multinomial
setting.
setting, and Gaussian process classification.

The logistic regression is not a multiclass classifier out of the box. As
a result it can identify only the first class.
Expand All @@ -21,6 +21,8 @@ class dataset, and we classify it with a Support Vector classifier, L1

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF
from sklearn import datasets

iris = datasets.load_iris()
Expand All @@ -30,6 +32,7 @@ class dataset, and we classify it with a Support Vector classifier, L1
n_features = X.shape[1]

C = 1.0
kernel = 1.0 * RBF([1.0, 1.0]) # for GPC

# Create different classifiers. The logistic regression cannot do
# multiclass out of the box.
Expand All @@ -38,8 +41,9 @@ class dataset, and we classify it with a Support Vector classifier, L1
'Linear SVC': SVC(kernel='linear', C=C, probability=True,
random_state=0),
'L2 logistic (Multinomial)': LogisticRegression(
C=C, solver='lbfgs', multi_class='multinomial'
)}
C=C, solver='lbfgs', multi_class='multinomial'),
'GPC': GaussianProcessClassifier(kernel)
}

n_classifiers = len(classifiers)

Expand Down
13 changes: 9 additions & 4 deletions examples/classification/plot_classifier_comparison.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@
from sklearn.datasets import make_moons, make_circles, make_classification
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.naive_bayes import GaussianNB
Expand All @@ -44,12 +46,14 @@

h = .02 # step size in the mesh

names = ["Nearest Neighbors", "Linear SVM", "RBF SVM", "Decision Tree",
"Random Forest", "AdaBoost", "Naive Bayes", "LDA", "QDA"]
names = ["3 Near. Neighb.", "Linear SVM", "RBF SVM", "RBF GPC",
"Decision Tree", "Random Forest", "AdaBoost", "Naive Bayes", "LDA",
"QDA"]
classifiers = [
KNeighborsClassifier(3),
SVC(kernel="linear", C=0.025),
SVC(gamma=2, C=1),
GaussianProcessClassifier(1.0 * RBF(1.0), warm_start=True),
DecisionTreeClassifier(max_depth=5),
RandomForestClassifier(max_depth=5, n_estimators=10, max_features=1),
AdaBoostClassifier(),
Expand All @@ -75,7 +79,8 @@
# preprocess dataset, split into training and test part
X, y = ds
X = StandardScaler().fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.4)
X_train, X_test, y_train, y_test = \
train_test_split(X, y, test_size=.4, random_state=42)

x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
Expand Down Expand Up @@ -128,5 +133,5 @@
size=15, horizontalalignment='right')
i += 1

figure.subplots_adjust(left=.02, right=.98)
plt.tight_layout()
plt.show()
51 changes: 0 additions & 51 deletions examples/gaussian_process/gp_diabetes_dataset.py

This file was deleted.

116 changes: 116 additions & 0 deletions examples/gaussian_process/plot_compare_gpr_krr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
"""
==========================================================
Comparison of kernel ridge and Gaussian process regression
==========================================================

Both kernel ridge regression (KRR) and Gaussian process regression (GPR) learn
a target function by employing internally the "kernel trick". KRR learns a
linear function in the space induced by the respective kernel which corresponds
to a non-linear function in the original space. The linear function in the
kernel space is chosen based on the mean-squared error loss with
ridge regularization. GPR uses the kernel to define the covariance of
a prior distribution over the target functions and uses the observed training
data to define a likelihood function. Based on Bayes theorem, a (Gaussian)
posterior distribution over target functions is defined, whose mean is used
for prediction.

A major difference is that GPR can choose the kernel's hyperparameters based
on gradient-ascent on the marginal likelihood function while KRR needs to
perform a grid search on a cross-validated loss function (mean-squared error
loss). A further difference is that GPR learns a generative, probabilistic
model of the target function and can thus provide meaningful confidence
intervals and posterior samples along with the predictions while KRR only
provides predictions.

This example illustrates both methods on an artificial dataset, which
consists of a sinusoidal target function and strong noise. The figure compares
the learned model of KRR and GPR based on a ExpSineSquared kernel, which is
suited for learning periodic functions. The kernel's hyperparameters control
the smoothness (l) and periodicity of the kernel (p). Moreover, the noise level
of the data is learned explicitly by GPR by an additional WhiteKernel component
in the kernel and by the regularization parameter alpha of KRR.

The figure shows that both methods learn reasonable models of the target
function. GPR correctly identifies the periodicity of the function to be
roughly 2*pi (6.28), while KRR chooses the doubled periodicity 4*pi. Besides
that, GPR provides reasonable confidence bounds on the prediction which are not
available for KRR. A major difference between the two methods is the time
required for fitting and predicting: while fitting KRR is fast in principle,
the grid-search for hyperparameter optimization scales exponentially with the
number of hyperparameters ("curse of dimensionality"). The gradient-based
optimization of the parameters in GPR does not suffer from this exponential
scaling and is thus considerable faster on this example with 3-dimensional
hyperparameter space. The time for predicting is similar; however, generating
the variance of the predictive distribution of GPR takes considerable longer
than just predicting the mean.
"""
print(__doc__)

# Authors: Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>
# License: BSD 3 clause


import time

import numpy as np

import matplotlib.pyplot as plt

from sklearn.kernel_ridge import KernelRidge
from sklearn.grid_search import GridSearchCV
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, ExpSineSquared

rng = np.random.RandomState(0)

# Generate sample data
X = 15 * rng.rand(100, 1)
y = np.sin(X).ravel()
y += 3 * (0.5 - rng.rand(X.shape[0])) # add noise

# Fit KernelRidge with parameter selection based on 5-fold cross validation
param_grid = {"alpha": [1e0, 1e-1, 1e-2, 1e-3],
"kernel": [ExpSineSquared(l, p)
for l in np.logspace(-2, 2, 10)
for p in np.logspace(0, 2, 10)]}
kr = GridSearchCV(KernelRidge(), cv=5, param_grid=param_grid)
stime = time.time()
kr.fit(X, y)
print("Time for KRR fitting: %.3f" % (time.time() - stime))

gp_kernel = ExpSineSquared(1.0, 5.0, periodicity_bounds=(1e-2, 1e1)) \
+ WhiteKernel(1e-1)
gpr = GaussianProcessRegressor(kernel=gp_kernel)
stime = time.time()
gpr.fit(X, y)
print("Time for GPR fitting: %.3f" % (time.time() - stime))

# Predict using kernel ridge
X_plot = np.linspace(0, 20, 10000)[:, None]
stime = time.time()
y_kr = kr.predict(X_plot)
print("Time for KRR prediction: %.3f" % (time.time() - stime))

# Predict using kernel ridge
stime = time.time()
y_gpr = gpr.predict(X_plot, return_std=False)
print("Time for GPR prediction: %.3f" % (time.time() - stime))

stime = time.time()
y_gpr, y_std = gpr.predict(X_plot, return_std=True)
print("Time for GPR prediction with standard-deviation: %.3f"
% (time.time() - stime))

# Plot results
plt.scatter(X, y, c='k', label='data')
plt.plot(X_plot, np.sin(X_plot), c='k', label='True')
plt.plot(X_plot, y_kr, c='g', label='KRR (%s)' % kr.best_params_)
plt.plot(X_plot, y_gpr, c='r', label='GPR (%s)' % gpr.kernel_)
plt.fill_between(X_plot[:, 0], y_gpr - y_std, y_gpr + y_std, color='r',
alpha=0.2)
plt.xlabel('data')
plt.ylabel('target')
plt.xlim(0, 20)
plt.title('GPR versus Kernel Ridge')
plt.legend(loc="best", prop={'size': 10})
plt.show()
100 changes: 100 additions & 0 deletions examples/gaussian_process/plot_gpc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
"""
====================================================================
Probabilistic predictions with Gaussian process classification (GPC)
====================================================================

This example illustrates the predicted probability of GPC for an RBF kernel
with different choices of the hyperparameters. The first figure shows the
predicted probability of GPC with arbitrarily chosen hyperparameters and with
the hyperparameters corresponding to the maximum log-marginal-likelihood (LML).

While the hyperparameters chosen by optimizing LML have a considerable larger
LML, they perform slightly worse according to the log-loss on test data. The
figure shows that this is because they exhibit a steep change of the class
probabilities at the class boundaries (which is good) but have predicted
probabilities close to 0.5 far away from the class boundaries (which is bad)
This undesirable effect is caused by the Laplace approximation used
internally by GPC.

The second figure shows the log-marginal-likelihood for different choices of
the kernel's hyperparameters, highlighting the two choices of the
hyperparameters used in the first figure by black dots.
"""
print(__doc__)

# Authors: Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>
#
# License: BSD 3 clause

import numpy as np

from matplotlib import pyplot as plt

from sklearn.metrics.classification import accuracy_score, log_loss
from sklearn.gaussian_process import GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF


# Generate data
train_size = 50
rng = np.random.RandomState(0)
X = rng.uniform(0, 5, 100)[:, np.newaxis]
y = np.array(X[:, 0] > 2.5, dtype=int)

# Specify Gaussian Processes with fixed and optimized hyperparameters
gp_fix = GaussianProcessClassifier(kernel=1.0 * RBF(length_scale=1.0),
optimizer=None)
gp_fix.fit(X[:train_size], y[:train_size])

gp_opt = GaussianProcessClassifier(kernel=1.0 * RBF(length_scale=1.0))
gp_opt.fit(X[:train_size], y[:train_size])

print("Log Marginal Likelihood (initial): %.3f"
% gp_fix.log_marginal_likelihood(gp_fix.kernel_.theta))
print("Log Marginal Likelihood (optimized): %.3f"
% gp_opt.log_marginal_likelihood(gp_opt.kernel_.theta))

print("Accuracy: %.3f (initial) %.3f (optimized)"
% (accuracy_score(y[:train_size], gp_fix.predict(X[:train_size])),
accuracy_score(y[:train_size], gp_opt.predict(X[:train_size]))))
print("Log-loss: %.3f (initial) %.3f (optimized)"
% (log_loss(y[:train_size], gp_fix.predict_proba(X[:train_size])[:, 1]),
log_loss(y[:train_size], gp_opt.predict_proba(X[:train_size])[:, 1])))


# Plot posteriors
plt.figure(0)
plt.scatter(X[:train_size, 0], y[:train_size], c='k', label="Train data")
plt.scatter(X[train_size:, 0], y[train_size:], c='g', label="Test data")
X_ = np.linspace(0, 5, 100)
plt.plot(X_, gp_fix.predict_proba(X_[:, np.newaxis])[:, 1], 'r',
label="Initial kernel: %s" % gp_fix.kernel_)
plt.plot(X_, gp_opt.predict_proba(X_[:, np.newaxis])[:, 1], 'b',
label="Optimized kernel: %s" % gp_opt.kernel_)
plt.xlabel("Feature")
plt.ylabel("Class 1 probability")
plt.xlim(0, 5)
plt.ylim(-0.25, 1.5)
plt.legend(loc="best")

# Plot LML landscape
plt.figure(1)
theta0 = np.logspace(0, 8, 30)
theta1 = np.logspace(-1, 1, 29)
Theta0, Theta1 = np.meshgrid(theta0, theta1)
LML = [[gp_opt.log_marginal_likelihood(np.log([Theta0[i, j], Theta1[i, j]]))
for i in range(Theta0.shape[0])] for j in range(Theta0.shape[1])]
LML = np.array(LML).T
plt.plot(np.exp(gp_fix.kernel_.theta)[0], np.exp(gp_fix.kernel_.theta)[1],
'ko', zorder=10)
plt.plot(np.exp(gp_opt.kernel_.theta)[0], np.exp(gp_opt.kernel_.theta)[1],
'ko', zorder=10)
plt.pcolor(Theta0, Theta1, LML)
plt.xscale("log")
plt.yscale("log")
plt.colorbar(label="Log-marginal Likelihood")
plt.xlabel("Magnitude")
plt.ylabel("Length-scale")
plt.title("Log-marginal-likelihood")

plt.show()
Loading
0