@@ -125,38 +125,59 @@ the kernel is known as the Gaussian kernel of variance :math:`\sigma^2`.
125
125
126
126
Matérn kernel
127
127
-------------
128
- The function :func: `matern_kernel ` is a generalization of the RBF kernel. It
129
- has an additional parameter :math: `\nu ` which controls the smoothness of the
130
- resulting function. The general functional form of a Matérn is given by:
128
+ The function :func: `matern_kernel ` is a generalization of the RBF kernel. It has
129
+ an additional parameter :math: `\nu ` (set via the keyword coef0) which controls
130
+ the smoothness of the resulting function. The general functional form of a
131
+ Matérn is given by
131
132
132
133
.. math ::
133
134
134
- k(d) = \sigma ^2 \frac {1 }{\Gamma (\nu )2 ^{\nu -1 }}\Bigg (\sqrt {2 \nu }\frac {d}{ \rho } \ Bigg )^\nu K_\nu \Bigg (\sqrt {2 \nu }\frac {d}{ \rho } \Bigg ),
135
+ k(d) = \sigma ^2 \frac {1 }{\Gamma (\nu )2 ^{\nu -1 }}\Bigg (\gamma \ sqrt {2 \nu } d \ Bigg )^\nu K_\nu \Bigg (\gamma \ sqrt {2 \nu } d \Bigg ),
135
136
136
- where :math: `d=\| x-y \| ^ 2 ` and ``x `` and ``y `` are the input vectors.
137
+ where :math: `d=\| x-y \|` and ``x `` and ``y `` are the input vectors.
137
138
138
139
As :math: `\nu\rightarrow\infty `, the Matérn kernel converges to the RBF kernel.
139
140
When :math: `\nu = 1 /2 `, the Matérn kernel becomes identical to the absolute
140
141
exponential kernel, i.e.,
141
142
142
143
.. math ::
143
- k(d) = \sigma ^2 \exp \Bigg (-\frac {d}{ \rho } \Bigg ) \quad \quad \nu = \tfrac {1 }{2 }
144
+ k(d) = \sigma ^2 \exp \Bigg (-\gamma d \Bigg ) \quad \quad \nu = \tfrac {1 }{2 }
144
145
145
- See Rasmussen and Williams 2006, pp84 for further details regarding the
146
- different variants of the Matérn kernel. In particular, :math: `\nu = 3 /2 `:
146
+ In particular, :math: `\nu = 3 /2 `:
147
147
148
148
.. math ::
149
- k(d) = \sigma ^2 \Bigg (1 + \frac { \sqrt {3 }d }{ \rho } \ Bigg ) \exp \Bigg (-\frac { \sqrt {3 }d}{ \rho } \Bigg ) \quad \quad \nu = \tfrac {3 }{2 }
149
+ k(d) = \sigma ^2 \Bigg (1 + \gamma \sqrt {3 } d \ Bigg ) \exp \Bigg (-\gamma \sqrt {3 }d \Bigg ) \quad \quad \nu = \tfrac {3 }{2 }
150
150
151
151
and :math: `\nu = 5 /2 `:
152
152
153
153
.. math ::
154
- k(d) = \sigma ^2 \Bigg (1 + \frac { \sqrt {5 }d }{ \rho } +\frac { 5 d^ 2 }{3 \rho ^ 2 } \Bigg ) \exp \Bigg (-\frac { \sqrt {5 }d}{ \rho } \Bigg ) \quad \quad \nu = \tfrac {5 }{2 }.
154
+ k(d) = \sigma ^2 \Bigg (1 + \gamma \sqrt {5 }d +\frac {5 }{3 } \gamma ^ 2 d^ 2 \Bigg ) \exp \Bigg (-\gamma \sqrt {5 }d \Bigg ) \quad \quad \nu = \tfrac {5 }{2 }
155
155
156
156
are popular choices for learning functions that are not infinitely
157
157
differentiable (as assumed by the RBF kernel) but at least once (:math: `\nu =
158
158
3 /2 `) or twice differentiable (:math: `\nu = 5 /2 `).
159
159
160
+ The following example illustrates how the Matérn kernel's covariance decreases
161
+ with increasing dissimilarity of the two inputs for different values of coef0
162
+ (the parameter :math: `\nu ` of the Matérn kernel):
163
+
164
+ .. figure :: ../auto_examples/metrics/images/plot_matern_kernel_001.png
165
+ :target: ../auto_examples/metrics/plot_matern_kernel.html
166
+ :align: center
167
+
168
+ The flexibility of controlling the smoothness of the learned function via coef0
169
+ allows adapting to the properties of the true underlying functional relation.
170
+ The following example shows that support vector regression with Matérn kernel
171
+ with smaller values of coef0 can better approximate a discontinuous
172
+ step-function:
173
+
174
+ .. figure :: ../auto_examples/svm/images/plot_svm_matern_kernel_001.png
175
+ :target: ../auto_examples/svm/plot_svm_matern_kernel.html
176
+ :align: center
177
+
178
+ See Rasmussen and Williams 2006, pp84 for further details regarding the
179
+ different variants of the Matérn kernel.
180
+
160
181
161
182
Chi-squared kernel
162
183
------------------
@@ -207,3 +228,8 @@ The chi squared kernel is most commonly used on histograms (bags) of visual word
207
228
International Journal of Computer Vision 2007
208
229
http://eprints.pascal-network.org/archive/00002309/01/Zhang06-IJCV.pdf
209
230
231
+ * Rasmussen, C. E. and Williams, C.
232
+ Gaussian Processes for Machine Learning
233
+ The MIT Press, 2006
234
+ http://www.gaussianprocess.org/gpml/chapters/
235
+
0 commit comments