Flexible Regression - Lecture 6
Marnie McLean
Room 344 Mathematics and Statistics Building,
marnie.mclean@glasgow.ac.uk
October 2018
Chapter 2 part 1- Smoothing Methods
Smoothing methods:
Used for description and/or estimation. Examples are:
loess
local linear regression
smoothing splines
regression splines
For now we are only interested in ‘how’ we do the smoothing?
The selection of smoothing parameters will be considered in the
next section.
Introduction Smoothing Splines Basis functions 2/27
2.5 Splines
Splines
Represent the fit of f (x) as a piecewise polynomial.
Spline functions consist of polynomial segments which are
joined together smoothly at pre-defined subintervals.
The points at which the joins occur are called breakpoints, or
knots, of the spline.
Introduction Smoothing Splines Basis functions 3/27
2.5 Splines
Figure : Spline with 6 interior knots
Introduction Smoothing Splines Basis functions 4/27
2.5 Splines
The function (and at least its first derivative) are constrained to
be continuous everywhere.
The aim of spline smoothing is to fit a smooth, flexible function
which minimizes the residual sum of squares.
Introduction Smoothing Splines Basis functions 5/27
2.5 Splines
Degree of the polynomial
Degree = 0 - a discontinuous step function.
Degree = 1 - a chain of line segments.
For larger values of degree the spline is increasingly smooth..
A cubic spline (degree = 3) is a common choice in many
situations for practical and computational reasons.
Introduction Smoothing Splines Basis functions 6/27
2.5 Splines
Number and position of knots
In an (unpenalised) spline the number of knots determines the
level of smoothing.
The more knots used, the more flexible the regression function
can become.
The positioning of the knots can be important, especially when
the number of knots is small.
One approach - use “too many” knots (one knot per observation
in the most extreme case) and use a penalty term to control for
the smoothness (section 2.5.1).
Introduction Smoothing Splines Basis functions 7/27
2.5.1 Smoothing Splines
For data (xi , yi ) and the model:
Yi = f (xi ) + i , i = 1, . . . , n
Pn
the solution f̂ (xi ) = yi minimises i=1 (yi − f (xi ))2 .
This is a regression which interpolates the points.
Introduction Smoothing Splines Basis functions 8/27
2.5.1 Smoothing Splines
Therefore, to avoid this, a second term is added to the
expression to give:
n
X Z b
2 2
(yi − f (xi )) + λ [f 00 (x)] dx,
i=1 a
λ is a fixed constant,
a ≤ x1 ≤ · · · ≤ xn ≤ b,
and we choose f̂ to minimise this modified least squares
criterion.
Introduction Smoothing Splines Basis functions 9/27
2.5.1 Smoothing Splines
Rb
The term λ a
[f 00 (x)]2 dx is referred to as a roughness penalty .
Pn 2
i=1 (yi − f (xi )) - measures closeness to the data
Rb
λ a [f 00 (x)]2 dx - penalizes curvature in the function.
Other choices of roughness penalty can be considered with
penalties on higher order derivatives.
This is the smoothing spline fit.
Introduction Smoothing Splines Basis functions 10/27
2.5.1 Smoothing Splines
For values of λ > 0, λ is the smoothing parameter.
Increasing λ penalises fluctuations, and so produces a smoother
curve.
Hence, λ here plays a similar role to the standard deviation and
span used in earlier sections.
The knots are the observed unique x values and λ is used to
control the smoothing.
Introduction Smoothing Splines Basis functions 11/27
2.5.1 Smoothing Splines
For this choice of roughness penalty: f̂ , is a natural cubic
spline.
This means that f̂ is a piecewise cubic polynomial in each
interval (xi , xi+1 ) (for ordered xi ).
The functions f̂ , fˆ0 and fˆ00 are continuous.
Cubic smoothing splines are among the most commonly used.
Introduction Smoothing Splines Basis functions 12/27
2.5.1 Smoothing Splines
Natural cubic splines
The value of the second and third derivatives of f at the start
and end points a and b are both equal to zero.
This implies that the function is linear beyond the boundary
knots.
Cubic smoothing splines can be fitted using the
smooth.spline function in R.
Introduction Smoothing Splines Basis functions 13/27
2.5.1 Smoothing Splines
Figure : Simulated data with a smoothing spline, the smoothing
parameter has been automatically chosen, the dashed line is the
underlying true curve
Introduction Smoothing Splines Basis functions 14/27
2.5.1 Smoothing Splines
Figure : Radiocarbon data with a smoothing spline, the smoothing
parameter has been automatically chosen
Introduction Smoothing Splines Basis functions 15/27
2.5.1 Smoothing Splines
What choices have to be made?
The main choice to be made is the size of the smoothing
parameter λ for the roughness penalty.
Drawbacks
Since there is a knot at every unique x value, there are as many
parameters are there are observations.
This excessive number of parameters can become very
computationally inefficient, particularly if there are multiple
covariates.
Introduction Smoothing Splines Basis functions 16/27
2.5 Splines
(Penalised) regression splines are often used instead, which
are a way to combine splines with computational efficiency.
Introduction Smoothing Splines Basis functions 17/27
2.5.2 Basis functions
Regression splines are underpinned by a set of known functions
called basis functions.
Basis functions are another common way to build a smooth
function.
Smooth functions can be approximated using weighted sums of
the individual functions.
The choice of basis system is often dependent on the data (or
type of variable) to which the smooth functions are to be fitted.
Introduction Smoothing Splines Basis functions 18/27
2.5.2 Basis functions
For the general model:
Yi = f (xi ) + i , i = 1, . . . , n
a curve estimate can be produced by fitting the regression
Yi = β0 b0 (xi ) + β1 b1 (xi ) + β2 b2 (xi ) + . . . + βp bp (xi ) + εi ,
where the bj are referred to as basis functions .
Therefore,
p
X
f (x) = βj bj (x)
j=0
Introduction Smoothing Splines Basis functions 19/27
2.5.2 Basis functions
Examples of basis expansions for simple models:
Simple linear regression
Yi = β0 + β1 xi + εi ,
1 x1
1 x2
. .
design matrix = X = B(x) = B =
. .
. .
1 xn
basis functions are b0 (x) = 1, b1 (x) = x
Introduction Smoothing Splines Basis functions 20/27
2.5.2 Basis functions
More generally for a polynomial of degree p;
x1p
1 x1 . .
. x2 . .
x2p
. . . . .
X = B(x) = B = . . . .
.
. . . . .
1 xn . . xnp
basis functions are b0 (x) = 1, b1 (x) = x, . . . , bp (x) = xp
Introduction Smoothing Splines Basis functions 21/27
2.5.2 Basis functions
A simple linear regression The basis functions for
example simple linear regression 1, x
Introduction Smoothing Splines Basis functions 22/27
2.5.2 Basis functions
Truncated power basis (Ruppert et. al 2003)
The following is a simple type of polynomial spline of degree p:
K
X
β0 + β1 x + · · · + βp x p + βpk (x − κk )p+
k=1
where p ≥ 1, κ1 , . . . , κK are the knots and (.)p+ = max {(.)p , 0}.
The model is constructed as a linear combination of basis
functions: 1, x, . . . , xp , (x − κ1 )p+ , . . . (x − κK )p+ .
Introduction Smoothing Splines Basis functions 23/27
2.5.2 Basis functions
Truncated power basis
A simple case would be for a linear spline basis with one knot
at x = 0.5 e.g.
Yi = β0 + β1 xi + β11 (xi − 0.5)+ + i , i = 1, . . . , n
where (xi − 0.5)+ is the positive part of the function x − 0.5.
The + sets this function to zero for those values of x where
x − 0.5 is negative.
Introduction Smoothing Splines Basis functions 24/27
2.5.2 Basis functions
Truncated power basis
Yi = β0 + β1 xi + β11 (xi − 0.5)+ + i , i = 1, . . . , n
The basis functions are:
1, x, (x − 0.5)+
This simple example has one knot at 0.5.
Introduction Smoothing Splines Basis functions 25/27
2.5.2 Basis functions
Truncated power basis
Yi = β0 + β1 xi + β11 (xi − 0.5)+ + i , i = 1, . . . , n
1 x1 (x1 − 0.5)+
1 x2 (x2 − 0.5)+
. . .
X = B(x) = B =
. . .
. . .
1 xn (xn − 0.5)+
Introduction Smoothing Splines Basis functions 26/27
2.5.2 Basis functions
Figure : Basis functions for Figure : linear spline fitted to
linear spline: 1, x, (x − 0.5)+ simulated data
Introduction Smoothing Splines Basis functions 27/27