Calc Var
Calc Var
Calc Var
MARKUS GRASMAIR
1. Brachistochrone problem
The classical problem in calculus of variation is the so called brachistochrone
problem 1 posed (and solved) by Bernoulli in 1696. Given two points A and B,
find the path along which an object would slide (disregarding any friction) in the
shortest possible time from A to B, if it starts at A in rest and is only accelerated
by gravity (see Figure 1).
A x
B
y
than once), but it is unlikely that we have excluded the actual optimum.2 Thus we
have reduced the problem of finding an optimal path from A to B to the problem
of finding an optimal function y on (0, a) satisfying certain boundary conditions.
Note also that the optimal solution will satisfy the inequality y(x) ≥ 0 for all x,
which we will always assume in the following.
The next (large) step will be the derivation of a formula for the travel time of
an object from A to B given the function y. We note first that the velocity v(x)
of the object at the point (x, y(x)) is already determined by y(x): Conservation of
energy implies that the sum of the kinetic and the potential energy of the object
always remains constant. Since it is at rest at the point (0, 0), and the difference
in potential energy between the point (0, 0) and (x, y(x)) is equal to mg y(x) (m
being the mass of the object and g the gravitational acceleration), it follows that
1
mv(x)2 = mg y(x)
2
or
p
v(x) = 2gy(x)
for all x.
Now we denote by s(x) the length of the path from 0 to (x, y(x)). Then
Z xp
s(x) = 1 + y 0 (x̂)2 dx̂,
0
implying that
ds p
= 1 + y 0 (x)2 .
dx
Moreover, the length L of the whole path is given by
Z ap
L= 1 + y 0 (x)2 dx.
0
Now we switch from the space variable x to the time variable t. By definition of
velocity, we have
ds
v(t) =
dt
or
dt 1
= .
ds v(s)
Therefore, if we denote by T the total travel time, we obtain (after some changes
of variables)
Z T Z L Z a
1 1 p
T = dt = ds = p 1 + y 0 (x)2 dx.
0 0 v(s) 0 2g y(x)
Thus we can formulate the brachistochrone problem as the minimization of the
functional
Z ap
1 + y 0 (x)2
F (y) := p dx
0 2g y(x)
subject to the constraints y(0) = 0 and y(a) = b.
2This reasoning is somehow dangerous. In this particular case, it turns out that everything
we have done is justified, but there are examples of more complicated problems, where a similar
reasoning leads to problems.
BASICS OF CALCULUS OF VARIATIONS 3
2. Variations
The (first) difficulty we face in the computation of the minimizer of the func-
tional F defined above is that we are dealing with a functional defined on functions.
Thus the usual first order optimality condition (∇F (y) = 0) is not immediately ap-
plicable, because we do not have any notion of a gradient of a function of functions.3
Instead, we resort again to directional derivatives, which, in this context, are called
variations.
Note that the variation of F in direction v is nothing else then its directional de-
rivative. However, one always has to be aware of the fact that we are differentiating
the functional F in direction of a function v, and the functional F itself depends on
(spatial) derivatives of its arguments. Thus, while formally everything is the same
as in the finite dimensional setting, the actual computation of a variation can turn
out to be somehow complicated.
Similarly as in the finite dimensional case we have the following first order nec-
essary condition:
δF (y ∗ ; v) = 0
3. Euler–Lagrange equations
In the following we assume that the functional F has the specific form
Z a
f x, y(x), y 0 (x) dx
F (y) =
0
3It is possible to define the gradient in infinite dimensional settings, but this requires that the
space on which one is working is equipped with an inner product that is (in some sense) compatible
with the functional F . Some of the necessary basics are taught in the course on functional analysis.
4 MARKUS GRASMAIR
4. Shortest paths
We first use this result to prove that a straight line is the shortest connection
between two points. Given two points A = (0, 0) and B = (a, b) with a > 0, the
length of a path of the form x 7→ (x, y(x)) from A to B is given by
Z ap
F (y) = 1 + y 0 (x)2 dx.
0
In order to derive the Euler–Lagrange equation for this functional, we denote
p
f (x, y, y 0 ) = 1 + y 02 .
Then
∂
f (x, y, y 0 ) = 0
∂y
and
∂ y0
0
f (x, y, y 0 ) = p .
∂y 1 + y 02
Thus we obtain the equation
d y 0 (x)
0= p .
dx 1 + y 0 (x)2
In other words, there exists a constant C ∈ R such that
y 0 (x)
p =C
1 + y 0 (x)2
for all x. (Note that, actually, we have −1 < C < 1.) Solving this equation for y 02
implies that
C2
y 0 (x)2 =
1 − C2
0
for all x. Thus it follows that y is constant. From the boundary conditions y(0) = 0
and y(a) = b we now derive easily that
b
y(x) = x.
a
5. Brachistochrone problem — solution
We recall that the brachistochrone problem was defined by the function f given
by s
1 + y0 2
f (x, y, y 0 ) = .
2g y
Thus s
∂ 1 1 + y0 2 1
f (x, y, y 0 ) = −
∂y 2 2g y 3/2
and
∂ 0 1 y0
f (x, y, y ) = √ .
∂y 0
p
2g y 1 + y 0 2
√
Multiplying everything with the constant factor 2g, we thus obtain the equation4
s
1 1 + y0 2 d y0
− = .
y3
q
2 dx
y(1 + y 0 2 )
over all functions y : (p, q) → R. Again, the first order necessary condition in this
case reads as δF (y; v) = 0 for all v. Since we do not have any constraints on the
functions we consider anymore, this equation has now to hold for all functions
v, and not only for those that satisfy v(p) = v(q) = 0. Thus, if we calculate the
variation and perform an integration by parts as we did before, we obtain additional
boundary terms: the variation reads as
Z q
∂ d ∂
f x, y(x), y 0 (x) − 0
(2) δF (y; v) = f x, y(x), y (x) v(x) dx
p ∂y dx ∂y 0
∂ ∂
+ v(q) 0 f q, y(q), y 0 (q) − v(p) 0 f p, y(p), y 0 (p) .
∂y ∂y
Since this term has in particular to be zero for all functions v that satisfy the
constraint v(p) = 0 = v(q), we may still apply Lemma 3, which implies that
the Euler–Lagrange equations still hold. In other words, the integral term in (2)
vanishes at a minimizer y. Using this, we see that
∂ ∂
v(q) 0 f q, y(q), y 0 (q) − v(p) 0 f p, y(p), y 0 (p) = 0
∂y ∂y
for all functions v. This immediately implies the equations
∂ d ∂
f x, y(x), y 0 (x) − f x, y(x), y 0 (x) = 0
for x ∈ (p, q),
∂y dx ∂y 0
∂
f p, y(p), y 0 (p) = 0,
∂y 0
∂
f q, y(q), y 0 (q) = 0.
∂y 0
8 MARKUS GRASMAIR
7. Additional remarks
The Euler–Lagrange equations suffer (in the context of optimization problems)
from the same shortcomings as the equation ∇F = 0 in the finite dimensional
setting. Unless the functional F is convex (which is, for instance, the case if the
function (y, y 0 ) 7→ f (x, y, y 0 ) is convex for all x), the equations are usually not
sufficient conditions for a local minimum.5 Sufficient conditions can be obtained
from the so called second variation of F , which is a generalization of the Hessian
of a functional.
Another, even larger, problem is the existence of minimizers. In the finite dimen-
sional setting, we have seen that the existence of a minimum is implied by the lower
semi-continuity and the coercivity of the functional, the latter property meaning
that the functional tends to infinity as its argument tends to infinity. A natural
generalization of these conditions to the variational setting we are considering here
is to require the integrand f for every x to be lower semi-continuous and coercive
in the variables y and y 0 . However, it turns out that these assumptions are not
sufficient for guaranteeing the existence of a minimizer. It is possible to construct
(fairly simple) examples with continuous and coercive integrand that do not admit
a minimum.
Finally, the derivation of the Euler–Lagrange equations is based on the assump-
tion of sufficient regularity of both the integrand f and the (unknown) solution y of
the variational problem. That the non-differentiability of the integrand will lead to
problems is not unexpected, as our calculation of the first order necessary condition
is based on the computation of derivatives of f . However, consider the problem of
solving Z 1
F (y) = (x − y(x)3 )2 y 0 (x)6 dx → min
0
subject to the constraints y(0) = 0 and y(1) = 1. Setting y(x) = x1/3 , the integrand
becomes zero, which implies that F (y) = 0. The only other functions for which the
integrand becomes zero are constant functions, but these do not satisfy the bound-
ary conditions. Thus the minimizer of this functional is the function y(x) = x1/3 —
which is non-differentiable at x = 0. For this particular problem, the difficulty only
occurs at the boundary point x = 0, and the Euler–Lagrange equation is satisfied
in the interior of the interval (0, 1), but one can consider the same functional on the
interval (−1/8, 1) instead with boundary conditions y(−1/8) = 1/2 and y(1) = 1.
Then one can argue that the function y(x) = |x|1/3 should still be the minimizer of
the functional, although y is non-differentiable at x = 0. Indeed, one can generalize
the notion of differentiability of a function in such a way that this holds true. More
details can be found in courses on partial differential equations (including their
numerical solution with finite elements, courses on functional analysis, and optimal
control).
Department of Mathematics, Norwegian University of Science and Technology, 7491
Trondheim, Norway
E-mail address: markus.grasmair@math.ntnu.no
5In order to talk about local minima for variational problems, one actually has to define a
notion of closeness of functions. One possibility (but by far not the only one) is to base closeness
on the supremum norm defined by ky − zk∞ = supx∈(p,q) |y(x) − z(x)|.