[go: up one dir, main page]

0% found this document useful (0 votes)
31 views50 pages

Numerical Methods For Unconstrained Optimum Design

The document discusses various methods for unconstrained nonlinear optimization, categorizing them into gradient-based, direct search, and nature-inspired methods. It elaborates on gradient-based methods, including first-order and second-order techniques, and emphasizes the importance of descent direction and step size determination in the optimization process. Additionally, it covers numerical methods for step size calculation, such as the golden section method, and compares the steepest-descent and conjugate gradient methods for finding optimal solutions.

Uploaded by

lzs15010236068
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views50 pages

Numerical Methods For Unconstrained Optimum Design

The document discusses various methods for unconstrained nonlinear optimization, categorizing them into gradient-based, direct search, and nature-inspired methods. It elaborates on gradient-based methods, including first-order and second-order techniques, and emphasizes the importance of descent direction and step size determination in the optimization process. Additionally, it covers numerical methods for step size calculation, such as the golden section method, and compares the steepest-descent and conjugate gradient methods for finding optimal solutions.

Uploaded by

lzs15010236068
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Unconstrained Nonlinear Optimization

Problems
Gradient-based and Direct Search Methods
✓Gradient-Based Search Methods: Use the gradient of the objective function to perform the
search for optimum point. Functions are assume to be smooth and at least twice continuously
differentiable everywhere in the feasible design space. Design variables are also assume to be
continuous. Generally classified into:
• 1- First-Order Methods such as Steepest decent method and Conjugate gradient methods
• 2- Second-order methods such as Modified Newton’s Method and Quasi-Newton Methods (The
DFP and BFGS methods)
These methods use only local information in their search process and thus converge to only local minimum
point.
Direct Search Methods: Only function values are used in the search process for the optimum
solution. (No gradient information is needed). Two prominent methods in this class: Hooke-
Jeeves and Nelder-Mead. These methods are simple to implement
Nature-Inspired Search Methods: These are stochastic-based methods which only use
functions values. They tend to converge to a global optimum point, however do not have good
stopping criteria as no optimality conditions are used. Moreover, they are slower than gradient-
based methods. Important methods in this category are: Genetic Algorithms, Particle Swarm
Optimization Methods, Simulated Annealing.
General Concepts: Gradient-based Methods

Here we try to understand underlying fundamental concepts used in most gradient-based


algorithms for unconstrained and constrained optimization problems. Gradient-based
search methods are iterative where the same calculations are repeated in every iteration. In
such approaches, we estimate an initial design and improve it until optimality conditions
are satisfied.

Here we first focus on uncontained optimization problems and then elaborate on


constrained optimization problems. Unconstrained optimization problems are important
because constrained optimization problems can be transformed into a sequence of
unconstrained optimization problems.
Unconstrained Optimization Problems
A General Iterative Algorithm

Many gradient-based optimization methods are described by the following iterative prescription:

In which, the change in the current design can be described as:


A General Iterative Algorithm

Most of gradient-based optimization algorithms are based on how to find the search direction, 𝐝(𝑘) ,
and scaling step size, 𝛼𝑘 . We will explain basic methods for evaluation of 𝐝(𝑘) and 𝛼𝑘 for unconstrained
optimization problem.
Descent Direction
The objective of the iterative optimization process is to reach a minimum point for the cost function
f(x). Let us assume that we are in the kth iteration and have determined that 𝑥 (𝑘) is not a minimum.
What is a descent direction along which the cost function will be reduced in next iteration (k+1). To
improve the design we should have the following condition in iteration (k+1):

Considering:
Gradient of the
Step Size: 𝛼𝑘 > 0 Approximated using linear
cost function
Taylor's expansion

Descent condition

Any directions, 𝑑 (𝑘) satisfying the descent


condition is called directions of descent
(downhill directions) for the cost function.

Step of the iterative optimization method based on these directions is called a descent step. There can be several directions of
descent at a design point and each optimization algorithm computes it differently. The concepts of descent directions and descent
step are used in most gradient-based optimization methods.
Descent Condition

Thus, not descent direction!


Step Size Determination: Basic Ideas
Two important sub-problems:
1. Direction-finding subproblem
2. Step size determination subproblem
For an optimization problem with several variables, the direction-finding subproblem must be solved first. Then, a
step size must be determined by searching for the minimum of the cost function along the search direction. This is
always a one-dimensional minimization problem, also called a line search problem. To realize this better, let us
assume that search direction 𝒅(𝑘) is knowm.
One dimensional problem.
Objective function is the
function of only step size, 𝛼.

It is important to understand this reduction of a function of n variables to a function of only one variable since this
procedure is used in almost all gradient-based optimization methods.
Step Size Determination: Basic Ideas

Recall: Descent direction

Assuming 𝒅(𝑘) is known:

To satisfy this inequality, 𝑓 𝛼 < f 0 , the NOTE: 𝑓 ′ 0 = 𝑪𝑘 . 𝑑𝑘


curve f(α) versus α must have a negative slope at
the point α=0.

It must be understood that if the search direction is that of descent, the graph of f(α) versus α
cannot be the one shown by the dashed curve because a small positive α would cause the
function 𝑓(𝛼) to increase, violating the inequality, 𝑓 𝛼 < f 0 .

Thus the step size determination subproblem is to find 𝛼 to

Minimize 𝑓(𝛼)
Analytical Method to Compute Step Size

One-dimensional problem.
Minimize 𝑓(𝛼)
Using chain rule differentiation

𝑓 ′ 𝛼 = 𝑪𝑘+1 . 𝑑 𝑘 First-order necessary condition: 𝑓 ′ 𝛼 = 0


Local minimum conditions
second-order sufficient condition: 𝑓 ′′ 𝛼 > 0

The condition 𝑪𝑘+1 . 𝑑 𝑘 = 0 states that the gradient of cost function at the new point (k+1) is orthogonal to the
search direction at the kth iteration. This condition can be effectively utilized in numerical line search methods to
determine the accuracy of the step size.

It is noted that 𝑓 ′ 0 = 𝑪𝑘 . 𝑑 𝑘 which must be negative based on the descent condition.


Analytical Method to Compute Step Size-Example

It is noted that Eq. (e) for the calculation of the step size
can be directly obtained from condition:

𝑪𝑘+1 . 𝑑𝑘 = 0

𝑪𝑘+1 . 𝑑𝑘 = 0 14𝛼 − 20 = 0
Numerical Methods to Compute Step Size
For many problems, it is not possible to obtain an explicit expression for f(α). Moreover, even if the functional form of
f(α) is known, it may be too complicated to lend itself to analytical solution. Therefore, a numerical method must be
used to find 𝛼𝑘 to minimize f(x) in the known direction, 𝑑 (𝑘) .
Most one-dimensional search methods assume the line search function to be a unimodal function in some interval. For
functions that are not unimodal, we can think of locating only a local minimum point that is closest to the starting point
(i.e., closest to α=0).

The line search problem, then, is to find α in an interval 0 ≤ 𝛼 ≤ 𝛼ഥ at which the function f(α) has a global minimum. In
numerical method, it is not possible to find exact minimum point 𝛼 = 𝛼 ∗ , the objective is to find the interval in which the
minimum lies . The interval [𝜶𝒍 𝜶𝒖 ] in which the minimum 𝜶 = 𝜶∗ lies is called the interval of uncertainty and is
designated as: 𝐼 = 𝛼𝑢 − 𝛼𝑙
Numerical Methods to Compute Step Size

Most numerical methods iteratively reduce the interval of uncertainty until it satisfies a specified
tolerance ε. Once this stopping criterion is satisfied, then, α∗ = 𝛼𝑢 + 𝛼𝑙 )/2

Methods based on this idea of reducing the interval of uncertainty are called interval-reducing methods.

PHASE I: The location of the minimum point is bracketed and


the initial interval of uncertainty is established.

Interval-reducing methods
PHASE II: The interval of uncertainty is reduced at each
iteration by eliminating the regions that cannot contain the
minimum

It is very important to realize that the performance of most optimization algorithms depends heavily on the step size
calculation procedure. Therefore, it is not surprising that numerous procedures have been developed and evaluated for
step size calculation.
Numerical Methods to Compute Step Size-Equal Interval Search

Initial Bracketing of Minimum—Phase I Reducing the Interval of Uncertainty—Phase II

Phase I is repeated from the lower end of the


a small increment, 𝛿 is selected
interval, 𝛼𝑙 , with some reduced value for the
increment , 𝛿, say r𝛿 in which 𝑟 ≪ 1

𝛼𝑢

𝛼𝑙
Numerical Methods to Compute Step Size-Alternate Equal Interval Search
This procedure is a predecessor to the more efficient golden sections search presented in next side. The procedure is
to evaluate the function at two new points, say 𝛼𝑎 and 𝛼𝑏 in the interval of uncertainty located at a distance of I/3
and 2I/3 from the lower limit 𝛼𝑙 , respectively, where 𝐼 = 𝛼𝑢 − 𝛼𝑙
Numerical Methods to Compute Step Size-Golden Section Method
Golden section search is an improvement over the alternate equal-interval search and is one of the best methods
in the class of interval-reducing methods.

Initial Bracketing of Minimum—Phase I


In the equal-interval methods, the selected increment δ is kept fixed to bracket the minimum initially. This can be an
inefficient process if δ happens to be a small number. An alternate procedure is to vary the increment at each step, that is,
multiply it by a constant 𝑟 > 1. Golden section method is a variable-interval search method in which the value of 𝑟 is
selected as the golden ratio which is 1.618 and can be derived in different ways. One derivation is based on Fibonnaci
sequence defined as:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

𝐹𝑛 1+ 5
lim = ≅ 1.618
𝑛→∞ 𝐹𝑛−1 2
1
Interesting to note: = 0.618
1.618
Numerical Methods to Compute Step Size-Golden Section Method
Two quantities a and b are said to be in the golden ratio φ if:

𝐹𝑛 𝐹𝑛+1
𝑎 𝑎+𝑏 Fibonnaci sequence: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, … =
= =φ 𝐹𝑛−1 𝐹𝑛
𝑏 𝑎

Golden pyramid 𝑛→∞

Line segments in the golden ratio

𝑎 𝑎+𝑏 𝑏 1
φ= = =1+ =1+
𝑏 𝑎 𝑎 φ
1+ 5 𝑆𝑙𝑜𝑝𝑒 51.827 … degrees = 51° 50′
2
φ −φ−1=0 φ= ≅ 1.618 The Great Pyramid of Giza
2

Positive value 𝑎 = φ ,ℎ = φ , 𝑏 = 1 𝑆𝑙𝑜𝑝𝑒 51° 52′

φ2 = φ + 1
https://en.wikipedia.org/wiki/Golden_ratio
Numerical Methods to Compute Step Size-Golden Section Method

Phase I

If

the minimum has been surpassed. Therefore, upper and lower limits on the interval of uncertainty are:
Numerical Methods to Compute Step Size-Golden Section Method

Reducing the Interval of Uncertainty—Phase II 1


𝜏= = 0.618 1 − 𝜏 = 0.382
𝜑

information at this point is known from previous iteration. Thus only one
function evaluation is needed for new point 𝛼′𝑎 at distance 1 − 𝜏 I ′ from the
left.
Note: 𝜏𝐼 ′ = 1 − τ I 𝑎𝑛𝑑𝐼 ′ = 𝜏𝐼
𝑡ℎ𝑢𝑠 𝜏 2 + 𝜏 − 1 = 0 which yields exactly 𝜏=0.618
Algorithm for One-Dimensional Search using Golden Section Method

Step 1: Phase I, define the initial interval of , 𝐼 = 𝛼𝑢 − 𝛼𝑙


Algorithm for One-Dimensional Search using Golden Section Method
Search Direction Determination: The Steepest-descent Method

Several methods are available for determining a descent direction for unconstrained optimization problems.
The steepest-descent method is the simplest, the oldest, and probably the best known numerical method for
unconstrained optimization. The philosophy of the method, introduced by Cauchy in 1847, is to find the
direction d, at the current iteration, in which the cost function f(x) decreases most rapidly, at least locally.
The steepest-descent method is a first-order method since only the gradient of the cost function is
calculated and used to evaluate the search direction.

Gradient of the objective function

The negative gradient vector represents a direction of steepest descent for the cost function and is written as

Descent condition is
automatically satisfied a

𝐂. 𝐝 < 0
The Steepest-descent Algorithm

Step 6. Update the design as 𝒙(𝑘+1) = 𝒙(𝑘) + 𝛼𝑘 𝒅(𝑘) . Set K=K+1 and then go to step 2.

Thus the steepest-descent directions at two consecutive


steps are orthogonal to each other.
The Steepest-descent Algorithm-Example
The preceding problem is quite simple and an
optimum point is obtained in only one
iteration of the steepest-descent method. This
is because the condition number of the
Hessian of the cost function is 1
The Steepest-descent Algorithm-Example

The steepest descent method is quite simple and


robust, but it has some drawbacks
Search Direction Determination: The Conjugate Gradient Method
Many optimization methods are based on the concept of conjugate gradients; however, here a method attributed
to Fletcher and Reeves (1964) is described.

The conjugate gradient method is a very simple and effective modification of the steepest-descent method. As shown
before, the steepest-descent directions at two consecutive steps are orthogonal to each other. This tends to slow
down the steepest-descent method. The conjugate gradient directions are not orthogonal to each other. Rather, these
directions tend to cut diagonally through the orthogonal steepest-descent directions. Therefore, they improve the rate
of convergence of the steepest-descent method considerably.

The conjugate gradient directions are orthogonal


to asymmetric and positive definite matrix A.
The Conjugate Gradient Algorithm

The conjugate gradient algorithm finds the minimum


in n iterations for positive definite quadratic functions
having n design variables. For general functions, if the
minimum has not been found by then, the iterative
process needs to be restarted every (n+1) iterations
for computational stability.
This means 𝒙(𝟎) = 𝒙(𝒏+𝟏) and restart the process.
The Conjugate Gradient Method-Example
Other Conjugate Gradient Methods

Different formulas for β are given as:

Based on numerical experiments, the following


procedure is recommended for selection of a β
value:
β value obtained based β value obtained based
on Polak-Ribiere on Fletcher-Reeves
formula formula
Scaling the Design Variables
As discussed before, the rate of convergence of the steepest-descent method is at best linear even for a quadratic cost function. It is
possible to accelerate this rate of convergence of the steepest descent method if the condition number of the Hessian of the cost function
can be reduced by scaling the design variables. As saw before in an illustrative example, the steepest-descent method converges in only
one iteration for a positive definite quadratic function with a unit condition number. The concept of scaling is shown below through an
illustrative example:
Search Directions-Classical Newton’s Method (2nd Order Methods)
The basic idea of the classical Newton’s method is to use a second-order Taylor’s expansion of the function
around the current design point. This gives a quadratic expression for the change in design Δx.

First-order necessary condition:


𝜕𝑓
= 𝐜 + 𝐇 ∆𝒙 = 𝟎
𝜕(∆𝒙)
Design change Update the deign

NOTE: Each iteration of Newton’s method requires computation of the Hessian of the cost function. Since it is a
symmetric matrix, it needs computation of n(n+1)/2 second-order derivatives of f(x) (n is the number of design
variables). This can require considerable computational effort.
Modified Newton’s Method
The classical Newton’s method does not have a step size associated with the calculation of design change Δx. In other
words, the step size is taken as one (a step of length one is called an ideal step size or a Newton’s step). Therefore, there is
not possible to ensure that the cost function will be reduced at each iteration (i.e., to ensure that 𝑓(𝒙 𝑘+1 ) < 𝑓(𝒙 𝑘 )
Thus, the method is not guaranteed to converge to a local minimum point even with the use of second-order information
that requires large calculations. This can be rectified, if the use of a step size in the calculation of the design change Δx is
incorporated. This is called Modified Newton’s Method. Modified Newton’s algorithm is described below:
Modified Newton’s Method
It is important to note realize that that unless H is positive definite, the direction, 𝒅(𝑘) may not
be that of descent for the cost function. This can be verified by using the descent condition as:

𝑻
𝐂 (𝒌) 𝐝(𝑘) < 0

NOTE: If H is negative definite or negative semidefinite, the descent is always violated. With H as
indefinite or positive semidefinite, the condition may or may not be satisfied, so we must check for it.
If the direction obtained in Step 4 is not that of descent for the cost function, then we should stop
there because a positive step size cannot be determined.
Modified Newton’s Method-Example
Modified Newton’s Method-Example
Modified Newton’s Method-Drawbacks
1. It requires calculations of second-order derivatives at each iteration, which is usually quite time-
consuming. In some applications it may not even be possible to calculate such derivatives.

2. The Hessian of the cost function may be singular at some iterations. Thus, it cannot be used to compute
the search direction. Also, unless the Hessian is positive definite, the search direction cannot be guaranteed
to be that of descent for the cost function, as discussed earlier.

3. The method is not convergent unless the Hessian remains positive definite and a step size is calculated
along the search direction to update the design. However, the method has a quadratic rate of convergence
when it works. For a strictly convex quadratic function, the method converges in just one iteration from
any starting design.
Example-Performance Evaluation
Marquardt Modification
Marquardt (1963) suggested a modification to the direction-finding process that has the desirable features of
the steepest-descent and Newton’s methods. It turns out that far away from the solution point, the method
behaves like the steepest-descent method, which is quite good there. Near the solution point, it behaves like
the Newton’s method, which is very effective there.

In Marquardt’s method, the Hessian is modified as (H+λI), where λ is a positive constant. λ is initially selected as a
large number that is reduced as iterations progress. The search direction is computed from
1
Note that when λ is large, the effect of H is basically neglected and 𝐝(𝑘) is essentially −( )𝐜 (𝑘)
𝝀
which is the steepest-descent direction with 1/λ as the step size.

Marquardt’s algorithm is summarized in the following steps.


Search Direction Determination:
Quasi-Newton Methods

1-Inverse Hessian Updating: The DFP Method


Davidon-Fletcher-Powell (DFP)
Proposed initially by Davidon (1959), was modified by
Fletcher and Powell (1963).
Quasi-Newton Methods

2-Direct Hessian Updating: The BFGS Method


Broyden-Fletcher-Goldfarb-Shanno (BFGS)
DFP Method

Note that the first iteration of the method is


the same as that for the steepest-descent
method. Fletcher and Powell (1963) prove the
following properties of the algorithm:

where
DFP Method: Example
BFGS Method

It is noted that the first iteration of the method is the same as that
for the steepest descent method when H(0)=I. It has been shown
that the BFGS update formula keeps the Hessian approximation
positive definite if an accurate line search is used. This is
important to know because the search direction is guaranteed to be
that of descent for the cost function if H is positive definite at each
iteration.
BFGS Method-Example
Engineering Applications Unconstrained Optimization Problems
Data Interpolation: The problem of data interpolation is called regression analysis. The problem can be formulated
as an unconstrained optimization problem where the error between the available data and its analytical representation
is minimized. The parameters that characterize the interpolation function are treated as the design variables for the
optimization problem. Known discrete data (𝑥𝑖 , 𝑦𝑖 ) . Ex. from experiments

Minimize, the error function


(sum of the square of the errors)

Analytical function, it can be linear , polynomial, exponential, …

Example, linear
Find parameters a and b, to minimize

This is basically linear least squares problem.


Engineering Applications Unconstrained Optimization Problems

Minimization of Total Potential Energy: In structural and mechanical systems, the governing equations characterizing
the equilibrium states can be found by minimizing the total potential energy of the system. This is known as the
principle of stationary potential energy. Thus system respond can be found by direct minimization of the total
potential energy of the system.

Example: compute the displacements Find x1 and x2 to minimize the total potential energy of the system, P as:
x1 and x2.

assuming small displacements

𝑠
𝑁𝑜𝑡𝑒: cos 𝛽 =
2𝐿
sin 𝛽 = ℎ/𝐿
Engineering Applications Unconstrained Optimization Problems
Engineering Applications Unconstrained Optimization Problems

Solutions to Nonlinear Equations: Unconstrained optimization methods can be used to find the roots of a nonlinear system
of equations. Let us consider the following 2×2 system:

A new function is now defined by the sum of the square of the functions:

Thus, the optimization problem is to find 𝑥1 and 𝑥2 to minimize the function f(𝑥1, 𝑥2). The minimum value of 𝑓
is zero and this can happen when 𝐹1=0 and 𝐹2=0, thus satisfying the system of nonlinear equations.
Engineering Applications
Unconstrained Optimization
Problems
References
Arora, J., Introduction to optimum Design, Academic Press, 2016.

You might also like