[go: up one dir, main page]

0% found this document useful (0 votes)
4 views66 pages

A) B) C) D) E) F)

The document discusses significant figures, accuracy, precision, and various types of numerical errors, including truncation and round-off errors. It also covers computer representation of numbers, including signed integers in different forms (sign-magnitude, 1's complement, and 2's complement), as well as floating-point representation according to IEEE standards. Examples are provided to illustrate calculations and error analysis in numerical approximations.

Uploaded by

ermikasa2716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views66 pages

A) B) C) D) E) F)

The document discusses significant figures, accuracy, precision, and various types of numerical errors, including truncation and round-off errors. It also covers computer representation of numbers, including signed integers in different forms (sign-magnitude, 1's complement, and 2's complement), as well as floating-point representation according to IEEE standards. Examples are provided to illustrate calculations and error analysis in numerical approximations.

Uploaded by

ermikasa2716
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

Number System and Numerical Error Analysis

SIGNIFICANT FIGURES
The concept of a significant figure, or digit, has been developed to formally designate
the reliability of a numerical value.
How many significant figures does each of the following contain?
a) 345
b) 0.00789
c) 4503
d) 45300
e) 4.5300x104
f) 20.00 1
ACCURACY AND PRECISION
✓ Accuracy refers to how closely a computed or measured value agrees with the true value.
✓ Precision refers to how closely individual computed or measured values agree with each other.
✓ These concepts can be illustrated graphically using an analogy from target practice. The bullet holes on
each target in Fig. below can be thought of as the predictions of a numerical technique, whereas the bull’s-
eye represents the truth.

Fig (a) Inaccurate and imprecise; (b) accurate and imprecise; (c) inaccurate and precise; (d) accurate and precise
2
• ERROR DEFINITIONS
Truncation errors results when approximations are used to represent exact mathematical
procedures.
• Consider evaluating the integral
1 −𝑥 2 −𝑥 2
𝐼= ‫׬‬0 𝑒 there is no antiderivative for 𝑒 in terms of elementary functions and thus the
integral can’t be evaluated explicitly. Instead, we approximate the integral with a quantity that
can be computed. For example, use the Taylor approximation Centered at x=0.
4 6 8
−𝑥 2
𝑥 𝑥 𝑥
𝑒 ≈ 1 − 𝑥2 + − +
2! 3! 4!
Then
1 𝑥4 𝑥6 𝑥8
• 𝐼= ‫׬‬0 (1 2
− 𝑥 +
2!

3!
+
4!
)dx
• Which can be evaluated easily. We approximate a problem we can’t solve with a new problem
that we can solve this introduces truncation error. 3
Round-off error results when numbers having limited significant figures are used to
represent exact numbers.
1
✓ If we represent 3
with six significant digits as 0.333333, the rounding error is given
1
by − 0.333333 = 0.0000003333.
3

✓ there are other numbers that cannot be represented exactly. For example, π and 2 are
numbers that need to be approximated in computer calculations.

4
Example1: Compute the absolute error and true percent relative error in
approximations of p by pa
p= 2 pa=1.414
Solution
absolute error = true value – approximation
absolute error = 2 – 1.414 = 𝟐. 𝟏𝟑𝟔𝒙𝟏𝟎−𝟒
true percent relative error
true error
εt = x100%
true value
2.136𝑥10−4
εt = x100% = 0.015%
2 5
In real-world applications, we will obviously not know the true answer a priori. For
these situations, an alternative is to normalize the error using the best available estimate
of the true value, that is, to the approximation itself.
𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑒 𝑒𝑟𝑟𝑜𝑟
✓ εa = 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
100%
𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 − 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
✓ εa = 𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
100%
where the subscript a signifies that the error is normalized to an approximate value

6
Often, when performing computations, we may not be concerned with the sign of the
error, but we are interested in whether the percent absolute value is lower than a
prespecified percent tolerance εs. Therefore, it is often useful to employ the absolute
value of the above Equations. For such cases, the computation is repeated until |εa| < εs
εs = (0.5 × 𝟏𝟎𝟐−𝒏 )%
The result is correct to at least n significant figures.

7
• Example: The exponential function can be computed using
𝑥2 𝑥3 𝑥𝑛
𝑒 𝑥 =1 +x+ + +… +
2 3! 𝑛!
as more terms are added in sequence, the approximation becomes a better and better
estimate of the true value of 𝑒 𝑥 . the above equation is called a Maclaurin series
expansion.
Estimate 𝒆𝟎.𝟓 After each new term is added, compute the true and approximate percent
relative errors. Note that the true value is 𝑒 0.5 = 1.648721.... Add terms until the absolute
value of the approximate error estimate εa falls below a prespecified error criterion εs
conforming to three significant figures.

8
• Solution:
the error criterion that ensures a result that is correct to at least three significant figures:
εs = (0.5 × 102−𝑛 )%
εs = (0.5 × 102−3 )% = 0.05%
Thus, we will add terms to the series until εa falls below this level.
the first estimate is equal to 1. The second estimate is then generated by adding the second term
as in
𝑒 0.5=1 + x=1 + 0.5 = 1.5
This represents a true percent relative error of
1.648721 − 1.5
εt = x100% = 9.02%
1.648721
approximate estimate of the error
1.5−1
εa= 1.5
x100% =33.3%
Because εa is not less than the required value of εs, we would continue the computation by
𝑥2
adding another term and repeating the error calculations. The process is continued
2
until |εa| < εs. The entire computation can be summarized as
9
Thus, after six terms are included, the approximate error falls below εs = 0.05%, and the
computation is terminated.

10
• Computer Representation of Numbers
numbers on the computer are represented with a binary, or base-2, system. Just as with the
decimal system, quantities can be represented using positional notation. For example, the binary
number 101.1 is equivalent to (1 × 22 )+ (0 × 21 ) + (1 ×20 ) + (1 × 2−1 ) = 4 + 0 + 1 + 0.5 = 5.5 in
the decimal system.
Integer Representation
There are three forms in which signed integer (whole) numbers can be represented in binary:
a. Sign-magnitude,
b. 1's complement, and
c. 2' complement.
Of these, the 2's complement is the most important and the sign-magnitude is the least used.

11
❑ The most straightforward approach, called the signed magnitude method, employs
the first bit of a word to indicate the sign, with a 0 for positive and a 1 for negative.
The remaining bits are used to store the number. For example, the integer value of 173
is represented in binary as 10101101:
❑ The binary representation of the decimal integer –173 on a 16-bit computer using the
signed magnitude method is shown in the fig below.

12
Ex. Express the decimal number -45 as an 8-bit number in the
a. sign-magnitude
b. 1's complement, and
c. 2's complement forms.

13
Solution
First write the 8-bit binary number for +45
Convert 45 to binary using repeated division by 2 method as follows
Decimal number Quotient after division Remainder after division
45 22 1 R0
2
22 11 0 R1
2
11 5 1 R2
2
5 2 1 R3
2
2 1 0 R4
2
1 0 1 R5
2

14
Hence 45 in binary =R5R4R3R2R1R0=101101 =00101101(8 bit representation)
In the sign- magnitude form -45 is produced by changing the sign bit to a 1 and leaving
the magnitude bits as they are the number is 10101101
In the 1’s complement form -45 is produced by taking 1’s complement of +45 (00101101)
The 1’s complement of a binary number is found by changing all 1s to 0s and all 0s to 1s,
so 1’s complement of 00101101 is 11010010
Hence -45 = 11010010(1’s complement form)

15
In the 2’s complement form -45 is produced by taking 2’s complement of +45 (00101101)
2’s complement of a binary number can be found as follows
✓ Start at the right with the LSB and write the bits as they are up to and including
the first 1.
✓ Take the 1’s complements of the remaining bits.
Hence -45 = 11010011((2’s complement form)
Note: The 2’s complement of a binary number can be found by adding 1 to the
LSB(leftmost bit) of the 1’s complement.
11010010
+1
11010011
16
The Decimal Value of Signed Numbers
Sign-magnitude Decimal values of positive and negative numbers in the sign-magnitude form are
determined by summing the weights in all the magnitude bit positions where there are 1s and ignoring those
positions where there are zeros. The sign is determined by examination of the sign bit.
Example: Determine the decimal value of the following signed binary number expressed in sign-magnitude
form.
10101101
The seven magnitude bits and their powers-of-two weights are as follows:
26 25 24 23 22 21 20
0 1 0 1 1 0 1

Summing the weights where there are 1s, 25 + 23 + 22 + 20 =45

The sign bit is 1; therefore, the decimal number is -45.

17
1’s Complement Decimal values of positive numbers in the 1's complement form are
determined by summing the weights in all bit positions where there are 1s and ignoring
those positions where there are zeros. Decimal values of negative numbers are
determined by assigning a negative value to the weight of the sign bit. summing all the
weights where there are 1s. and adding 1 to the result.
Example: Determine the decimal value of the following signed binary number expressed
in 1's complement form.
11010010

18
Solution
The bits and their powers-of-two weights for the negative number are as follows.
Notice that the negative sign bit has a weight of −27 or − 128.
−27 26 25 24 23 22 21 20
1 1 0 1 0 0 1 0
Summing the weights where there are 1s,
− 128 + 64 +16 +2 = − 46
Adding 1 to the result, the final decimal number is
− 46 +1 = − 45

19
2’s Complement Decimal values of positive and negative numbers in the 2's
complement form are determined by summing the weights in all bit positions where
there are 1s and ignoring those positions where there are zeros. The weight of the sign
bit in a negative number is given a negative value.
Example: Determine the decimal value of the following signed binary number
expressed in 2's complement form.
11010011

20
Solution
The bits and their powers-of-two weights for the negative number are as follows.
Notice that the negative sign bit has a weight of −27 or − 128.
−27 26 25 24 23 22 21 20
1 1 0 1 0 0 1 1
Summing the weights where there are 1s,
− 128 + 64 +16 +2 + 1 = − 45
hence the decimal number is − 45

21
• Floating-Point Representation.
Fractional quantities are typically represented in computers using floating-point format.
In this approach, which is very much like scientific notation, the number is expressed as
±s x 𝑏 𝑒
where s = the significand (or mantissa), b = the base of the number system being used,
And e = the exponent.
The following figure shows the manner in which a floating-point number is stored in a
word.

22
o Prior to being expressed in this form, the number is normalized by moving the binary
place over so that only one significant digit is to the left of the binary point. This is
done so computer memory is not wasted on storing useless nonsignificant zeros.
o For example, a value like 0.005678 could be represented in a wasteful manner as
0.005678 ×100 in hypothetical floating-point base-10 system. However, normalization
would yield 5.678 × 10−3 which eliminates the useless zeroes.
o The following figure shows the manner in which a floating-point number is stored using
IEEE standard.

23
1. Single Precision: Single Precision is a format proposed by IEEE for the representation
of floating-point numbers. It occupies 32 bits in computer memory.

2. Double Precision: Double Precision is also a format given by IEEE for the
representation of the floating-point number. It occupies 64 bits in computer memory.

24
Example 1: Use Single Precision IEEE-754 floating point standard to find the decimal
equivalent of the following floating-point machine number.

Solution:
Here sign bit is 1 so the number is negative and it can be written as
(−1𝑆 ) x 1.M x 2𝐸−127
Exponent bits = 10000011= 27 + 21 + 20 = 1+2+128=131
M = 10000000000000000000000
Substitute the values into the above formula to obtain final answer
(−11 ) x 1. 10000000000000000000000 x 2131−127
= -1.5 x 24 = -24
25
Example 2 convert -24 to Single Precision IEEE-754 floating point standard
Solution:
Convert 24 to binary using repeated division by 2 method as follows
Decimal number Quotient after division Remainder after division
24 12 0 R0
2

12 6 0 R1
2

6 3 0 R2
2

3 1 1 R3
2

1 0 1 R4
2

26
Hence 24 in binary =R4R3R2R1R0=11000
24=1.1 x 24 the exponent is 4 add exponent bias 127
4 + 127=131 then convert 131 to eight bit binary because in single precision format the
exponent is represented using 8 bits.

27
Decimal number Quotient after division Remainder after division
131 65 1 R0
2
65 32 1 R1
2
32 16 0 R2
2
16 8 0 R3
2
8 4 0 R4
2
4 2 0 R5
2
2 1 0 R6
2
1 0 1 R7
2

28
131=R7R6R5R4R3R2R1R0 =10000011
The given number is negative so the sign bit will be 1 the exponent is 10000011
and the mantissa is 10000000000000000000000
Sign(1bit) Exponent (8 bits) Mantissa (23 bits)

1 10000011 10000000000000000000000

29
For normalized floating-point numbers rounding yields a lower absolute error than
chopping. when chopping is employed
∆𝑥
≤ε
𝑥
and, for cases where rounding is employed,
∆𝑥 𝜀

𝑥 2
Where ε is referred to as the machine epsilon, machine epsilon is the distance between
1 and the next smallest representable number.
In MATLAB this value is assigned to the predefined variable eps. As shown below,
when the name of the variable eps is typed (Command Window), the assigned value is
displayed.
30
Subtractive Cancellation
This term refers to the round-off induced when subtracting two nearly equal floating-point numbers. One
common instance where this can occur involves finding the roots of a quadratic equation or parabola with the
quadratic formula.

−𝑏± 𝑏 2 −4𝑎𝑐
𝑥1 , 𝑥2 = 2𝑎
For cases where b2 >> 4ac, the difference in the numerator can be very small and this causes round-off error. In
many cases, the form of the mathematical expressions that contain subtraction of two quantities that are nearly
equal can be changed to a different form that is less likely to cause round-off errors. For instance, we can change
the form of the quadratic formula by rationalizing the numerator as follows.

−𝑏+ 𝑏 2 −4𝑎𝑐 −𝑏− 𝑏 2 −4𝑎𝑐 𝑏 2 −(𝑏 2 −4𝑎𝑐)


𝑥1 = ( )=
2𝑎 −𝑏− 𝑏 2 −4𝑎𝑐 2𝑎(−𝑏− 𝑏 2 −4𝑎𝑐

31
which simplifies to an alternate quadratic formula
−2𝑐
𝑥1 =
𝑏+ 𝑏 2 −4𝑎𝑐

The rationalization technique can also be applied to give the following alternative
quadratic formula for 𝑥2
−2𝑐
𝑥2 = 2𝑏− 𝑏 −4𝑎𝑐
For example, consider the quadratic equation: x2 - 100.000lx+ 0.01 = 0
for which the exact solutions are 𝑥1 = 100 and 𝑥2 = 0.0001. The solutions can be
calculated with the quadratic formula:
−𝑏± 𝑏 2 −4𝑎𝑐
𝑥1 , 𝑥2 = 2𝑎

32
• Using MATLAB (Command Window) to calculate 𝑥1 and 𝑥2 gives:

33
The value that is calculated by MATLAB for 𝑥2 is not exact due to round-off errors.
The round-off error occurs in the numerator in the expression for 𝑥2 Since b is
negative, the numerator involves subtraction of two numbers that are nearly equal. But
alternative formula for 𝑥2 gives exact value as shown below .

34
CHAPTER TWO
ROOTS OF EQUATIONS
GRAPHICAL METHODS
A simple method for obtaining an estimate of the root of the equation f(x) = 0 is to make a
plot of the function and observe where it crosses the x axis. This point, which represents
the x value for which f(x) = 0, provides a rough approximation of the root.

Graphical techniques are of limited practical value because they are not precise. However, graphical
methods can be utilized to obtain rough estimates of roots. These estimates can be employed as starting
guesses for numerical methods discussed in this chapter. 35
Bracketing Methods
These techniques are called bracketing methods because two initial guesses for the root are required. As the name
implies, these guesses must “bracket,” or be on either side of, the root. these techniques are based on the
Intermediate Value Theorem.
THE BISECTION METHOD
A simple algorithm for the bisection calculation is shown below.
Step 1: Choose lower xl and upper xu guesses for the root such that the function changes
sign over the interval. This can be checked by ensuring that f(xl)f(xu) < 0.
Step 2: An estimate of the root xr is determined by
𝑥𝑙 + 𝑥𝑢
𝑥𝑟 =
2
Step 3: Make the following evaluations to determine in which subinterval the root lies:
(a) If f(xl)f(xr) < 0, the root lies in the lower subinterval. Therefore, set xu = xr and
return to step 2.
(b) If f(xl)f(xr) > 0, the root lies in the upper subinterval. Therefore, set xl = xr and
return to step 2.
(c) If f(xl)f(xr) = 0, the root equals xr; terminate the computation.
36
Example: Determine the root of 𝑥 3 + 4𝑥 2 - 10 using bisection method. Employ initial
guesses of xl = 1 and xu = 2 and iterate until the estimated error ℇa falls below a level of
ℇs = 0.000001%.
Solution:
Iteration 1
xl =1, xu = 2
xl +xu 1+2
xr =
2
= 2
=1.5
f(x ) =13 +
l 4x12 – 10= -5
f(xu) =23 + 4x22 – 10= 14
f(xr) =1.53 + 4x1.52 – 10 =2.3750
f(xl)x f(xr)< 0 Therefore, we will set xu = xr 37
Iteration 2
xl =1, xu = 1.5
xl +xu
xr = 2
= 1+1.5
2
=1.25
f(xl) =13 + 4𝑥12 – 10 = -5
f(xu) =1.53 + 4𝑥1.52 – 10 = 2.3750
f(xr) =1.253 + 4𝑥1.25
2 – 10 = -1.7969
f(xu)x f(xr)< 0 Therefore, we will set x1= xr
𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 − 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
εa = x100%
𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
1.25 − 1.5
εa = x100% =20%
1.25
After 27 iterations εa falls below εs = 0.000001%, and the computation can be
terminated. The root estimate is 1.365230
38
THE FALSE-POSITION METHOD
A shortcoming of the bisection method is that, in dividing the interval from xl to xu into
equal halves, no account is taken of the magnitudes of f(xl) and f(xu). For example, if f(xl)
is much closer to zero than f(xu), it is likely that the root is closer to xl than to xu.
An alternative method that exploits this graphical insight is to join f(xl) and f(xu) by a
straight line. The intersection of this line with the x axis represents an improved estimate
of the root.

39
Using similar triangles in the above figure, the intersection of the straight line with the x axis
can be estimated as
𝑓(xl) 𝑓(x )
= x −xu
xr−xl r u
Cross-multiplying the above equation yields
f(xl)(xr - xu) = f(xu)(xr - xl)
Collect terms and rearrange:
xr [ f(xl) − f(xu)] = xu f(xl) − xl f(xu)
Divide by f(xl) - f(xu)
𝑥 𝑢 𝑓 (𝑥 𝑙 )−𝑥 𝑙 𝑓(𝑥 𝑢 )
xr =
𝑓 (𝑥 1 )−𝑓(𝑥 𝑢 )

This is one form of the method of false position. Note that it allows the computation of the root xr as a
function of the lower and upper guesses xl and xu. It can be put in an alternative form by expanding it:
40
𝑥𝑢𝑓 (𝑥𝑙 ) 𝑥𝑙𝑓 (𝑥𝑢 )
xr = -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 ) 𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

then adding and subtracting xu on the right-hand side:


expanding it:
𝑥𝑢𝑓 (𝑥𝑙 ) 𝑥𝑙𝑓 (𝑥𝑢 )
xr = xu + – xu -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 ) 𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

Collecting terms yields


𝑥𝑢𝑓 (𝑥𝑢 ) 𝑥𝑙𝑓 (𝑥𝑢 )
xr = xu + -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 ) 𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

𝑓 (𝑥𝑢 )(𝑥𝑙 −𝑥𝑢 )


xr = xu -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

41
which is the same as the former equation we obtained for false position method. We use
this form because it involves one less function evaluation and one less multiplication. In
addition, it is directly comparable with the secant method which will be discussed in this
chapter.
Example: Determine the root of 𝑥 3 + 4𝑥 2 - 10 using false position method. Employ
initial guesses of xl = 1 and xu = 2 and iterate until the estimated error ℇa falls below a
level of ℇs = 0.000001%.
Solution;
Iteration 1
xl =1, xu = 2
f(xl) = 13 + 4x12 – 10 = -5
f(xu) = 23 + 4x22 – 10 = 14
42
𝑓 (𝑥𝑢 )(𝑥𝑙 −𝑥𝑢 )
xr = xu -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

14(1−2)
xr = 2 - = 1.2632
−5−14

f(xr) =1.26323 + 4𝑥1.26322 – 10 = -1.6023


f(xu)x f(xr)< 0 Therefore, we will set xl = xr
Iteration 2
xl =1.2632, xu = 2
f(xl) =1.26323 + 4x1.26322 – 10 = -1.6023
f(xu) =23 + 4x22 – 10 = 14

43
𝑓 (𝑥𝑢 )(𝑥𝑙 −𝑥𝑢 )
xr= xu -
𝑓 (𝑥𝑙 )−𝑓(𝑥𝑢 )

14(1.2632 −2)
xr= 2 - = 1.3388
−1.6023 −14

f(xr) =1.33883 + 4𝑥1.33882 – 10 = - 0.4304


f(xu)x f(xr)< 0 Therefore, we will set xl = xr
1.3388 − 1.2632
εa = x100% =5.6520%
1.3388

• After 14 iterations εa falls below εs = 0.000001%, and the computation can be


terminated. The root estimate is 1.365230

44
Open Methods
For the bracketing methods we have seen the root is located within an interval
prescribed by a lower and an upper bound. Repeated application of these methods
always results in closer estimates of the true value of the root. Such methods are said
to be convergent because they move closer to the truth as the computation progresses
in contrast, the open methods described below are based on formulas that require only
a single starting value of x or two starting values that do not necessarily bracket the
root. As such, they sometimes diverge or move away from the true root as the
computation progresses. However, when the open methods converge, they usually do
so much more quickly than the bracketing methods.

45
SIMPLE FIXED-POINT ITERATION
The formula for root estimation is obtained by rearranging the function f(x) = 0 so
that x is on the left-hand side of the equation:
x = g(x)
This transformation can be accomplished either by algebraic manipulation or by
simply adding x to both sides of the original equation. For example,𝑥 2 -2x +3 =0
𝑥 2 +3
can be simply manipulated to yield whereas sin x = 0 could be put into the form
2
of Equation above by adding x to both sides to yield x = sin x + x.

46
Example: Use simple fixed-point iteration to locate the root of f(x) = 𝑒 −𝑥 - x. Starting
with an initial guess of x0 = 0 and iterate up to third iteration, compute the approximate
error for each iteration.
Solution:
Express the given function in the form of xi+1 = g(xi)
xi+1 = 𝑒 −𝑥𝑖
Iteration 1
xi+1 = 𝑒 0 =1
Iteration 2
xi+1 = 𝑒 −1 = 0.367879

47
𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛 − 𝑝𝑟𝑒𝑣𝑖𝑜𝑢𝑠 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
εa = x100%
𝑐𝑢𝑟𝑟𝑒𝑛𝑡 𝑎𝑝𝑝𝑟𝑜𝑥𝑖𝑚𝑎𝑡𝑖𝑜𝑛
0.367879 − 1
εa = x100% =71.8%
1
Iteration 3
xi+1 = 𝑒 −0.367879 = 0.692201
0.692201 − 0.367879
εa = x100% =46.9 %
0.692201

48
THE NEWTON-RAPHSON METHOD
The following figure shows Graphical depiction of the Newton-Raphson method. A
tangent to the function of xi [that is, f’(xi)] is extrapolated down to the x axis to
provide an estimate of the root at xi+1.

49
the first derivative at x is equivalent to the slope:
𝑓 𝑥𝑖 − 0
𝑓′ 𝑥𝑖 =
𝑥𝑖 − 𝑥𝑖+1
which can be rearranged to yield
𝒇 𝒙𝒊
xi+1 = xi -
𝒇′ (𝒙𝒊 )
Example: Use the Newton-Raphson method to estimate the root of f(x) = 𝑒 −𝑥 - x, employing an initial
guess of x0 = 0. and iterate up to third iteration, compute the approximate error for each iteration.
Solution:
The first derivative of the function can be evaluated as
𝑓 ′ (𝑥) = -𝑒 𝑥 – 1

50
Iteration 1
xi = 0
𝑓 ′ (𝑥) = -𝑒 −𝑥 – 1
𝑓 𝑥𝑖
xi+1 = xi -
𝑓′ (𝑥𝑖 )
𝑒 −𝑥𝑖 − 𝑥𝑖
xi+1 = xi - = 0.5000
−𝑒 −𝑥𝑖 – 1
Iteration 2
xi = 0.5000
𝑒 −𝑥𝑖 − 𝑥𝑖 𝑒 −0.5000 − 0.5000
xi+1 = xi - = 0.5000 - = 0.5663
−𝑒 −𝑥𝑖 – 1 −𝑒 −0.5000 – 1

0.5663 − 0.5000
εa = x100% =11.7076%
0.5663

51
Iteration 3
xi = 0.5663
𝑒 −𝑥𝑖 − 𝑥𝑖 𝑒 −0.5663 − 0.5663
xi+1 = xi - = 0.5663 - = 0.5671
−𝑒 −𝑥𝑖 – 1 −𝑒 −0.5663 – 1
0.5671 − 0.5663
εa = x100% = 0.1411%
0.5671
Thus, the approach rapidly converges on the true root. Notice that the true percent
relative error at each iteration decreases much faster than it does in simple fixed-
point iteration.

52
THE SECANT METHOD
A potential problem in implementing the Newton-Raphson method is the evaluation of the
derivative. Although this is not inconvenient for polynomials and many other functions, there are
certain functions whose derivatives may be extremely difficult or inconvenient to evaluate. For
these cases, the derivative can be approximated by a backward finite divided difference shown
below.
𝑓 𝑥𝑖−1 − 𝑓(𝑥𝑖 )
𝑓 ′ (𝑥𝑖 ) ≈
𝑥𝑖−1 −𝑥𝑖
This approximation can be substituted into Newton-Raphson formula to yield the following
iterative equation:
𝑓(𝑥𝑖 )(𝑥𝑖−1 − 𝑥𝑖 )
𝑥𝑖+1 = 𝑥𝑖 –
𝑓 𝑥𝑖−1 −𝑓(𝑥𝑖 )
The above Equation is the formula for the secant method. Notice that the approach requires two initial
estimates of x. However, because f(x) is not required to change signs between the estimates, it is not
classified as a bracketing method.
53
Example: Use the Secant method to estimate the root of f(x) = 𝑒 −𝑥 - x, Start with
initial estimates of x−1 = 0 and x0 = 1.0. and iterate up to third iteration, compute the
approximate error for each iteration.
Solution: Recall that the true root is 0.56714329. . .
Iteration 1
x−1 = 0
x0 = 1.0
𝑓 𝑥−1 = 1
𝑓 𝑥0 = -0.6321
−0.6321(0− 1)
𝑥1 = 1 – = 0.6127
1−(−0.6321)
54
Iteration 2
x−1 = 1
x0 = 0.6127
𝑓 𝑥−1 = -0.6321
𝑓(𝑥0 )= -0.0708
−0.0708(1−0.6127)
𝑥1 = -0.6321 – = 0.5638
−0.6321−(−0.0708)
0.5638− 0.6127
εa = x100% = 8.6733%
0.5638

55
Iteration 3
x−1 = 0.6127
x0 = 0.5638
𝑓 𝑥−1 = -0.0708
𝑓(𝑥0 )= 0.0052
0.0052(0.6127−0.5638)
𝑥1 =0.5638 – = 0.5672
−0.0708−0.0052
0.5672− 0.5638
εa = x100%= 0.5994%
0.5672

56
Modified Newton-Raphson method
Newton-Raphson and secant methods are linearly, rather than quadratically, convergent for
multiple roots (Ralston and Rabinowitz, 1978).
Modifications have been proposed to alleviate this problem. Ralston and Rabinowitz (1978)
have indicated that a slight change in the formulation returns it to quadratic convergence.
define a new function u(x), that is, the ratio of the function to its derivative
𝑓(𝑥)
𝑢 𝑥 = ′
𝑓 (𝑥)
Substitute 𝑢 𝑥 into formula of Newton-Raphson method
𝑓(𝑥)
𝑥𝑖+1 = 𝑥𝑖 − ′
𝑓 (𝑥)
𝑢(𝑥)
𝑥𝑖+1 = 𝑥𝑖 − ′
𝑢 (𝑥)
Then find 𝑢′ (𝑥)
57
𝑣 ′ 𝑣 ′ 𝑤−𝑣𝑤 ′
Let’s differentiate 𝑢 𝑥 using quotient rule = where 𝑣 𝑥 = 𝑓 𝑥 𝑎𝑛𝑑 𝑣 𝑥 =
𝑤 𝑤2
𝑓 ′ (𝑥)
𝑑𝑢 𝑓′ 𝑥 𝑓′ 𝑥 − 𝑓(𝑥) 𝑓′′ (𝑥)
=
𝑑𝑥 (𝑓′ (𝑥))2

Now let’s substitute 𝑢 𝑥 𝑎𝑛𝑑 𝑢′ 𝑥 𝑖𝑛𝑡𝑜 formula of Newton-Raphson method and simplify the
equation
𝑓(𝑥)
𝑓 ′ (𝑥)
𝑥𝑖+1 = 𝑥𝑖 − ′
𝑓 𝑥 𝑓 ′ 𝑥 − 𝑓(𝑥) 𝑓 ′′ (𝑥)
(𝑓 ′ (𝑥))2
𝑓(𝑥)(𝑓 ′ (𝑥))2
𝑥𝑖+1 = 𝑥𝑖 − ′
𝑓 (𝑥)(𝑓 ′ 𝑥 𝑓 ′ 𝑥 − 𝑓 𝑥 𝑓 ′′ 𝑥 )
𝑓(𝑥𝑖 ) 𝑓 ′ (𝑥𝑖 )
𝑥𝑖+1 = 𝑥𝑖 − ′
(𝑓 (𝑥𝑖 ))2 − 𝑓 𝑥𝑖 𝑓 ′′ 𝑥𝑖 58
Problem Statement. Use modified Newton-Raphson methods to evaluate the multiple
roots of 𝑓 𝑥 = 𝑥 3 − 6𝑥 2 + 11𝑥 − 6 with an initial guess of 𝑥0 = 0.8. Round your
answer to 4 decimal places.
Solution.
Calculate 𝑓 ′ (𝑥)
𝑓 ′ 𝑥 = 3𝑥 2 − 12𝑥 + 11
Calculate 𝑓 ′′ (𝑥)
𝑓 ′′ 𝑥 = 6𝑥 − 12

59
Iteration 1
𝑥0 = 0.8
Compute 𝑓 𝑥0
𝑓 𝑥0 = 𝑓 0.8 = 0.83 − 6𝑥0.82 + 11𝑥0.8 − 6 = −0.528
Compute 𝑓 ′ 𝑥0
𝑓 ′ 𝑥0 = 𝑓 ′ 0.8 = 3𝑥0.82 − 12𝑥0.8 + 11 = 3.32
Compute 𝑓 ′′ 𝑥0
𝑓 ′′ 𝑥0 = 𝑓 ′′ 0.8 = 6𝑥0.8 − 12 = −7.2
Now calculate the next approximation 𝑥1
𝑓(𝑥0 ) 𝑓 ′ (𝑥0 )
𝑥1 = 𝑥0 − ′
(𝑓 (𝑥0 ))2 − 𝑓 𝑥0 𝑓 ′′ 𝑥0
−0.528𝑥3.32
𝑥1 = 0.8 −
(3.32)2 −(−0.528)(−7.2)
𝑥1 = 1.0428
60
Iteration 2
Set 𝑥0 = 𝑥1 = 1.0428
Compute 𝑓 𝑥0
𝑓 𝑥0 = 𝑓 1.0428 = 1.04283 − 6𝑥1.04282 + 11𝑥1.0428 − 6 = 0.0802
Compute 𝑓 ′ 𝑥0
𝑓 ′ 𝑥0 = 𝑓 ′ 1.0428 = 3𝑥1.04282 − 12𝑥1.0428 + 11 = 1.7487
Compute 𝑓 ′′ 𝑥0
𝑓 ′′ 𝑥0 = 𝑓 ′′ 1.0428 = 6𝑥1.0428 − 12 = −5.7432
Now calculate the next approximation 𝑥1

61
𝑓(𝑥0 ) 𝑓 ′ (𝑥0 )
𝑥1 = 𝑥0 − ′
(𝑓 (𝑥0 ))2 − 𝑓 𝑥0 𝑓 ′′ 𝑥0
0.0802𝑥1.7487
𝑥1 = 1.0428 −
(1.7487)2 −(0.0802)(−5.7432)
𝑥1 = 1.0029
Therefore, after 2 iterations using Modified Newton-Raphson method with 𝑥0 = 0.8 the
approximate root of the equation is 0.7103.
Note. Although Modified Newton-Raphson method is preferable for multiple roots, it is
somewhat less efficient and requires more computational effort than the standard method
for simple roots.

62
Modified Secant Method
Rather than using two arbitrary values to estimate the derivative, an alternative approach
involves a fractional perturbation of the independent variable to estimate f (x)
𝑓(𝑥𝑖 +𝛿𝑥𝑖 ) − 𝑓(𝑥𝑖 )
𝑓 ′ (𝑥𝑖 ) ≈
𝛿𝑥𝑖
where δ = a small perturbation fraction. This approximation can be substituted into
Newton-Raphson method formula to yield the following iterative equation:
𝛿𝑥𝑖 𝑓(𝑥𝑖 )
𝑥𝑖+1 = 𝑥𝑖 –
𝑓(𝑥𝑖 +𝛿𝑥𝑖 ) − 𝑓(𝑥𝑖 )

63
Example: Use the modified secant method to estimate the root of f(x)=𝑥 4 - 𝑥 3 - 3𝑥 2 - x.
Use a value of 0.01 for δ and start with x = 2 and iterate up to third iteration, compute
the approximate error for each iteration.
Solution:
Iteration 1
x0 =2
𝑓(𝑥0 )= -6
x0 + δx0 =2.0200
𝑓(𝑥0 + 𝛿𝑥0 )= -5.8539
𝛿𝑥0 𝑓(𝑥0 )
𝑥1 =𝑥0 – = 2.8216
𝑓(𝑥0 +𝛿𝑥0 ) − 𝑓(𝑥0 )

64
Iteration 2
x0 = 2.8216
𝑓 𝑥0 = 14.2145
x0 + δx0 = 2.8498
𝑓(𝑥0 + 𝛿𝑥0 ) = 15.5992
𝛿𝑥0 𝑓(𝑥0 )
𝑥1 = 𝑥0 – = 2.5319
𝑓(𝑥0 +𝛿𝑥0 ) − 𝑓(𝑥0 )
2.5319− 2.8216
εa = x100% = 11.442%
2.5319

65
Iteration 3
x0 = 2.5359
𝑓(𝑥0 ) = 3.2189
x0 + δx0 = 2.5613
𝑓(𝑥0 + 𝛿𝑥0 ) = 3.9908
𝛿𝑥0 𝑓(𝑥0 )
𝑥1 = 𝑥0 – = 2.4301
𝑓(𝑥0 +𝛿𝑥0 ) − 𝑓(𝑥0 )
2.4301− 2.5359
εa = x100% = 4.3537%
2.4301

66

You might also like