Introduction To Numerical Methods: DR Jamieson Christie October 1, 2009
Introduction To Numerical Methods: DR Jamieson Christie October 1, 2009
Dr Jamieson Christie
October 1, 2009
Contents
0.1
Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0.2
1 Introduction
1.1
Introduction to Linux . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
1.1.2
1.1.3
Creating files . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction to FORTRAN . . . . . . . . . . . . . . . . . . . . .
1.2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2
Hello World! . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.3
A second program . . . . . . . . . . . . . . . . . . . . . .
10
1.2.4
Real numbers . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.2.5
13
1.2.6
Mathematical functions . . . . . . . . . . . . . . . . . . .
15
1.2.7
Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
1.2.8
17
1.3
Introduction to gnuplot . . . . . . . . . . . . . . . . . . . . . . .
18
1.4
Representation of numbers . . . . . . . . . . . . . . . . . . . . . .
20
1.4.1
Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
1.4.2
Floating-point numbers . . . . . . . . . . . . . . . . . . .
21
1.2
23
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.2
bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.3
26
2.4
27
2.5
Newtons method . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
2.6
Iterative procedures . . . . . . . . . . . . . . . . . . . . . . . . .
30
3 Linear algebra
31
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.2
Elementary operations . . . . . . . . . . . . . . . . . . . . . . . .
31
3.3
32
3.3.1
32
3.3.2
Gaussian elimination . . . . . . . . . . . . . . . . . . . . .
33
3.3.3
Computational implementation . . . . . . . . . . . . . . .
34
3.3.4
Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
3.3.5
37
3.3.6
Determinants . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.3.7
38
3.3.8
39
3.3.9
Iterative methods . . . . . . . . . . . . . . . . . . . . . . .
39
4 Numerical integration
39
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
4.2
Trapezium rule . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.3
Function interpolation . . . . . . . . . . . . . . . . . . . . . . . .
42
4.4
43
4.5
44
4.6
45
5 Numerical differentiation
45
5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
5.2
45
5.3
46
5.4
Higher-order derivatives . . . . . . . . . . . . . . . . . . . . . . .
49
5.5
49
5.6
Richardson extrapolation . . . . . . . . . . . . . . . . . . . . . .
50
Introduction
These notes were originally written in the summer of 2007 for the introductory
part of the Introduction to Numerical Methods course given in the Condensed
Matter Diploma course at the Abdus Salam International Centre for Theoretical
Physics (ICTP). They have not been changed since, and so many not be suitable
for any other use. For example, there are occasional references to the ICTP
computer set-up.
The notes were intended to be used for students who had not necessarily had any
experience of programming or computers, and so they start from a very basic
level. They formed a ten-lecture course, where each lecture took 90 minutes.
If you find any errors while reading these notes, I would be very grateful if you
let me know.
Provided this notice is preserved, you may freely distribute all or part of these
notes for personal or academic use. You may not: remove this notice, make any
commercial use of these notes, modify these notes or distribute any modified
copy, make and/or distribute a larger work of which these notes form a part.
Jamieson Christie
jamieson.christie@ucl.ac.uk
October 2009
Syllabus
INTRODUCTION Files and directories in Linux; creating files; compiling
and executing FORTRAN code; variable declarations; logical constructs; do
loops; mathematical functions; matrices; plotting data with gnuplot; input and
output; representing numbers
ROOT FINDING Bisection; regula falsi ; secant method; Newtons method;
use of functions; functional iteration
LINEAR ALGEBRA Matrix algebra; solving systems of equations
NUMERICAL INTEGRATION Trapezoid rule; Simpsons rule
NUMERICAL DIFFERENTIATION Forward and centred difference methods
mv
Change the name of a directory or file with mv. For instance, to change the name
of the directory name to programs, go to the parent directory of /tt name, and
type
[jchristi@condense-38 ~]$ mv name programs
cp
To copy a file, use cp. For instance, to make a copy of file1 called file2, type
[jchristi@condense-38 ~]$ cp file1 file2
rm
To delete a file, use rm. Once deleted, the file cannot be recovered. For example,
to delete a file file1, type
[jchristi@condense-38 ~]$ rm file1
rmdir
To remove an empty directory, use rmdir. If you want to remove a directory
with files in, use rm -r. Be careful: once deleted, the files cannot be recovered.
[jchristi@condense-38 ~]$ rm dir1
[jchristi@condense-38 ~]$ rmdir dir1
rmdir: dir1: Directory not empty
[jchristi@condense-38 ~]$ rm -r dir1
[jchristi@condense-38 ~]$
man
The most useful command! To find out about any command, type man, then
the name of the command. For example, for more information on the command
ls, type
[jchristi@condense-38 ~]$ man ls
more
To look in a text file one screen at a time, you can use the more command.
[jchristi@condense-38 ~]$ more filename
less
A more powerful command to look at text files is less, which allows us to move
backwards and forwards in the file.
[jchristi@condense-38 ~]$ less filename
head
To look at the start of a file, use head. By default, it shows the first 10 lines of
the file, use head -n 20 if you want to see the first 20, for example.
[jchristi@condense-38 ~]$ head -n 2 sos.cpp
#include <iostream>
#include <fstream>
tail
The opposite command is tail, which looks at the end of a file. The default is
the end 10 lines, use tail -n 20 if you want to see the end 20, for example.
[jchristi@condense-38 ~]$ tail -n 2 sos.cpp
return 0;
}
grep
The command grep allows us to search inside files for a pattern. For example,
say I have files called file1 and file2 and I want to find out which of them
contains the word Diploma. I would type
[jchristi@condense-38 ~]$ grep Diploma file1 file2
file2:Diploma 2007-08, which will be
The program returns the whole line containing the word.
wc
The command wc counts the number of lines, words and characters in a file. For
example,
[jchristi@condense-38 ~]$ wc *cpp
32
90 669 adjust2.cpp
29
86 590 adjust.cpp
61 176 1459 total
The first figure is the number of lines, the second figure is the number of words,
and the third is the number of characters.
ls -l
Although there are lots of options for ls, one of the most useful is ls -l, which
shows lots of information about the files in the directory. For example,
[jchristi@condense-38 ~]$ ls -l *cpp
-rw-r--r-- 1 jchristi smr1694 669 May 24 15:46 adjust2.cpp
-rw-r--r-- 1 jchristi smr1694 590 Dec 11 2006 adjust.cpp
Of most interest are the permissions, the owner (jchristi), the size of the file
(here in kB), and the date and time it was last modified.
*
The * acts as a wild card. For example, ls *cpp will last all files which end in
cpp, whereas ls cpp* will last all files which begin in cpp.
Tab completion
If you only know part of a filename or directory, by pressing the Tab key, you
can obtain the rest of the name, providing that it is unique. This speeds up
typing considerably.
1
1.1
1.1.1
Introduction
Introduction to Linux
The command line
The first part of this course is an introduction to the Linux operating system.
This is the operating system in which a lot of scientific computing is done. All
the computers in ICTP can boot into either the Linux1 or Windows operating
system. When the machine starts up, choose linux, rather than windows from
the first menu. Eventually, you will be asked for your username and password.
When you provide this, you will log in to Linux.
Once you have logged into the computer and Linux is running, you will need
to open a terminal. In this course, we will be working mainly from the command line. This is a plain-text interface where we can type commands directly.
There are hundreds of possible commands and processes we can do from the
command line; Linux gives its users a lot of control over the computer and its
operating system. In this course I will concentrate on the essentials for scientific
computing.
A terminal can be opened with a program called XTerm or KTerm or Terminal
or something similar. When you do this, you will see something like this:
[jchristi@condense-38 ~]$
This is called the command line. You will see a slightly different command line
with your username and the name of the computer you are using (jchristi is
my username, and condense-38 is the name of the desktop computer in my
office, where I was writing these notes).
When you have this line, you can issue commands by typing their name, any
arguments that they take, and pressing the Enter key.
1.1.2
When you open a terminal, you are put in your home directory. Each directory
can contain files and other directories. You can create new directories with the
command mkdir, followed by the name of the directory to be created. You
should name the directory with a clear idea of what you can expect to find
in it. This makes for much easier filing and retrieval. For example, in my
home directory, I have directories called applications, backups, Diploma,
Documents, MDprogram, Papers, Pictures, private, among others. Many
of these directories have sub-directories inside them, for example, the Diploma
1 There are many different types of Linux, and I will not go into the differences. In this
course, we will use whichever type happens to be installed on the ICTP computers.
Creating files
There are very many ways to create text files. This course mainly covers programming, so it makes sense to choose an editor which offers some support for
programming. My favourite is called emacs, which is widely used. It can be
called from the command line by typing:
[jchristi@condense-38 ~]$ emacs &
The & means that emacs should be run in the background. This means that
control will be returned to the command line. emacs will open a new window
in which you can type.
When you have finished typing in emacs, you can save the file by executing
the command Ctrl-x Ctrl-s. The notation Ctrl-x means that the Ctrl key
should be held down, and the x key pressed while the Ctrl key is still being
held. To close emacs you use the command Ctrl-x Ctrl-c. To open emacs
and look at a specific file, type emacs filename &
Once you have created and saved a file, you can return to the command line,
type ls, and see that it will be listed there.
Sometimes you will not need to modify a file, but just to look at it. The
commands more and less do this: more allows us to go page by page forwards
through the file, whereas less allows us to go forward and backward. If you just
want to look at the first few lines of a file, you can use the head command, and
if you just want to look at the last few lines, you can use the tail command.
If you have many files, and cant remember which one contains a certain word
or phrase, the grep command will search the files for a given phrase.
The numbers of lines, words and characters in a certain file can be obtained
with the wc command.
Here, I have listed only a small subset of the commands, essentially, enough to
get you started. There are very many others, and you can ask us if there is
something specific that you want to do.
10
1.2
1.2.1
Introduction to FORTRAN
Introduction
In this course, you will learn how to program a computer to perform useful
scientific tasks. In order to do this, you will write code in a language called
FORTRAN. There are hundreds of different programming languages available.
For our purposes, FORTRAN has two main benefits: it is easy to learn, and a
lot of scientific code is written in it.
After having written the code in FORTRAN, you will need to compile it. To do
this you will use a program called a compiler, which translates the FORTRAN
code into machine code. Once we have done this, we will then create an executable file, which the computer can run. Usually, these two steps are done at
once.
1.2.2
Hello World!
Your first program will be very simple: you will write code which will print Hello
World! to the screen. Open emacs with the command emacs helloworld.f90
&
helloworld.f90 does not yet exist, so emacs shows you a blank page.
The FORTRAN code is as follows:
program hello
write(*,*)Hello World!
end program
Type this code exactly as written into emacs and save the file with Ctrl-x
Ctrl-s.
The program starts with the line program hello. This tells the compiler that
the program starts there, and is called hello. You can name the program as
you wish. The command write(*,*) tells the compiler to write to the screen
whatever follows between the quotation marks. The program then ends with
end program so that the compiler knows the program has ended.
There are many different compilers that can be used. You can use g95 which
should be installed on the ICTP machines, and is available for free if not. In
order to compile the program, you must type the following on the command
line, after having saved the file:
[jchristi@condense-38 ~]$ g95 helloworld.f90 -o helloworld.x
This tells the compiler (g95) to compile the file helloworld.f90, and create an
executable program called helloworld.x. If this step is carried out successfully,
then the executable helloworld.x will appear when the files are listed with ls.
11
You can then run the program with the command ./helloworld.x
The ./ before the program name tells the computer that the executable file is
in the current directory.
[jchristi@condense-38 ~]$ ./helloworld.x
Hello World!
If all works well, you should see that the program has written Hello World! to
the screen, just as you wanted.
Pause here and make sure that you understand what you have done. Perhaps
change the text to print some other message to the screen (perhaps Hello world!
in your own language), or more than one. Change the name of the executable
file.
1.2.3
A second program
Now, you try a more complicated example. You will write a program to compute
the first 100 square numbers. Use emacs to open a new file called squares.f90,
and type the following exactly as written:
program squares
implicit none
! This program computes the first 100 squares
! and prints them to the screen
integer i
do i=1,100
write(*,*)i,i*i
enddo
end program
Dont worry about what the program does just yet, just compile it with
[jchristi@condense-38 ~]$ g95 squares.f90 -o squares.x
and run with
[jchristi@condense-38 ~]$ ./squares.x
You will see that it prints to the screen, the numbers from 1 to 100, and their
squares.
12
When you want the computer to do a number of nearly identical things, you
can use a do loop, such as the one in the program above. The code inside the
do loop, which is everything between the do line and the enddo line, is executed
a number of times. This number is set by the first line of the loop. In this
case, the variable i is set equal to 1 to begin. Then the code inside the loop is
executed once. Then the program returns to the start of the loop, and adds 1
to i. This continues to happen until i is equal to 100, in our example. This
means that the code inside the loop is executed 100 times.
The code inside the loop is a simple write command such as you have already
used. It first writes i to the screen, and then the square of i, which in FORTRAN is written as i*i. Since i increases by 1 each time the loop is executed,
the program writes the first 100 numbers and their squares to the screen.
There is another new feature in this program. For the first time, you are using
a variable, which you have called i. The compiler needs to know what i is so
that it knows how much memory to allocate, for example. At the beginning
of the program, you must declare the variable: tell the compiler what type of
variable it is. Since here i is an integer, you include a line integer i. It is
possible for FORTRAN to use variables that have not been declared, however,
this is almost always a disaster, and offers a benefit only in a very small number
of cases. To turn off this option, and force FORTRAN to accept only declared
variables, the command implicit none is included at the start of the program,
before the declarations. This line should always be included in your programs.
The final new feature is the comments included in the program. The programs
you have written so far have been very simple, but many programs are extremely
complicated and would be difficult to understand just from the code alone. So
you can write comments to explain to yourself or future programmers what
parts of the code are doing. These comments begin with ! and the rest of the
line then forms the comment. The compiler ignores the comments.
Again, you should pause here and make sure you understand what this program
is doing. You can try changing the start and end points of the do loop, or change
the values written to the screen, for instance, writing the cube with i*i*i or
more complicated expressions like i*i - 5*i + 6
You can change the step size of the loop, by including a third number in the
first line of the do loop. For example, do i=1,100,2 will execute a loop and
then add 2 to i. If you want i to decrease, you should use a negative number
as follows do i=100,1,-1. Experiment with these options.
In some cases, you will not want to have the output written to the screen, but
rather put into a file. The simplest way to do this is with the > command, as
follows:
[jchristi@condense-38 ~]$ ./squares.x >results
[jchristi@condense-38 ~]$
13
This creates a file called results, which you can examine with more, less, or
emacs, which contains the output of the squares.x program. You will use this
results file to generate some graphs.
1.2.4
Real numbers
So far, you have only used integer numbers. Using real numbers is slightly more
complicated. Later we will discuss the way numbers are stored in the computers
memory. For the time being, lets assume that you want to use real numbers
with the highest accuracy, so-called double precision. (Single precision real
numbers also exist; they take up less memory, but are less accurate, and we
will not use them in this course.) Each compiler implements double-precision
differently. Look at the following code:
program real
implicit none
integer, parameter :: dp=kind(1.0d0)
real(kind=dp) i
integer j
i=2.7391d0
do j=1,10
write(*,*)j*i
enddo
end program
Firstly, all real numbers are written in the form 1.234d5 for 1.234 105 . The
d tells the compiler that this number must be double precision. In the first
line of code, the parameter dp is set to be equal to the kind of 1.0d0. This
means that the compiler will do the job of finding out the exact specification
of double-precision numbers for you. All you have to do is set the kind of any
remaining real numbers to dp, as done in the next line to the real number i.
The other new feature in this code is that dp is declared as a parameter. This
is FORTRANs way of declaring a value that is not going to change during the
course of the program. Any variable can be declared as a parameter, and the
value must be set at the same time as the declaration. This has two benefits:
if you know a variable should not change during the program, but accidentally
do change it, the compiler will give you an error, and secondly, it allows the
compiler to make the code faster.
14
1.2.5
Many times, you will want to write code that is only executed if certain other
conditions are fulfilled. For this task, you can use an if statement. For example,
here is a code snippet2 which computes the square root of a number x, but only
if that number is greater than zero.
if(x.gt.0.0d0) then
y=sqrt(x)
write(*,*)y
else
write(*,*)y is not greater than zero
endif
The first thing an if statement tests is to see whether the statement in the
brackets, in this case, x.gt.0.0d0, is true or not. If it is true, then the first set
of commands is executed, if it is not true, then the second set of commands is
executed. This code also introduces the .gt. operator, which means greater
than. Similarly, the operator .lt. means less than.
It is possible to have more than two options. Each statement is evaluated in
turn, but once one has been found to be true, the program will pass to the end
of the if statement.
if(x.gt.1000.0d0)then
write(*,*)x is very big
else if(x.gt.100.0d0)then
write(*,*)x is quite big
else if(x.gt.10.0d0)then
write(*,*)x is quite small
else
write(*,*)x is very small
endif
Sometimes, you will want to test more than one statement, and execute different
commands depending on how many of them are true. This can be accomplished
with .and. and .or. relations. For example, this code multiplies x and y only
if both of them are positive.
if((x.gt.0.0d0).and.(y.gt.0.0d0))then
z=x*y
endif
2 From now on, I will often write only the relevant pieces of code. Everything I write will
be easily transferable into legitimate FORTRAN code by simply declaring the variables, and
putting the start and end program commands.
15
16
The .eq. relation is a test which compares whether the two variables are equal.
It should always be used for this. The mistake is to write if(x=2). This will
set x equal to 2, which could easily cause mistakes in the rest of the code, and
will cause the if statement to always evaluate as true.
While on the subject, another common mistake is not initialising variables. Take
a look at the following code, which multiplies two integers i and j, if they are
both positive.
program if
implicit none
! This program can cause errors in certain cases
! BE CAREFUL!
integer i,j,k
i=3
j=-2
if((i.gt.0).and.(j.gt.0))then
k=i*j
endif
write(*,*)k
end program
If i and j are both positive, there is no problem, but if not, as in the above
example, then the program has no value of k. It will output to the screen, or
carry on to the rest of the program, some random piece of memory which it has
read. A good compiler should be able to catch these mistakes for you, but you
should be very careful. I once lost a lot of time due to an error caused exactly
this way.
1.2.6
Mathematical functions
FORTRAN has many inbuilt mathematical functions, such as sin, cos, tan,
sqrt, exp and log. Inverse trigonometric functions are given by asin, acos
and atan. Some of them are used in the code below:
program mathexamples
implicit none
integer, parameter :: dp=kind(1.0d0)
17
real(kind=dp) a,b,c,d,e
real(kind=dp), parameter :: pi = 4.0d0*atan(1.0d0)
write(*,*)pi = ,pi
a = sin(pi)
b = sin(2.0d0*pi)
c = cos(2.0d0*pi)
d = exp(pi)
e = log(d)
write(*,*)a,b,c,d,e
end program
Note that the answers are not exactly right. sin() and sin(2) should both be
0, for example, and they are very slightly different, whilst the value of itself
is also slightly wrong. Some causes for this behaviour will be discussed later.
1.2.7
Matrices
Very often in science, we will need to deal with matrices, often called arrays in
FORTRAN. Fortunately, these are easy to use. In FORTRAN, an array must
contain all elements of the same type, that is, all integers, or all reals, or all
logical variables etc.
Declaring and using an array is very simple. In this program, we declare and
initialise a 10 10 identity matrix, with ones on the diagonal, and zeroes elsewhere.
program array
implicit none
integer identity_matrix(10,10)
integer i,j
do i=1,10
identity_matrix(i,i) = 1
do j=1,10
if(i.ne.j)then
identity_matrix(i,j) = 0
endif
enddo
enddo
end program
18
In FORTRAN, the first index refers to the column, and the second index refers
to the row. For instance, b(5,3) is the element in the fifth column and third
row of the matrix b. If you have learnt some C or C++, you will know that
these languages refer to the elements the other way round.
Also, in this program, I have put one do loop inside another; this is called
nesting loops. I have also used the not equal to (.ne.) relationship.
Something to think about now: write a program which will multiply together
two square matrices. You will have to declare the matrices, give some values to
the elements, either by assigning them inside the program or reading them in,
and then deduce the right commands to multiply them correctly. I will give a
correct solution later.
1.2.8
You have already seen how to write to the screen. It is often helpful to be able
to read from the screen as well. For example, take a simple yes/no question like
this.
program capitals
implicit none
character answer
write(*,*)Do you want all capitals? y/n
read(*,*)answer
if(answer.eq.y)then
write(*,*)HELLO WORLD!
else if(answer.eq.n)then
write(*,*)hello world!
else
write(*,*)You should have typed y or n
endif
end program
This program will wait for you to give an input which it will give to the variable
answer. If you type y, it will print the message in capitals, if you type n, it will
use small letters. If you use neither, you will get a message telling you what
you should have used.
This program also introduces a new variable type, that of character. If you
want to declare variables of type character with a given length, they should
be declared as follows, for a variable called surname with up to 20 characters:
character(20) surname
19
Rather than reading or writing to the screen, you will often want to read or
write to files. This is an example from my own work: reading in the positions
of some (72, in this case) atoms:
open(10,FILE=positions)
do i=1,72
do j=1,3
read(10,*)pos(j,i)
enddo
enddo
close(10)
Before reading from a file, it needs to be opened, which is done in the first line
above. The number 10 here is called the unit number, which is a unique number
between 9 and 99, assigned by the programmer to a specific file. Then, inside
two do loops, the atomic positions are read in to the array pos. After having
read all the necessary data, the file is closed. Note that if I want to read or
write from a different file, I must use a different unit number.
The syntax for writing is very similar.
open(11,FILE=newpositions)
do i=1,72
do j=1,3
write(11,*)newpos(j,i)
enddo
enddo
close(11)
These are the very simplest methods of reading and writing to files. Other
technicalities exist, but you do not need to know them at the beginning.
1.3
Introduction to gnuplot
Lets look at the results file you created earlier from the squares program. Type
as follows:
gnuplot> plot "results"
A graph should be displayed, with all your data points marked with red crosses.
If you prefer a nice clean line, type
gnuplot> plot "results" with lines
which will join the points for you.
You might want to check whether our FORTRAN program has given the correct
results. gnuplot can plot functions. Type
gnuplot> plot "results" with points, x*x with lines
You will see that the points from our file results sit perfectly on the line plotted
for us by gnuplot, verifying your program.
Go back to the squares program, and reprogram it to output the cubes as well,
by changing the relevant line to
write(*,*)i,i*i,i*i*i
Look at the results file so you know its format.
Start gnuplot again. You can plot the first column (x) against the second column
(x2 ) as before
gnuplot> plot "results" using 1:2 with lines
Now, you can also plot the first column (x) against the third (x3 ) using
gnuplot> plot "results" using 1:3 with lines
It is also possible to plot x2 against x3 , and Im sure you can figure out how.
It is also possible to get a three-dimensional plot. To do this one must use the
command splot.
gnuplot> splot "results" using 1:2:3 with lines
Saving the graphs in postscript format is possible, although the syntax is slightly
complicated.
21
If you want to restrict your plot to a specific range, this can be done using
square brackets. For example, to have an x-range from 0 to 10, and a y-range
from 0 to 100, we can use:
gnuplot> plot [0:10][0:100] "results" u 1:2 w l
gnuplot offers many more features. Check out
http://t16web.lanl.gov/Kawano/gnuplot/intro/index-e.html for a good
online resource.
1.4
Representation of numbers
Integers
A computer can only give a finite amount of memory to each number. This
means that only a limited set of numbers can be stored. In the case of integers,
if 32 bits are used, which is common, the range of numbers available runs from
2147483648 to 2147483647. At first, you might think that this would not be a
22
problem: after all, these are large numbers. Lets take a simple example where
this lack of storage becomes a problem.
Say I want to sum the numbers from 1 to n. A simple code to do this is as
follows:
program integercrash
implicit none
integer i, total, n
write(*,*)n=
read(*,*)n
total=0
do i=1,n
total = total + i
enddo
write(*,*)n,total
end program
Make sure you understand what this program does, compile and run it. It
should output the end of the range and the sum. Try it for some different
ranges less than 50000. Now change n to 70000 or some larger number. You
should see immediately that the program is not giving the right answer. The
right answer, which you can compute easily from n(n + 1)/2, is too large to be
correctly represented.
1.4.2
Floating-point numbers
For real (non-integer) numbers, the problem is even more acute. In addition
to the memory problem that exists for integers, it is not possible to represent
almost all real numbers exactly. The usual way of representing them is to use
64 bits for each number: 1 for the sign, 11 for the exponent, and 52 for the
mantissa.
This method of storage is rather complicated, but imagine the number 1.13108 .
We dont need to write 113000000 to store the number, we just need to store
1.13 and 8, if we assume that the computer always uses base 10. Although a
real computer uses base 2, it stores numbers in exactly the same way. With 64
bits for each real number, this is equivalent to storing 15 or 16 significant figures
in base 10.
This representation brings with it a number of issues and problems that you
should know.
23
Many real numbers map to the same floating-point number. The computer has
only a limited range of precision, and so all real numbers closest to a given
floating-point number map to the same one.
Floating-point numbers do not form a group. That is, if you add two floatingpoint numbers, you will not get another floating-point number. The computer
will return the floating-point number closest to the sum. This will usually not
be exact.
If you add a very large number to a very small one, the very small number is
likely not to show up. For instance, adding 1 to 1020 will give 1020 . There
is not enough memory to represent 20 significant figures of a number, and the
computer will drop all but the first 15 or so.
Subtracting two large numbers which are close to each other will often give
a number which is quite wrong. If this number is then used later on in the
program, wrong answers can result.
This program illustrates these two problems. The (g25.15) in the write
statement is there to give the output format of the number.
program fperrors
implicit none
integer, parameter :: dp=kind(1.0d0)
real(kind=dp) x,y,z
integer i,j
x = 1.0d20
y = 1.0d0
write(*,(g25.15)) x+y
x = 1.0d16+1.0d0
y = 1.0d16-1.0d0
write(*,*)The answer should be 2, but is ,x-y
end program
A simple program shows the range of available real numbers, and what happens
once those ranges are exceeded.
program bigandsmall
implicit none
integer, parameter :: dp=kind(1.0d0)
real(kind=dp) x,y
integer i
24
x=1.0d0
y=1.0d0
do i=1,250
x = 50.0d0*x
y = y/50.0d0
write(*,*)x,y
enddo
end program
2
2.1
2.2
bisection
If we know that the zero x lies in a certain range a < x < b, then we can use
the bisection method, which is the simplest of the four methods.
If the position x of the zero is such that a < x < b, then we know that
f (a)f (b) < 0. We simply split the range at the midpoint m = (a+b)/2 and then
find which half of the range the zero is in now, by testing which of f (a)f (m) or
f (b)f (m) is negative. We now have a range exactly half the size of the previous
range, and we know that it contains a zero. Figure 1 shows this process (the zero
is seen to be in the right-hand half of the range at this step). We then repeat
the process with the new smaller range. After a certain number of times, this
method will converge onto the zero of the function, within machine accuracy.
The advantage of this method is that it will always find the zero, and it is easy
to work out how many steps it will take to do so. (The size of the range will
25
Below is a copy of a working program which will find the zero of a function,
if it is given the function, a range which contains a zero, and a tolerance level
to tell the program when it is close enough to zero. Recall from the discussion
of floating-point numbers that the zero will almost certainly never be found
exactly.
The function is defined in this program separately as an example. In this case, it
is probably not necessary because the function is simple. In general, it can take
26
! function definition
! I cannot declare a variable dp outside a function, so I
! have set the kind of these functions to the kind of 1.0d0
! directly
real(kind=kind(1.0d0)) function f(x)
real(kind=kind(1.0d0)) x
f = x*x - 3
return
end function
The function is defined after the program ends. Note that it is also required to
have a type (e.g. real, integer). The first line must declare the name of the
function, in this case f, and the arguments that the function takes, in this case
only one, called x. x also needs to be declared. The definition of the function in
this example is f = x2 3; you can of course change this. The keyword return
is then used to return the value of f (x) to the main program. The function
must end with end function.
In the main program, I have also used a do while loop. Previously, you have
seen a do loop which is executed a certain, specific number of times. In this
case, the do loop is executed only while a certain condition is true.
After compiling and running
the program, you will see that the program outputs
anapproximation to x = + 3. Starting from a range including the root x =
3 will give that answer. You can experiment by changing the range, or the
function.
2.3
The regula falsi method (also called the method of false position) is a slightly
more sophisticated method for finding the zero of a function. As before, we will
assume that there is only one point x at which f (x ) = 0, and that it is known
to lie in a certain range a < x < b.
The regula falsi method approximates the function f as a straight line between
the two points at the end of the range. This straight line will cross the x-axis
at a certain point m, which forms the next approximation to the zero. Defining
the two points at the end of the range as (a, f (a)) and (b, f (b)), the crossing
point m occurs at
ba
m=b
f (b)
(1)
f (b) f (a)
As in the bisection method, the program must then test to see in which subsection of the range, a < x < m or m < x < b, the zero occurs. The process
is then repeated on the smaller range, using (m, f (m)) and one of the other
values, such that the new range contains the zero. Figure 2 shows this method.
28
(b,f(b))
(m,0)
-1
-2
(a,f(a))
-3
-4
-5
0
0.5
1.5
2.5
Figure 2: The regula falsi method. The solid line is the function f . The dotted
line is the straight-line approximation.
This method can converge faster than the bisection method. A disadvantage is
that an initial range known to contain the zero is needed. Also, the number of
iterations needed to find a solution is not known in advance. It can be useful
to stop the search after a given number of iterations, if the zero has not been
found.
The code is not included because it is very similar to the bisection method code.
The only difference is the method by which m is chosen.
2.4
The secant method is very similar to the regula falsi method, but it does not
require an initial range which contains a zero. Instead it starts from any two
points, which we define as (a, f (a)) and (b, f (b)), and calculates where a straight
line passing through these two points will cross the x-axis. This point c occurs
at
ba
c=b
f (b)
(2)
f (b) f (a)
The points (b, f (b)) and (c, f (c)) are now used in the equation above to form
a fourth point, and so on, until the method eventually converges on the zero.
Figure 3 shows this method.
The advantages of this method are that it can be faster to converge than the
regula falsi method, and does not need a range containing a zero. However, it
has the disadvantage that, if the starting points are far away from the initial
zero, the method may not converge.
29
Secant method
4
(c,0)
(b,f(b))
-1
-2
(a,f(a))
-3
-4
-5
0
0.5
1.5
2.5
Figure 3: The secant method. The solid line is the function f . The dotted line
is the straight-line approximation.
The code is not included due to its similarity with the bisection and regula falsi
methods.
2.5
Newtons method
Newtons method (also known as the Newton-Raphson method for this case)
is the most sophisticated of the four methods we will discuss. It starts from a
single point (a, f (a)) and makes a straight-line approximation to the function
f which is tangent to f at the point (a, f (a)). In order to do this, the program
needs to know the derivative of the function f . In the case of a simple function,
this can be provided analytically, or it can be computed by the program.
Defining the derivative of f at the point a as f 0 (a), then the new approximation
to the zero is given by b = a f (a)/f 0 (a). The process is then repeated using
the point (b, f (b)) as the starting point, until the method is sufficiently close to
the zero.
A working code is given below.
! This program finds the zeroes of a function
! by Newtons method
program zeroes_newton
implicit none
30
Newtons method
1
0.5
(b,0)
-0.5
(a,f(a))
-1
-1.5
-2
-2.5
-3
0.8
1.2
1.4
1.6
1.8
Figure 4: Newtons method. The solid line is the function f . The dotted line is
the straight-line approximation.
It is possible to define more sophisticated methods. Newtons method computes a
linear approximation to f using its first derivative. Cauchys method computes a
quadratic approximation to f using its first and second derivatives, and three points.
2.6
Iterative procedures
Newtons method is one example of the solution to an equation found by iterating a function. I will give a further example3 .
Let p > 1. What is the value of the following infinite expression?
a=
1
p+
1
1
p+ p+...
(3)
We can see that a solution to the above equation can be found iteratively as
follows.
x0 = 1
(4)
1
1
=
p+1
p + x0
(5)
1
1
1 = p+x
p + p+1
1
(6)
x1 =
x2 =
...
3 from
32
(7)
xk+1 =
1
p + xk
(8)
(9)
1
p+x
(10)
where F is defined as
F (x) =
One can then compute a to arbitrary precision simply by iterating the above
functional form, using a code very similar to that given for Newtons method
above. Of course, in this case, the answer can be determined analytically:
x =
1
p + x
x 2 + px 1 = 0
p
x = (p p2 + 4)/2
3
3.1
(11)
(12)
(13)
Linear algebra
Introduction
In this subsection, we will look at matrices and learn how to solve systems of
linear equations, taking careful note of the problems that can arise.
3.2
Elementary operations
3.3
(14)
4x1 + x2 3x3 = 10
(15)
(16)
We can first ask which operations do not change the solution {x1 , x2 , x3 }. There
are three of relevance: multiplying any of the equations by a constant value, reordering the equations, and adding any multiple of one equation to any multiple
34
of any other. I give these three without proof, but it is easy to satisfy yourself
that they will not alter the values xi . This means that we can rearrange our
system of equations using these three operations, for our convenience.
3.3.2
Gaussian elimination
The method below, which solves a set of linear equations, is called Gaussian
elimination. It uses the fact that we can add a multiple of one equation to any
other equation without altering the solution, to get rid of some of the coefficients.
We first add -2 times equation 14 to equation 15, to give the equations as follows:
2x1 3x2 + x3 = 18
(17)
7x2 5x3 = 46
(18)
x1 4x2 + 2x3 = 20
(19)
The value of -2 was chosen specifically to make the coefficient of x1 zero in the
second equation.
Then we add 1/2 times equation 17 to equation 19. This gives us
2x1 3x2 + x3 = 18
(20)
7x2 5x3 = 46
(21)
5
11
x2 + x3 = 29
(22)
2
2
The value of 1/2 was chosen to make the coefficient of x1 in the third equation
zero.
Now, we wish to make the coefficient of x2 in the same equation zero, which we
can do by adding 11/14 times equation 21 to equation 22. This gives us
2x1 3x2 + x3 = 18
(23)
7x2 5x3 = 46
(24)
20
100
x3 =
14
14
(25)
At this point, we essentially have the full solution. It is easy to see from equation
25 that x3 = 5. Substituting this value into 24 gives x2 = 3. The values of
x3 and x2 can then be substituted into equation 23 to give x1 = 2. The full
solution is {2, 3, 5}.
35
3.3.3
Computational implementation
2 3 1
x1
18
4
1 3 x2 = 10
(26)
1 4 2
x3
20
After the transformations carried out above, equations 23 - 25 can be written
as
2 3
1
x1
18
0 7
5 x2 = 46
(27)
0 0 20/14
x3
100/14
In the following implementation, the matrices are stored as a, x and b respectively.
! This program solves a set of linear equations by Gaussian elimination
program gausselimination
implicit none
integer, parameter :: dp=kind(1.0d0)
integer, parameter :: ndim=3 ! dimension of matrix
real(kind=dp) p
integer i,j,k
! a is the set of coefficients, b is the set of answers
! x is the solution
real(kind=dp) a(ndim,ndim),b(ndim),x(ndim)
! set matrix
open(11,file=matrix.in)
do i=1,ndim
read(11,*)a(:,i)
enddo
read(11,*)
read(11,*)b(:)
close(11)
do i=1,ndim
do j=i+1,ndim
! set pivot
p = a(i,j)/a(i,i)
! work out the changed elements
do k=1,ndim
36
When compiled and run, we see that the correct answers are given.
3.3.4
Pivoting
despite the fact that the system is solvable by inspection with the answer x =
{1.5, 1}.
The same difficulty is seen in the following system, although at first it is not
easily evident.
1
x1
1
=
(29)
2 1
x2
2
If we carry out the Gaussian elimination analytically, we find
1
1
x1
=
x2
2 2
0 1 2
This means that
x2 =
2 2
1 2
(30)
(31)
(32)
this will give us x1 = 0. The true values for very small are {x1 = 1.5, x2 =
1 3
2 }, which are consistent with values {1.5, 1} found above for = 0. It is
clear that if the computer makes this approximation, the value of x2 will be
accurate, while that of x1 will be completely wrong.
Fortunately, the problem may be solved straightforwardly. If the order of the
rows are interchanged, such that the equations are now written as
2 1
x2
2
=
,
(33)
1
x1
1
and then the Gaussian elimination is performed, the following matrices are
obtained
2
1
x2
2
=
(34)
0 1 + /2
x1
1
Starting from the bottom row, we find that
x1 =
1
,
1 + /2
(35)
which the computer will set equal to 1 if is sufficiently small, which, when
substituted into the top row, will give x2 = 1.5, as it should. We now have the
right answers.
This is a classic example of a small numerical instability causing a very large
error. We must be careful, whenever we use this elimination technique, to
38
minimise the risk of this happening. This type of row interchange is known as
partial pivoting.
The problem above occurred because the number was much smaller than other
elements in the row. We solved the problem by interchanging the rows, which
meant the first coefficient of the pivot row was of comparable size to the other
elements. The technique of partial pivoting is the act of interchanging the rows
such that the row with largest magnitude is on the diagonal at each step. (One
can also use complete pivoting, which is the interchange of rows and columns.
However, the extra complexity usually outweighs any benefit.) In the example
above, we interchanged the rows so that 2 was on the diagonal, rather than the
much smaller .
Although we did not use partial pivoting in the code above, in practice, it should
always be used with elimination techniques of this type.
3.3.5
(36)
It is clear that both equations are identical, and that there are an infinite number
of solutions for x1 and x2 . Consider the matrix equation
1 2
x1
5
=
(37)
2 4
x2
12
It is clear that the equations are inconsistent, and that there is no solution for x1
and x2 . If we were to attempt Gaussian elimination on either of these matrices,
we would see a zero appear on the diagonal. This is a sign that Gaussian
elimination cannot proceed, and that that the matrix is singular. A singular
matrix has a determinant of zero, and does not have an inverse.
Another way to tell if the system will not have a unique solution is to assess
whether the rows (or columns) are linearly independent. The condition for linear
independence is that it is not possible to construct a linear combination of the
rows (or columns) equal to zero (except for the trivial case 0x1 +0x2 +0x3 +. . .).
If the rows (or columns) are linearly independent, then a unique solution will
exist.
In the above cases, it is easy to see that the equations have no unique solution.
As the systems become larger, it become harder. At first sight, the dependency
in the following system is hard to spot, but it can be found from twice the first
column minus the second column plus the third column being equal to zero. If
Gaussian elimination is attempted, it will fail.
2
3
1
x1
12
4 2 10 x2 = 8
(38)
1 2
4
x3
5
39
It is not just singular matrices which can cause problems. Certain matrices
are very sensitive to the effects of small errors. Such systems are called illconditioned, and are characterised by very large changes in the answer for very
small changes in the coefficients. A classic example of an ill-conditioned matrix
is called the Hilbert matrix, where the elements are Hij = (1 + i + j)1 . The
matrix is not singular, but is very nearly so, and the computation of the inverse
of even a 20 20 Hilbert matrix will be all but impossible to do accurately.
3.3.6
Determinants
2 3
4
1
1 4
1
3
2
(39)
2 3
1
0 7
5
(40)
0 0 20/14
No row interchanges or multiplications were done to obtain this matrix, so the
determinant is 27(20/14) = 20, as before. For larger matrices, calculating
the determinant via Gaussian elimination offers a considerable saving in time.
3.3.7
3.3.8
The methods above are valid for matrices in general. If the matrix has a definite
structure, with many elements zero, for example, a tridiagonal or sparse matrix,
then other algorithms can be found, which are faster. Some of these fall under
the class of iterative methods.
3.3.9
Iterative methods
In certain cases, iterative methods have advantages over the elimination methods
which we have already discussed. In particular, if the matrix is sparse (mostly
zeros), then they are faster. They also have very small memory requirements,
and they are very easy to parallelise (spreading the computational load over
many processors).
The first step is to solve each of the equations for one of the variables. If possible,
solve for the variable with the largest coefficient.
Then, we start with an initial estimate of our solution {x1 , x2 , x3 }. This does
not have to be particularly accurate. We substitute our approximation into the
right-hand side of the equations. Evaluating the equations gives us new values
for {x1 , x2 , x3 }. These new values, we hope, are closer than the old values.
They are then substituted in the right-hand sides of the above equations, and
the procedure is repeated until the difference between successive iterations is
Note that the method can be made even faster by using the updated values of
the components of x in the same iteration. This is known as the Gauss-Seidel
method. This method will converge more quickly, but will take more time if
parallelised.
Another profound advantage of these methods is that they are self-correcting,
at least to some extent. Any errors which are made (due to rounding or other
problems) will be removed at successive steps of iteration. A disadvantage is
that the method will not always converge on the solution.
4
4.1
Numerical integration
Introduction
41
16
Function to be integrated
14
12
10
8
6
4
2
0
0.5
1.5
2.5
4.2
Trapezium rule
The simplest numerical scheme for integrating a function is called the trapezium
rule. The area under the curve of the function is divided into thin strips of width
x. The curve of the function is approximated as a straight line, such that each
strip takes the form of a trapezium, hence the name. The area under the curve
is then just the sum of the areas of these trapezia. Figure 5 illustrates this
procedure.
Z
f (x)dx
a
n1
X
i=0
f (a + ix) + f (a + (i + 1)x)
,
2x
(41)
where the integral from a to b is split into n trapezia of equal width x, such
that x = (b a)/n.
From the definition of integration, as x tends to 0, this will give the exact
numerical value of the integral. In the computer, of course, we cannot use such
a small x, but in general, the approximation will be a good one for small x.
A working code is given below. The function to be integrated, in this case
f = x2 3, is given in a separate function.
! evaluates an integral using the trapezium rule
program trapezium_rule
implicit none
integer, parameter :: dp=kind(1.0d0)
42
integer i
real(kind=dp) lhs,rhs
! function
real(kind=dp) f
! the range of integration
real(kind=dp) a,b
! the number of steps
integer n
! width of one strip, all assumed to be the same width
real(kind=dp) dx
! value of integral
real(kind=dp)answer
! set range
a = 1.0d0
b = 2.0d0
n = 10000
! number of steps
dx = (b-a)/n
answer = 0.0d0
do i=0,n-1
lhs = a+(i*dx)
rhs = a+((i+1)*dx)
answer = answer + (f(lhs) + f(rhs))*dx/2.0d0
enddo
write(*,*) answer
end program
real(kind=kind(1.0d0)) function f(x)
real(kind=kind(1.0d0)) x
f = x*x - 3
return
end function
The above program integrates the function f = x2 3 between the values of
43
1 and 2, dividing the range into 10000 strips. You can check analytically that
the answer should be 2/3 and when the program is run, 10000 strips is good
enough to reproduce this answer to a high level of accuracy. The accuracy will
depend on the number of steps chosen. You must choose this number carefully.
If you make it too high, the strip will be too narrow to be evaluated correctly,
and/or the integral will take a very long time to evaluate. Too small a number
of strips will not be a good approximation to the function. If practical, the
integration should be repeated with different numbers of strips, so that you can
be sure that your answer is between these two limits.
Of course, more sophisticated expressions can be developed. We will go on to
discuss these, but first we must take a look at function interpolation.
4.3
Function interpolation
Lets imagine that we have a set of data (x1 , f (x1 )), (x2 , f (x2 )), . . ., but we dont
know the function f . We can approximate a set of n + 1 data points with a
polynomial function of order at least n. I state without proof that there is one
and only one polynomial of degree n which will pass through all the data points.
As a concrete example, take the points (1, 2), (3, 12), and (1, 6). Since there
are three data points, I know that I can find a quadratic function which passes
through all three points.
There is a simple method of doing this. If I define my data points as (xi , f (xi )),
then this quadratic P2 (x) will fit the data points:
P2 (x) =
(x x1 )(x x3 )
(x x2 )(x x3 )
f (x1 ) +
f (x2 )+
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
(x x1 )(x x2 )
f (x3 )
(x3 x1 )(x3 x2 )
(42)
(43)
It is easy to see that in the case where x is equal to one of the xi s, all but one
of the terms will vanish because the numerator will be zero. The term which
remains will be equal to f (xi ), because the numerator and denominator are
equal.
This equation is obviously quadratic in x, and passes through the three points
we have been given. Some simple algebra shows that (11/4)x2 4x 3/4 is our
interpolating function. We can then use this function to evaluate intermediate
values of f (x). This procedure is very simple to implement computationally.
Lets write this more generally. If I am interpolating n data points, with a
polynomial of order n 1, then the general formula will be
Pn1 (x) =
n
n
X
Y
i=1 j=1,j6=i
44
x xj
f (xi )
xi xj
(44)
Note that this method will work whether the data are evenly spaced or not, as
long as the x-values are distinct. This type of interpolating polynomial is known
as the Lagrangian polynomial. Of course, one can interpolate with other types
of functions as well, but that is beyond the scope of this course.
4.4
x x1
x x2
f (x1 ) +
f (x2 )
x1 x2
x2 x1
(45)
It is clear that this function is linear in x, and has the values f (x1 ) and f (x2 )
at the points x1 and x2 respectively.
We have approximated our function f (x) by this linear function P1 (x) in the
region x1 < x < x2 . If the approximation is a good one then
Z x2
Z x2
f (x)dx
P1 (x)dx
(46)
x1
x1
2(x1 x2 ) x1 x2
x2
xx1
x2
2(x2 x1 ) x2 x1
x2
= f (x1 )
+f (x2 )
f (x1 ) + f (x2 )
2(x2 x1 )
(47)
(48)
x1
(49)
x1
(50)
This is exactly the trapezium rule which was discussed above. The connection
between function interpolation and numerical integration should now be clear.
The trapezium rule is merely the simplest interpolation that can be made over a
small range with a polynomial function. More complicated expressions, that is,
polynomials of higher degree, can be used as approximations in a given range.
45
(x x2 )(x x3 )
(x x1 )(x x3 )
f (x1 ) +
f (x2 )+
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
(x x1 )(x x2 )
f (x3 )
(x3 x1 )(x3 x2 )
(51)
(52)
(53)
Simpsons rule is only useful if we can evaluate the function at the intermediate
value of x2 = (x1 + x3 )/2. If we are trying to integrate over an experimental
data set, for example, this will not be possible, but the trapezium rule will still
work.
The procedure for integrating a function with Simpsons rule is exactly the
same as that for the trapezium rule. The range is divided into thin strips, the
contribution of each of which to the integral is evaluated using the relevant approximation. The total value of the integral is just the sum of the contributions.
4.5
So far, we have only looked at integrating data in equally sized strips. There
are two cases in which this is not desirable. Firstly, if we are integrating actual
data, they may not be evenly spaced, and secondly, some functions may vary a
lot faster in some regions than in others. It would be helpful, therefore, to have
a method of varying the step size such that it is larger where the function is
flatter, and smaller where the function varies more quickly.
Of course, we could just plot the function and vary the step size by hand, perhaps
running the program separately on a series of ranges with an appropriate number
of steps, and then summing the results to get the total value of the integrand.
However, it would be better to have an automated scheme. Schemes where the
strip size varies are often called adaptive quadrature schemes.
The principle of the method is straightforward. The overall integration range is
subdivided into strips as before. Each strip is integrated using Simpsons rule.
Each strip is then split in half at the midpoint, and the two pieces are each
integrated using Simpsons rule. If the sum of these two integrands is within
a user-specified tolerance of the integral of the full strip, then it is accurate
46
enough, and the program moves on to the next strip. If not, each sub-strip is
divided in two, and the process is repeated, until the strip is small enough that
the process gives an answer within the tolerance.
The errors will be additive, so if you want the whole integral to be evaluated
with a maximum error of , and there are n strips, then each one must be
evaluated to within a tolerance of less than /n.
4.6
(54)
(55)
where the endpoints of one strip are assumed to be x1 and x2 , and f n (x) is
the nth derivative of f evaluated at x. From these results, it is clear that
Simpsons rule is considerably more accurate. In fact, using a cubic interpolating
polynomial in each strip is no more accurate.
5
5.1
Numerical differentiation
Introduction
5.2
f 0 (x) = lim
(56)
f (x + x) f (x)
,
x
(57)
(58)
x 00
x2 000
1
(f (x + x) f (x))
f (x)
f (x) + . . . ,
x
2!
3!
(59)
f (x + x) = f (x) + xf 0 (x) +
so rearranging, we find
f 0 (x) =
Because x is always very small, we can neglect the term in x2 and all smaller
terms to say that the error in this approximation is of order x.
5.3
It would be preferable to have an approximation with a smaller error. Fortunately, this can easily be achieved. We can create Taylor expansions based on
equation 58 for the functions f (x + x) and f (x x). We can then construct
the difference of the two f (x + x) f (x x), and we can see that every second
term in the expansion will be cancelled, as follows:
f (x + x) = f (x) + xf 0 (x) +
x2 00
x3 000
f (x) +
f (x) + . . .
2!
3!
(60)
f (x x) = f (x) xf 0 (x) +
x2 00
x3 000
f (x)
f (x) + . . . ,
2!
3!
(61)
x3 000
f (x) + . . . .
3
(62)
f (x + x) f (x x) x2 000
f (x) + . . . .
2x
6
(63)
Now, if we only use the first term in this approximation, we see that the largest
term which we ignore is of order x2 , which is significantly smaller than x.
This method is much more accurate than our first guess.
48
A working code for this is given as follows. We will be differentiating the function
f (x) = x4 3 at the point x = 1. It is easy to calculate that the value of the
derivative at this point is 4.
! uses the centred-difference method to differentiate a known function
program diff
implicit none
integer, parameter :: dp=kind(1.0d0)
! The function
real(kind=dp)f
! The step size delta x
real(kind=dp), parameter :: dx = 1.0d-7
! The point at which we wish to differentiate
real(kind=dp), parameter :: x = 1.0d0
! The derivative
real(kind=dp) df
df = ((f(x+dx)-f(x-dx))/(2.0d0*dx))
write(*,*)The derivative is ,df
end program
real(kind=kind(1.0d0)) function f(x)
real(kind=kind(1.0d0)) x
f = x**4-3
return
end function
On my laptop, with a step size of x = 107 as above, the answer given for
the derivative was 4.000000000115022,4 which is pleasingly accurate. It will be
instructive to investigate the behaviour of the answer as we vary the step size.
When I varied the step size, I obtained the answers given in Table 5.3. Your
answers will be different due to the different step sizes. Recall that the real
answer is exactly 4.
4 Your value will very probably be different because you are using different computers.
When doing this for real, you must always test the accuracy of your program on the system
it will be run on.
49
x
100
101
102
103
104
105
106
107
108
109
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
Calculated derivative
8
4.04
4.000400000000015
4.000003999999668
4.000000039998675
4.000000000403681
3.9999999998929776
4.000000000115022
3.9999999978945757
4.000000108916878
4.000000330961483
4.000000330961483
4.000133557724438
3.9990233346998134
3.996802888650563
4.218847493575595
2.2204460492503126
0
0
0
Table 1: The value of the calculated derivative and how it depends on x. The
exact answer is 4.
50
5.4
Higher-order derivatives
By keeping more terms in the Taylor expansion, it is possible to construct computational expressions for higher-order derivatives, although careful attention
should be paid to the accuracy of these schemes. It follows that if differentiating
a function introduces certain errors, differentiating it again introduces the same
errors on the already slightly wrong answers.
The expression for the second derivative which can be found by substituting our
expression for f 0 (x) back into the main Taylor expansion is
f 00 (x) =
1
[f (x + x) 2f (x) + f (x x)]
x2
(64)
5.5
Just as the trapezium rule and Simpsons rule from the integration subsection
are connected to function interpolation, so the numerical rules for differentiation
can be recovered in the same way. For example, lets assume we are approximating three points x1 , f (x1 ), x2 , f (x2 ) and x3 , f (x3 ) by a quadratic polynomial
exactly as we did in the integration subsection.
Then we have
P2 (x) =
(x x2 )(x x3 )
(x x1 )(x x3 )
f (x1 ) +
f (x2 )+
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
51
(65)
(x x1 )(x x2 )
f (x3 )
(x3 x1 )(x3 x2 )
(66)
(2x x2 x3 )
(2x x1 x3 )
f (x1 ) +
f (x2 )+
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
(2x x1 x2 )
f (x3 )
(x3 x1 )(x3 x2 )
(67)
(68)
(2x2 x1 x3 )
(x2 x3 )
f (x1 ) +
f (x2 )+
(x1 x2 )(x1 x3 )
(x2 x1 )(x2 x3 )
(x2 x1 )
f (x3 )
(x3 x1 )(x3 x2 )
(69)
(70)
The above equation is general. We can link it to the other work if we assume
that the three points are evenly spaced, and occur at x1 = x x, x2 = x,
x3 = x + x, then we find
P20 (x) =
1
1
f (x x) +
f (x + x)
2x
2x
(71)
But this is just the centred-difference expression for the derivative we have
already used.
5.6
Richardson extrapolation
We have discussed at some length in this course the errors inherent in computing
various properties, and the care needed in programming and running to take
account of them. In many cases, the sizes of the errors can be reduced by
a clever choice of algorithm. Richardson extrapolation is one such technique
which allows us, in this case, to calculated formulas for the derivatives with
even greater accuracy.
Above, I have given two formulas for computing f 0 (x). They had errors of the
order of x and x2 respectively. Using Richardson extrapolation, we can create
formulas with more accuracy.
From the Taylor expansions of f (x + x) and f x, we can see that in the
difference, all the terms with an even power of x will vanish:
f (x + x) f (x x) = 2xf 0 (x) +
2 3 (3)
2
x f (x) + x5 f (5) (x) + . . .
3!
5!
52
(72)
f 0 (x) =
1
2x
(74)
[f (x + x) f (x x)].
(75)
(76)
(77)
This equation gives an approximation for f 0 (x) which has an error of the order of
x4 , which is much more accurate than our previous two expressions. Although
this seems like a massive advantage, the disadvantage is that one has to carry
out two evaluations of F , which will increase the computing time.
Of course, this is only the first step. It is possible to carry out the same procedure
to eliminate the term proportional to x4 . This will give an equation with 3 F s
in, and have an error of the order of x6 . This can then be repeated to arbitrary
accuracy, introducing at each step an additional evaluation of the function F .
53