Chapter 5 - Arrays and String and Ch6-Pointers
Chapter 5 - Arrays and String and Ch6-Pointers
Fundamental of Programming
Chapter Five
Array and String
5.1. Introduction
Variables in a program have values associated with them. During program execution
these values are accessed by using the identifier associated with the variable in
expressions etc. In none of the programs written so far have very many variables been
used to represent the values that were required. Thus even though programs have been
written that could handle large lists of numbers it has not been necessary to use a separate
identifier for each number in the list. This is because in all these programs it has never
been necessary to keep a note of each number individually for later processing. For
example in summing the numbers in a list only one variable was used to hold the current
entered number which was added to the accumulated sum and was then overwritten by
the next number entered. If that value were required again later in the program there
would be no way of accessing it because the value has now been overwritten by the later
input.
If only a few values were involved a different identifier could be declared for each
variable, but now a loop could not be used to enter the values. Using a loop and assuming
that after a value has been entered and used no further use will be made of it allows the
following code to be written. This code enters six numbers and outputs their sum:
sum = 0.0;
for (i = 0; i < 6; i++)
{
cin >> x;
sum += x;
}
This of course is easily extended to n values where n can be as large as required.
However if it was required to access the values later the above would not be suitable. It
would be possible to do it as follows by setting up six individual variables:
float a, b, c, d, e, f;
which is obviously a very tedious way to program. To extend this solution so that it
would work with more than six values then more declarations would have to be added,
extra assignment statements added and the program re-compiled. If there were 10000
values imagine the tedium of typing the program (and making up variable names and
remembering which is which)!
To get round this difficulty all high-level programming languages use the concept of a
data structure called an Array.
An array can be thought of as a collection of numbered boxes each containing one data
item. The number associated with the box is the index of the item. To access a particular
item the index of the box associated with the item is used to access the appropriate box.
The index must be an integer and indicates the position of the element in the array. Thus
the elements of an array are ordered by the index.
For example data on the average temperature over the year in Ethiopia for each of the last
100 years could be stored in an array declared as follows:
float annual_temp[100];
then if more records come to light it is easy to amend the program to cope with more
values by changing the value of NE. This works because the compiler knows the value of
the constant NE at compile time and can allocate an appropriate amount of space for the
array. It would not work if an ordinary variable was used for the size in the array
declaration since at compile time the compiler would not know a value for it.
An array element is accessed by writing the identifier of the array followed by the
subscript in square brackets. Thus to set the 15th element of the array above to 1.5 the
following assignment is used:
annual_temp[14] = 1.5;
Note that since the first element is at index 0, then the ith element is at index i-1. Hence
in the above the 15th element has index 14.
An array element can be used anywhere an identifier may be used. Here are some
examples assuming the following declarations:
count[i] += 5;
Array elements can form part of the condition for an if statement, or indeed, for any other
logical expression:
The following code finds the average temperature recorded in the first ten elements of the
array.
sum = 0.0;
for (i = 0; i <10; i++)
sum += annual_temp[i];
av1 = sum / 10;
Notice that it is good practice to use named constants, rather than literal numbers such as
10. If the program is changed to take the average of the first 20 entries, then it all too easy
to forget to change a 10 to 20. If a const is used consistently, then changing its value will
be all that is necessary.
For example, the following example finds the average of the last k entries in the array. k
could either be a variable, or a declared constant. Observe that a change in the value of k
will still calculate the correct average (provided k<=NE).
sum = 0.0;
for (i = NE - k; i < NE; i++)
sum += annual_temp[i];
av2 = sum / k;
Important - C++ does not check that the subscript that is used to reference an array
element actually lies in the subscript range of the array. Thus C++ will allow the
assignment of a value to annual_temp[200], however the effect of this assignment is
unpredictable. For example it could lead to the program attempting to assign a value to a
memory element that is outside the program's allocated memory space. This would lead
Note that the array has not been given a size, the compiler will make it large enough to
hold the number of elements in the list. In this case primes would be allocated space for
seven elements. If the array is given a size then this size must be greater than or equal to
the number of elements in the initialization list. For example:
A set of positive data values (200) are available. It is required to find the average value of
these values and to count the number of values that are more than 10% above the average
value.
In the above the variable nogt10 is the number greater than 10% above the average value.
It is easy to argue that after exiting the while loop, count is set to the number of positive
numbers entered. Before entering the loop count is set to zero and the first number is
entered, that is count is one less than the number of numbers entered. Each time round the
loop another number is entered and count is incremented hence count remains one less
than the number of numbers entered. But the number of numbers entered is one greater
than the number of positive numbers so count is therefore equal to the number of positive
numbers.
A main() program written from the above algorithmic description is given below:
void main()
{
const int NE = 200; // maximum no of elements in array
float sum = 0.0; // accumulates sum
int count = 0; // number of elements entered
int nogt10 = 0; // counts no greater than 10%
// calculate average
average = sum/count;
// Output results
cout << "Number of values input is " << n;
cout << endl
<< "Number more than 10% above average is "
<< nogt10 << endl;
}
Since it was assumed in the specification that there would be less than 200 values the
array size is set at 200. In running the program less than 200 elements may be entered, if
n elements where n < 200 elements are entered then they will occupy the first n places in
the array indata. It is common to set an array size to a value that is the maximum we
think will occur in practice, though often not all this space will be used.
The following program simulates the throwing of a dice by using a random number
generator to generate integers in the range 0 to 5. The user is asked to enter the number of
trials and the program outputs how many times each possible number occurred.
An array has been used to hold the six counts. This allows the program to increment the
correct count using one statement inside the loop rather than using a switch statement
#include <iostream.h>
#include <stdlib.h> // time.h and stdlib.h required for
#include <time.h> // random number generation
void main()
{
const int die_sides = 6; // maxr-sided die
int count[die_sides]; // holds count of each
// possible value
int no_trials, // number of trials
roll, // random integer
i; // control variable
float sample; // random fraction 0 .. 1
Notice the use of a constant to store the array size. This avoids the literal constant '10'
appearing a number times in the code. If the code needs to be edited to use different sized
arrays, only the constant needs to be changed. If the constant is not used, all the '10's
would have to be changed individually - it is easy to miss one out.
Arrays can have any number of dimensions, although most of the arrays that you create
will likely be of one or two dimensions.
Suppose the program contains a class named square. The declaration of array named
board that represents would be
Square board[8][8];
The program could also represent the same data with a one dimensional, 64-square array.
For example, it could include the statement
Square board[64];
Such a representation does not correspond as closely to the real-world object as the two
dimensional array, however.
int x[] = { 1, 2, 3, 4} ;
This initialization creates an array of four elements.
Note however:
int x[][] = { {1,2}, {3,4} } ; // error is not allowed.
and must be written
int x[2][2] = { {1,2}, {3,4} } ;
In C++ strings of characters are held as an array of characters, one character held in
each array element. In addition a special null character, represented by `\0', is appended
to the end of the string to indicate the end of the string. Hence if a string has n characters
then it requires an n+1 element array (at least) to store it. Thus the character `a' is stored
in a single byte, whereas the single-character string "a" is stored in two consecutive bytes
holding the character `a' and the null character.
The string variable s1 could hold strings of length up to nine characters since space is
needed for the final null character. Strings can be initialized at the time of declaration just
as other variables are initialized. For example:
In the first case the array would be allocated space for eight characters, that is space for
the seven characters of the string and the null character. In the second case the string is
The setw(width) I/O manipulator can be used before outputting a string, the string will
then be output right-justified in the field width. If the field width is less than the length of
the string then the field width will be expanded to fit the string exactly. If the string is to
be left-justified in the field then the setiosflags manipulator with the argument
ios::left can be used.
Note that the last two elements are a relic of the initialization at declaration time. If the
string that is entered is longer than the space available for it in the character array then
C++ will just write over whatever space comes next in memory. This can cause some
very strange errors when some of your other variables reside in that space!
Thus it will read strings consisting of a single word, but anything typed after a space is
thrown away.
We have solved the problem of reading strings with embedded blanks, but what about
strings with multiple lines? It turns out that the cin::get() function can take a third
argument to help out in this situation.
This argument specifies the character that tells the function to stop reading. The default
value of this argument is the newline('\n')character, but if you call the function
with some other character for this argument, the default will be overridden by the
specified character.
now you can type as many lines of input as you want. The function will continue to
accept characters until you enter the terminated character $ (or untill you exceed the size
of the array. Remember, you must still press Enter key after typing the '$' character .
However, it is possible to tell the >> operator to limit the number of characters it places
in an array.
#include<iostream.h>
void main(){
char str[] = "Welcome to C++ programming language";
cout<<str;
}
if you tried to the string program with strings that contain more than one word , you may
have unpleasant surprise. Copying string the hard way
#include<iostream.h>
#include<string.h> //for strlen()
void main()
{
const int max=80;
char str1[]='' Oh, Captain, my Captain!"
our fearful trip is done";
char str2[max];
for(int i=0; i<strlen(str1);i++)
str2[i]=str1[1];
str2[i]='\0';
cout<<endl;
cout<<str2;
}
strcpy(destination, source);
strcpy copies characters from the location specified by source to the location
specified by destination. It stops copying characters after it copies the terminating null
character.
Example:
#include <iostream.h>
#include <string.h>
void main(){
char me[20] = "David";
cout << me << endl;
strcpy(me, "YouAreNotMe");
cout << me << endl ;
return;
}
The function strcat concatenates (appends) one string to the end of another string.
strcat(destination, source);
o The first character of the source string is copied to the location of the terminating null
character of the destination string.
o The destination string must have enough space to hold both strings and a terminating
null character.
Example:
#include <iostream.h>
#include <string.h>
void main() {
char str1[30];
strcpy(str1, "abc");
cout << str1 << endl;
strcat(str1, "def");
cout << str1 << endl;
The function strncat is like strcat except that it copies only a specified number of
characters.
strncat(destination, source, int n);
It may not copy the terminating null character.
Example:
#include <iostream.h>
#include <string.h>
void main() {
char str1[30];
strcpy(str1, "abc");
cout << str1 << endl;
strncat(str1, "def", 2);
str1[5] = '\0';
cout << str1 << endl;
char str2[] = "xyz";
strcat(str1, str2);
cout << str1 << endl;
str1[4] = '\0';
cout << str1 << endl;
}
strncmp does not compare characters after a terminating null character has been found
in one of the strings.
Chapter Six
6.1. Pointers
A pointer is simply the address of a memory location and provides an indirect way of
accessing data in memory. A pointer variable is defined to ‘point to’ data of a specific
type. For example:
int num;
we can write:
ptr1 = #
The symbol & is the address operator; it takes a variable as argument and returns the
memory address of that variable. The effect of the above assignment is that the address of
num is assigned to ptr1. Therefore, we say that ptr1 points to num. Figure 5.Error!
Bookmark not defined. illustrates this diagrammatically.
ptr1 num
Figure: A simple integer pointer.
Given that ptr1 points to num, the expression
*ptr1
dereferences ptr1 to get to what it points to, and is therefore equivalent to num. The
symbol * is the dereference operator; it takes a pointer as argument and returns the
contents of the location to which it points.
Two operators are used for allocating and deallocating memory blocks on the heap. The
new operator takes a type as argument and allocated a memory block for an object of that
type. It returns a pointer to the allocated block. For example,
Memory allocated from the heap does not obey the same scope rules as normal
variables. For example, in
when Foo returns, the local variable str is destroyed, but the memory block
pointed to by str is not. The latter remains allocated until explicitly
released by the programmer.
Listing 5.1
1 #include <string.h>
5 strcpy(copy, str);
6 return copy;
7 }
Annotation (analysis)
1 This is the standard string header file which declares a variety of
functions for manipulating strings.
4 The strlen function (declared in string.h) counts the characters in its
string argument up to (but excluding) the final null character. Because
the null character is not included in the count, we add 1 to the total and
allocate an array of characters of that size.
5 The strcpy function (declared in string.h) copies its second argument
to its first, character by character, including the final null character.
str++ advances str by one char (i.e., one byte) so that it points to the
second character of "HELLO", whereas ptr++ advances ptr by one int (i.e.,
four bytes) so that it points to the second element of nums. Figure 5.1
illustrates this diagrammatically.
str ptr
str++ ptr++
Listing 5.2
1 void CopyString (char *dest, char *src)
2 {
3 while (*dest++ = *src++)
4 ;
5 }
Annotation
3 The condition of this loop assigns the contents of src to the contents of
dest and then increments both pointers. This condition becomes 0 when
the final null character of src is copied to dest.
Annotation
1 Instead of passing an array to the function, we pass an int pointer and
two additional parameters which specify the dimensions of the array. In
this way, the function is not restricted to a specific array size.
6 The expression *(temp + i * columns + j) is equivalent to
temp[i][j] in the previous version of this function.
Listing 5.4
1 int HighestTemp (const int *temp, const int rows, const int columns)
2 {
3 int highest = 0;
Listing 5.5
Annotation
1 Binary search is a well-known algorithm for searching through a sorted
list of items. The search list is denoted by table which is an array of
strings of dimension n. The search item is denoted by item.
2 Compare is the function pointer to be used for comparing item against
the array elements.
7 Each time round this loop, the search span is reduced by half. This is
repeated until the two ends of the search span (denoted by bot and top)
collide, or until a match is found.
9 The item is compared against the middle item of the array.
10 If item matches the middle item, the latter’s index is returned.
11 If item is less than the middle item, then the search is restricted to the
lower half of the array.
14 If item is greater than the middle item, then the search is restricted to
the upper half of the array.
16 Returns -1 to indicate that there was no matching item.
The following example shows how BinSearch may be called with
strcmp passed as the comparison function:
6.1.4. References
A reference introduces an alias for an object. The notation for defining
references is similar to that of pointers, except that & is used instead of *.
For example,
defines num2 as a reference to num1. After this definition num1 and num2
both refer to the same object, as if they were the same variable. It should be
emphasized that a reference does not create a copy of an object, but merely
a symbolic alias for it. Hence, after
num1 = 0.16;
The 1 in the first and the 1 in the third line are likely to be the same object
(most compilers do constant optimization and allocate both 1’s in the same
memory location). So although we expect y to be 3, it could turn out to be
4. However, by forcing x to be a copy of 1, the compiler guarantees that the
object denoted by x will be different from both 1’s.
The most common use of references is for function parameters.
Reference parameters facilitates the pass-by-reference style of arguments,
as opposed to the pass-by-value style which we have used so far. To
observe the differences, consider the three swap functions in Listing 5.6.
Listing 5.6
The effect of these definitions is that String becomes an alias for char*,
Name becomes an alias for an array of 12 chars, and uint becomes an alias
for unsigned int. Therefore:
String str; // is the same as: char *str;
Namename; // is the same as: char name[12];
uintn; // is the same as: unsigned int n;
The typedef introduces Compare as a new type name for any function with
the given prototype. This makes BinSearch’s signature arguably simpler.