Cs 310 Notes
Cs 310 Notes
Instructor Notes
In
The C++ Programming Language
I The Language 1
1 Overview 3
1.1 Systems and Application Programming . . . . . . . . . . . . . 4
1.2 Development Environment . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Secure Shell . . . . . . . . . . . .
FT . . . . . . . . . . . . 7
1.2.2 Visual and Visual Improved . . . . . . . . . . . . . . . 11
1.2.3 Secure Copy . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 The Preprocessor . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Header Files . . . . . . . . . . . . . . . . . . . . . . . 15
A
1.3.3 Compiler . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.3.4 Linker . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
R
i
Notes (C++) 2019-2020
3 Functions 45
3.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.1 Passing by Value . . . . . . . . . . . . . . . . . . . . . 47
3.1.2 Passing by Reference . . . . . . . . . . . . . . . . . . . 48
3.1.3 Default Arguments . . . . . . . . . . . . . . . . . . . . 50
3.2 Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Recursion . . . . . . . . . . . . . . . . . . . . .
FT . . . . . . . . 51
3.4 Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Lambdas, Anonymous Functions, and Closures . . . . . . . . 51
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
ii
Notes (C++) 2019-2020
II Analysis 81
8 Complexity 83
8.1 Empirical Analysis . . . . . . . . . . . . . . . . . . . . . . . . 83
8.2 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . 84
8.2.1 Complexity Classes . . . . . . . . . . . . . . . . . . . . 84
8.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10 Lists 105
A
10.1 Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.2 Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.2.1 Priority Queue . . . . . . . . . . . . . . . . . . . . . . 105
R
iii
Notes (C++) 2019-2020
V Algorithms 149
R
iv
Part I
FT
The Language
A
R
D
1
D
R
A
FT
Chapter 1
Overview
This document details some of the terminology and major concepts rele-
vant for upper-division computer science students working with C++ at San
Diego State University (SDSU); as a degree requirement, the department re-
FT
quires undergraduate computer science students complete an upper-division
course in data structures and algorithms, and these lecture notes supplement
the weekly discussion and lectures occurring in the class.
SDSU introduces students to programming using the Java language.
A
Thus, this document accordingly assumes the reader understands prereq-
uisite Java syntax and programming, for it builds upon those concepts.
Specifically, students should understand the basics of programming syntax,
R
taught in Java, knowing the language bears fruit when we juxtapose the two
languages as we will during discussions of memory management.
That said, students with similar programming experience in C# or
Python should achieve an equatable level of comprehension. Having expe-
rience in a language with automatic garbage collection, for example, assists
the discussion of memory management, but lacking this depth should not
prevent a student from understanding C++. On the other hand, students
with no prior programming experience will find this document, and the class
in general, entirely insufficient for their needs.
3
Notes (C++) 2019-2020
4
Notes (C++) 2019-2020
C++ Java
Audience Systems, games, and Cross-platform applica-
optimization-centered tion development
developers
Inheritance Multiple (code) Single (code)
Operator Yes No
Overloading
Memory Man- Raw pointers Garbage Collected
agement
Structs Yes No
Header Files Yes No
Functions Yes Everything in a Class
Threading C++11 and later Built-into the language
Figure 1.1: Some of the major differences between C++ and Java
FT
Maven to correctly build the program.
Although Java emerged in response to C++, neither language remains
unchanged. Standardization committees routinely update the language to
A
include modern programming features. Both Java and C++ evolved to
include support for lambda expressions. The standard library in C++ and
the Java Collections Framework each include updated data structures and
R
5
Notes (C++) 2019-2020
Figure 1.2: Real Programmers, courtesy of XKCD and the Creative Com-
FT
mons.
functionality, however, comes with a cost, for machines using an IDE require
additional resources. One cannot easily remotely connect to a server and
A
launch an IDE when running from the shell 1 .
On the other hand, one may write both Java and C++ source directly
in a text editor and compile it from the command line without additional
R
software. Much of the assistance IDEs offer comes in the form of automating
the mundane tasks in computer science, but doing so obscures understanding
D
6
Notes (C++) 2019-2020
for our needs. Alternatively, Eclipse and Netbeans, IDEs familiar to Java
developers, also provide C++ integration.
D
7
Notes (C++) 2019-2020
Linux and MacOS users should find the SSH process relatively painless,
for the operating systems include support for the feature at the command
line. Thus, after opening a terminal window on a remote machine, one
may simply execute the ssh command with the appropriate parameters and
FT
connect to the server. Windows users must either install a third-party tool
like PuTTY, or they must enable the Windows Subsystem for Linux which
allows them to install Ubuntu or Kali via the Microsoft Store.
A
+ The course instructor highly encourages every Windows 10
user to migrate to the Windows Subsystem for Linux, for it
enhances the development experience and simplifies interfac-
R
Account Password
Historically, information technology officers, using standards developed by
the National Institute of Standards and Technology (NIST) based upon what
security professionals believed to be reasonable assumptions. After count-
less cracked passwords, however, the failure of this system became obvious.
Although using multiple character sets and requiring strange permutations
of ASCII symbols allows a computer to produce a wide range of passwords,
humans struggle to memorize these patterns.
Consequently, the very shop that developed the original standard man-
dating these obscure combinations now strongly argues aginst its continued
use. In short, the NIST now encourages people to use password manager
tools and follow some general guidelines for memorized secrets[3]:
8
Notes (C++) 2019-2020
ing.
Finally, these recommendations better reflect how modern computer at-
tacks occur. Brute force password attacks, like the WOPR attempting to
D
crack the United States nuclear launch codes [4], rarely occur. The original,
cryptic rules helped prevent this iterative approach, but they also created
an unusable system. In practice, with examples of millions of cracked pass-
words available on the Internet to simply try, the brute force hack makes
little sense. Furthermore, many computers and online services quickly detect
this attack and they implement strategies to minimize its impact.
9
Notes (C++) 2019-2020
FT
A
R
D
Figure 1.4: Password Strength, courtesy of XKCD and the Creative Com-
mons.
10
Notes (C++) 2019-2020
With access to a shell window, users may launch applications on the host
machine. After connecting to the university server, students may open one of
the editors available on the server. Both Visual (vi) and its successor Visual
Improved (vim), offer basic text editing features. Vim acts as a super-set of
vi, so it supports the same interface, but it also includes code highlighting
and other features useful to developers. These tools also possess the inherent
advantage in that they ship on nearly every Linux installation, so learning
the interface guarantees the developer may modify essential system files
without dependency on a particular program.
That said, the editor’s interface possesses a steep learning curve 3 . One
FT
must quickly learn the difference between command mode and insert-text
mode. Although learning vi benefits the computing professional, one need
not use it exclusively when developing code in this course. Critically, the
chosen editor must generate ASCII files without any additional formatting
or hidden text. That is, they must only write to the file what they show the
A
user.
For this reason, Microsoft Word documents will not compile. Although
R
the application displays only limited text to the user, the saved file includes
a great deal of additional formatting and special characters Word needs to
properly display and work with the document. All this additional overhead
D
will cause compilation to fail. Notepad, on the other hand, may write files
suitable for compilation. Look for a text editor, not a word processor.
3
How do I exit this program?! Why won’t it save? I’m typing, nothing’s happening!!!!
11
Notes (C++) 2019-2020
1.3 Compilation
Computers must convert between the high-level language we use to write
software and the low-level, bare-bones machine code necessary to operate
the processor. Ultimately, every program must toggle bits on a physical
device. It could hold a pin high or toggle the contents of some internal
flip-flops. The approach to this compilation process varies based upon the
language in question.
For example, Java compiles the source file into platform-independent
Java bytecode. The platform-dependent Java Virtual Machine (JVM) then
interprets this bytecode into the appropriate machine instructions for the
FT
host processor. With Java, one need not compile the application on every
new system. Instead, one simply writes a unique JVM for the hardware,
and this JVM uses the same, compiled bytecode to function.
By contrast, C++ compiles source code into an executable appropriate
for the target machine, for it does not include an intermediate bytecode step.
A
The executable file produced after running the C++ compiler specifically
works on its target hardware, so one must compile separately for Mac, Linux,
R
file must meet several syntax requirements before the compiler can use it to
produce any meaningful output for the next stage.
12
Notes (C++) 2019-2020
source.cpp
pre-
processor
source
tokens
FT
compiler
A
source.o
R
D
linker
executable
13
Notes (C++) 2019-2020
# 1 "bare.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 32 "<command-line>" 2
# 1 "bare.c"
int main(){
return 0;
}
The C++ compiler also uses the pre-processor to insert function defi-
A
nitions and variables defined in external files. Similar to Java’s import di-
rective, the #include pre-processor directive instructs the compiler to paste
the contents of the indicated file at the point the #include directive appears
R
in the code. Errant use of include statements may bloat the output from
the pre-processor, for the pre-processor recursively expands each of the files
D
it includes, so if the user’s file includes a header file which, itself, includes
several header files, the pre-processor will expand those where they appear
in the included file.
In addition to inserting text from other sources, say when using the
standard library or the output streams, one may actually program constant
values and write simple pseudo-functions. Because a pre-processor expan-
sion simply replaces the text with its saved text, C and C++ developers
my write what looks like a function but does not require the overhead as-
sociated with a function call. Modern C++ development frowns on using a
macro for a function like this 4 , for there exists a double-evaluation problem.
Because macros, even macro functions, simply serve as a text replacement
before compilation, one might inadvertently call a function multiple times.
4
The standard library includes a max function, so you need not create your own without
reason.
14
Notes (C++) 2019-2020
max(x++, y++);
Figure 1.8: A basic C macro for finding the maximal value. The pre-
processor replaces the macro variable a with x++, so what values do x and
y have after executing this expression?
#include <iostream>
#define SWAP(X,Y) {int temp=X; X=Y; Y=temp;}
int main() {
int a = 2;
int temp = 17;
std::cout << a << ":" << temp << std::endl;
SWAP(a, temp); FT
std::cout << a << ":" << temp << std::endl;
}
Figure 1.9: An example of variable capture. What values for a and temp
does the program print?
A
What happens when a macro declares a temporary variable with the same
name as one currently in use? Because it simply substitutes the text it finds,
D
15
Notes (C++) 2019-2020
FT
the function. Traditionally, a C header file includes global variable and
function declarations and typically ends in .h; C++ header files include the
actual working code and might end in .hpp, but the language imposes no
requirements on the actual suffix. In C++, one may define and implement
A
a class in a single file, and some teams choose to write code using this
style. Others, however, prefer to separate the interface logic from the actual
R
business logic by placing only function and class declarations in the .h file
and their definitions in a .cpp file.
D
One may use either angular brackets or quotation marks when including
the files, but the two symbols possess subtle distinctions. Angular brackets
reference the global include directories; one may view them as external to
the project. Files in quotes are relative to the current directory for the
source file; these only work with directories in the include path. Consider
placing project-specific include files in quotation marks while leaving the
major, common libraries in angular brackets.
16
Notes (C++) 2019-2020
#ifndef __USER_SETTING_H
#define __USER_SETTING_H
#include <iostream>
#include "my_project.h"
namespace sdsu
{
struct UserSettings {
bool darkMode = true;
bool sUser = false;
};
} FT
#endif
Figure 1.11: An example of a header guard which only includes the contents
of a header file the first time it encounters it in code. This style works with
both C and C++.
A
R
#pragma once
D
#include <iostream>
#include "my_project.h"
namespace sdsu
{
struct UserSettings {
bool darkMode = true;
bool sUser = false;
};
}
17
Notes (C++) 2019-2020
1.3.3 Compiler
After pre-processing each of the source files included in the compilation pro-
cess, the tool begins translating the expanded output into an intermediate
machine code. This object file includes the machine code for the statements
in the current file with placeholders to statements made in other source
units. Each object file remains independent of its peers in the program at
this point.
The compiler does not need to know what an external function call does
in order to compile the current file, for it only needs the variable or function’s
signature. The machine code compiled for the current file only needs to know
what parameters to load on the stack before the current call and what return
value to expect. A program does not need to know what happens inside max
or min to know how to prepare for the call and what to do after it completes.
1.3.4 Linker FT
Finally, using the resultant object files produced by the compiler, the linker
fills in the final blanks necessary to produce an executable the user may run
on the machine. The linker replaces the external references in each of the
object files with references to each compiled unit. Now that the machine
A
understands where the max and min functions reside within the compiled
object file, it can replace those placeholder calls with the actual jumps to
R
the appropriate instruction. That is, the compiler cannot include this during
its initial generation of code, for it will not know everything until the first
stage of compilation completes.
D
18
Notes (C++) 2019-2020
#include <iostream>
int main(){
std::cout << "" << std::endl;
}
mate needs. How does one plan to deploy the software? Are the machines
regularly maintained? Are the versions of the external libraries critical,
for a required system update may break third-party applications using the
D
updated libraries.
Statically linking in the external libraries effectively copies their code
into the project and integrates it with the final executable. This reduces
the number of additional installation dependencies the user needs during
installation, for everything comes statically packaged together. If a vendor
updates a third-party library the application uses, the application experi-
ences no changes, for it uses its own, internal copy. Obviously, carrying
around this additional library causes the resultant program to grow in size,
but it also speeds up performance, for the application need not consult the
system to locate the external libraries.
Dynamically linking a library, however, offers its own set of benefits.
When multiple applications use the same dynamically linked library (DLL),
the operating system only loads the shared source in memory once. Further-
more, when one discovers a security flaw with a DLL, a single update to the
19
Notes (C++) 2019-2020
library potentially solves the error for every application using the library.
Although third-party updates may break an application, they also provide
support. Additionally, one may use a DLL across several programming lan-
guages.
1.4 Exercises
FT
A
R
D
20
Chapter 2
The C++ language builds upon C with Classes [6], and, in fact, one may
view C++ as a super-set of the C language, for, with very few exceptions,
all legacy C code compiles with C++. In fact, one may still use the older
FT
C-style header files, with their associated functions, in C++. Consequently,
students familiar with the C language should find the transition to C++
manageable. Java emerged, in part, as an object-oriented response to C++,
and the language developers sought to make the Java programming lan-
A
guage as accessible to existing C++ programmers. Thus, much of Java’s
syntax looks familiar in C++. Significant differences exist between the two
languages, but each is closer to the other than, for example, something like
R
Python.
D
21
Notes (C++) 2019-2020
int main(){
int first = 1;
char second = 'b';
float value = 3.50;
}
Figure 2.1: Manifest typing in the C++ language. The source code must
specify the distinct types for each memory location.
a = 1
a = "lucky charm"
def tikki():
print("spots on")
a = tikki() FT
a()
Figure 2.2: With dynamic typing in Python, a variable may change its
fundamental type within the current scope. In this example, a assumes the
form of an integer, string, and function within the same scope.
A
One does not need to supply initial for variables in C++ or Java. Java
initializes primitives, but C++ remains performance focused, and initializa-
D
tion requires additional work. Consequently, C++ does not generate code
to initialize these variables before using them. Although C++ permits this
style of code, secure coding guidelines 1 strongly advise against it, for one
cannot anticipate the initial values. [10]
22
Notes (C++) 2019-2020
#include<iostream>
int main(){
int a;
int *ptr;
std::cout << a << ':' << ptr << std::endl;
}
like its predecessor C, C++ imposes the following restrictions on all variable
names:
• After the first character, the variable name may include any count of
additional underscore or alpha-numeric characters.
A
• Variable names are case-sensitive.
suit one’s personal writing style. This degree of customization created source
code with wildly varying appearance. Moreover, this made maintenence sub-
stantially more difficult, for variable names help the reader understand code.
When every programmer arbitrarily names variables, or uses poor names,
the entire project suffers. Consequenlty, most major software development
firms use a coding standard or style guide [7][8][9][10].
Some style guides specify naming conventions which identify how one
should construct a variable’s name. That is, when creating a private pointer
to a constant, unsigned character, a naming convention might demand pro-
grammers prefix the name with _cptruc so as to make clear 2 the variable’s
type and intended use. Hungarian notation, an early coding style, sought to
include information about the variable’s type and scope in its name so when
developers encounter it in source, they immediately understood its use.
2
Crystal.
23
Notes (C++) 2019-2020
24
Notes (C++) 2019-2020
// An assignment statement-expression
bool a = true;
bool a{true};
if( !a ) a = true;
Figure 2.6: Three conditional uses of boolean types, but the while loop may
not behave as expected.
25
Notes (C++) 2019-2020
2.1.3 Characters FT
We can extend single-bit, boolean values into an arbitrarily large binary
number by simply combining them and correctly interpreting their new
values. The number of bits we use to represent the data establishes the
A
maximum value we may store. For example, an eight bit character type
may hold 256 distinct values. We may choose to interpret these data in any
number of ways. For example, the binary value 0b11111111 could represent
R
255 if one wished to ignore negative values, but it might also act as -0 using
one’s-compliment.
D
Although the character type in C++ may hold literal, ASCII character
values, it also frequently appears when the software design needs a small
number. Systems programmers optimize code for time and space, and they
typically select data types based upon an optimization strategy. Thus, both
C++ and C include extensive data type ranges to accommodate this level
of optimization as illustrated in Figure 2.7.
26
Notes (C++) 2019-2020
Type Notes
short At least 16-bits
signed Accepts positive or negative values.
Default if omitted.
unsigned short At least 16-bits
int 16-bits minimum, but 32-bits typi-
cally on 32 or 64 bit machines
long 32-bits minimum
long long 64-bits minimum (since C++11)
2.1.4 Integers
By increasing the number of bits used to store a number, we dramatically
FT
expand the range of potential values for that data type. The integer data
type represents the next step after a simple character. Much like the char
type, the int type offers multiple flavors. One might use signed or unsigned
varieties. Figure 2.8 details some of the possibilities. Unlike the char type,
which uses eight bits, the C++ compiler implementation establishes the size
A
of an int.
The multitude of options C++ offers for data types stems from its use in
R
Much like string-literals, which store values in the current character set,
one may also use integer-literals when assigning values to a variable. On
a computer, we frequently need numbers in different bases based upon its
use. Obviously, base-ten integers remain the most common, so when one
sets a variable to an integer-literal without adding any prefix or suffix, the
compiler defaults to this numbering system.
27
Notes (C++) 2019-2020
int main(){
// Assignment statement
int a = 1;
return 0;
}
FT
Figure 2.9: Some of the various integer-literal representations in a basic
C++ program.
defines the behavior for floating-point numbers, and most programming lan-
guages adhere to its specifications [11].
D
28
Notes (C++) 2019-2020
4195835.0/3145727.0 1.333
820 449 136 241 002 5 Correct value
739 068 902 037 589 4 Flawed Pentium
Figure 2.10: The Pentium FDIV Error incorrectly computes some floating
point calculations.
2.1.6 Auto
As a manifestly typed language, C++ source requires developers specify each
and every variable type in the file, for this allows the compiler to allocate
the proper number of bits to hold the data. In some cases, however, one
might correctly deduce the type from its context. For example, given the
statement: x = true, one might naturally suspect the variable x stores bool
type data. Certainly, x might represent a long, but the smallest, natural fit
FT
remains a boolean value.
The C++11 update introduced placeholder type values to help simplify
the source code in these situations, and each update seems to introduce
increased functionality to this keyword. For example, C++14 allows auto
to act as the return type for lambda expressions, and C++17 introduced
A
structured-bindings.
R
and C++, when we declare a variable, we must also specify its type. After
introducing the variable curCount as a char, we could not then assign it the
value 3.5, for the compiler will prevent this type of implicit down casting,
for the char type cannot hold the number without losing data.
On the other hand, both C++ and Java support implicit up casting.
In this type conversion, we attempt to store something small in a larger con-
tainer. Many programming languages automatically perform this operation
quietly behind the scenes during compilation. One might easily store some-
thing ranging from 0-255 in a container capable of storing values ranging
from 0-65535, for we are widening the value.
The C++ language supports two ways for performing explicit type-
casting for its primitive data types. To maintain backward compatibility,
developers may choose to use the old c-style casting method. Alternatively,
one may elect to cast to a target type in something resembling a function
29
Notes (C++) 2019-2020
#include <iostream>
int main()
{
// accept becomes type char
auto accept = 'y';
std::set<std::string> courses;
courses.insert({ "CS310", "CS320", "CS440", "CS496" });
for (auto it = courses.begin(); it != courses.end(); it++)
std::cout << *it << std::endl;
return 0;
}
FT
Figure 2.11: Examples of placeholder types where the compiler deduces the
data type.
A
call. The code in Figure 2.12 illustrates use of the two styles.
R
Arrays, in Java, C, and C++, all hold a collection of elements of the same
type in contiguous memory locations. The array promises the data will
exist in memory sequentially, and this promise proves essential for its per-
formance. Due to locality of reference, the processor tends to access memory
locations repeatedly relatively near one another. That is, when working with
curLine[4], one likely passed through curLine[3] and may very well be on
the way to curLine[5]. Because the data exist close together in memory,
the processor can directly access them. In fact, due to the way the processor
fetches data from memory, it may already have those values in a fast cache.
One declares an array using the general format: T[size], where T rep-
resents its type (e.g., char, long, void*, Pokemon) and size indicates how
many elements exist in the array. Figure 2.14 illustrates declaring multiple
arrays using C++. After creating an array, one naturally needs to access
30
Notes (C++) 2019-2020
short data;
// A widening (up-cast)
data = cCount;
sRequires = cCount;
char stackBuff[8];
memset(stackBuff,0,8);
31
Notes (C++) 2019-2020
the members. One may do so through the [] notation or via pointer ma-
nipulation as shown in Figure 2.15.
FT
Random Access Because the compiler understands that the elements in
an array of integers are adjacent in memory, it can quickly compute the
correct address to jump to when accessing elements, randomly, from within
the array. Certainly, progressing through the array sequentially from front
A
to back occurs quickly, but it also supports rapidly accessing any member
within the structure.
R
To do so, the compiler uses the array’s base address as a starting point.
The address of the first element in an array, (the item at index zero), cor-
responds to the arrays’s name. So given an array lines[16], the value of
D
32
Notes (C++) 2019-2020
FT
A
R
Figure 2.16: Row and Column major order as illustrated by the Wikimedia
Commons.
D
int main ()
{
for (auto rowCounter=0; rowCounter<32; rowCounter++)
for (auto colCounter=0; colCounter<32; colCounter++)
{
bw_image[rowCounter][colCounter]=(rowCounter+1)*(colCounter+1);
}
}
33
Notes (C++) 2019-2020
FT
Figure 2.18: Pointers hold the value of an address where the actual data
rest. One accesses the data indirectly by first looking at the address held
in the pointer and then visiting that address. Image from Dinosaur Comics
A
©2005 Ryan North.
R
2.2.2 Pointers
All memory locations on a computer possess a unique identifying number so
D
the machine may access and mutate the data stored therein. The address
may exist on a local hard disk or in RAM. Imagine memory as a continual
list of cells one may access by number. Regardless of the underlying data
type, the address of every location in memory is simply an integer.
void *current = nullptr;
char** ptr = &buffer;
34
Notes (C++) 2019-2020
Figure 2.19: Use of nullptr and the older C-style evaluation to test for
pointer initialization.
Null Pointer The Java programming includes the special keyword null
to denote un-instantiated objects. C++ provides a similar feature with the
nullptr, and one may set a pointer of any type to the nullptr. Historically,
the C language uses 0 as an indication of a null pointer, for it represents an
invalid memory location. To support readability, however, nullptr makes
clear the context. FT
Pointing into structures
Structures allow developers to cleanly group together related fields. One can
create a structure for the player’s ship, and in this structure it might hold
A
several related fields. What if one wished to implement a star-ship dealership
where the user scrolled through the dealership browsing for available craft?
How might one store the currently selected ship?
R
2.2.3 Dereferencing
Pointers represent memory addresses, but this presents little utility in and
of itself. That is, although knowing the address where data reside may serve
a use in a particular algorithm, the data themselves possess meaning in the
application. Sometimes one needs to know something’s name and not simply
where the name is on the computer. We dereference an address by visiting
the location pointed to by the address. This process requires several steps in
35
Notes (C++) 2019-2020
#include <stdio.h>
#include <stdlib.h>
struct entry {
char *key;
int count;
};
struct node {
struct entry *data;
struct node *next;
};
int main(){
FT
struct entry *player = (struct entry*)malloc(1);
player->key="Mr. Jones";
player->count=1;
A
struct node *head = (struct node*)malloc(1);
head->data = player;
R
return 0;
}
Figure 2.20: The C language introduces new, special notation for pointing
into structures. This same style applies to how C++ addresses pointers into
structures and classes.
36
Notes (C++) 2019-2020
#include <iostream>
int main(){
int a = 1;
int b = 2;
std::cout << "Before: a(" << a << "), b(" << b << std::endl;
swap(&a, &b);
std::cout << "After: a(" << a << "), b(" << b << std::endl;
return 0; FT
}
2.2.4 References
D
How does one swap the value of two variables in Java? One may pass in
an address of an object, so, in effect, one transfers in a copy of the address
the method manipulates during its execution, but when it terminates, the
address returns to its original value. Certainly, one may visit the locations
pointed to by the parameter copies and make persistent changes, but one
may not actually swap the address A holds and the address B holds in a
Java method, for Java passes references by value.
The C++ language, however, supports creating aliases3 to variables as
Figure 2.21 illustrates. Unlike a copy, an alias references the actual variable
directly. Consequently, a function may create changes impacting the outer
scope, and this makes aliasing a powerful feature in C++4 .
3
Sydney Bristow.
4
And a significant security risk.
37
Notes (C++) 2019-2020
return addresses. If one can overwrite a return address with the address of
malicious code5 , one may take control of the computer. Perhaps the buffer
D
strcpy(buffer,str);
}
void main() {
char large_string[256];
int i;
5
Friendly to you though.
38
Notes (C++) 2019-2020
function(large_string);
}
FT
2.3 Strings
The C++ language supports characters, integers, floats, and Boolean values
A
as primitive, fundamental types. For string types, however, one must use
an array of characters or a class provided in the standard library. The
standard library string includes methods one may use when working with
R
String Literals appear in read only memory. One may inspect the ex-
ecutable file and read the string literals contained in the program. Conse-
quently, string literals are immutable. Figure 2.23 illustrates several meth-
ods for declaring string literals in C++.
Null Termination The length of a string literal always exceeds the num-
ber of ASCII characters stored in the string by one. The C language, and
C++ by default, terminates string literals with the null character. That is,
rather than storing the size of the string like an object, C uses the literal
character code of 0 to indicate the end of the string. This differs than the
character code for the number ’0’ which is 48. This fact presents interesting
problems for some later functions, like cin, when reading a string from input
into a string variable.
6
As it does pretty much everything else.
39
Notes (C++) 2019-2020
#include <iostream>
#include <cstring>
int main(){
const char* name = "how\0long am I?";
std::cout << strlen(name) << std::endl;
std::cout << sizeof(name) << std::endl;
}
Figure 2.24: Calculating string length when the string includes the null
termination symbol. The main method fails to return a value.
well as the character ’T’, both represent expressions. One may use them on
the right hand side of an equal sign when making an assignment.
D
40
Notes (C++) 2019-2020
int x;
int y;
{
x = 3;
y = 4;
}
{
int x; // declaration statement
int y;
x = 3; // expression statement
y = 4;
} FT
Figure 2.26: Blocks of code include variable declarations with their associ-
ated stack implications.
A
+ Expression statements in C++ evaluate to a result.
R
Block We make the distinction that any compound statement that also
includes a variable declaration is called a block of code. Variables declared
within a block of code, as shown in Figure 2.26, only have visibility within
the block, or any nested blocks, and fall out of context when the block ends.
41
Notes (C++) 2019-2020
2.5 Exercises
1. Using your knowledge of data types, what value should the integer res
contain after executing the following statement:
7
Job Security
8
EXP50-CPP. Do not depend on the order of evaluation for side effects
42
Notes (C++) 2019-2020
j = ++j + 1;
arr[i++] = i;
Prepare experimental code which tests the result and compile it using
R
int main()
{
int course = 320;
int *ptr = &course;
pointyStack(ptr);
43
Notes (C++) 2019-2020
return -1;
}
44
Chapter 3
Functions
it completes. The compiler must know the return type and expected param-
eters every function takes before using it in a translation unit.
Generally, a function declaration takes the form:
Function Definition At some point, the linker must connect the declared
function to actual working code. The function definition, the general form
of which appears in Figure 3.2, provides the actionable software for the
45
Notes (C++) 2019-2020
Figure 3.2: The general form of a function definition. It provides the source
FT
code for the linker to use when creating the executable file.
stead choose to simply declare the function one intends to use in the source
file and trust the linker to find the appropriate translation unit during com-
pilation. This way, if an external header changes in some minor way, it does
D
not impact the current source file. That is, when one changes a header file,
every file including the header must also undergo recompilation. By includ-
ing a function declaration for the relevant functions in the current scope,
one breaks this dependency.
Unfortunately, doing so obscures the dependency and impacts code read-
ability. Header files express preconditions for the compilation to succeed,
and by inspecting them a developer may discern how the file fits in the over-
all project. Moreover, if the method signature does change in the library
and its header file, those changes will not appear in the current translation
unit. That is, when one includes gps.hpp in a file, any change to a function
declaration impacts every source file using the header file, but with a for-
ward declaration, this no longer holds true, for the translation unit need not
consult the header when preparing for the function call. For these reasons,
some firms strongly discourage the use of forward declarations [7].
46
Notes (C++) 2019-2020
3.1 Parameters
Every function and method definition in C++ must include a list of its
formal parameters. The compiler uses this information when it prepares for
and returns from the function call. In order to use a function, one must
understand what arguments it requires. Recall that the C++ compilation
process passes through several phases. It does not compile the function call
FT
when it appears in code. Instead, it inserts a placeholder reference for the
linker to replace when it builds the final executable.
Every declaration of a function produces a unique signature, so two
functions may use the same name if they accept a different number or type
of formal parameters.
A
Questions arise, however, when one questions how those arguments en-
tered the function. Did the compiler include code to copy versions of them
R
behind the scenes, or did it include the actual variable? This becomes crit-
ical when one considers the state of the input arguments after the function
call returns. If a function accepted the current number of users as in input
D
argument, and it modified the count during its execution, should the count
in the caller reflect the new value? Should functions work with the actual
data, or should they each receive their own copy?
These issues become critical when one begins developing in a concurrent
application, for what happens if one process modifies data used by another?
If two threads launched with copies of the same variable at the same time,
but they return at different times, which result should the caller assume?
47
Notes (C++) 2019-2020
#include <iostream>
int main(){
int myData[4] = {0,2,4,6};
printInfo(myData);
mutate(myData, 4); FT
printInfo(myData);
}
uses that version. Of note, for classes and structures, this results in the
copy-constructor activating. Pointers, which represent addresses, enter the
D
48
Notes (C++) 2019-2020
Figure 3.5: A version of swap which passes by value. The swap successfully
consumes processor resources, but it fails to perform any meaningful work.
Figure 3.6: A version of swap using raw pointers. Instead of accepting the
actual values to modify, this version takes their address and de-references it
A
to change the variables’ values in memory.
R
passing by value, as shown in Figure 3.5, the modified copies of the original
variables disappear. Consequently, it fails to actually swap the values.
D
49
Notes (C++) 2019-2020
Figure 3.7: A version of swap using pass by reference. Here the function
works with aliases to the integer variables.
Figure 3.8: Modifying the example in Figure 3.3 to accept default arguments.
FT
variable automatically take place in the outer program scope.
puting a logarithm in a program, for example, one typically uses the same
base. Although one may compute a logarithm using any base, some appear
more frequently than others. Rather than providing unique functions for
D
these common operations (e.g. b2log and b10log) one might instead code
a single common function that accepts a parameter for the base to use in
the calculation.
Although this approach creates a common function, it forces every call
to the log function to also include information about the base to use. The
additional parameter requires the caller to enter more text every time the
caller uses the function. Instead, one may provide default arguments for the
compiler to use should the source code omit the parameter.
50
Notes (C++) 2019-2020
3.3 Recursion FT
3.4 Function Pointers
Some languages, like C++ and Python, allow source code to use a function in
a parameter list or as a function’s return value. This is not simply using the
A
result of calling the function, as one would do in recursion, but this involves
actually storing a function in a variable and then calling the variable using
R
51
Notes (C++) 2019-2020
#include <iostream>
A
int compress(int hashcode, int numbins)
{
return hashcode % numbins;
R
{
return (*functocall)(first, second);
}
int main()
{
int (*stars)(int, int) = compress;
return 0;
}
52
Notes (C++) 2019-2020
std::vector<int> some_list{ 1, 2, 3, 4, 5 };
int total = 0;
std::for_each(begin(some_list), end(some_list), [&total](int x) {
total += x;
});
Figure 3.12: This computes the total of all elements in the list. The variable
total is stored as a part of the lambda function’s closure. Since it is a
reference to the stack variable total, it can change its value.
[10]. Instead, one might capture the variable by value. This results in the
function receiving its own copy of the variable, so it does not depend on the
stack’s contents.
2
EXP61-CPP. A lambda object must not outlive any of its reference captured objects.
53
Notes (C++) 2019-2020
auto badlamb() {
int n = 108;
return [&] {
n = 310;
return n;
};
}
void awesometown() {
int n = badlamb()();
}
Figure 3.15: This lambda indicates captures n by reference with the & sym-
bol. What happens when badlamb returns and n falls out of scope?
3.6 Exercises FT
1. Given an unsorted array of integers, its length, and a target sum,
write a function that returns true if any two items in the input array
combine to equal the target value.
A
2. The 0-1 Knapsack: Encumbrance limits restrict the number of items
games allow players to carry. Consequently, while exploring and col-
lecting inventory items, players must manually calculate which items
R
54
Notes (C++) 2019-2020
within the original array where the pattern begins or -1 if not present.
6. Modify Problem 3 such that the input array may wrap-around or over-
flow. FT
Given the input array: data[] = {65515,65525,9,20,31,40,55}
The function shall return: 0
7. Using the following code fragment, what does nigma return when
value = 5? value = 25? Symbolic answers suffice.
A
int nigma(int ed){
return (ed <= 0) ? 1 : ed * nigma(ed-1);
R
}
D
8. Write a function that accepts four parameters: two sorted integer ar-
rays and each of their associated lengths. This function shall return
nothing. Instead, it shall modify the contents of the arrays passed into
the function in-place. It shall merge the contents of the two arrays such
that the smallest elements in the combined set shall appear in the first
array and the largest elements shall be in the second array. That is,
at the conclusion of the function f irst[f length − 1] <= second[0].
10. Using the function defined in Figure 3.9, write a function that returns
the least common multiple of two numbers. The solution may use
std::abs.
55
Notes (C++) 2019-2020
|a ∗ b|
lcm(a, b) =
gcd(a, b)
11. Describe what happens when the caller invokes the gcd function de-
fined in Figure 3.9 with a=20 and b=-20. Does the computer produce
a result? How might one modify the code to prevent poor behavior
when provided ill-formed input?
12. The Josephus Problem: A group of n people form a circle and begin a
counting out game. Counting begins at an initial starting point, and
it continues around the circle along the specified direction skipping
k people. Players exit the game when the counting finishes on their
position, so the circle shrinks with each iteration. After removing
a player, counting resumes from the next player in the circle. The
final player wins the game. Given the total number of players n and
k which skips k − 1 players, write a function that returns the safe
position within the circle. For example, if provided 5 players and with
FT
a skip value of 3, the safe position returned is 4.
Players: 5 Width: 3
1 2 3 4 5 Initial
1 2 X 4 5 Round 1
A
X 2 X 4 5 Round 2
X 2 X 4 X Round 3
R
X X X 4 X Winner
D
56
Chapter 4
When working with values and data, a typical development task includes
extracting the data from an existing file or producing some sort of modified
output file. An application might prompt a user for input or display ASCII
FT
text as output. In C++, the language accomplishes these input and output
tasks through I/O Streams.
Input streams use the <istream>header file while output streams use
the <ostream>library header file. If using both, one may simply include
<iostream>for access to both.
A
Figure 4.5.
Figure 4.1: The standard I/O streams in C++. Developers may override
the default destinations.
57
Notes (C++) 2019-2020
#include <ostream>
std::cout << "All those moments will be lost in time," << std::endl;
std::cout << "like tears in rain.\nTime to die.";
Figure 4.2: Writing to the screen. This code uses two different methods for
inserting newline characters at the end of the text.
#include <iostream>
int main( )
{
int value;
cout << "What do you want: ";
cin >> value; FT
cout << "I'm not going to give you " << value;
return 0;
}
A
Figure 4.3: Prompting and reading an integer from the user in C++. What
happens when a giant crustacean from the paleolithic era inputs 3.50?
R
#include <iostream>
#include <string>
D
int main( )
{
std::string fullName;
std::cout << "Enter your full name (First M. Last): ";
std::cin >> fullName;
std::cout << fullName << " is only your first name" << std::endl;
return 0;
}
Figure 4.4: Although the cin operation accepts the user’s full name, when
it extracts the value from the input, it terminates when it reaches the first
space. Consequently, only the first name appears in the string variable
fullName.
58
Notes (C++) 2019-2020
#include <iostream>
#include <string>
int main ()
{
std::string fullName;
std::cout << "Enter your full name (First M. Last): ";
getline (std::cin, fullName);
std::cout << fullName << " is an awesome full name." << std::endl;
return 0;
}
Figure 4.5: Although the cin operation accepts the user’s full name, when
it extracts the value from the input, it terminates when it reaches the first
FT
space. Consequently, only the first name appears in the string variable
fullName.
A
R
Format Manipulators
endl End of the line.
D
59
Notes (C++) 2019-2020
#include <iostream>
#include <fstream>
datafile.close();
R
return 0;
}
D
Figure 4.7: Producing a generic CSV file in the current directory. The
interface closely matches producing input and output to the screen.
60
Chapter 5
Classes
Prior to using a class, the source code must describe its methods and fields.
Class definitions must include the method declarations.
class Student
{
std::string name;
long redID;
};
Figure 5.1: A basic class definition holding two member variables with the
default visibility.
61
Notes (C++) 2019-2020
virtual ~Vector3(){};
Vector3();
Vector3(double, double, double);
Vector3(double*);
virtual ~Vector3(){};
5.1.1 Constructors
1
DCL57-CPP. Do not let exceptions escape from destructors or deallocation functions
62
Notes (C++) 2019-2020
5.1.4 Destructors
R
In Java, the JVM calls the finalize() method attached to every object.
D
For almost all applications, the default method provided in the Object class
remains sufficient, but for objects with complicated resource allocations,
users may wish to override this behavior. One might wish to disconnect
from an external server, shutdown resources, or notify an Observer. To do
so, one need only override the method in the class definition.
The C++ language also provides a default destructor, and for many ap-
plications the default suffices. If an object allocates heap memory, however,
one must explicitly remove the reference and free the memory. Failing to do
so might lead to a memory leak. Destructors are noted by a tilde appearing
in front of the class name.
63
Notes (C++) 2019-2020
#include<cstdlib>
using namespace std;
class Label{
private:
int serno;
public:
Label(int number){serno = number;}
};
class Orange{
private:
Label *farmer;
public:
Orange(){farmer = new Label(1);}
~Orange(){free(farmer);}
}; FT
Figure 5.4: Example of a class with a destructor and constructor in C++.
The allocation of Heap memory requires a free to avoid memory leaks.
A
to indicate the visibility of a given variable within a scope. The goal remains
simplifying the interface to other components so that only the essential is
R
exposed.
The C programming language provides no mechanism for access modifi-
D
cation, for it does not include classes and the structures remain data focused.
That is, structures in C hold data, but they do not include methods or the
notion of operator overloading. Python fully supports classes, but it does
not provide a method for access modification.
To indicate a variable or method is internal in python, one simply pref-
aces its name with an underscore character. This naming convention helps
developers achieve much of the same encapsulation without forcing it in
software.
In Java Access modifiers are the keywords which are used with classes,
variables, methods and constructors to control their level of access. Java
has four access modifiers: public, private, protected, and package (or
default).
64
Notes (C++) 2019-2020
Friendly Modifiers
access with the friend keyword before use. That is, a class must explicitly
identify all external functions it considers friendly.
D
Using inheritance, the new types defined by the software become a spe-
cialized type of the parent class. They possess an IS-A relationship. Using
inheritance, we may establish a model for a simple Duck class. In this model,
2
Genetic testing may complicate this in the future.
65
Notes (C++) 2019-2020
// classes_as_friends1.cpp
// compile with: /c
class B;
class A {
public:
int Func1( B& b );
private:
int Func2( B& b );
FT
};
class B {
A
private:
int _b;
R
};
66
Notes (C++) 2019-2020
we want to simulate multiple duck types: Wood Ducks, Mallards, and Can-
vasback. All three of these animals possess the abilities to quack and fly.
For ducks, apparently, each type produces a slightly different call. Con-
sequently, in this application, we want to specify that all ducks possess this
ability, but we want to require every duck type to implement their own duck
sound. For our purposes, the birds in this model all fly the same way, so
they do not need unique behavior for that method.
Our goal is to:
• Create a simple, universal fly method all ducks can share.
associated code, then every existing class that inherits from Duck also gains
that ability.
D
• Pro: Code-reuse
5.4 Exercises
1. Using the code in Figure 5.11, code a copy constructor that sends the
text ”copied.” to the standard output each time it executes.
67
Notes (C++) 2019-2020
#include <iostream>
namespace Targets
{
class Duck
{
public:
virtual void quack(void){};
FT
void fly(){std::cout << "Flap Flap" << std::endl;};
};
public:
void quack(void);
};
68
Notes (C++) 2019-2020
void WoodDuck::quack(void){
std::cout << "Woody Quack!" << std::endl;
};
void Mallard::quack(void){
std::cout << "Malignant Quack!" << std::endl;
};
void Canvasback::quack(void){
std::cout << "Burlap Quack!" << std::endl;
} FT
int main(){
Targets::WoodDuck fred;
fred.quack();
A
fred.fly();
Mallard mary;
R
mary.quack();
mary.fly();
D
Canvasback carl;
carl.quack();
carl.fly();
Figure 5.10: Using the earlier duck example defining class methods outside
their definition.
69
Notes (C++) 2019-2020
#include<iostream>
#include<cmath>
using namespace std;
class Vec3 {
private:
double values[3];
public:
Vec3( double v0, double v1, double v2 ){
values[0] = v0; values[1] = v1; values[2] = v2;
}
}
Figure 5.11: A code fragment for a simple Vec3 mathematical vector to use
for exercises.
FT
2. Building on the code in Figure 5.11, create a two parameter mutator
method that accepts an index and updated value to use.
3. Overload the * operator in the code in Figure 5.11 such that when
the Vec3 is multiplied by a float it returns a scaled Vec3 object (e.g.
A
Vec3(1,2,3) * 2 == Vec3(2,4,6)).
4. Create a normalize method which modifies the vector’s current values
R
to unit length. That is, the normalize function adjusts the existing
values in the Vec3 object such that the length from the origin at 0,0,0
D
s0 = a1 b2 − a2 b1
s1 = a2 b0 − a0 b2
s3 = a0 b1 − a1 b0
70
Notes (C++) 2019-2020
71
Notes (C++) 2019-2020
FT
A
R
D
72
Chapter 6
6.1 Scope
Programs bind the names of variables in source to particular memory lo-
FT
cations during their execution. A variable is in scope in a program all the
places where one may legally reference that variable name. We introduce a
new scope in C++ when we begin a function call, iteration, or even when
we simply introduce empty curly braces. Figure 6.1 illustrates using curly
A
braces to begin a new scope, but it contains an error that prevents com-
pilation. The inner-block of code declares a variable, but the outer-block
attempts to assign to it, and the compiler prevents the assignment.
R
{
{
int block = 2;
}
block = -1;
}
Figure 6.1: This source contains errors and will not compile due to vari-
able scope issues. The variable block is not visible in the outer compound
statement.
73
Notes (C++) 2019-2020
include <iostream>
int main(){
{
unsigned short data[65535] = {310};
ptr = data;
}
6.2 Namespaces
D
74
Notes (C++) 2019-2020
Figure 6.3: Variable visibility in a simple iterative for loop. The loop control
variable falls out of scope when the loop ends, so the compiler does not
understand the binding.
#include <iostream>
int x = 1;
int y = 2; FT
void clash(int x, int y){
x = 3;
y = 4;
}
A
void bang(){
x++;
R
y++;
}
D
int main(){
int x = 50;
int y = 60;
clash(x,y);
bang();
Figure 6.4: The compiler attempts to resolve the binding as close to the use
as possible, so the inner declaration shadows the outer name. This really
feels like an exam question, doesn’t it?
75
Notes (C++) 2019-2020
namespace Fleet
{
// a class object in the namespace
class SuperDimensionalFortress
{
public:
void Transform(){}
bool isSelfAware(){}
}
using Fleet::SuperDimensionalFortress;
SuperDimensionalFortress sdf;
Figure 6.7: Bringing one identifier into scope with the using directive.
Figure 6.8: Grabbing everything in the namespace and bringing it into scope.
76
Notes (C++) 2019-2020
import java.util.ArrayList;
import java.util.List;
...
when you reference your custom Vector object, it does not conflict with a
similar structure in the Java Collections Framework.
All entities within the same package in Java have access to the public,
D
2
Or edu.sdsu.cs.datastructures if you have an awesome 310 instructor
77
Notes (C++) 2019-2020
FT
A
R
D
78
Chapter 7
Class Templates provide a way to specify that some of the class’ mem-
bers come from templates.
FT
Function Templates identify that a function, independent of a class,
accepts generic parameters.
79
Notes (C++) 2019-2020
public:
typedef void (Receiver::* Action)();
SimpleCommand(Receiver* r, Action a) :
_rec(r), _act(a) {}
m1a1 = Tank();
SimpleCommand<Tank> tnkOff = SimpleCommand(m1a1, &Tank::powerOff())
tnkOff.Execute(); FT
Figure 7.1: An example of a C++ template for the Command design pattern
in Design Patterns.
A
The Python programming does not require any formal syntax with gener-
ics. Instead, it uses a strategy called duck typing. The idea follows the say-
R
ing, ”if it looks like a duck and quacks like a duck, it’s a duck.” Instead of
trying to verify the operation is permitted at compile time, Python assumes
it is and acts accordingly.
D
80
Part II
FT
Analysis
A
R
D
81
D
R
A
FT
Chapter 8
Complexity
How rapidly does something grow? For example, if a linear algorithm takes
an hour to complete with the current number of users, how long will it
require with twice the users? Growth might occur in the amount of time it
FT
takes something to complete, or it could manifest in the additional amount
of storage required as the problem size increases.
When measuring complexity growth, one needs at least two data points.
For example, knowing a car passed a highway patrol officer at exactly noon
A
provides little actionable information about its speed. If, however, we in-
clude an observation from a second highway patrol officer precisely one mile
away who notices the exact vehicle passing that officer’s position thirty sec-
R
onds later. Given these two data points, reckless driving seems like a likely
outcome 1 .
D
83
Notes (C++) 2019-2020
use the analysis to justify implementing the design if one needs the imple-
mented design to perform the analysis. In order to justify changes, however,
one may use empirical timing results to illustrate a system’s current archi-
tectural bottlenecks.
Additionally, an analysis of working code remains strictly tied to the
machine, and operating conditions, under which the testers performed the
analysis. That is, timing results on one computer carry no significance
on any other machine. Algorithms running on a machine with a modern
graphics processing unit, for example, mean nothing when contrasted against
the same algorithm implemented on a TRS-80 color computer. Even on
machines running identical hardware, background processes may impact the
measured time.
Both Java and C++ provide convenient library and system functions
that accomplish timings at various precision (e.g., System.nanoTime(),
System.currentTimeMillis(), std::clock()). The code in Figure 8.1
illustrates the C++ mechanism for accomplishing timings.
FT
8.2 Theoretical Analysis
Machine independent. May work from design concept.
A
8.2.1 Complexity Classes
R
8.3 Exercises
For each code fragment below, give the complexity class of the algorithm
D
2.
for(int i=1; i <= n; i*=2)
for(int j=1; j <= i; j+=2)
x++;
84
Notes (C++) 2019-2020
#include <iostream>
#include <iomanip>
#include <chrono>
#include <ctime>
void consumeTime()
{
double d = 0;
for(int n=0; n<10000; ++n) FT
for(int m=0; m<10000; ++m)
d += d*n*m;
}
int main()
{
A
std::clock_t clockStart = std::clock();
consumeTime();
std::clock_t clockEnd = std::clock();
R
85
Notes (C++) 2019-2020
3.
for(int x = 0; x < n; x++)
{
for(int i=0; i < n; i++)
std::cout << "Single";
for(int j=0; j < n; j++)
std::cout << "Letter";
for(int k=0; k < n; k++)
std::cout << "Variables";
}
4.
int x = n;
for(int i=0; i < 10000; i++)
while((x-=2) > 0)
std::cout << "The value of x is: " << x << std::endl;
FT
5. Given the empirical times below for the algorithm X, what is the likely
O complexity of this algorithm?
n Time in nS
100K 1206
A
200K 2707
300K 3496
R
400K 5310
500K 7774
D
600K 8922
86
Notes (C++) 2019-2020
he noticed the test only included the men in the original data. After
he corrected this mistake and included the women and children in the
test, the sample size increased to n = 1000. Predict, as accurately as
possible, the maximum duration the algorithm will take with the new
sample size if the algorithm’s complexity is O(n2 ).
9. How long will the algorithm in question 8 take if the algorithm is O(n!)
instead of O(n2 )?
10. Given a list of integers, return the sub-sequence that produces the
maximum sum. For example, the input list 2, 3, 5, -33, 5, 6,
1, 3, 3 produces: 5, 6, 1, 3, 3
FT
A
R
D
87
Notes (C++) 2019-2020
FT
A
R
D
88
Chapter 9
When discussing different sort algorithms, one must consider the sort’s be-
havior with respect to the items it sorts and any additional resources the sort
requires. Two terms help characterize an algorithm: in-place and stable.
FT
An otherwise speedy algorithm may become unusable in an application
if the algorithm’s storage requirements grow beyond the machine’s capacity.
The term in-place references algorithms whose memory requirements remain
constant no matter how large the problem grows. These procedures may use
scratch variables as necessary, but the amount of scratch memory remains
A
totally independent of the input size. Algorithms cannot allocate a duplicate
array or data structure in any way tied to the input size and still claim in-
R
place performance.
How should a sorting algorithm handle duplicate entries? For simply
sorting numbers, without any additional context, this characteristic may
D
seem meaningless. When the numbers the sort works with reference cus-
tomer priority levels, however, and each number corresponds to a unique
person, the order suddenly matters1 A stable sorting algorithm guarantees
that the order the individual duplicates appeared in the original list appears
in the sorted list.
89
Notes (C++) 2019-2020
Case Complexity
D
Best O(n2 )
Worse O(n2 )
Selection sort works in-place by partitioning the array to sorted and
unsorted portions. Initially, the sorted portion holds a size of zero and
the unsorted portion takes up the entire array space, but after the first sort
iteration, the sorted size will equal one and the unsorted portion will contain
one less item. The algorithm works by scanning the entire unsorted portion
of the array for the smallest item it contains. After it finds the item with the
smallest value, it swaps it with the item at the front of the unsorted array.
This effectively grows the sorted portion by one and shrinks the unsorted
region by one.
Anytime an algorithm blindly swaps into an array, as the selection sort
algorithm does when it throws the item at the front of the unsorted portion
into the spot formally associated with the smallest value item, it scrambles
90
Notes (C++) 2019-2020
return -1;
}
Figure 9.2: The source code for a recursive version of the binary search
algorithm. FT
the algorithm’s stability. Due to this behavior, selection sort lacks stability.
A
9.3 Insertion Sort
Case Complexity
R
Best O(n)
Worse O(n2 )
Actual
D
O(kn)
Much like selection sort, insertion sort works in-place by partitioning
the input array into two parts: sorted and unsorted. Unlike the earlier sort,
however, insertion sort only works with one new item at a time. Rather
than scan the entire unsorted array for the smallest item, simply insert the
item at the front of the unsorted portion where it belongs with respect to
the sorted array.
The initial array:
10 9 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
1. Pass One:
10 9 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
91
Notes (C++) 2019-2020
Figure 9.4: Insertion sort’s in-place source code requires a constant number
of scratch variables.
92
Notes (C++) 2019-2020
2. Pass Two:
10 9 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
9 10 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
9 10 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
3. Pass Three:
9 10 2 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
9 2 10 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 9 10
FT 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 9 10 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
At this point, we begin to witness the complexity buildup. As we add
A
2 to the sorted list, it must shuffle through every position currently
in the sorted portion. This swap requires processor time. If every
R
new item entering the sorted portion of the array requires shifting
the existing contents, which requires linear time, then sorting n items
requires:
D
4. Pass Four:
2 9 10 8 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 9 8 10 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 8 9 10 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 8 9 10 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
93
Notes (C++) 2019-2020
5. Pass Five:
2 8 9 10 7 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 8 9 7 10 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 8 7 9 10 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 7 8 9 10 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
2 7 8 9 10 3 6 5 4 0
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
FT
9.4 Merge Sort
Case Complexity
A
Best O(n log(n))
Worse O(n log(n))
R
Rather than comparing and swapping entries throughout the array, the
merge sort algorithm breaks the input array into sub-arrays and then com-
D
bines them back together in order in a new array. Unlike both insertion and
selection sort, merge sort uses another working array during its operation.
Consequently, this algorithm fails to meet the requirements for in-place.
With merge sort, one might actually more items than one can store in
memory or on a single storage media. For example, one could combine a
massive customer list in Asia with one from Europe onto two new storage
disks in sorted order. The first would contain an alphabetical list of cus-
tomers, starting from the first and ending when the disk lacks additional
capacity. Then, on a second new disk, the algorithm would begin storing
the remaining customers in alphabetical order where it left off on the first
list.
Although clearly not an in-place algorithm, it does preserve the order of
duplicates so long as the code always takes from the left array first in the
event of duplicate entries at the front of the left and right lists.
94
Notes (C++) 2019-2020
FT
A
R
D
Figure 9.5: A breakdown of merge sort’s steps working from top down. This
image courtesy Wikimedia and the Public Domain.
95
Notes (C++) 2019-2020
Figure 9.6: Radix sort assisted Herman Hollerith during the tabulation of
the United States census.
ity of an algorithm, with Radix Sort, the coefficient grows large enough to
impact performance. Moreover, even though radix sort uses linear complex-
ity, another sort belonging to O(n log(n)) will scale the linear factor less
D
than radix sort’s coefficient. Additionally, unlike insertion sort, radix sort
requires additional working memory proportional to the input size.
The algorithm works by shuffling items from the input array into different
buckets. In a base-10 system, unlike what a computer uses, one requires ten
different buckets for each of our digits 0-9. For each digit, the algorithm
allocates an array capable of holding new items inside. As the algorithm
works from least significant digit toward most significant digit, it places the
item in the appropriate bucket. After processing the least significant digit
for every number in the input array, it pulls everything out of the buckets
it sorted them into and starts again using the next digit.
02 17 28 90 10 03 06 15 24 00
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
96
Notes (C++) 2019-2020
Digit
0 90, 10, 00
1
2 02
3 03
4 24
5 15
6 06
7 17
8 28
9
After placing each value into the appropriate bin, we may now return
these items to the original array. We do so by emptying out each bucket,
in-order.
90 10 00 02 03 24 15 06 17 28
dataToSort =
[0] [1]
FT
[2] [3] [4] [5] [6] [7]
Then we may again place the items from the original array into the
[8] [9]
3
4
D
5
6
7
8
9 90
Finally, we may empty out each bucket and return the data to the orig-
inal array. Having processed each digit, the algorithm completes and leaves
us with a sorted array.
00 02 03 06 10 15 17 24 28 90
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
Thankfully, computers use binary representations of numbers, not their
base-10 form, so we require only two buckets within which we sort items.
Unfortunately, we must process every item in the list to sort at least once
97
Notes (C++) 2019-2020
for each digit. Binary numbers contain as many digits as bits required
to represent the underlying type, so a two-byte integer requires thirty-two
passes.
2. Rearrange the elements in the array such that everything that appears
to the left of the pivot contains smaller values and everything to the
right holds larger values.
D
Performance degrades when the pivot selection fails to divide the ar-
ray. An in-order series of numbers (either descending or ascending) places
the pivot value solidly at one extreme2 . Consequently, quick sort’s worst
case performance reflects this imperfection. When quick sort cannot divide-
and-conquer, it must linearly scan the list a linear number of times. This
produces O(n2 ) performance.
2
Assuming one simply selects the last item in the list as the pivot. One cannot escape
this behavior by permuting it and choosing an alternate element as the index.
98
Notes (C++) 2019-2020
FT
A
R
D
Figure 9.7: An example of quick sort on an input array. This image courtesy
WikiMedia and the public domain.
99
Notes (C++) 2019-2020
9.7 Exercises
1. What is the complexity, in O() notation, for performing each of the
following sorts on an array of size n.:
(a) Using selection sort when the input is in its natural order (as-
cending)?
(b) Using insertion sort when the input is in its natural order (as-
cending)?
(c) Performing merge sort on an array filled entirely with duplicate
items?
(d) Using quick sort on an array in apparently random order?
(e) Using quick sort on an array filled entirely with n copies of the
same item?
FT
2. Using the insertion sort algorithm, starting from index position 0, show
the contents of the input array after each indicated position swap. The
unsorted portion should shrink and the sorted portion should grow
with every pass.
10 -7 4 2 -10 22 8 7 9 0
A
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
After completing the third pass:
R
dataToSort =
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
D
3. The code for binary search shown in Figure 9.2 uses recursion to per-
form the search. Rewrite the algorithm to eliminate the recursion (e.g.,
using a while loop).
100
Part III
FT
Data Structures: Linear
A
R
D
101
D
R
A
FT
Notes (C++) 2019-2020
AVL Tree
Red-Black Tree
Graph Adjacency List
D
Adjacency Matrix
Figure 9.8: Abstract data types and data structures implementing the in-
terface.
103
Notes (C++) 2019-2020
FT
A
R
D
104
Chapter 10
Lists
10.1 Stack
Possible to create with an array.
FT
10.2 Queue
Single linked list configured the correct direction
A
10.3.1 Array
10.3.2 Dynamic Array
10.3.3 Circular Array
105
Notes (C++) 2019-2020
Figure 10.1: The basic structure of a sequential access list. Each Node
connects to the next through a saved address.
struct Node
{
Node(int value) : data(value){next = NULL;};
int data;
Node *next;
}
Figure 10.2: A basic node for a singly linked list. It contains a place to store
the data value as well as a link to the next Node in the chain.
FT
Linked lists provide all the same List functions as their array-based
counterparts, but they implement the functions in different ways. Rather
than keeping the data grouped together, a linked list distributes it through-
out the memory space. As an advantage, one no longer needs to move items
A
in memory when inserting in the front, rear, or middle of the list. For this
functionality, however, the linked list sacrifices random access. Instead, link
R
lists use sequential access. This means that in order to get to position i + 1
in the list, one must first navigate to i. Thus, to get to the fourth item in
the list, one must first visit the first, second, and third elements.
D
Each position in the list maintains both a value (the data relevant to
the application), as well as the address to the next item in the list. This
requires more overhead than a traditional array, for the index positions
naturally inform one where to look in memory for the data.
106
Notes (C++) 2019-2020
ming, however, so one need not introduce classes and methods. Rather than
implementing the add and remove methods as part of a class, one simply
D
creates functions that take the node as a parameter. The code in Figure
10.3 shows one method for achieving this by passing in the working node.
Removing items from the front of a singly-linked list, unlike a standard
array, requires constant time, for one need only update the pointer to the
start of the list. Instead of pointing start to the current front item, move
it so it points to front->next. No matter how long the list spans, this
operation requires the same number of steps.
107
Notes (C++) 2019-2020
struct DNode
{
DNode(int value) : data(value){next=NULL; prev=NULL;};
int data;
DNode *next;
DNode *prev;
}
Figure 10.5: An updated node for a doubly linked list. With the addition of
the back-pointer, most of the space to store the node goes to keeping track
of list maintenance data.
in C++ might actually modify an alias to the input node. Rather than
passing in simply a pointer to the start of the list, pass in the address where
that pointer lives in memory. Then, when one changes it, one changes its
D
value in the outer scope. In C and C++, a function can set the linked list’s
starting node to an entirely new node. Java developers cannot achieve this
precise functionality due to the language’s design.
Significantly, the use of aliased variables in C++ falls well within the
language’s intent and style. This functionality allows the language to sup-
port these types of data structures without incorporating a strictly object-
oriented design. One need not preserve or maintain the number of items
inside the list if one never intends to check it. C++ developers need not in-
clude this inefficiency. At the same time, one might alternatively construct
a class in C++ which accomplishes all the list requirements while preserving
its state in some internal variables.
To achieve specifically tracer functionality, as shown in Figure 10.7, one
must use triple reference pointers, and the Java language took steps in its
design to strictly avoid this level of referencing. The power it offers in-
108
Notes (C++) 2019-2020
toAdd->next = previous->next;
previous->next = toAdd;
toAdd->prev = previous;
if (toAdd->next != nullptr)
toAdd->next->prev = toAdd;
}
int cursor = 0;
while((*tracer) && (++cursor < position))
tracer = &(*tracer)->next;
toAdd->next = *tracer;
*tracer = toAdd;
}
Figure 10.7: Triple reference pointers in C and C++ allow one to use casting
to access the data contents, yet it makes it easy to work directly with the
head reference if necessary.
109
Notes (C++) 2019-2020
1
Job security . . . or how to get kicked off a team?
110
Notes (C++) 2019-2020
10.5 Exercises
1. What is the complexity, in O() notation, for:
2. Rewrite the code in Figure 10.3 by replacing the recursive call with a
while loop.
3. How does one correctly implement a queue, within the expected com-
plexity for each major operation, using only a singly-linked list? What
additional variables must one track? From what end does the dequeue
FT
operation remove items in the queue?
4. Doubly linked lists require additional overhead, but,they add func-
tionality not available with singly linked lists. Describe two benefits
provided by doubly linked lists that are not provided by singly linked
A
lists. The lists in this question are unordered.
5. Given an array of unsorted integers and a target value, write a function
R
that scans the input array and displays two values, and their index
positions, within the array with a sum totaling the target value. What
D
111
Notes (C++) 2019-2020
FT
A
R
D
112
Part IV
FT
Data Structures: Non-linear
A
R
D
113
D
R
A
FT
Chapter 11
2. Uses left-justification
based data structure. These extra links require additional storage space, for
one must store the address to the next datum. With an array, however, all
values are adjacent with one another in the computer’s memory. Due to the
heap’s rules, which carefully limit its shape,
left 2i + 1
right 2i + 2
i−1
parent 2
11.2 Insertion
Due to the heap’s strict structural rules, adding one item to the heap changes
its shape in a deterministic way. Without question, and regardless of the
data set, a heap with six items will take the shape as shown in Figure 11.4.
This fact holds true no matter how many values the heap contains. The
115
Notes (C++) 2019-2020
FT
Figure 11.1: Trees and Heaps courtesy of XKCD and the Creative Commons.
A
R
-40
D
0 -10
1 13 0 0
92 2 14 55 1
Figure 11.2: The minimum heap property holds for every node within the
heap. All heaps, regardless of the input values, are complete, left-justified
trees.
116
Notes (C++) 2019-2020
-40 0 -10 1 13 0 0 92 2 55 14 1
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
Figure 11.3: The array backing the abstract tree shown in Figure 11.2. All
heaps use arrays, so this data structure represents the entire limits of the
heap.
x0
x1 x2
x3 x4 x5
FT
Figure 11.4: A left-justified, complete binary tree with six items will always
take this shape.
A
heap rules force it to assume one, and only one, shape for every heap of a
R
given size. Consequently, when the heap grows to include seven values, the
heap will take the next well-defined.
The heap rules make implementing the data structure on an array not
D
only possible, but the ideal method. With each addition, the backing array
simply grows its size by one1 . We lack all choice in the heap’s shape, so,
instead, we must swap values within the nodes themselves during run-time.
The insertion operation simply involves placing the new value in the next
available array space. From there, it then bubbles-up to its correct position
through a series of swaps with its parent.
The insertion procedure:
1. Drop the new value into array in the first position ”outside” the heap.
1
A data structure with a buffer, which reduces array re-allocations, will improve per-
formance for a queue application.
117
Notes (C++) 2019-2020
(a) if the parent’s value is lower (in a min heap), stop, for the item
is in its correct position.
(b) else swap the new value with it’s parent’s value.
(c) continue these steps until parent is lower or item is now root (no
parent).
11.3 Deletion
When deleting from a heap, we always remove the root value. As the heap
represents a queue, this appears natural, for the first item in a line stands at
its exit. The next customer is the one at the front of the line. Thankfully,
one may quickly remove an item from a heap.
Removal Process:
1. Store the old root value in a variable to return.
FT
2. Overwrite the contents of the root with the item at the end of the
heap.
(c) continue these steps until the item falls into its correct position.
D
11.4 Heapification
By slightly modifying the algorithm’s approach to building a heap, one may
reduce the complexity associated with building a heap. The performance im-
provement impacts wall-clock time, for it fails to reduce the algorithm’s com-
plexity for any meaningful task. Certainly, one may build a queue quickly
with this method, but one must still remove items from the heap, and that
time dominates the analysis. This is analogous to speeding up into slower
moving traffic.
11.5 Performance
Optimized for a priority queue, heaps quickly fulfill that ADT’s major op-
erations.
118
Notes (C++) 2019-2020
if (maxIdx != cur) {
A
temp = arr[cur];
arr[cur] = arr[maxIdx];
arr[maxIdx] = arr[cur];
R
Figure 11.5: Source code for a recursive heapification. This approach only
works when one knows all the data structure’s contents in advance like when
sorting an array.
119
Notes (C++) 2019-2020
FT
A
R
D
120
Notes (C++) 2019-2020
11.6 Exercises
1. After inserting n items into a maximum heap in sequential order, what
is the worst case performance for finding the single largest item in the
tree?
5. What is the worst case performance for removing the smallest item
from a minimum heap of size n?
FT
6. The heapsort algorithm is not stable, but you can make it appear so
by using a Wrapper on items inserted into the heap. Describe this
process. What modification must one make to the heap, if any? What
modification does one make to the items it contains?
A
7. On a separate sheet of paper, create a heap by inserting each of the
items in the following list into the heap, one item at a time, as if the
R
8. A heap containing 1, 048, 576 items must have, at most, how many
edges from its root to its farthest leaf?
9. What is the array index for the parent of an item located at index 12
in a heap? The heap’s root uses index 0.
121
Notes (C++) 2019-2020
FT
A
R
D
122
Chapter 12
The binary search algorithm, covered in Section 9.2, locates a single item
in a sorted array in O(log(n)) time. If we build a structure to mimic the
algorithm’s behavior, we will create a map implementation with fast key
lookup times. The algorithm requires a sorted list, but what if we built
a singly-linked list, and instead of simply having a next field, the data
structure’s nodes included two links. Rather than sort the data as they
arrive by rearranging the list, simply add the new items to one of the two,
internal lists. The code in Figure 12.1 shows a node in one such structure.
The binary search algorithm calls itself on either the left or right half
of the array depending on the value of the currently inspected item. Lower
targets must appear in the lower portion, or left, side of the array, and larger
values appear to the right of the current value.
123
Notes (C++) 2019-2020
MapNode *left;
MapNode *right;
} root;
Figure 12.1: A node in the binary search tree associates the key and value
using a linked group of binary nodes.
50
25 80
FT
60 85
A
81
R
Figure 12.2: The unbalanced binary search tree resulting from entering in
the keys in the order: 50, 80, 60, 25, 85, and 81. A different sequence may
D
12.2.1 Insertion
Recall that maps must contain unique keys. The values may appear multiple
times, but the search keys themselves must appear unique. Adding a new
key and value pair to the data structure requires one to instantiate a new
node if the key is missing. All additions, or insertions, in this style of data
structure occur at a null child. For new keys, the tree neither changes nor
124
Notes (C++) 2019-2020
2. Jump either left or right based upon if the new key is greater or smaller
than the current node’s key.
4. Instantiate a new node in the map with the desired key and value
provided.
12.2.2 Removal
Removing from a singly-linked list required one to stop at the link before
the one to remove. In a binary tree, we may view this as deleting from
FT
the parent. Deletion always happens at the parent’s level, but the strategy
changes based upon the number of children to which the node to delete
connects.
Typically, the remove or delete method returns the value previously
A
stored inside. This tends to simplify the caller’s software in many situa-
tions. If the value proves unnecessary to the implementation, developers
may simply ignore it in code.
R
Zero Children In the easiest case, where the desired node to remove from
D
the tree is a leaf value, one need only set the parent’s corresponding left
or right links to nullptr2 .
One Child For parent nodes with an only-child, removal closely mimics
removal from the middle of a singly-linked list. After locating the key one
wishes to delete and determining it has only one child, the algorithm simply
needs to redirect the parent’s left or right links (as applicable) such that
they now point to the node-to-delete’s only child. The parent will point to
its former grandchild.
1
This differs slightly from the put method which either inserts a new node in the tree,
if the key is absent, or sets the value associated with the existing key in the structure to
the new value specified in the latest put call.
2
Obviously, one must take special consideration with the root node. That is, when
deleting the only key in a tree.
125
Notes (C++) 2019-2020
if (start->left == NULL)
{ FT
struct node *temp = start->right;
free(start);
return val;
}
else if (start->right == NULL)
{
A
struct node *temp = start->left;
free(start);
return val;
R
K inOrderKey = inOrderPredecessor(start->left);
D
start->key = inOrderKey;
start->value = delete( start->left, key);
return val;
}
return nullptr;
}
Figure 12.3: Rough code for a removal method in a binary search tree. This
fragment misses several error conditions and significant use cases.
126
Notes (C++) 2019-2020
template<typename K>
K inOrderPredecessor(MapNode *root){
if( root == nullptr || root->left == nullptr )
return nullptr;
return maxKey( root->left );
}
template<typename K>
K inOrderSuccessor(MapNode *root){
if( root == nullptr || root->right == nullptr )
return nullptr;
return minKey( root->right );
}
Figure 12.4: Locating the in-order predecessor is simpler with helper func-
tions.
FT
Two Children When deleting nodes with two children from the binary
search tree, one must replace the root key and value with something mean-
ingful. That is, it must preserve the binary search tree ordering so that
subsequent key searches correctly locate the values. Using the in-order
A
predecessor or in-order successor will supply a value that keeps the other
connections between nodes valid.
R
In-Order Predecessor Deleting any node with two children from the
tree requires an algorithm to efficiently locate a replacement. One may do
D
12.2.3 Performance
Maps and dictionaries focus on quickly locating a record within the structure,
so the binary search tree favors these operations. Every function associated
with finding a key must operate quickly. These functions include: put, get,
and contains. Assuming a relatively complete tree, these lookup times oc-
cur in O(log(n)) time, for the tree eliminates half the potential values with
every step from parent to child.
Unfortunately, this reflects the best performance, for a different entry
order may produce a different tree. For example, inserting values in a tree
that are already in sorted order will produce an unbalanced binary search
tree. These trees more closely resemble linked lists, and, consequently, their
127
Notes (C++) 2019-2020
50
80
85
81
Figure 12.5: This tree more closely resembles a linked list, and it performs
as poorly as well. Searching for the single key 82 requires linear time.
FT
worst case performance mirrors the worst-case performance of a linked list.
A severely unbalanced list may require O(n) when locating a single value in
the tree. Figure 12.5 illustrates one such tree.
the basic binary search ordering, but its performance depends upon the
insertion order. Randomly distributed numbers should create a fairly com-
plete tree, but an in-order sequence builds something more like a linked-list.
D
128
Notes (C++) 2019-2020
A A
B B C
C A C B
Figure 12.6: Several configurations possible using the same input data but
in a different order. Only the middle configuration, however, offers balanced
performance.
we need to either rotate them right or rotate them left to move the middle
value to the root position. Imagine a roller-coaster approaching the apex of
its climb and starting to roll over the top. The unbalanced three nodes look
like a line of cars going up the hill. As the first car rotates over the top, the
middle car moves up. We want to mimic this behavior using three nodes
and source code.
Rotating a series of three nodes to the left requires the previous grand-
parent node (i.e. the root in the sub-tree) to become its child’s left child.
Effectively, the grandparent’s right child becomes the new root node, and
the grandparent moves down into its child’s position. This requires the ro-
tation code to simply move the node links around. The data values and keys
in each node, however, remain unchanged – it is only their left or right links
that accept new values.
The Java source in Figure 12.7 illustrates this on a right rotation. Here,
129
Notes (C++) 2019-2020
For each node in the AVL tree, the difference in height between
the left and right sub-trees shall differ by no more than one.
A
Mistakenly, on the internet, many casual sources suggest this is the con-
dition required for a balanced tree. This assertion remains false. Although a
R
a complete tree.
The total number of nodes in a complete tree structure, like the heap
discussed in Section 11, must contain a specific number of nodes.
130
Notes (C++) 2019-2020
Figure 12.8: The minimum number of nodes a complete tree vs an AVL tree.
Observe that the AVL condition fails to produce a truly complete tree, but
it performs well enough for a map.
and then traverses either left or right based upon the value of the key in the
root. After we add the new node to the tree, however, we must evaluate the
D
impact of the new node on the tree’s balance. All new nodes enter the tree
with a height of zero, for they are leaves with neither a left nor a right child.
Their parent nodes, however, must verify the addition did not violate the
AVL condition and rotate if necessary. An addition may require the parent
node to update its own height, and this could create a significant impact on
the tree above it.
Let us walk through the procedure by inserting each of the following
keys into the tree in sequential order from left to right4 .
After inserting the first five keys, the AVL tree and unbalanced binary
search tree look identical.
4
Exactly like you will do on the exam
131
Notes (C++) 2019-2020
AvlNode *left;
AvlNode *right;
~AvlNode() {
delete left; left = nullptr;
delete right; right = nullptr;
}
};
FT
Figure 12.9: An example of the source required for a node in an AVL tree.
It tracks the left and right children just as a standard binary search tree,
A
but it now includes the additional space required to track its height.
R
18
D
4 78
1 1
11 22
0 0
keys
Figure 12.10: An AVL tree with the heights of each node. After inserting five
keys in the AVL tree, it still resembles its basic, unbalanced counterpart.
132
Notes (C++) 2019-2020
18
3
4 78
1 2
11 22
0 1
42
0
FT
Figure 12.11: The addition of the 42 key introduces an AVL tree violation.
The 78 node has a left tree height of 2 and nothing on the right.
A
18
3
R
4 78 18
1 2 2
D
11 42 4 42
0 1 1 1
22 11 22 78
0 0 0 0
Figure 12.12: The fix-up for the violation introduced with the 42 key involves
first rotating the 22 and 42 nodes left and then rotating to the right through
78.
133
Notes (C++) 2019-2020
With AVL trees, a single insertion may unbalance multiple nodes higher
up in the tree. Adding one to one side of the tree may cause several internal
nodes in the sub-tree to detect an AVL violation. After the insertion, the
tree moves back along the insertion path all the way to the root. As it moves
back through the tree, it verifies that each node satisfies the AVL condition.
The first node it encounters on its trip from the new leaf back to the root
that has sub-trees that violate the condition must rotate.
By correcting the error at its lowest level in the tree, this automatically
reduces the height of any violating nodes above it in the tree. The only
reason the nodes above it have unbalanced subtrees is because the latest
insertion violated the AVL condition. A single insertion can only increase
the height of a sub-tree by no more than one, so all nodes above it in the
tree violating the AVL condition may only be off by one. Thus, reducing
the height at the lowest level also corrects those nodes.
In an AVL tree, one only ever needs to perform a single rotation fix-up
step. This fact does not hold true for the red-black trees discussed in Section
FT
12.3.3, for they may require multiple adjustments to correct the structure.
Things in the AVL tree appear slightly simpler in that regard.
only a single additional bit5 For this tree to work, however, it requires ad-
ditional rules which define the tree’s properties and what to do when an
D
• Check: the path from every leaf to the tree’s root passes through the
same number of black nodes.
5
We cannot allocate a single bit in Java or C++. Instead, an 8-bit or boolean value
stores the color.
134
Notes (C++) 2019-2020
NIL 2
NIL 3
NIL NIL
Figure 12.13: The addition of the key 3 causes a red-violation in this red-
black tree. This configuration triggers a rotation due to the NIL attached
to node 1.
FT
Insertion As in the unbalanced tree, all additions to the structure occur
at null children. That is, we do not overwrite existing data, nor do we
place the node inside the middle of the structure and manage those links.
Insertion always increases the map’s size, so it makes sense that new values
go where nothing existed before.
A
When inserting the first node in an empty red-black tree, the node enters
the tree as a red leaf, for the tree rules demand all new nodes arrive red, but
R
the tree then immediately changes its color to black. Let us run through the
process with an in-order series of numbers:
D
{ 1, 2, 3, 4, 5, 6 }
135
Notes (C++) 2019-2020
1 3
Figure 12.14: The reduced height in the red-black tree after the rotation
triggered by 3’s addition to the map.
FT
A
2
R
D
1 3
NIL NIL
Figure 12.15: Yet another red-violation when we insert the next number in
the in-order sequence.
136
Notes (C++) 2019-2020
1 3
NIL NIL
Figure 12.16: Changing the color of the new node’s parent, grandparent,
and aunt resolves this problem. Because this color flip includes the root, it
actually causes two fix-up steps: color flip, root change to black.
FT
As Figure 12.17 shows, yet another red-violation appears with the next
number in the in-order sequence. In this case, because node 3’s other child
is null, we treat it as black per the red-black tree rules. As previously
established, we rotate the nodes when we detect this condition. Figure
A
12.18 shows what the final tree looks like after this concludes with the key
5 successfully inserted in the map.
R
12.4 B-Trees
D
The binary search tree structure expands to more than one link. B-Trees
generalize the idea. The binary tree acts as a simple 2-tree, for each node
contains two links. What if we could expand this idea to include three links?
In this way, we might slightly optimize our tree’s structure and search times.
We could use a lower link and greater link, just like in the binary search tree,
but what if there were also some concept of a middle value?
The 2-3 tree, one form of B-tree, captures this idea. In order for there
to be a middle tree to traverse, then the nodes inside the tree must contain
two keys: a lower and upper key. Items falling between the two keys will
enter the middle sub-tree. Additionally, items lower than the smaller key
will fall in the left sub-tree while items greater than the larger key will fall
into the right sub-tree.
Figure 12.25 illustrates this idea with a 2-3 tree. Unlike the binary
tree, or a simple 3-tree, the 2-3 tree uses both traditional binary search
137
Notes (C++) 2019-2020
1 3
NIL 5
FT NIL NIL
Figure 12.17: Adding key 5 to the map triggers a rotation to the left for
nodes 3, 4, and 5.
A
R
2
D
1 4
NIL NIL 3 5
Figure 12.18: After the rotation triggered by 5’s addition to the map.
138
Notes (C++) 2019-2020
1 4
NIL NIL 3 5
NIL NIL
FT
Figure 12.19: With node 6, another red-violation appears.
A
2
R
1 4
D
NIL NIL 3 5
NIL NIL
Figure 12.20: A color flip clears the problem. Observe there are the same
number of black nodes between every leaf and the root.
139
Notes (C++) 2019-2020
1 4
NIL NIL 3 6
NIL NIL 5 7
NIL NILNIL 8
FT NIL NIL
Figure 12.21: A simple color resolves the initial red-violation here, but it
introduces side-effects above it in the tree.
A
tree nodes as well as 3-nodes. When items enter the map via an insertion
method, the tree decides if it should enter the map as a new binary node
R
or, alternatively, swell an existing binary node into a 3-node. The 2-3 tree’s
insertion procedure chooses between adding a new binary node or expanding
D
140
Notes (C++) 2019-2020
1 4
NIL NIL 3 6
NIL NIL 5 7
NIL NILNIL 8
FT NIL NIL
Figure 12.22: Now nodes 4 and 6 share red as their color, but they are
adjacent, so this creates a violation.
A
R
4
D
2 6
1 3 5 7
NIL NIL
141
Notes (C++) 2019-2020
Figure 12.24: A node in a 5-tree tracks four keys and includes links to five
other 5-nodes. This illustration courtesy the WikiMedia Commons
FT
A
R
11
D
-30 7 15 40
Figure 12.25: The 2-3 tree changes the number of keys and links each node
in the tree uses so it is always complete. The name references the number
of outbound links each node will possess.
142
Notes (C++) 2019-2020
includes two keys. What if our storage media were terribly slow, like a tape-
drive or external memory array? Using the 3-node would slightly speed up
our overall performance if we planned on doing substantial processing or
searching in the tree.
What if, however, rather than limiting ourselves to simply two keys,
what if we expanded it to 128, or 256, or 2048? Searching through the keys
will become a chore itself, but fetching the 2048-node from memory supplies
2047 keys in a single read. When reading from a slow drive, this style of
tree may offer empirical performance improvements.
FT
A
R
D
143
Notes (C++) 2019-2020
12.5 Exercises
1. The code-fragment for the inOrderPredecessor in Figure 12.4 calls
the functions minKey and maxKey. Supply the code for each of these
functions.
2. A hash table using probing currently has a total of 128 bins inside
to hold items. Due to the nature of this specific data set, all items
produce a hashcode with the common factor of 8. If the hash table
simply performs a modulus of the hash code by 128 to determine the
proper item placement, what is the maximum number of items the
table can hold without rehashing?
3. Identify the worst case performance for inserting a single item into
a hash table using open-addressing given the table currently holds n
items.
FT
4. What is the worst case amortized cost of inserting n items into an
initially empty hash table?
5. What is the average complexity one should expect from a Hash Table
using probing when inserting a single item into a table with n items?
A
6. An incorrect chained hash table implementation uses only a single
bucket, for this avoids the need to rehash. In this Map implementation,
R
7. Draw the final AVL tree resulting from inserting the following number
D
sequence in order from left to right. Show each major insertion (e.g.,
one where the tree rotates) for partial credit: 88, 45, 3, 12, 11,
43, 22, 10, 0
8. Draw the final Red-Black tree, noting each node’s color, resulting from
inserting the following number sequence in order from left to right.
Show each major insertion for partial credit: 79, 5, 19, 20, 39,
40, 41, 6, 30
144
Notes (C++) 2019-2020
to put keys into the map as well as to retrieve the corresponding value.
10. Modify the tree created in Problem 9 such that it supports producing
an in-order keyset iterator. The driver shall demonstrate this func-
tionality.
11. Modify the tree created in Problem 9 such that it supports producing
FT
a value iterator and demonstrate its performance. The iterator shall
produce values corresponding to the keyset order. That is, the second
value iterated shall correspond to the second key iterated.
12. Include and demonstrate the deletion method in your solution to Prob-
A
lem 9. The solution may use the in-order predecessor or successor.
eters and returns true if the strings are anagrams of one another.
D
14. Create a program that randomly combines words contained in its input
stream and sends them to the output stream. At program execution,
the application shall read in the entire set of words to use during the
combination process. After reaching the EOF symbol, the program
shall then begin randomly selecting words and combining them into a
concatenated output word. The program produces only one combined
output word for each execution.
By default, the program randomly squeezes four words in its dictionary
together and omits any spacing between them (e.g., somewhatlikethis-
butrandom). When unable to construct strings meeting the minimum
length from the input stream, the program shall alert the user to an
error and suggest corrective action. There shall be no duplicate words
in any single output combination.
145
Notes (C++) 2019-2020
Example Use:
user@computer: $ ./crunch <dict.txt
reducefishersplendidprogram
user@computer: $ ./crunch -s <dict.txt
windows omlette lethargic amount
user@computer: $ ./crunch -n 2 -s<dict.txt
hospitable chicken repair overflow
FT
scrawny belief remain actually
15. Modify the program described in Problem 14 to accept a command line
argument which specifies the number of strings the program outputs
in a given execution. The program shall crunch together four words n
A
times for the current execution.
16. Students shall create a program which reads ASCII strings from the
R
standard input until it reaches the EOF symbol. From the accumu-
lated input, the program shall extract only its unique tokens, separated
by the space character, and send these to the standard output. For
D
146
Notes (C++) 2019-2020
Example Use:
user@computer: $ ./unique<iamsam.txt
FT
A
R
D
147
Notes (C++) 2019-2020
FT
A
R
D
148
Part V
FT
Algorithms
A
R
D
149
D
R
A
FT
Notes (C++) 2019-2020
12.5.1 Performance
12.7 Exercises
1. Given the following graph, draw the adjacency matrix in the space
provided. For any directed edges, include the appropriate arrow indi-
cating the direction of travel.
1 Destination
A B C D E
A
B B
1 1 C
151
D
1 E
A D
1 1
C
Notes (C++) 2019-2020
Input File The CSV file provided as the program’s input argument
adheres to the following formatting specification.
• Only one entry (i.e. Vertex or Edge) appears on each line in the
file
• Entry may represent either a vertex name or an edge connection
• Vertex names appear as single items on the line
• Edge connections include three parts: Source, Destination
• Edge and Vertex names appear as strings
FT
• Edges and Vertex Names may appear interlaced
• A new vertex may appear as a vertex entry or when included in
an edge connection.
A
Program Name: buildGraph
Command Line Args: path to CSV to use
R
152
List of Figures
1.8 A basic C macro for finding the maximal value. The pre-
processor replaces the macro variable a with x++, so what
D
2.1 Manifest typing in the C++ language. The source code must
specify the distinct types for each memory location. . . . . . 22
153
Notes (C++) 2019-2020
154
Notes (C++) 2019-2020
lasting effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5 A version of swap which passes by value. The swap success-
fully consumes processor resources, but it fails to perform any
D
meaningful work. . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 A version of swap using raw pointers. Instead of accepting
the actual values to modify, this version takes their address
and de-references it to change the variables’ values in memory. 49
3.7 A version of swap using pass by reference. Here the function
works with aliases to the integer variables. . . . . . . . . . . . 50
3.8 Modifying the example in Figure 3.3 to accept default argu-
ments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.9 An example of a recursive algorithm to compute the greatest
common divisor of two integers. . . . . . . . . . . . . . . . . . 51
3.10 An example of a recursive algorithm. This version of binary
search returns the index of the matching element, but at what
cost? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.11 An example of first-class and higher-order functions. . . . . . 52
155
Notes (C++) 2019-2020
3.12 This computes the total of all elements in the list. The vari-
able total is stored as a part of the lambda function’s closure.
Since it is a reference to the stack variable total, it can change
its value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.13 A simple lambda function illustrating the general syntax. . . 53
3.14 Lambda capture specifiers if the object should be copied by
value or reference. . . . . . . . . . . . . . . . . . . . . . . . . 53
3.15 This lambda indicates captures n by reference with the & sym-
bol. What happens when badlamb returns and n falls out of
scope? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
156
Notes (C++) 2019-2020
6.1 This source contains errors and will not compile due to vari-
able scope issues. The variable block is not visible in the
outer compound statement. . . . . . . . . . . . . . . . . . . . 73
6.2 Although free of syntax errors, this code includes a pointer
to a variable out of scope. This represents a significant logic
error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
FT 74
6.3 Variable visibility in a simple iterative for loop. The loop
control variable falls out of scope when the loop ends, so the
compiler does not understand the binding. . . . . . . . . . . . 75
6.4 The compiler attempts to resolve the binding as close to the
A
use as possible, so the inner declaration shadows the outer
name. This really feels like an exam question, doesn’t it? . . 75
6.5 An example of namespace in C++ . . . . . . . . . . . . . . . 76
R
157
Notes (C++) 2019-2020
updating. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.7 Triple reference pointers in C and C++ allow one to use cast-
ing to access the data contents, yet it makes it easy to work
directly with the head reference if necessary. . . . . . . . . . . 109
11.1 Trees and Heaps courtesy of XKCD and the Creative Commons.116
11.2 The minimum heap property holds for every node within the
heap. All heaps, regardless of the input values, are complete,
left-justified trees. . . . . . . . . . . . . . . . . . . . . . . . . 116
11.3 The array backing the abstract tree shown in Figure 11.2. All
heaps use arrays, so this data structure represents the entire
limits of the heap. . . . . . . . . . . . . . . . . . . . . . . . . 117
11.4 A left-justified, complete binary tree with six items will al-
ways take this shape. . . . . . . . . . . . . . . . . . . . . . . . 117
158
Notes (C++) 2019-2020
12.1 A node in the binary search tree associates the key and value
using a linked group of binary nodes. . . . . . . . . . . . . . . 124
12.2 The unbalanced binary search tree resulting from entering in
the keys in the order: 50, 80, 60, 25, 85, and 81. A different
sequence may produce a different structure. . . . . . . . . . . 124
12.3 Rough code for a removal method in a binary search tree.
This fragment misses several error conditions and significant
use cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
12.4 Locating the in-order predecessor is simpler with helper func-
tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
12.5 This tree more closely resembles a linked list, and it performs
as poorly as well. Searching for the single key 82 requires
FT
linear time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
12.6 Several configurations possible using the same input data but
in a different order. Only the middle configuration, however,
offers balanced performance. . . . . . . . . . . . . . . . . . . 129
A
12.7 Rotating Node objects in the Java programming language. . . 130
12.8 The minimum number of nodes a complete tree vs an AVL
tree. Observe that the AVL condition fails to produce a truly
R
159
Notes (C++) 2019-2020
160
Bibliography
[2] D. Jackson, “The google technical interview: How to get your dream
job,” XRDS, vol. 20, pp. 12–14, Dec. 2013.
161
Notes (C++) 2019-2020
FT
A
R
D
162