10 Linked Data
10 Linked Data
March 21 – 26
Linked Data Structures
Readings: CP:AMA 17.5
The primary goal of this section is to be able to use linked lists and trees.
2
Linked List
Creating
Destroying
Inserting
Linked Lists
A Linked List is a sequence of nodes, where each node contains some data
and a link to the next node in the list. The last node in a linked list does
not link to another node.
Linked lists can grow and shrink at run-time. A significant advantage over
an array, for example, is that it is possible to easily add items to and
remove items from the front and the middle.
4
Linked Lists in C
In C, nodes are implemented as structures (llnode), and the link
between nodes is implemented as a pointer. The pointer value of the last
node is NULL, which indicates the end of the linked list.
5
Linked Lists in C
Clients interact with the linked list only through the wrapper structure
llist. As such, they do not know about the exitance of llnode.
6
Linked Lists in C – Structure Definitions
An llist points to the front node (which is NULL for an empty list). Each
llnode stores an item and a link (pointer) to the next node (which is
NULL for the last node).
llnode is a recursive data structure (it has a pointer to its own structure type). By
contrast, llist is not recursive.
7
Creating a Linked List
Creating an empty linked list means creating the wrapper-structure of a
linked list (llist), without any llnode since the list is still empty.
// [Client]
struct llist *my_list = ll_create();
8
Creating a Linked List
// llist.h [INTERFACE]
// ll_create() creates a new empty linked list.
// effects: allocates heap memory [client must call ll_destroy]
// time: O(1)
struct llist *ll_create(void);
// llist.c [IMPLEMENTATION]
struct llist *ll_create(void) {
struct llist *llst = malloc(sizeof(struct llist));
assert(llst);
llst->front = NULL;
return llst;
}
9
Creating a Linked List Node
Creating a linked list nodes creates an llnode-structure and stores the
value within, but does not “hook” the node into a linked list yet.
struct llnode *my_node = lln_create(136);
Be aware that llnode are only used internally as part of the linked list
ADT.
Remember that the client will only communicate with the linked list
through the wrapper-structure (llist).
10
Creating a Linked List Node
// llist.c [IMPLEMENTATION]
struct llnode *lln_create(int data) {
struct llnode *lln = malloc(sizeof(struct llnode));
assert(lln);
lln->data = data;
lln->next = NULL;
return lln;
}
11
List Insertion – Inserting at the Front
Inserting into a linked list can happen in three different location: at the
front (or head), in the middle, or at the back (or tail).
A potential special case is inserting into an empty linked list (see below).
// [Client]
struct llist *my_list = ll_create();
ll_insert_front(my_list, -1);
ll_insert_front(my_list, 9001);
ll_insert_front(my_list, 42);
ll_insert_front(my_list, 136);
12
List Insertion – Inserting at the Front
The order in which pointers are mutated is crucial. Using an incorrect
order can lead to losing the “connection” to existing nodes.
13
List Insertion – Inserting at the Front
// llist.h [INTERFACE]
// ll_insert_front(lst, itm) inserts the item itm at the front
// of the list *lst.
// effects: modifies *lst
// time: O(1)
void ll_insert_front(struct llist *lst, int itm);
// llist.c [IMPLEMENTATION]
void ll_insert_front(struct llist *lst, int itm) {
struct llnode *new_node = lln_create(itm);
new_node->next = lst->front; // either NULL or existing node
lst->front = new_node;
}
14
Traversing a List
Traversing a linked list means visiting every element (= node), from front
to back.
Unlike arrays, linked list nodes are not arranged sequentially in memory.
As a result, it is not possible to skip elements, and there is no fast and
convenient way to “jump” to the i-th element. The list must be traversed
from the front. Traversing a linked list is therefore O 𝑛 .
15
Traversing a List – Iterative Approach
When iterating through a list, we typically use a struct llnode-pointer
to keep track of the current node.
16
Traversing a List – Recursive Approach
When using recursion, remember to recurse over the nodes (llnode)
and not the wrapper-structure (llist).
17
Destroying a List
Since the creation of a linked list included heap-memory allocation (via a
call to malloc), we must provide facilities to release the allocated
memory again.
For the same reason, each linked-list node in the list must be destroyed as
well. As a result, destroying a list requires list traversal, which can be
achieved imperatively or recursively.
// [Client]
struct llist *llst = list_create();
ll_insert_front(llst, -1);
ll_insert_front(llst, 9001);
ll_insert_front(llst, 42);
ll_insert_front(llst, 136);
ll_destroy(llst);
18
Destroying a List – Iterative Approach
while (current) { // current != NULL
struct llnode *to_destroy = current;
current = current->next; // basic list traversal
lln_destroy(to_destroy); // destroying the llnode
}
19
Destroying a List – Iterative Approach
The order of statements is crucial when
mutating a linked list (or any other data
structure). An incorrect order can lead to
losing the “connection” to the remaining
part of the list.
while (current) {
struct llnode *to_destroy = current;
current = current->next;
lln_destroy(to_destroy);
}
20
Destroying a List – Iterative Approach
When using an iterative approach, we are going to need an additional
node pointer (to_destroy) so that we do not lose the “connection” to
the rest of the list.
// llist.h [INTERFACE]
// ll_destroy(lst) releases all resources used by *lst.
// effects: *lst becomes invalid
// time: O(n), where n is the length of the list
void ll_destroy(struct llist *lst);
// llist.c [IMPLEMENTATION]
void ll_destroy(struct llist *lst) {
struct llnode *current = lst->front; // basic list traversal
while (current) { // basic list traversal
struct llnode *to_destroy = current;
current = current->next; // basic list traversal
lln_destroy(to_destroy); // destroying the llnode
}
free(lst); // destroying the llist wrapper structure
}
21
Destroying a List – Recursive Approach
With a recursive approach, it is more convenient to free the rest of the list
before we free the first node.
22
Destroying a Linked List Node
// llist.c [IMPLEMENTATION]
void lln_destroy(struct llnode * lln) {
assert(lln);
free(lln);
}
23
Time for a Quiz! [Select most appropriate.]
What is the minimum number of bytes allocated on the heap if the
following code successfully reads in 𝑛 integers?
struct llist {
struct llnode *first; A. 8𝑛 + 8
}; B. 8𝑛 + 16
struct llnode {
C. 12𝑛 + 8
int data; D. 12𝑛 + 16
struct llnode *next; E. None of the above
};
int main(void) {
struct llist *llst = list_create();
int input = INT_MIN;
while (scanf("%d", &i) == 1) {
ll_insert_front(llst, input);
}
ll_destroy(llst);
}
24
Time for a Quiz! [Select most appropriate.]
Answers will be discussed in class!
25
List Insertion – Inserting at the Back
Inserting into a linked list can happen in three different location: at the
front (or head), in the middle, or at the back (or tail).
// [Client]
struct llist *my_list = ll_create();
ll_insert_back(my_list, 136);
ll_insert_back(my_list, 42);
ll_insert_back(my_list, 9001);
ll_insert_back(my_list, -1);
26
List Insertion – Iterative Approach
void ll_insert_back(struct llist *lst, int itm) {
struct llnode *new_node = lln_create(itm);
if (lst->front == NULL) {
lst->front = new_node;
} else {
struct llnode *insert_after = lst->front;
while (insert_after->next != NULL) {
insert_after = insert_after->next;
} // Fig 1
insert_after->next = new_node; // Fig 2
}
}
27
List Insertion – Iterative Approach
// llist.h [INTERFACE]
// ll_insert_back(lst, itm) inserts the item itm at the back
// of the list *lst.
// effects: modifies *lst
// time: O(n)
void ll_insert_back(struct llist *lst, int itm);
// llist.c [IMPLEMENTATION]
void ll_insert_back(struct llist *lst, int itm) {
struct llnode *new_node = lln_create(itm);
if (lst->front == NULL) { // empty list: insert at front
lst->front = new_node;
} else { // non-empty list: find the node AFTER which to insert
struct llnode *insert_after = lst->front;
while (insert_after->next != NULL) {
insert_after = insert_after->next;
}
insert_after->next = new_node;
}
}
28
List Insertion – Recursive Approach
// llist.c [IMPLEMENTATION]
void ll_insert_back_wrkr(struct llnode *lln, int itm) {
if (lln->next == NULL) {
lln->next = lln_create(itm);
} else {
ll_insert_back_wrkr(lln->next, itm);
}
}
29
List Insertion at an Arbitrary Position
If a linked list maintains additional properties, for example, being sorted
from lowest to highest element, an insertion can happen at any possible
location, i.e., at the front, in the middle, or at the back
30
List Insertion at an Arbitrary Position
// llist.h [INTERFACE]
// slst_insert(slst, itm) inserts the item itm into the sorted
// list *slst.
// effects: modifies *slst
// time: O(n)
void slst_insert(struct llist *slst, int itm);
// llist.c [IMPLEMENTATION]
void slst_insert(struct llist *slst, int itm) {
struct llnode *new_node = lln_create(itm);
if (slst->front == NULL || itm <= slst->front->data) {
new_node->next = slst->front;
slst->front = new_node;
} else {
struct llnode *insert_after = slst->front;
while (insert_after->next != NULL &&
itm > insert_after->next->data) {
insert_after = insert_after->next;
}
new_node->next = insert_after->next;
insert_after->next = new_node;
}
31
}
List Insertion at an Arbitrary Position – Front
struct llnode *new_node = lln_create(itm);
// inserting at the front
if (slst->front == NULL || itm < slst->front->data) {
new_node->next = lst->front; // either NULL or an existing node
lst->front = new_node;
}
// ...
32
List Insertion at an Arbitrary Position – Middle
As mentioned before, the order of statements
is crucial when mutating a linked list. An
incorrect order can lead to losing the
“connection” to the remaining part of the list.
In this example, the item is inserted into the
middle of the list.
33
List Insertion at an Arbitrary Position – Back
As mentioned before, the order of statements
is crucial when mutating a linked list. An
incorrect order can lead to losing the
“connection” to the remaining part of the list.
In this example, the item is inserted at the
back of the list.
34
Removing Nodes
As with insertion, there are three possible location from which we can
remove a node: the front, the middle, or the back.
35
Removing Nodes from the Front
// llist.h [INTERFACE]
// ll_remove_front(lst) removes the item from the front of
// the list *lst.
// requires: *lst must not be empty
// effects: modifies *lst
// time: O(1)
void ll_remove_front(struct llist *lst);
// llist.c [IMPLEMENTATION]
void ll_remove_front(struct llist *lst) {
assert(lst->front);
struct llnode *to_remove = lst->front;
lst->front = lst->front->next;
lln_destroy(to_remove);
}
36
Removing Nodes from the Back
// llist.h [INTERFACE]
// ll_remove_back(lst) removes the item from the back of the
// list *lst.
// requires: *lst must not be empty
// effects: modifies *lst
// time: O(n)
void ll_remove_back(struct llist *lst);
37
Removing Nodes from the Back
// llist.c [IMPLEMENTATION]
void ll_remove_back(struct llist *lst) {
assert(lst->front);
struct llnode *destroy_after = lst->front;
struct llnode *to_destroy = lst->front;
while (to_destroy->next != NULL) {
destroy_after = to_destroy;
to_destroy = to_destroy->next;
}
if (to_destroy == lst->front) { // remove only element
lst->front = NULL;
} else { // remove non-only element
destroy_after->next = NULL;
}
lln_destroy(to_destroy);
}
38
Removing Nodes from an Arbitrary Position
// ll_remove_item(lst, itm) removes the first occurrence of
// item itm in list *lst and returns true if item is
// successfully removed, and false otherwise.
bool ll_remove_item(struct llist *lst, int itm) {
if (lst->front == NULL) return false;
if (lst->front->data == itm) {
ll_remove_front(lst);
return true;
}
struct llnode *remove_after = lst->front;
while (remove_after->next && itm != remove_after->next->data) {
remove_after = remove_after->next;
}
if (remove_after->next == NULL) return false;
struct llnode *to_remove = remove_after->next;
remove_after->next = remove_after->next->next;
lln_destroy(to_remove);
return true;
}
39
Time for a Quiz! [Select most appropriate.]
Which of the function below void C(struct llist *lst) {
assert(lst);
correctly destroy a linked list? struct llnode *node1 = lst->front;
struct llnode *node2 = NULL;
void A(struct llist *lst) { free(lst);
assert(lst); while (node1 != NULL) {
struct llnode *node = lst->front; node2 = node1;
while (node != NULL) { node1 = node1->next;
free(node); free(node2);
node = node->next; }
} }
free(lst);
}
void D(struct llist *lst) {
void B(struct llist *lst) { assert(lst);
assert(lst); free(lst);
struct llnode *node1 = lst->front; struct llnode *node1 = lst->front;
struct llnode *node2 = NULL; struct llnode *node2 = NULL;
while (node1 != NULL) { while (node1 != NULL) {
node2 = node1->next; node2 = node1;
free(node1); free(node2);
node1 = node2;
node1 = node1->next;
}
free(lst); }
} }
41
Linked Data Augmentations
List Length
Linked List with Back-pointer
Caching Information – List Length
Consider that we are writing an application where the length of a linked
list will be queried often.
Typically, finding the length of a linked list is 𝑂 𝑛 .
However, we can store (or “cache”) the length in the llist structure, so
the length can be retrieved in 𝑂 1 time.
struct llist {
struct llnode *front;
int length;
};
43
Caching Information – List Length
Naturally, other list functions would have to update the length as
necessary:
44
Data Integrity
The introduction of the length field to the linked list may seem like a
great idea to improve efficiency.
However, it introduces new ways that the structure can be corrupted.
What if the length field does not accurately reflect the true length? For
example, imagine that someone implements the ll_remove_item
function, but forgets to update the length field?
Or a naïve coder may think that the following statement removes all of
the nodes from the list:
lst->length= 0;
45
Data Integrity
Whenever the same information is stored in more than one way, it is susceptible to
integrity (consistency) issues.
Advanced testing methods can often find these types of errors, but you
must exercise caution.
46
Caching Information – Back Pointer
Sometimes we have to add elements to the end of a linked list. In a
normal linked list, ll_insert_back has a runtime behaviour of O(n).
With storing a back pointer, i.e., a pointer to the back-element of a linked
list, we can improve ll_insert_back to O(1).
47
Linked List with a Back Pointer
The interface of linked list with a back pointer is the same as the one for
a regular linked list. In fact, the presence of a back pointer is an
implementation detail that the client is not aware of.
48
Linked List with a Back Pointer
// llist.c [IMPLEMENTATION]
// a linked list with a back pointer
struct llist {
struct llnode *front;
struct llnode *back; // back-pointer
};
// [CLIENT]
struct llist *lst = ll_create();
49
Linked List with a Back Pointer
// llist.c [IMPLEMENTATION]
void ll_insert_front(struct llist *lst, int itm) {
struct llnode *new_node = lln_create(itm);
if (lst->front== NULL) { // inserting into empty list
lst->front = lst->back = new_node;
} else { // inserting into non-empty list
new_node->next = lst->front;
lst->front = new_node;
}
}
// [CLIENT]
ll_insert_front(llst, 136);
50
Linked List with a Back Pointer
// llist.c [IMPLEMENTATION]
void ll_insert_back(struct llist *lst, int itm) {
struct llnode *new_node = lln_create(itm);
if (lst->front == NULL) { // inserting into empty list
lst->front = lst->back = new_node;
} else { // inserting into non-empty list
lst->back->next = new_node;
lst->back = new_node
}
}
// [CLIENT]
ll_insert_back(llst, 136);
// ...
ll_insert_back(llst, 9001);
51
Linked List with a Back Pointer
// llist.c [IMPLEMENTATION]
void ll_remove_front(struct llist *lst) {
assert(lst->front);
struct llnode *to_remove = lst->front;
lst->front = lst->front->next;
lln_destroy(to_remove);
if (lst->front == NULL) {
lst->back = NULL;
}
}
52
Linked List with a Back Pointer
// llist.c [IMPLEMENTATION]
void ll_remove_back(struct llist *lst) {
assert(lst->back);
struct llnode *destroy_after = lst->front;
struct llnode *to_destroy = lst->front;
while (to_destroy->next != NULL) {
destroy_after = to_destroy;
to_destroy = to_destroy->next;
}
if (to_destroy == lst->first) { // removed final element
lst->front = lst->back = NULL;
} else { // removed non-final element
destroy_after->next = NULL;
lst->back = destroy_after;
}
lln_destroy(to_destroy);
}
53
Linked Node Augmentations
Node Augmentation
It is also possible to augment nodes with additional information about
the node or the structure.
55
Node Augmentation
Linked List Linked List w/ Doubly Linked List w/
Back Pointer Back Pointer
insert_front 𝑂 1 𝑂 1 𝑂 1
insert_back 𝑂 𝑛 𝑂 1 𝑂 1
remove_front 𝑂 1 𝑂 1 𝑂 1
remove_back 𝑂 𝑛 𝑂 𝑛 𝑂 1
56
Node Augmentation
Nodes can have additional augmentations to store more data. It might be
preferable, however, to instead store data in dedicated data-structures.
57
Trees
Tree Terminology
• the root node has no parent, all
others have exactly one
• nodes can have multiple children
• in a binary tree, each node has at
most two children
• a leaf node has no children
• the height of a tree is the
maximum possible number of
nodes from the root to a leaf
(inclusive) (here: 3)
• the height of an empty tree is
zero
• the node count is the number of
nodes in a tree (here: 6)
59
Tree Nodes
At the implementation level, trees are very similar to linked lists. Each
node can link to two nodes (binary tree) or more.
60
Binary Search Tree (BST)
Binary Search Tree (BSTs) enforce the ordering property: for every node
with data d, all data in the left subtree is smaller than d, and data in the
right subtree is larger than d.
61
Binary Search Tree – Implementation
Our BST node (btnode) is similar to our linked list node definition in that
it uses pointers to connect to its left and right sub-tree.
struct btnode {
int data;
struct btnode *left;
struct btnode *right;
};
struct bst {
struct btnode *root;
};
62
Binary Search Tree – Create
As with linked lists, we need a function to create a new BST.
63
Binary Search Tree – Traverse
Tree traversal can be achieved recursively and imperatively, but the
recursive approach is easier. (The imperative approach requires an
additional ADT.)
64
Binary Search Tree – Nodes
Before writing code to insert a new node, first we write a helper to create
and destroy nodes.
65
Binary Search Tree – Insert
As with lists, we can write tree functions recursively or iteratively.
For the recursive version, we will need a wrapper, and we can handle the
special case that the tree is empty.
66
Binary Search Tree – Insert
For the worker function, we recurse on nodes.
67
Binary Search Tree – Insert
void bst_insert_wrkr(struct btnode *node, int data) {
// find node after which to insert
struct btnode *insert_after = NULL;
while (node != NULL) {
insert_after = node;
if (data < node->data) { // data should go left
node = node->left;
} else if (data > node->data) { // data should go right
node = node->right;
} else { // data already exists
return;
}
}
// inserting new data
if (data < insert_after->data) {
insert_after->left = node_create(data);
} else if (data > insert_after->data) {
insert_after->right = node_create(data);
}
}
68
Binary Search Tree – Iteration vs. Recursion
Use iteration when you have to find Use recursive when you have to
a single node in the BST: traverse the entire BST:
69
Trees and Efficiency
What is the efficiency of bst_insert?
The worst case is when the tree is unbalanced, and every node in the
tree must be visited.
70
Trees and Efficiency
The running time of bst_insert is O ℎ : it depends more on the height
of the tree (ℎ) than the number of nodes in the tree (𝑛).
The definition of a balanced tree is a tree where the height (ℎ) is
O log 𝑛 .
Conversely, an unbalanced tree is a tree with a height that is not
O log 𝑛 . The height of an unbalanced tree is O 𝑛 .
71
Trees and Efficiency
With a balanced tree, the running time of standard tree functions (e.g.,
insert, remove, search) are all O log 𝑛 .
With an unbalanced tree, the running time of each function is O 𝑛 .
A self-balancing tree “re-arranges” the nodes to ensure that tree is
always balanced.
With a good self-balancing implementation, all standard tree functions
preserve the balance of the tree and have an O log 𝑛 running time.
In CS 240 and CS 341 you will see self-balancing trees. Self-balancing trees often use
node augmentations to store additional information to aid re-balancing.
72
Node Augmentation – Node Count
A popular tree node augmentation is to store in each node the count
(number of nodes) in its subtree.
struct btnode {
int data;
struct btnode *left;
struct btnode *right;
int count;
};
It also allows us to implement a select function, which finds the item with
index k in the tree, in O ℎ time.
73
Node Augmentation – Node Count
74
Node Augmentation – Node Count
The following code illustrates how to select item with index k in a BST with a
count node augmentation.
76
Array-based Trees
We can transform the binary tree
on the right, which uses linked
nodes to structure and store the
data, into a tree that uses an array
of nodes instead.
77
Array-based trees are often used to implement “complete trees”, where
there are no empty nodes, and every level of the tree is filled (except
the bottom).
78
Time for a Quiz! [Select all that apply.]
Which of the following are correct statements about binary trees?
79
Time for a Quiz! [Select all that apply.]
Answers will be discussed in class!
80
Time for a Quiz! [Select most appropriate.]
What is the complexity of finding the minimum value in a balanced BST /
in an unbalanced BST with 𝑛 nodes?
81
Time for a Quiz! [Select most appropriate.]
Answers will be discussed in class!
82
End of the Session
At the end of this section, you should be able to: Any further
• use the new linked list and tree terminology questions?
introduced
• use linked lists and trees with a recursive or iterative
approach
• use wrapper structures and node augmentations to
improve efficiency
• explain why an unbalanced tree can affect the
efficiency of tree functions
83