DATA STRUCTURES AND
ALGORITHMS
Searching
1
CONTENT
• Sequential Search
• Binary Search
• Binary Search Tree
• Hashing
2
Sequential Search
• Given a sequence X[L..R] and a sequentialSearch(X[], int L, int R,
value Y. Find the index i such that int Y) {
X[i] = Y for(i = L; i <= R; i++)
if(X[i] = Y) return i;
return -1;
}
3
Binary Search
• Sequence of objects is sorted in a
non-increasing (non-decreasing)
order of keys
• Base on divide and conquer:
• Compare the input key with
the key of the object in the
middle of the sequence and
decide to perform binary
search on the left
subsequence or the right
subsequence of the object in
the middle
1 2 3 4 5 6 7 8 9 10
2 4 5 8 9 12 14 16 17 20 Y = 17
4
Binary Search
• Sequence of objects is sorted in a binarySearch(X[], int L, int R,
non-increasing (non-decreasing) int Y) {
order of keys if(L = R){
if(X[L] = Y) return L;
• Base on divide and conquer:
return -1;
• Compare the input key with }
the key of the object in the int mid = (L+R)/2;
middle of the sequence and
if(X[mid] = Y) return mid;
decide to perform binary
if(X[mid] < Y)
search on the left
return binarySearch(X,mid+1,R,Y);
subsequence or the right
subsequence of the object in return binarySearch(X,L,mid-1,Y);
the middle }
• Running time: O(log(R-L))
5
Binary Search
• Exercise Given a sequence of distinct elements a1, a2, …, aN and a
value b. Count the number of pairs (ai, aj) having ai + aj = b (i < j)
6
Binary Search Tree - BST
• BST is a data structure storing
objects under a binary tree:
• Key of each node is greater
than the keys of nodes of the
left sub-tree and smaller than 20
the keys of nodes of the right
sub-tree
10 26
struct Node{ 7 15 23 30
int key;
Node* leftChild;
Node* rightChild;
};
Node* root; 3 8
7
Binary Search Tree - BST
• Operations
• Node* makeNode(int v): create a node and return a pointer to the
created node
• Node* insert(Node* r, int v): create and insert a node with key v
into the BST with root r
• Node* search(Node* r, int v): find and return a node having key v
in the BST with root r
• Node* findMin(Node* r): find and return the node having minimum
key
• Node* del(Node* r, int v): remove the node having key v from the
BST with root r
8
Binary Search Tree - BST
Node* makeNode(int v) { Node* insert(Node* r, int v) {
Node* p = new Node; if(r == NULL)
p->key = v; r = makeNode(v);
p->leftChild = NULL; else if(r->key > v)
p->rightChild = NULL; r->leftChild = insert(r->leftChild,v);
return p; else if(r->key <= v)
} r->rightChild = insert(r->rightChild,v);
return r;
}
9
Binary Search Tree - BST
Node* search(Node* r, int v) { Node* findMin(Node* r) {
if(r == NULL) if(r == NULL)
return NULL; return NULL;
if(r->key == v) Node* lmin = findMin(r->leftChild);
return r; if(lmin != NULL) return lmin;
else if(r->key > v) return r;
return search(r->leftChild, v); }
return search(r->rightChild, v);
}
10
Binary Search Tree - BST
• Xóa nút gốc
20 23
Thay thế 20 bằng 23
10 26 10 26
7 15 23 30 7 15 30
3 8 3 8
11
Binary Search Tree - BST
Node* del(Node* r, int v) {
if(r == NULL) return NULL;
else if(v < r->key) r->leftChild = del(r->leftChild, v);
else if(v > r->key) r->rightChild = del(r->rightChild, v);
else{
if(r->leftChild != NULL && r->rightChild != NULL){
Node* tmp = findMin(r->rightChild);
r->key = tmp->key; r->rightChild = del(r->rightChild, tmp->key);
}else{
Node* tmp = r;
if(r->leftChild == NULL) r = r->rightChild;
else r = r->leftChild;
delete tmp;
}
}
return r;
}
12
Balanced Binary Search Tree - AVL
• AVL is a BST with balance property
• Difference of heights of the left child and the right child of each
node is at most 1
• The height of a AVL is logN (N is the number of nodes)
• Insertion and removal of a node from the AVL must conserve the
balance property
13
Balanced Binary Search Tree - AVL
20 20
10 26 10 26
7 15 23 30 7
3 8 3 8
AVL BST but not AVL
14
Balanced Binary Search Tree - AVL
• Difference of the heights of two children of a node might be 2 after the
insertion or removal of a node
• Perform rotations to recover the balance property
15
Balanced Binary Search Tree - AVL
Case 1
K2 K1
K1 C K2
Right rotation at K2 A
A B B C
TC
TA
TB TB TC
TA
16
Balanced Binary Search Tree - AVL
Case 2
Left rotation at K1 Right rotation at K3 K2
K3 K3
K1 C K2 C K3
K1
A K2 K1 B
TC TC A D B C
D B A D
TA TB
TA TD TB TC
TD TB TA TD
17
Hashing
• Mapping: a data structure storing pairs (key, value)
• put(k,v): Set a mapping from a key k to a value v
• get(k): return the value of the key k
• Implementation
• Binary Search Trees
• Hash tables
18
Hashing
• Direct addressing
• Value of the key k is the address indicate the place in the table
storing the pair (k,v)
• Advantages: simple, fast lookup
• Disadvantages: memory usage might be ineffective
19
Hashing
• Hash function h(k) specifies the address where the pair (k, value) is
stored
• h(k) should be simple and easy to compute
20
Hashing
• Collision: Two different keys have the same value of the hash function
(hash code):
• Resolution:
• Chaining: group keys having the same hash code into buckets
• Open Addressing
21
Hashing
• Modulo: h(k) = k mod m where m is the size of the hash table
22
Hashing: Open Addressing
• Open Addressing
• Pairs (key, value) are stored in the table itself
• Operations put(k, v) and get(k) need to probe the table until the desired
slot found
• put(k, v): probe for finding a free slot for storing (k, v)
• get(k): probe for finding the slot where the key k is stored
• Probing order: h(k, 0), h(k, 1), h(k, 2), …, h(k, m-1)
• Methods
• Linear probing: h(k, i) = (h1(k) + i) mod m where h1 is normal
hash function
• Quadratic probing: h(k, i) = (h1(k) + c1i + c2i2) mod m where h1
is normal hash function
• Double hashing: h(k, i) = (h1(k) + i * h2(k)) mod m where h1
and h2 are normal hash functions
23
Hashing: Open Addressing
get(k) put(k, v)
{ {
// T: the table // T: the table
i = 0; x.key = k; x.value = v;
while(i < m) { i = 0;
j = h(k,i); while(i < m) {
if(T[j].key = k) { j = h(k,i);
return T[j]; if(T[j] = NULL) {
} T[j] = x; return j;
i = i + 1; }
} i = i + 1;
return NULL; }
} error(“Hash table overflow”);
}
24
Hashing: Open Addressing
• Exercise: A table has m slots, apply the open addressing method in
which h(k, i) has the form:
h(k, i) = (k mod m + i) mod m
• Initialization, the table is free, present the status of the table after
inserting following sequence of keys 7, 8, 6, 17, 4, 28 into the table with
m = 10
25
Hashing: Open Addressing
• Exercise: A table has m slots, apply the open addressing method in
which h(k, i) has the form:
h(k, i) = (k mod m + i) mod m
• Initialization, the table is free, present the status of the table after
inserting following sequence of keys 7, 8, 6, 17, 4, 28 into the table with
m = 10
0 1 2 3 4 5 6 7 8 9
28 x x x 4 x 6 7 8 17
26
Hashing functions
• Key is an integer
• h(k) = mod m
• Key is a string
• k = s[0..n] → h(k) = (s[0]*256n + s[1]*256n-1 + … + s[n]*2560) mod m
27
28