Final Report SOS
Final Report SOS
Final Report
Varun Luhadia
July 29, 2024
CONTENTS
Abstract and Timeline 3
Chapter 1: Basics of Programming 3
1.1 Data Types 3
1.2 If-Else Condition Statements 4
1.3 For and While Loops 4
1.4 Functions 5
1.5 Arrays 5
1.6 Strings 5
1.7 Pointers 5
Chapter 2: Recursion and Backtracking 6
2.1 Recursion 6
2.2 Backtracking 6
2.3 Application Examples 7
Chapter 3: Object Oriented Programming 7
3.1 Classes and Objects 7
3.2 Encapsulation 7
3.3 Inheritance 7
3.4 Polymorphism 8
3.5 Abstraction 8
Chapter 4: Linked List 8
4.1 Introduction 8
4.2 Types of Linked Lists 8
4.3 Operations 9
Chapter 5: Stack 9
5.1 Introduction 9
5.2 Operations 9
Chapter 6: Queue 10
6.1 Introduction 10
6.2 Operations 10
Chapter 7: Binary Search Tree 11
7.1 Introduction 11
7.2 Implementation of Binary Search Tree 11
7.3 Insertion in a BST 11
7.4 Deletion in a BST 12
7.5 Traversal Methods 12
7.6 Duplicate Elimination 12
7.7 Binary Tree Search 12
7.8 Other Binary Tree Operations 13
Chapter 8: Heap 13
8.1 Introduction 13
8.2 Types of Heaps 13
8.3 Heap Operations 13
8.4 Properties of Heap 13
8.5 Time Complexity of Heap 14
8.6 Applications of Heaps 14
8.7 Advantages of Heaps 14
1
8.8 Disadvantages of Heaps 14
8.9 Comparison between Heap and Tree 15
8.10 Other Types of Heaps 15
Chapter 9: Hashing 15
9.1 Introduction 15
9.2 Components of Hashing 15
9.3 Working of Hashing 16
9.4 Hash Function 16
9.5 Types of Hash Functions 16
9.6 Time and Space Complexity of a Hash Function 16
9.7 Collision in Hashing 16
9.8 Separate Chaining 16
9.9 Open Addressing 16
9.10 Load Factor in Hashing 17
9.11 Re-Hashing 17
Chapter 10: Tries 17
10.1 Introduction to Trie 17
10.2 Purpose and Need of Trie 17
10.3 Advantages Over Hash Tables 17
10.4 Properties of Trie 18
10.5 Working of Trie 18
10.6 Applications of Trie 18
10.7 Complexity Analysis 18
10.8 Advantages of Trie 18
10.9 Disadvantages of Trie 18
Chapter 11: Dynamic Programming 19
11.1 Introduction 19
11.2 Working of Dynamic Programming 19
11.3 Characteristics of Dynamic Programming 19
11.4 Approaches of Dynamic Programming 19
11.5 Difference between Dynamic Programming and Recursion 20
11.6 Difference between Tabulation and Memoization 20
Chapter 12: Greedy Algorithm 21
12.1 Introduction 21
12.2 Steps for implementing a Greedy Algorithm 21
12.3 Characteristics of Greedy Algorithms 21
12.4 Characteristic Components of Greedy Algorithm 22
12.5 Difference between Greedy Algorithm and Dynamic Programming 22
Chapter 13: Graphs 23
13.1 Introduction 23
13.2 Types of Graphs 23
13.3 Graph Representation 23
13.4 Graph Traversal Algorithms 24
13.5 Shortest Path Algorithms 24
13.6 Minimum Spanning Tree (MST) 24
13.7 Applications of Graph Theory 25
13.8 Advanced Topics 25
13.9 Complexity Analysis 25
2
Abstract and Timeline
This mid-term report aims to highlight the basics of Data Structures and Algorithms consisting
of introductory theory and some generic codes. My learning so far is entirely based on a series of
YouTube videos named DSA Playlist by Striver (Raj Vikramaditya) and C How to Program with
an Introduction to C++ by Paul Deitel and Harvey Deitel.
Week Target
Week 1 Basics (Data Types, If-Else, Looping, Functions, Arrays, Strings, Pointers)
Week 5 Heap
Week 6 Hashmaps
Definition: Data types specify the kind of data that can be stored and manipulated within a
program.
Importance:
● Integer (int): Represents whole numbers. Used in counting, indexing, and loops.
Example: 5, -3, 42.
● Floating Point (float): Represents numbers with a fractional component. Used in
precision calculations. Example: 3.14, -0.001.
● Character (char): Represents single characters. Example: 'a', 'Z'.
● String (str): Represents a sequence of characters. Example: "Hello, World!".
● Boolean (bool): Represents truth values. Example: True, False.
3
● List: Ordered collection of mixed data types. Example: [1, "two", 3.0].
● Tuple: Ordered collection of immutable elements. Example: (1, 2, "three").
● Set: Unordered collection of unique elements. Example: {1, 2, 3}.
● Dictionary (dict): Collection of key-value pairs. Example: {"name": "Alice", "age": 25}.
● Stack: LIFO collection. Example operations: push, pop.
● Queue: FIFO collection. Example operations: enqueue, dequeue.
● Tree: Hierarchical data structure. Example: binary tree.
● Graph: Collection of nodes and edges. Example: directed graph.
Application in Algorithms:
Definition: Control structures that execute different blocks of code based on conditions.
Importance:
● Facilitates decision-making.
● Enables specific code paths depending on dynamic conditions.
Basic Syntax:
Importance:
Types:
1.4 Functions
Importance:
Basic Syntax:
1.5 Arrays
Importance:
Types:
1.6 Strings
Operations:
1.7 Pointers
Definition: Variables storing memory addresses of other variables.
Importance:
● Memory management.
● Performance improvement.
● Flexibility in function implementation.
5
2.1 Recursion
1. Definition:
○ Recursion is a programming technique where a function calls itself directly or
indirectly to solve a problem. It involves solving a problem by breaking it down
into smaller, more manageable sub-problems of the same type.
2. Types of Recursion:
○ Functional Recursion: Involves a function that calls itself to solve smaller
instances of the same problem until a base case is reached. This type is
straightforward and often used in problems like calculating the factorial of a
number or traversing a tree structure.
○ Parameterized Recursion: Extends functional recursion by passing additional
parameters that change with each recursive call, influencing the problem-solving
process. This type can be more flexible and powerful, as seen in generating the
Fibonacci series.
3. Importance:
○ Recursion simplifies complex problems by dividing them into simpler
sub-problems, which can be solved more easily and combined to form the final
solution.
○ It provides elegant and intuitive solutions for many problems, especially those
involving hierarchical structures such as trees or graphs. Examples include tree
traversal algorithms, factorial calculations, and sorting algorithms like quicksort
and mergesort.
2.2 Backtracking
1. Definition:
○ Backtracking is a technique for solving problems recursively by attempting to
build a solution incrementally, one piece at a time, and backing out (backtracking)
when a dead-end is reached. It is particularly useful in situations where a
sequence of decisions is made to arrive at a solution.
2. Importance:
○ Backtracking is useful for solving problems that involve finding a sequence of
decisions or choices that lead to a solution. It is commonly used in constraint
satisfaction problems where the goal is to find a solution that satisfies a set of
constraints, such as the Sudoku puzzle, the N-Queens problem, and graph
traversal problems like finding Hamiltonian paths or cycles.
3. Key Concepts:
○ Decision Space: Each step in backtracking involves making a decision and
exploring all possible choices. If a choice leads to a solution, it is accepted;
otherwise, the algorithm backtracks to the previous decision point and tries the
next option.
○ Recursion: The backtracking algorithm typically uses recursion to explore each
decision path. Each recursive call represents a step in the decision-making
process.
○ Backtracking: If a path does not lead to a solution, the algorithm backtracks to
the previous decision point and tries a different path. This ensures that all
possible solutions are explored systematically.
● Recursion:
○ Factorial Calculation: A simple example where a function calls itself to
compute the factorial of a number. The function reduces the problem size by one
in each call until it reaches the base case (factorial of 0 or 1).
6
○ Fibonacci Series: Demonstrates parameterized recursion by generating the
Fibonacci series. The function is called with additional parameters representing
the previous two terms in the series, and the parameters are updated with each
call.
● Backtracking:
○ N-Queens Problem: Involves placing N queens on an N×N chessboard such that
no two queens threaten each other. The backtracking algorithm explores possible
positions for each queen row by row, ensuring that no two queens are in the same
row, column, or diagonal. If placing a queen leads to a conflict, the algorithm
backtracks and tries a different position.
● Definition: A class is a blueprint for creating objects that encapsulate data (attributes)
and behaviors (methods) into a single unit. Objects are instances of classes, containing
the specific data and functions defined by the class.
● Example: Consider a Car class with attributes like speed and model. Objects of this
class, such as car1 and car2, represent specific cars with defined models and speeds.
Methods within the class, such as displayInfo(), allow interaction with the object's data.
3.2 Encapsulation
● Definition: Encapsulation binds data (attributes) and methods (behaviors) that operate
on the data into a single unit (class), protecting the internal state of an object from
outside interference and misuse.
● Example: In a BankAccount class, attributes like balance are private, meaning they
cannot be directly accessed from outside the class. Methods like deposit() and withdraw()
are provided to interact with the balance, ensuring control over how the balance is
modified.
3.3 Inheritance
● Definition: Inheritance allows one class (derived or child class) to inherit the attributes
and methods of another class (base or parent class), promoting code reuse and
establishing a hierarchy.
● Example: An Animal class might have a method sound(). A Dog class, derived from
Animal, can inherit this method and override it to provide specific behavior, such as
barking.
3.4 Polymorphism
3.5 Abstraction
7
● Example: An abstract Shape class may declare a pure virtual function draw(). Concrete
subclasses like Circle and Rectangle implement this function, providing specific details
for drawing the shapes, while the user interacts with the abstract interface without
needing to know the underlying details.
4.1 Introduction
A linked list is a fundamental data structure that consists of a sequence of elements, where each
element points to the next one, forming a chain-like structure. This dynamic linear collection
allows for efficient insertion and deletion of elements. Unlike arrays, linked lists do not require a
contiguous block of memory; they can grow and shrink dynamically. Each element in a linked list
is called a node, and every node contains two components: data and a reference (or pointer) to the
next node in the sequence. The last node in a linked list points to a null value, indicating the end
of the list.
4.3 Operations
1. Insertion:
○ Nodes can be inserted at the beginning, middle, or end of the list.
○ To insert a node, we need to adjust the pointers of the neighboring nodes to link to
the new node.
○ Inserting at the beginning involves making the new node point to the current
head and then updating the head to the new node.
○ Inserting at the end involves traversing to the last node and updating its pointer
to the new node.
○ Inserting in the middle involves finding the correct position and updating the
pointers accordingly.
2. Deletion:
○ Nodes can be deleted from any position in the list.
○ To delete a node, we need to update the pointers of the neighboring nodes to
bypass the node to be deleted.
○ Deleting the first node involves updating the head to the next node.
8
○Deleting the last node involves updating the pointer of the second-to-last node to
null.
○ Deleting a node in the middle involves adjusting the pointers of the surrounding
nodes.
3. Traversal:
○ Traversing a linked list involves visiting each node from the head to the end.
○ During traversal, we can perform operations like printing the data or searching
for a specific value.
4. Searching:
○ To find an element in a linked list, we traverse the list from the head, comparing
each node’s data with the target value until we find a match or reach the end of
the list.
5. Reversing:
○ Reversing a linked list involves changing the direction of the pointers so that each
node points to its previous node instead of the next one.
○ The head becomes the tail and vice versa.
Chapter 5: Stack
5.1 Introduction
A stack is a linear data structure that follows the Last In, First Out (LIFO) principle. It is
analogous to a stack of plates, where the last plate added is the first one to be removed. In a
stack, elements can be added and removed only from the top. Stacks are used in various
applications, including expression evaluation, backtracking algorithms, and managing function
calls in recursion.
5.2 Operations
1. Push:
○ Adding an element to the top of the stack.
○ If the stack is implemented using an array, this involves increasing the top index
and placing the new element at that index.
○ If the stack is implemented using a linked list, this involves creating a new node
and adjusting the pointers to make it the new top.
2. Pop:
○ Removing the top element from the stack.
○ If the stack is implemented using an array, this involves accessing the element at
the top index and then decreasing the top index.
○ If the stack is implemented using a linked list, this involves adjusting the
pointers to remove the top node and making the next node the new top.
3. Peek:
○ Accessing the top element without removing it.
○ This operation returns the value of the top element.
4. IsEmpty:
○ Checking whether the stack is empty.
○ For an array-based stack, this involves checking if the top index is -1.
○ For a linked list-based stack, this involves checking if the head pointer is null.
5. Size:
○ Determining the number of elements in the stack.
○ This can be achieved by maintaining a counter that is updated with each push
and pop operation.
Chapter 6: Queue
9
6.1 Introduction
A queue is a linear data structure that follows the First In, First Out (FIFO) principle. It is
analogous to a line of people waiting for a service, where the first person in line is the first to be
served. In a queue, elements are added at the rear and removed from the front. Queues are used
in various applications, including task scheduling, managing requests in a system, and
breadth-first search in graphs.
6.2 Operations
1. Enqueue:
○ Adding an element to the rear of the queue.
○ If the queue is implemented using an array, this involves increasing the rear
index and placing the new element at that index.
○ If the queue is implemented using a linked list, this involves creating a new node
and adjusting the pointers to add it to the end of the list.
2. Dequeue:
○ Removing the front element from the queue.
○ If the queue is implemented using an array, this involves accessing the element at
the front index and then increasing the front index.
○ If the queue is implemented using a linked list, this involves adjusting the
pointers to remove the front node and making the next node the new front.
3. Front:
○ Accessing the front element without removing it.
○ This operation returns the value of the front element.
4. IsEmpty:
○ Checking whether the queue is empty.
○ For an array-based queue, this involves checking if the front index is greater than
the rear index.
○ For a linked list-based queue, this involves checking if the head pointer is null.
5. Size:
○ Determining the number of elements in the queue.
○ This can be achieved by maintaining a counter that is updated with each enqueue
and dequeue operation.
6. Circular Queue:
○ A variation of the queue where the rear pointer wraps around to the front of the
array when it reaches the end.
○ This allows for efficient utilization of space in the array.
7.1 Introduction
A Binary Search Tree (BST) is a specialized form of a binary tree, which is a hierarchical,
nonlinear data structure. Unlike linear data structures like arrays, linked lists, stacks, and
queues, trees represent data in a more complex, branched form. In a binary tree, each node has
at most two children, referred to as the left and right children. This characteristic makes binary
trees particularly effective for search operations.
In a BST, the structure adheres to a specific ordering property: for each node, all elements in its
left subtree are less than the node’s value, and all elements in its right subtree are greater. This
property ensures that the BST is sorted in an ordered fashion, facilitating efficient search,
insertion, and deletion operations. Additionally, BSTs do not contain duplicate values, ensuring
each element is unique within the tree.
10
7.2 Implementation of Binary Search Tree
1. Insertion: Adding a new node to the BST while maintaining its ordering property.
2. Deletion: Removing a node from the BST and ensuring the tree remains properly
ordered.
3. Traversal: Visiting all nodes in the tree in a specific order, commonly performed in three
ways:
○ In-order Traversal: Visits nodes in ascending order.
○ Pre-order Traversal: Visits nodes in the order they were inserted.
○ Post-order Traversal: Visits all children nodes before their respective parent
nodes.
Insertion in a BST starts at the root node and involves comparing the value to be inserted with
the current node's value. If the value is less than the current node’s value, the insertion process
moves to the left child; if greater, it moves to the right child. This process continues recursively
until a null position is found, where the new node is then inserted.
1. Start at the root node. If the tree is empty, the new node becomes the root.
2. Compare the new value with the current node’s value.
3. Move left or right based on the comparison:
○ If the new value is less, move to the left child.
○ If the new value is greater, move to the right child.
4. Repeat the process until a null position is found.
5. Insert the new node at the null position.
Deleting a node in a BST is more complex than insertion and involves three primary scenarios:
Traversal of a BST is a method to visit all nodes in a specific order. There are three main
traversal techniques:
1. In-order Traversal:
○ Traverse the left subtree.
○ Visit the current node.
○ Traverse the right subtree.
○ Use case: Retrieves the BST elements in ascending order.
2. Pre-order Traversal:
○ Visit the current node.
○ Traverse the left subtree.
11
○ Traverse the right subtree.
○ Use case: Useful for creating a copy of the tree.
3. Post-order Traversal:
○ Traverse the left subtree.
○ Traverse the right subtree.
○ Visit the current node.
○ Use case: Used in deleting the tree or evaluating postfix expressions.
BSTs inherently manage duplicates by their structure. When an insertion operation encounters a
duplicate value (a value already present in the tree), it simply discards the duplicate. This is
achieved by the recursive comparison process during insertion, ensuring each value in the tree is
unique.
Searching for a value in a BST is highly efficient due to its ordered nature. Starting at the root,
the search process involves comparing the target value with the current node's value and
deciding to move left or right accordingly. This binary decision-making process significantly
reduces the number of comparisons needed to locate a value.
In a balanced BST, the height of the tree is logarithmic relative to the number of nodes (log₂n).
Therefore, searching for an element in a BST with n nodes takes at most log₂n comparisons. For
example, searching a BST with 1,000 elements requires no more than 10 comparisons, while
searching a tree with 1,000,000 elements requires no more than 20 comparisons.
1. Level Order Traversal: Visits nodes level by level, starting from the root. This can be
implemented using a queue.
2. Tree Height Calculation: Determines the number of levels in the tree.
3. Balancing the Tree: Ensures the tree remains balanced for optimal performance.
Balanced trees prevent skewed structures that degrade performance to linear search
times.
Chapter 8: Heap
8.1 Introduction
A heap is a binary tree-based data structure that satisfies the heap property. This property
ensures that the value of each node is greater than or equal to the value of its children. The root
node contains either the maximum or minimum value, depending on the type of heap. Heaps are
used in a variety of applications, including priority queues, memory management, and graph
algorithms.
● Max Heap: The root node contains the maximum value, and the values decrease as you
move down the tree.
12
● Min Heap: The root node contains the minimum value, and the values increase as you
move down the tree.
Complete Binary Tree: A heap is a complete binary tree, meaning all levels are fully filled
except possibly the last level, which is filled from left to right. This allows efficient array
representation of the tree.
Heap Property: The heap property ensures that the minimum (in a min-heap) or maximum (in
a max-heap) element is always at the root of the tree.
Parent-Child Relationship: In a heap, the relationship between a parent node at index i and
its children is given by the formulas: left child at index 2i + 1 and right child at index 2i + 2.
Efficient Insertion and Removal: Insertion involves placing the new element at the next
available position at the bottom-rightmost level and then restoring the heap property by
comparing and possibly swapping the element with its parent. Removal involves replacing the
root with the last element and heapifying down.
Efficient Access to Extremal Elements: The minimum or maximum element is always at the
root, allowing constant-time access.
Heapify: This process rearranges elements to maintain the heap property. It takes O(log N) time
to balance the tree after an insertion or deletion.
Insertion: Inserting a new element may distort the heap's properties, necessitating a heapify
operation to restore them. This also takes O(log N) time.
getMax (for max-heap) or getMin (for min-heap): Finding the maximum or minimum
element is O(1) as it is always at the root.
removeMin or removeMax: This operation returns and deletes the root element (maximum in
a max-heap or minimum in a min-heap).
Building a heap from an input array has a time complexity of O(n). This is derived from the fact
that the Heapify operation, which takes O(log n) time, is called for each node in the tree. Since
the number of nodes at each level increases exponentially while the height of the tree decreases,
the overall time complexity for building the heap is O(n).
● Priority Queues: Heaps are commonly used to implement priority queues, where
elements are stored based on their priority. This allows efficient access to the
highest-priority element.
● Heapsort Algorithm: This sorting algorithm uses a heap to sort elements efficiently
with a worst-case time complexity of O(n log n).
13
● Memory Management: Heaps are used in memory management systems for dynamic
memory allocation and deallocation.
● Graph Algorithms: Algorithms like Dijkstra’s, Prim’s, and Kruskal’s use heaps to
manage priority queues efficiently.
● Job Scheduling: Heaps are used to schedule tasks based on their priority or deadline,
ensuring that the highest-priority task is always executed first.
● Efficient Insertion and Deletion: Operations like insertion and deletion are efficient,
typically taking O(log N) time.
● Efficient Priority Queue: Heaps allow constant-time access to the highest priority
element, making them suitable for priority queues.
● Guaranteed Access to Extremal Elements: The maximum or minimum element is
always at the root, ensuring quick access.
● Space Efficiency: Heaps require less memory compared to other data structures like
linked lists.
● Heap-sort Algorithm: Provides an efficient sorting algorithm with a worst-case time
complexity of O(n log n).
● Lack of Flexibility: Heaps are not very flexible as they are designed to maintain a
specific order of elements.
● Not Ideal for Searching: Searching for a specific element in a heap requires traversing
the entire tree, which has a time complexity of O(n).
● Not a Stable Data Structure: The relative order of equal elements may not be
preserved.
● Memory Management: Requires dynamic memory allocation, which can be challenging
in systems with limited memory.
● Complexity: Operations have a worst-case time complexity of O(n log n), which may not
be optimal for some applications.
● Binomial Heap: A collection of binomial trees that satisfies the heap property.
● Fibonacci Heap: A collection of trees with a heap-ordered structure, optimized for
amortized running times of O(1) for insert and O(log n) for extract-min.
● Leftist Heap: A binary heap where the left subtree is always larger than the right
subtree, maintained by calculating the null path length.
● K-ary Heap: A generalization of binary heaps where each node has K children instead of
2. This type includes max k-ary heaps and min k-ary heaps.
14
Chapter 9: Hashing
9.1 Introduction
Hashing in data structures refers to the process of transforming a given key into another value.
This involves mapping data to a specific index in a hash table using a hash function, enabling
fast retrieval of information based on its key. The transformation of a key to the corresponding
value is done using a hash function, and the value obtained is called a hash code.
With the exponential growth of data on the internet, efficient storage and retrieval become
crucial. Traditional data structures like arrays can store data in constant time O(1), but
searching in arrays takes at least O(logn) time, which can be inefficient for large datasets.
Hashing data structures address this inefficiency by allowing both storage and retrieval in
constant time O(1).
1. Key: The input to the hash function, which can be a string or integer.
2. Hash Function: Computes the index in the hash table for a given key.
3. Hash Table: A data structure that maps keys to values using the hash function.
1. Hash Function Calculation: The hash function is applied to the key to determine the
index for storage.
2. Key-Value Storage: The key-value pair is stored at the computed index in the hash
table.
3. Retrieval: The same hash function is used to retrieve the value using the key.
A hash function creates a mapping between key and value using a mathematical formula. Key
properties of a hash function include determinism, fixed output size, efficiency, uniformity,
preimage resistance, collision resistance, and the avalanche effect. A good hash function should
be efficiently computable, uniformly distribute keys, minimize collisions, and have a low load
factor.
1. Division Method: Uses the remainder of the key divided by a prime number.
2. Multiplication Method: Uses a constant to multiply the key and takes the fractional
part of the product.
3. Mid-Square Method: Squares the key and uses the middle digits of the result.
4. Folding Method: Divides the key into parts, sums the parts, and then takes the modulo.
5. Cryptographic Hash Functions: Designed for security, e.g., MD5, SHA-1, SHA-256.
6. Universal Hashing: Uses a family of hash functions to minimize collision probability.
7. Perfect Hashing: Creates a collision-free hash function for a static set of keys.
15
9.7 Collision in Hashing
Collisions occur when two different keys map to the same hash value. Hash collisions can be
handled using separate chaining or open addressing.
This method uses a linked list for each index of the hash table to handle collisions. Multiple
elements that hash to the same slot are stored in the linked list. The performance of chaining
depends on the load factor n/mn/mn/m.
In open addressing, all elements are stored in the hash table itself. Methods include linear
probing, quadratic probing, and double hashing. The process involves probing for the next
available slot if a collision occurs.
Linear Probing
Sequentially searches for the next available slot starting from the original hash location.
Quadratic Probing
Double Hashing
9.11 Re-Hashing
Rehashing involves increasing the size of the hash table and redistributing elements to new
buckets based on new hash values. This is done to prevent collisions and maintain the efficiency
of the hash table. When the load factor exceeds a certain threshold (typically 0.75), the table is
resized and elements are rehashed to reduce the load factor and collisions.
● Trie, short for retrieval, is a tree-based data structure used for efficiently storing and
retrieving strings.
● The name "Trie" comes from the word "reTRIEval," emphasizing its use in quick search
operations.
● Trie is designed primarily for scenarios where fast prefix-based searches are required,
such as autocomplete features in search engines or text editors.
16
● Unlike hash tables, which are generally faster for exact match lookups, tries excel in
operations involving partial matches and prefix searches.
● Prefix Search: Trie allows efficient prefix searching (or autocomplete) by structuring
data in a way that facilitates quick traversal down common prefixes.
● Alphabetical Order: It inherently supports alphabetical ordering of stored strings,
which is beneficial for applications requiring sorted outputs.
● No Hash Function Overhead: Unlike hash tables, which require computing hash
functions, tries directly map characters to nodes, eliminating this overhead.
● Search Efficiency: Searching in a trie for a string takes O(L) time complexity, where L
is the length of the string, making it potentially faster for large datasets compared to
hash tables in certain contexts.
● Node Structure: Each node in a trie represents a single character, and nodes are linked
to represent strings.
● Path Representation: The path from the root to any node represents a complete string.
● Child Nodes: Nodes typically use arrays or hash maps to store references to child nodes,
corresponding to possible characters following the current prefix.
● End of Word Marker: Nodes may also have a flag or counter indicating the end of a
valid word.
● Inserting a word involves traversing from the root through each character of the word
and creating new nodes where necessary.
● Searching involves following the nodes corresponding to each character in the query
string and checking if a valid word endpoint exists.
● Deletion requires careful handling to maintain the integrity of the trie structure,
potentially removing nodes only when necessary to avoid breaking other valid words.
● Efficient Prefix Search: Ideal for applications requiring fast prefix-based lookups.
17
● Ordered Output: Enables alphabetical ordering of stored strings without additional
sorting operations.
● Space Efficiency: Tries can be more space-efficient than other structures for certain
datasets, particularly those with many shared prefixes.
● Flexible Use: Suitable for managing dictionaries, implementing search functionalities in
text editors, and optimizing routing algorithms.
● Memory Usage: Tries can consume more memory than hash tables due to the overhead
of maintaining node structures for each character.
● Complexity: Implementing and managing tries can be more complex compared to
simpler data structures like arrays or hash tables.
● Performance Degradation: In cases of poorly distributed keys or large datasets,
performance can degrade compared to hash tables optimized for exact match lookups.
11.1 Introduction
Dynamic Programming (DP) is a method used in mathematics and computer science to solve
complex problems by breaking them down into simpler subproblems. By solving each subproblem
only once and storing the results, it avoids redundant computations, leading to more efficient
solutions for a wide range of problems.
● Identify Subproblems: Divide the main problem into smaller, independent subproblems.
● Store Solutions: Solve each subproblem and store the solution in a table or array.
● Build Up Solutions: Use the stored solutions to build up the solution to the main problem.
● Avoid Redundancy: By storing solutions, DP ensures that each subproblem is solved only
once, reducing computation time.
Dynamic programming is an optimization technique used when solving problems that consists of
the following characteristics:
A. Optimal Substructure:
Optimal substructure means that we combine the optimal results of subproblems to achieve the
optimal result of the bigger problem.
B. Overlapping Subproblems:
The same subproblems are solved repeatedly in different parts of the problem.
● In dynamic programming, problems are solved by breaking them down into smaller ones to
solve the larger ones, while recursion is when a function is called and executed by itself.
While dynamic programming can function without making use of recursion techniques, since
the purpose of dynamic programming is to optimize and accelerate the process, programmers
usually make use of recursion techniques to accelerate and turn the process efficiently.
● When a function can execute a specific task by calling itself, receive the name of the recursive
function. In order to perform and accomplish the work, this function calls itself when it has to
be executed.
● Using dynamic programming, you can break a problem into smaller parts, called
subproblems, to solve it. Dynamic programming involves solving the problem for the first
time, then using memoization to store the solutions.
● Therefore, the main difference between the two techniques is their intended use; recursion is
used to automate a function, whereas dynamic programming is an optimization technique
used to solve problems.
● Recursive functions recognize when they are needed, execute themselves, then stop working.
When the function identifies the moment it is needed, it calls itself and is executed; this is
called a recursive case. As a result, the function must stop once the task is completed, known
as the base case.
● By establishing states, dynamic programming recognizes the problem and divides it into
sub-problems in order to solve the whole scene. After solving these sub-problems, or
variables, the programmer must establish a mathematical relationship between them. Last
but not least, these solutions and results are stored as algorithms, so they can be accessed in
the future without having to solve the whole problem again.
19
If some subproblems in the
If all subproblems must be solved
subproblem space need not be
at least once, a bottom-up
solved at all, the memoized
Subproblem dynamic programming algorithm
solution has the advantage of
solving usually outperforms a top-down
solving only those
memoized algorithm by a
subproblems that are
constant factor
definitely required
12.1 Introduction
A greedy algorithm is a type of optimization algorithm that makes locally optimal choices at each
step to find a globally optimal solution. It operates on the principle of “taking the best option
now” without considering the long-term consequences.
20
● Optimal solution: The feasible solution that achieves the desired extremum is called an
“optimal solution”. In other words, the feasible solution that either minimizes or maximizes
the objective function specified in a problem is known as an “optimal solution”.
● Feasibility check: It investigates whether the selected input fulfills all constraints mentioned
in a problem or not. If it fulfills all the constraints then it is added to a set of feasible
solutions; otherwise, it is rejected.
● Optimality check: It investigates whether a selected input produces either a minimum or
maximum value of the objective function by fulfilling all the specified constraints. If an
element in a solution set produces the desired extremum, then it is added to a sel of optimal
solutions.
● Optimal substructure property: The globally optimal solution to a problem includes the
optimal sub solutions within it.
● Greedy choice property: The globally optimal solution is assembled by selecting locally
optimal choices. The greedy approach applies some locally optimal criteria to obtain a partial
solution that seems to be the best at that moment and then find out the solution for the
remaining sub-problem.
The local decisions (or choices) must possess three characteristics as mentioned below:
● Feasibility: The selected choice must fulfill local constraints.
● Optimality: The selected choice must be the best at that stage (locally optimal choice).
● Irrevocability: The selected choice cannot be changed once it is made.
21
Chapter 13: Graphs
13.1 Introduction
Graph Theory is a fundamental area of study in computer science and mathematics that deals
with the study of graphs, which are mathematical structures used to model pairwise
relationships between objects. Graphs consist of vertices (nodes) and edges (connections between
nodes). In this comprehensive overview, we'll cover the major topics in Graph Theory, including
basic definitions, types of graphs, representations, traversal algorithms, shortest path
algorithms, and applications.
Graphs can represent various real-world scenarios such as social networks, road networks,
computer networks, etc.
● Undirected Graph: Edges have no direction. If there's an edge between vertices u and v,
it can be traversed in both directions.
● Directed Graph (Digraph): Edges have a direction. If there's a directed edge from
vertex u to vertex v, it can only be traversed from u to v.
2. Weighted Graphs:
● Edges have weights or costs associated with them, which represent the cost, distance, or
capacity between vertices.
● Acyclic Graph: Contains no cycles (a path that starts and ends at the same vertex).
● Cyclic Graph: Contains at least one cycle.
1. Adjacency Matrix:
● A 2D array A of size |𝑉| * |𝑉|, where A[i][j] is non-zero if there's an edge from vertex i to
vertex j.
2
● Suitable for dense graphs but consumes O(𝑉 ) space.
2. Adjacency List:
● An array of lists (or hashmap of lists), where each element iii contains a list of vertices
adjacent to vertex iii.
● Suitable for sparse graphs and consumes O(V + E) space.
22
● Explores as far as possible along each branch before backtracking.
● Uses a stack (recursive implementation) or explicit stack (iterative implementation).
● Explores all neighbors at the present depth level before moving on to nodes at the next
depth level.
● Uses a queue.
1. Dijkstra's Algorithm:
● Finds the shortest paths from a source vertex to all other vertices in a graph with
non-negative edge weights.
● Uses a priority queue (min-heap).
2. Bellman-Ford Algorithm:
● Finds the shortest paths from a single source vertex to all other vertices, even in the
presence of negative edge weights.
● Uses relaxation technique over all edges |𝑉| − 1 times.
3. Floyd-Warshall Algorithm:
● Finds shortest paths between all pairs of vertices in a graph, including negative edge
weights.
● Uses dynamic programming.
1. Kruskal's Algorithm:
● Finds an MST by sorting all edges and adding the smallest edge that doesn’t form a cycle
until |𝑉| − 1 edges are added.
● Uses Union-Find data structure for cycle detection.
2. Prim's Algorithm:
● Finds an MST by starting from an arbitrary vertex and greedily adding the minimum
weight edge that connects the tree to an outside vertex.
● Uses a priority queue (min-heap).
23
● Maximal subgraphs where each vertex is reachable from every other vertex in the
subgraph.
● Eulerian Path/Cycle: Visit each edge exactly once (Cycle returns to the starting vertex).
● Hamiltonian Path/Cycle: Visit each vertex exactly once (Cycle returns to the starting
vertex).
Graph algorithms vary in complexity depending on the representation and the specific problem.
2
Common complexities include O(V + E) for traversal algorithms, O(𝑉 . 𝑙𝑜𝑔(𝑉) + 𝐸) for MST
3
algorithms, and O(𝑉 ) for all pairs shortest path algorithms.
24