[go: up one dir, main page]

0% found this document useful (0 votes)
24 views98 pages

Data Structure Mca

Uploaded by

Sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views98 pages

Data Structure Mca

Uploaded by

Sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 98

Unit – 1

Abstract data type


ADT but before understanding what ADT is let us consider different inbuilt data types that are provided
tous. Data types such as int, float, double, long, etc. are considered to be inbuilt data types and we can
perform basic operations with them such as addition, subtraction, division, multiplication, etc. Now there
might be a situation when we need operations for our userdefined data type which have to be defined.
These operations can be defined only as and when we require them. So, in order to simplify the process of
solving problems, we can create data structures along with their operations, and such data structures that
are not in built are known as Abstract Data Type (ADT).

Abstract Data type (ADT) is a type (or class) for objects whose behavior is defined by a set of values
and a set of operations. The definition of ADT only mentions what operations are to be performed but not
how these operations will be implemented. It does not specify how data will be organized in memory and
what algorithms will be used for implementing the operations. It is called “abstract” because it gives an
implementationindependent view.

The process of providing only the essentials and hiding the details is known as abstraction.

The user of data type does not need to know how that data type is implemented, for example, we have been
using Primitive values like int, float, char data types only with the knowledge that these data type can
operate and be performed on without any idea of how they are implemented.
1. List ADT

Vies of list

• The data is generally stored in key sequence in a list which has a head structure consisting of
count, pointers and address of compare function needed to compare the data in the list.
• The data node contains the pointer to a data structure and a selfreferential pointer which points to the
next node in the list.
• The List ADT Functions is given below:
• get() – Return an element from the list at any given position.
• insert() – Insert an element at any position of the list.
• remove() – Remove the first occurrence of any element from a nonempty list.
• removeAt() – Remove the element at a specified location from a nonempty list.
• replace() – Replace an element at any position by another element.
• size() – Return the number of elements in the list.
• isEmpty() – Return true if the list is empty, otherwise return false.
• isFull() – Return true if the list is full, otherwise return false.

2. Stack ADT

View of stack

• In Stack ADT Implementation instead of data being stored in each node, the pointer to data is stored.
• The program allocates memory for the data and address is passed to the stack ADT.
• The head node and the data nodes are encapsulated in the ADT. The calling function can only see the
pointer to the stack.
• The stack head structure also contains a pointer to top and count of number of entries currently in
stack.
• push() – Insert an element at one end of the stack called top.
• pop() – Remove and return the element at the top of the stack, if it is not empty.
• peek() – Return the element at the top of the stack without removing it, if the stack is not empty.
• size() – Return the number of elements in the stack.
• isEmpty() – Return true if the stack is empty, otherwise return false.
• isFull() – Return true if the stack is full, otherwise return false.

3. Queue ADT
View of Queue

• The queue abstract data type (ADT) follows the basic design of the stack abstract data type.
• Each node contains a void pointer to the data and the link pointer to the next element in the queue. The
program’s responsibility is to allocate memory for storing the data.
• enqueue() – Insert an element at the end of the queue.
• dequeue() – Remove and return the first element of the queue, if the queue is not empty.
• peek() – Return the element of the queue without removing it, if the queue is not empty.
• size() – Return the number of elements in the queue.
• isEmpty() – Return true if the queue is empty, otherwise return false.
• isFull() – Return true if the queue is full, otherwise return false.

Some of the key features of ADTs include:


• Abstraction: The user does not need to know the implementation of the data structure only essentials
are provided.
• Better Conceptualization: ADT gives us a better conceptualization of the real world.
• Robust: The program is robust and has the ability to catch errors.
• Encapsulation: ADTs hide the internal details of the data and provide a public interface for users to
interact with the data. This allows for easier maintenance and modification of the data structure.
• Data Abstraction: ADTs provide a level of abstraction from the implementation details of the data.
Users only need to know the operations that can be performed on the data, not how those operations
are implemented.
• Data Structure Independence: ADTs can be implemented using different data structures, such as
arrays or linked lists, without affecting the functionality of the ADT.
• Information Hiding: ADTs can protect the integrity of the data by allowing access only to authorized
users and operations. This helps prevent errors and misuse of the data.
• Modularity: ADTs can be combined with other ADTs to form larger, more complex data structures.
This allows for greater flexibility and modularity in programming.

Advantages Disadvantages
Encapsulation: ADTs provide a way to Overhead: Implementing ADTs can add
encapsulate data and operations into a overhead in terms of memory and
single unit, making it easier to manage processing, which can affect
and modify the data performance.
structure.
Abstraction: ADTs allow users to work Complexity: ADTs can be complex to
with data structures without having to implement, especially for large and
know the implementation details, complex data structures.
which can simplify
programming and reduce errors.
Data Structure Independence: ADTs can Learning Curve: Using ADTs requires
be implemented using different data knowledge of their implementation and
structures, which can make it easier to usage, which can take time and effort to
adapt to changing needs and learn.
requirements.
Information Hiding: ADTs can protect the
Limited Flexibility: Some ADTs may be
Integrity of data by controlling access and limited in their functionality or may not
preventing unauthorized modifications. be suitable for all types of data
structures.
Modularity: ADTs can be combined with Cost: Implementing ADTs may require
other ADTs to form more complex data additional resources and investment,
structures, which can increase which can increase the cost of
flexibility and modularity in development.
programming.
Iterators in Python
An iterator in Python is an object that is used to iterate over iterable objects like lists, tuples, dicts, and
sets. The Python iterators object is initialized using the iter() method. It uses the next() method for
iteration.
1. iter (): The iter() method is called for the initialization of an iterator. This returns an iterator
object
2. next (): The next method returns the next value for the iterable. When we use a for loop to traverse
any iterable object, internally it uses the iter() method to get an iterator object, which further uses the
next() method to iterate over. This method raises a StopIteration to signal the end of the iteration

Array Representation

Arrays are represented as a collection of multiple containers where each container stores one element. These containers are
indexed from '0' to 'n1', where n is the size of that particular array.

Arrays can be declared in various ways in different languages. Below is an illustration −

As per the above illustration, following are the important points to be considered −

• Index starts with 0.


• Array length is 10 which means it can store 10 elements.
• Each element can be accessed via its index. For example, we can fetch an element at index 6 as 9.
Creating Array in Python

To create an array in Python, import the array module and use its array() function. We can create an
array of three basic types namely integer, float and Unicode characters using this function.

The array() function accepts typecode and initializer as a parameter value and returns an object of array
class.
Syntax

The syntax for creating an array in Python is −

# importing
import array as array_name

# creating array
obj = array_name.array(typecode[, initializer])

Where,

• typecode − The typecode character used to speccify the type of elements in the array.
• initializer − It is an optional value from which array is initialized. It must be a list, a byteslike object, or
iterable elements of the appropriate type.

typecode Python data type Byte size


'b' signed integer 1
‘B’ unsigned integer 1
‘U’ Unicode character 2
‘h’ signed integer 2
‘H’ unsigned integer 2
‘I’ signed integer 2
‘l’ unsigned integer 4
‘L’ signed integer 4
‘q’ signed integer 8
‘Q’ unsigned integer 8
‘F’ floating point 4
‘D’ floating point 8

Python | Using 2D arrays/lists the right way

• Python provides powerful data structures called lists, which can store and manipulate collections of
elements. Also provides many ways to create 2dimensional lists/arrays.

• However one must know the differences between these ways because they can create complications
in code that can be very difficult to trace out. In this article, we will explore the right way to use 2D
arrays/lists in Python.

•Using 2D arrays/lists the right way involves understanding the structure, accessing elements, and
efficiently manipulating data in a twodimensional grid. When working with structured data or grids,
2D arrays or lists can be useful. A 2D array is essentially a list of lists, which represents a tablelike
structure with rows and columns.
Creating a 1D list
• In Python, Initializing a collection of elements in a linear sequence requires creating a 1D array, which
is a fundamental process.
• Although Python does not have a builtin data structure called a ‘1D array’, we can use a list that can
achieve the same functionality.
• Python lists are dynamic and versatile, making them an excellent choice for representing 1D arrays.
Let’s start by looking at common ways of creating a 1d array of size N initialized with 0s.
Creating 1D List using Naive Methods
Manually initializing and populating a list without using any advanced features or constructs in Python is
known as creating a 1D list using “Naive Methods”
Syntax

The syntax for creating an array in Python is −

# importing
import array as array_name

# creating array
obj = array_name.array(typecode[, initializer])
Where,
• typecode − The typecode character used to speccify the type of elements in the
array.
• initializer − It is an optional value from which array is initialized. It must be a list,
a byteslike object, or iterable elements of the appropriate type.

typecode Python data type Byte size


'b' signed integer 1
‘B’ unsigned integer 1
‘U’ Unicode character 2
‘h’ signed integer 2
‘H’ unsigned integer 2
‘I’ signed integer 2
‘l’ unsigned integer 4
‘L’ signed integer 4
‘q’ signed integer 8
‘Q’ unsigned integer 8
‘F’ floating point 4
‘D’ floating point 8
TwoDimensional Arrays

A twodimensional array, often referred to as a matrix, is a data structure that consists of a


collection of elements arranged in rows and columns. It can be visualized as a grid, where each
element is accessed using two indices: one for the row and one for the column.

Characteristics:

• Structure: Defined by dimensions m×nm \times nm×n, where mmm is the number
of rows and nnn is the number of columns.
• Indexing: Elements are accessed using two indices, typically starting from 0 (e.g.,
array[row][column]).
• Homogeneous Data: Typically stores elements of the same data type.

Example in Python:

python
Copy code
# Creating a 3x3 matrix
matrix = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
# Accessing elements
print(matrix[0][1]) # Output: 2
print(matrix[2][0]) # Output: 7

# Modifying elements
matrix[1][1] = 10
print(matrix)

Matrix Abstract Data Type (MAT)

The Matrix Abstract Data Type (MAT) defines a set of operations that can be performed on a
matrix, abstracting away the underlying representation. This allows for manipulation of matrices
without needing to manage their internal storage details.

Operations Typically Included:

1. Creation: Initialize a matrix of specified dimensions.


2. Access: Retrieve the value at a specific row and column.
3. Modification: Set the value at a specific row and column.
4. Addition: Add two matrices of the same dimensions.
5. Multiplication: Multiply two matrices (subject to dimensional constraints).
6. Transpose: Create a new matrix that is the transpose of the original.
7. Display: Print the matrix in a readable format.

Sure! Let's dive into sets, maps, and multidimensional arrays, including their definitions,
characteristics, and examples.

Sets

A set is an abstract data type that represents a collection of unique elements. Sets are often used
to perform operations such as union, intersection, and difference.

# Characteristics:

Unique Elements: No duplicates allowed.

Unordered: The order of elements is not significant.

Dynamic Size: The size of a set can change as elements are added or removed.

# Common Operations:

1. Add: Insert an element into the set.

2. Remove: Delete an element from the set.


3. Contains: Check if an element exists in the set.

4. Union: Combine two sets.

5. Intersection: Get common elements from two sets.

6. Difference: Get elements that are in one set but not in another.

# Example in Python:

```python

# Using a set in Python

my_set = {1, 2, 3}

my_set.add(4) # Add an element

my_set.remove(2) # Remove an element

print(my_set) # Output: {1, 3, 4}

print(3 in my_set) # Check membership: Output: True

# Set operations

set_a = {1, 2, 3}

set_b = {3, 4, 5}

union_set = set_a.union(set_b) # {1, 2, 3, 4, 5}

intersection_set = set_a.intersection(set_b) # {3}

difference_set = set_a.difference(set_b) # {1, 2}

```

Maps
A map (also known as a dictionary or associative array) is an abstract data type that consists of
keyvalue pairs. It allows for fast retrieval of values based on their keys.

# Characteristics:

KeyValue Pairs: Each key maps to a specific value.

Unique Keys: Keys must be unique; duplicate keys are not allowed.

Dynamic Size: The map can grow or shrink as needed.

# Common Operations:

1. Insert: Add a keyvalue pair to the map.

2. Remove: Delete a keyvalue pair by key.

3. Get: Retrieve a value associated with a specific key.

4. Contains: Check if a key exists in the map.

# Example in Python:

```python

# Using a map (dictionary) in Python

my_map = {'a': 1, 'b': 2, 'c': 3}

my_map['d'] = 4 # Add a keyvalue pair

print(my_map['b']) # Output: 2

del my_map['a'] # Remove a keyvalue pair

print('c' in my_map) # Check if key exists: Output: True

# Iterating through a map

for key, value in my_map.items():

print(f"{key}: {value}")

MultiDimensional Arrays
A multidimensional array is an array that contains arrays as its elements, allowing for the
representation of data in more than one dimension. The most common example is a
twodimensional array (matrix).

# Characteristics:

Rectangular Shape: Each dimension typically has a fixed size.

Access via Multiple Indices: Elements are accessed using multiple indices (e.g., `array[i][j]` for
a 2D array).

Homogeneous Data: All elements are usually of the same type.

# Common Use Cases:

Representing matrices and grids.

Storing multidimensional data, like images (height, width, color channels).

Example in Python:

```python

# Creating a 2D array (matrix)

matrix = [

[1, 2, 3],

[4, 5, 6],

[7, 8, 9]

# Accessing elements

print(matrix[1][2]) # Output: 6 (element at second row, third column)

# Modifying elements

matrix[0][1] = 10

print(matrix)

# Iterating through a 2D array


for row in matrix:

for value in row:

print(value, end=' ')

print() # New line after each row

Unit2
Algorithm Analysis

Efficiency of an algorithm can be analyzed at two different stages, before implementation and after
implementation. They are the following −

• A Priori Analysis − this is a theoretical analysis of an algorithm. Efficiency of an algorithm is


measured by assuming that all other factors, for example, processor speed, are constant and
have no effect on the implementation.
• A Posterior Analysis − This is an empirical analysis of an algorithm. The selected algorithm is
implemented using programming language. This is then executed on target computer machine.
In this analysis, actual statistics like running time and space required, are collected.

Algorithm Complexity

Suppose X is an algorithm and n is the size of input data, the time and space used by the algorithm
X are the two main factors, which decide the efficiency of X.

• Time Factor − Time is measured by counting the number of key operations such as
comparisons in the sorting algorithm.
• Space Factor − Space is measured by counting the maximum memory space required by the
algorithm.

The complexity of an algorithm f(n) gives the running time and/or the storage space required by the
algorithm in terms of n as the size of input data.

Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the
algorithm in its life cycle. The space required by an algorithm is equal to the sum of the
following two components −

• A fixed part that is a space required to store certain data and variables, that are independent of
the size of the problem. For example, simple variables and constants used, program size, etc.
• A variable part is a space required by variables, whose size depends on the size of the problem.
For example, dynamic memory allocation, recursion stack space, etc.
Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the fixed part and S(I) is
the variable part of the algorithm, which depends on instance characteristic I. Following is a
simple example that tries to explain the concept −

Algorithm: SUM(A, B)
Step 1 − START

Step 2 − C ← A + B + 10

Step 3 − Stop

Here we have three variables A, B, and C and one constant. Hence S(P) = 1 + 3. Now, space
depends on data types of given variables and constant types and it will be multiplied
accordingly.

Time Complexity

Time complexity of an algorithm represents the amount of time required by the algorithm to run to
completion. Time requirements can be defined as a numerical function T(n), where T(n) can be
measured as the number of steps, provided each step consumes constant time.

For example, addition of two nbit integers takes n steps. Consequently, the total computational time
is T(n) = c ∗ n, where c is the time taken for the addition of two bits. Here, we observe that T(n)
grows linearly as the input size increases.

Algorithm Types
The efficiency and accuracy of algorithms have to be analyzed to compare them and choose a
specific algorithm for certain scenarios. The process of making this analysis is called
Asymptotic analysis. It refers to computing the running time of any operation in mathematical
units of computation.

The efficiency and accuracy of algorithms have to be analyzed to compare them and choose a
specific algorithm for certain scenarios. The process of making this analysis is called
Asymptotic analysis. It refers to computing the running time of any operation in mathematical
units of computation.

• Best Case − Minimum time required for program execution.


• Average Case − Average time required for program execution.
• Worst Case − Maximum time required for program execution.
Asymptotic Notations

The commonly used asymptotic notations to calculate the running time complexity of an algorithm.

• Ο Notation
• Ω Notation
• θ Notation
Big Oh Notation, Ο

The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It
measures the worst case time complexity or the longest amount of time an algorithm can
possibly take to complete.

ing co

For example, for a function f(n)

Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }

Omega Notation, Ω

The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It
measures the best case time complexity or the best amount of time an algorithm can possibly
take to complete.

For example, for a function f(n)

Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ

The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −

θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }

Common Asymptotic Notations

A list of some common asymptotic notations is mentioned below −

constant − Ο(1)

logarithmic − Ο(log n)

linear − Ο(n)

− Ο(n log n)
n log n

quadratic − Ο(n2)

cubic − Ο(n3)

polynomial − nΟ(1)

exponential − 2Ο(n)
Algorithm Classes
Algorithms are unambiguous steps which should give us a welldefined output by processing zero or
more inputs. This leads to many approaches in designing and writing the algorithms. It has been
observed that most of the algorithms can be classified into the following categories.

Greedy Algorithms

Greedy algorithms try to find a localized optimum solution, which may eventually lead to globally
optimized solutions

Generally greedy algorithms do not provide globally optimized solutions.

Greedy algorithms look for a easy solution at that point in time without considering how it impacts
the future steps. It is similar to how humans solve problems without going through the complete
details of the inputs provided.

Networking algorithms use the greedy approach. Here is a list of few of them −

• Travelling Salesman Problem


• Prim's Minimal Spanning Tree Algorithm
• Kruskal's Minimal Spanning Tree Algorithm
• Dijkstra's Minimal Spanning Tree Algorithm

Divide and Conquer

This class of algorithms involves dividing the given problem into smaller subproblems and then
solving each of the subproblem independently. When the problem cannot be further sub
divided, we start merging the solution to each of the subproblem to arrive at the solution for the
bigger problem.

Divide and Conquer

In divide and conquer approach, the problem in hand, is divided into smaller subproblems and then
each problem is solved independently.

When we keep on dividing the subproblems into even smaller subproblems, we may eventually
reach a stage where no more division is possible.

Those "atomic" smallest possible subproblem (fractions) are solved. The solution of all
subproblems is finally merged in order to obtain the solution of an original problem.
Broadly, we can understand divideandconquer approach in a threestep process.

Divide/Break

This step involves breaking the problem into smaller subproblems. Subproblems should represent a
part of the original problem. This step generally takes a recursive approach to divide the
problem until no subproblem is further divisible. At this stage, subproblems become atomic in
nature but still represent some part of the actual problem.

Conquer/Solve

This step receives a lot of smaller subproblems to be solved. Generally, at this level, the problems
are considered 'solved' on their own.

Merge/Combine

When the smaller subproblems are solved, this stage recursively combines them until they
formulate a solution of the original problem. This algorithmic approach works recursively and
conquer &s; merge steps works so close that they appear as one.

Examples

The following program is an example of divideandconquer programming approach where the


binary search is implemented using python.

Binary Search implementation

Divide and conquer algorithms are −

• Merge Sort
• Quick Sort
• Kruskal's Minimal Spanning Tree Algorithm
• Binary Search

Dynamic Programming

Dynamic programming involves dividing the bigger problem into smaller ones but unlike divide
and conquer it does not involve solving each subproblem independently.

These algorithms are used for optimization. Dynamic algorithms are motivated for an overall
optimization of the problem and not the local optimization.

Dynamic programming algorithms are −

• Fibonacci number series


• Knapsack problem
• Tower of Hanoi

Recursion
Recursion allows a function to call itself. Fixed steps of code get executed again and again for new
values.

Recursive Algorithm?

A recursive algorithm calls itself with smaller input values and returns the result for the current
input by carrying out basic operations on the returned value for the smaller input. Generally, if a
problem can be solved by applying solutions to smaller versions of the same problem, and the
smaller versions shrink to readily solvable instances, then the problem can be solved using a
recursive algorithm.

To build a recursive algorithm, you will break the given problem statement into two parts. The first
one is the base case, and the second one is the recursive step.

• Base Case: It is nothing more than the simplest instance of a problem, consisting of a condition
that terminates the recursive function. This base case evaluates the result when a given
condition is met.

• Recursive Step: It computes the result by making recursive calls to the same function, but with
the inputs decreased in size or complexity.

For example, consider this problem statement: Print sum of n natural numbers using recursion. This
statement clarifies that we need to formulate a function that will calculate the summation of all
natural numbers in the range 1 to n. Hence, mathematically you can represent the function as:

F(n) = 1 + 2 + 3 + 4 + …………..+ (n2) + (n1) + n


It can further be simplified as:

You can breakdown this function into two parts as follows:

Different Types of Recursion

There are four different types of recursive algorithms, you will look at them one by one.

• Direct Recursion

A function is called direct recursive if it calls itself in its function body repeatedly. To better
understand this definition, look at the structure of a direct recursive program.

In this program, you have a method named fun that calls itself again in its function body. Thus, you
can say that it is direct recursive.

• Indirect Recursion
The recursion in which the function calls itself via another function is called indirect recursion.
Now, look at the indirect recursive program structure.

In this example, you can see that the function fun1 explicitly calls fun2, which is invoking fun1
again. Hence, you can say that this is an example of indirect recursion.

• Tailed Recursion

A recursive function is said to be tailrecursive if the recursive call is the last execution done by the
function. Let’s try to understand this definition with the help of an example.

If you observe this program, you can see that the last line ADI will execute for method fun is a
recursive call. And because of that, there is no need to remember any previous state of the
program.

• NonTailed Recursion

A recursive function is said to be nontail recursive if the recursion call is not the last thing done by
the function. After returning back, there is something left to evaluate. Now, consider this
example.

In this function, you can observe that there is another operation after the recursive call. Hence the
ADI will have to memorize the previous state inside this method block. That is why this
program can be considered nontail recursive.

Moving forward, you will implement a C program that exhibits recursive algorithmic nature.

Memory Allocation of Recursive Method

Each recursive call generates a new copy of the function on stack memory. Once the procedure
returns some data, the copy is deleted from storage. Each recursive call maintains a separate
stack because all parameters and other variables defined inside functions are kept on the stack.
The stack is deleted after the value from the relevant function is returned.

Recursion is quite complicated in terms of resolving and monitoring the values at each recursive
call. As a result, you have to maintain the stack and track the values of the variables specified in
it. To better understand the memory allocation of recursive functions, examine the following
example.

Now, have a look at this recursive Fibonacci code for n = 5. First, all stacks are preserved, each of
which prints the matching value of n until n becomes zero. When the termination condition is
achieved, the stacks are destroyed one at a time by returning 0 to their calling stack. To
understand the call stack hierarchy, look at the figure below.
Analysis of Recursive Algorithms
A recursive function is a function that is defined in terms of itself. Similarly, an algorithm is
said to be recursive if the same algorithm is invoked in the body. An algorithm that calls
itself is direct recursive.
These recursive algorithms are extremely powerful, but even more importantly, need more
times and storage than iterative algorithms ( that consist of iterative instruction like for,
while, and repeat), therefore , recursive algorithms are worse than iterative algorithms
because they are used the stack, but they are more clearly than an iterative.
When analyzing a recursive function for its step count ( analysis its running time), we often
obtain a recursive formula. These recursive formulas are referred to as recurrence relations
which are solved by repeated substitutions (iterative substitution) method.

Ex. 6:

Find the summation of one dimension array's elements.


The instance characteristics of this problem is n (number of elements).
Space Complexities:

Total space complexity of recursive functions=cells number for each calling*of calling
Each call to Rsum function needs two cells (one for variable S and the other one for
return address).

= of│First size in first calling _ Final size in last calling │+ 1


number of calling (depth
recursion)
= │n0│+1 = n+1

Thus space complexities:

= 2n+2 This is linear formula.

This space inconstant and dependents on instance characteristics (n).

Time complexities (step counts):


The time is formulated in a recursive formula and then using iterative substitution to solve it.
Best , Worst and Average Cases

We can extricate ourselves from the difficulties resulting from situations when the chosen parameters
are not adequate to determine the step count uniquely by defining three kinds of step counts: best
case, worst case, and average case.
1) ) The bestcase step count is the minimum number of steps that can be executed for the given
parameters.
2) ) The worstcase step count is the maximum number of steps that can be executed for the given
parameters .
3) ) The average step count is the average number of steps executed on instances with the given
parameters.

In some instances, the above three cases are equivalent (there is one formula only) for the same value
of instance characteristics, like in summation array element example (instance characteristics is
n).
But in other instances, dependency on instance characteristics is not enough to determine time complexities,
like in search problem about an element in an array (time complexities are effected with the position of
required element in array in addition to instance characteristics

UNIT – 3

STACK

A stack is a fundamental data structure in computer science that operates on a Last In, First Out
(LIFO) principle. This means that the most recently added item is the first one to be removed.
Here’s a detailed overview of stacks

Key Operations

1. Push: Adds an item to the top of the stack.

2. Pop: Removes and returns the item at the top of the stack.

3. Peek (or Top): Returns the item at the top of the stack without removing it.

4. IsEmpty: Checks if the stack is empty.

5. Size: Returns the number of items in the stack (optional in some implementations).

Characteristics
• LIFO Principle: The last element added is the first to be removed.
• Dynamic Size: Stacks can grow and shrink dynamically as elements are added or
removed (in implementations like linked lists or resizable arrays).
• Fixed Size: In some cases, stacks have a fixed size, and attempts to push an item onto a
fullstack or pop from an empty stack might result in errors.

Implementations

1. Array Based Stack:

• Uses a fixed size array to hold elements.


• Simple to implement and access elements directly using indices.
• Potential issue: if the stack grows beyond its initial size, it requires resizing, which can be
inefficient.

2. Linked List Based Stack:

• Uses a linked list where each node points to the next.


• Dynamic in size and avoids issues related to fixed capacity.
• Involves more complex memory management and pointer manipulation.

Applications of Stacks:
• Function calls: Stacks are used to keep track of the return addresses of function calls,
allowing the program to return to the correct location after a function has finished
executing.
• Recursion: Stacks are used to store the local variables and return addresses of recursive
function calls, allowing the program to keep track of the current state of the recursion.
• Expression evaluation: Stacks are used to evaluate expressions in postfix notation
(Reverse Polish Notation).
• Syntax parsing: Stacks are used to check the validity of syntax in programming languages
and other formal languages.
• Memory management: Stacks are used to allocate and manage memory in some operating
systems and programming languages.
.

Advantages of Stacks:
• Simplicity: Stacks are a simple and easytounderstand data structure, making them suitable
for a wide range of applications.
• Efficiency: Push and pop operations on a stack can be performed in constant time (O(1)),
providing efficient access to data.
• Lastin, Firstout (LIFO): Stacks follow the LIFO principle, ensuring that the last element
added to the stack is the first one removed. This behavior is useful in many scenarios, such
as function calls and expression evaluation.
• Limited memory usage: Stacks only need to store the elements that have been pushed
onto them, making them memoryefficient compared to other data structures.
Disadvantages of Stacks:
• Limited access: Elements in a stack can only be accessed from the top, making it difficult
to retrieve or modify elements in the middle of the stack.
• Potential for overflow: If more elements are pushed onto a stack than it can hold, an
overflow error will occur, resulting in a loss of data.
• Not suitable for random access: Stacks do not allow for random access to elements,
making them unsuitable for applications where elements need to be accessed in a specific
order.
• Limited capacity: Stacks have a fixed capacity, which can be a limitation if the number of
elements that need to be stored is unknown or highly variable.

QUEUE

A queue is a fundamental data structure that operates on a First In, First Out (FIFO) principle.
This means that the first item added to the queue is the first one to be removed. Here’s an
overview of queues:

Basic Operations on Queue:


• enqueue(): Inserts an element at the end of the queue i.e. at the rear end.
• De queue (): This operation removes and returns an element that is at the front end of the
queue.
• Front (): This operation returns the element at the front end without removing it.
• Rear (): This operation returns the element at the rear end without removing it.
• Is Empty (): This operation indicates whether the queue is empty or not.
• Is Full (): This operation indicates whether the queue is full or not.
• Size (): This operation returns the size of the queue i.e. the total number of elements it
contains.
Characteristics

• FIFO Principle: The first element added is the first to be removed, making it suitable for
scenarios where order of processing is important.
• Dynamic Size: Queues can grow and shrink dynamically as elements are added or
removed (in implementations like linked lists or resizable arrays).
• Fixed Size: Some implementations use a fixedsize array, which can lead to overflow
issues if the queue grows beyond its capacity.

Implementations

1. Array Based Queue:

• Uses a fixed size array to store elements.


• Operations involve managing the front and rear pointers, which can lead to inefficiencies
if the array becomes full or needs resizing.
• Circular Queue: A variation where the end of the array wraps around to the beginning,
making efficient use of space.

2. Linked List Based Queue:

• Uses a linked list where each node points to the next.


• Avoids issues related to fixed capacity and allows dynamic resizing.
• Requires more memory for storing pointers and involves more complex memory
management.

3. Double Ended Queue ( Deque ):

• Allows insertion and deletion of elements from both ends (front and rear).
• Can be implemented using arrays or linked lists and provides more flexibility compared
to standard queues.

Applications

1. Task Scheduling:

Operating systems and applications often use queues to manage tasks and processes,
scheduling them in the order they arrive.

2. Buffering:

Queues are used in buffering data streams, such as in IO buffering, where data is collected and
processed in the order it arrives.

3. BreadthFirst Search (BFS):

BFS algorithm uses a queue to explore nodes in a graph or tree level by level.

4. Print Spooling:

Print jobs are managed using a queue, ensuring that print requests are processed in the order
they are received.

queue = []

# Enqueue

queue . append ('A')


queue. append ('B')

queue. Append ('C')

print("Queue: ", queue)

# Dequeue

element = queue .pop (0)

print ("Dequeue: ", element)

# Peek

Front Element = queue[0]

print("Peek: ", front Element)

# isEmpty

Is Empty = not bool (queue)

print("is Empty: ", is Empty)

# Size

print("Size: ", len(queue))

output:

Queue: ['A', 'B', 'C']


Dequeue: A
Peek: B
isEmpty: False
Size: 2

DEQUEUE
A Deque (short for DoubleEnded Queue) is a data structure that allows insertion and deletion of
elements from both ends—both the front and the rear. This makes it a versatile and powerful
structure for various applications. Here’s an overview of deques:

Key Operations

1. Add to Front: Inserts an element at the front of the deque.

2. Add to Rear: Inserts an element at the rear of the deque.

3. Remove from Front: Removes and returns the element at the front of the deque.

4. Remove from Rear: Removes and returns the element at the rear of the deque.

5. Peek Front: Returns the element at the front without removing it.

6. Peek Rear: Returns the element at the rear without removing it.

7. IsEmpty: Checks if the deque is empty.

8. Size: Returns the number of elements in the deque (optional in some implementations).

Characteristics

• Double Ended: Elements can be added or removed from both ends, making it more
flexible than a standard queue.
• Dynamic Size: Deques can grow and shrink dynamically, depending on the
implementation (e.g., linked lists or resizable arrays).
• Fixed Size: Some implementations use a fixed size array, which can lead to overflow
issues if the deque grows beyond its capacity.

Implementations

1. ArrayBased Deque:

Typically implemented with a circular array to efficiently manage the front and rear pointers.

Allows constanttime operations for adding and removing elements, but resizing may be
necessary if the array is full.

2. Linked ListBased Deque:

Uses a doubly linked list where each node points to both its previous and next node.

Allows efficient addition and removal of elements from both ends but requires extra space for
pointers and involves more complex memory management.
Applications

1. Algorithm Optimization:

Deques are used in algorithms like the sliding window maximum or minimum, where you
need to keep track of a subset of elements and efficiently manage them.

2. Scheduling:

Useful in task scheduling systems where tasks can be dynamically added or removed from
both ends of a queue.

3. Palindrome Checking:

Deques can efficiently check for palindromes by comparing characters from both ends towards
the center.

4. Undo Mechanisms:

In applications where you need to add and remove operations from both ends, such as
undo/redo functionalities.

Example Usage

Simple Deque Implementation in Python:

Python’s `collections` module provides a `deque` class, but here’s a basic implementation for
educational purposes:

```python

class Deque:

def __init__(self):

self.items = []

def is_empty(self):

return len(self.items) == 0

def add_front(self, item):


self.items.insert(0, item)

def add_rear(self, item):

self.items.append(item)

def remove_front(self):

if not self.is_empty():

return self.items.pop(0)

raise IndexError("Remove from an empty deque")

def remove_rear(self):

if not self.is_empty():

return self.items.pop()

raise IndexError("Remove from an empty deque")

def peek_front(self):

if not self.is_empty():

return self.items[0]

raise IndexError("Peek from an empty deque")

def peek_rear(self):

if not self.is_empty():

return self.items[1]

raise IndexError("Peek from an empty deque")


def size(self):

return len(self.items)

```

LINKED LISTS

Linked List

o Linked List can be defined as collection of objects called nodes that are randomly
stored in the memory.
o A node contains two fields i.e. data stored at that particular address and the
pointer which contains the address of the next node in the memory.
o The last node of the list contains pointer to the null.

Key Characteristics:

1. Directional: Each node has a single pointer to the next node, creating a oneway traversal from
the head to the end of the list.

2. Dynamic Size: Nodes can be added or removed without reallocating or reorganizing the entire
structure.

3. Efficient Operations: Insertion and deletion of nodes can be performed efficiently, but only if
you have a reference to the node immediately before the insertion or deletion point.

Basic Operations:

1. Traversal: To access elements, you start from the head and follow the `next` pointers until you
reach the desired node or the end of the list.
2. Insertion: To insert a node, you adjust the pointers of the surrounding nodes.

3. Deletion: To delete a node, you need to adjust the pointer of the preceding node to bypass the
node being removed.

Types of Linked list

The following are the types of linked list:

o Singly Linked list


o Doubly Linked list
o Circular Linked list
o Doubly Circular Linked list

Singly Linked list

A singly linked list is the simplest kind of linked lists. It takes up less space in memory because
each node has only one address to the next node, like in the image

Example in Python:

```python

class Node:

def __init__(self, data):

self.data = data

self.next = None

class SinglyLinkedList:

def __init__(self):

self.head = None

def append(self, data):

new_node = Node(data)
if not self.head:

self.head = new_node

return

last = self.head

while last.next:

last = last.next

last.next = new_node

def print_list(self):

current = self.head

while current:

print(current.data, end=" > ")

current = current.next

print("None")

```

Circularly Linked Lists

Circularly Linked List is a variant where the last node points back to the first node, forming a
circular structure. This circular link allows for continuous traversal through the list without
encountering an end condition.

# Key Characteristics:

1. Circular Reference: The last node’s `next` pointer refers back to the head, creating a circular
chain of nodes.

2. Traversal: Can start from any node and continue indefinitely since there is no end node.

3. Insertion and Deletion: Insertion and deletion operations can be slightly more complex
compared to singly linked lists due to the circular nature.

# Types:
1. Circular Singly Linked List: Each node points to the next node, and the last node points to the
first node.

2. Circular Doubly Linked List: Each node points to both the next and the previous node, with
the last node pointing to the first and vice versa.

Example in Python:

```python

class CircularNode:

def __init__(self, data):

self.data = data

self.next = None

class CircularLinkedList:

def __init__(self):

self.head = None

def append(self, data):

new_node = CircularNode(data)

if not self.head:

self.head = new_node

new_node.next = new_node

else:

last = self.head

while last.next != self.head:

last = last.next

last.next = new_node

new_node.next = self.head
def print_list(self):

if not self.head:

print("List is empty")

return

current = self.head

while True:

print(current.data, end=" > ")

current = current.next

if current == self.head:

break

print("(circular)")

```

Doubly Linked Lists

Doubly Linked List is a more complex variation where each node contains two pointers: one to

the next node and one to the previous node. This allows traversal in both directions.

# Key Characteristics:

1. Bidirectional Traversal: Nodes can be traversed in both forward and backward directions due
to the presence of two pointers.

2. Insertion and Deletion: More flexible compared to singly linked lists as you can easily access
both preceding and succeeding nodes.

3. Memory Usage: Requires more memory for storing two pointers per node.
# Basic Operations:

1. Traversal: Nodes can be traversed in both directions, which can be advantageous for certain
algorithms.

2. Insertion: Easier insertion as you have references to both the next and previous nodes.

3. Deletion: Simplified deletion since you can directly adjust pointers in both directions.

Example in Python:

```python

class DoublyNode:

def __init__(self, data):

self.data = data

self.next = None

self.prev = None

class DoublyLinkedList:

def __init__(self):

self.head = None

def append(self, data):

new_node = DoublyNode(data)

if not self.head:

self.head = new_node

return

last = self.head

while last.next:
last = last.next

last.next = new_node

new_node.prev = last

def print_list(self):

current = self.head

while current:

print(current.data, end=" <> ")

current = current.next

print("None")

```

TREE

Trees Overview

A tree is a hierarchical data structure that consists of nodes connected by edges. It is used to
represent relationships with a hierarchical structure, where each node can have zero or more
children, but only one parent. Trees are fundamental in computer science for various
applications, including representing hierarchical data, organizing information, and supporting
efficient data retrieval and manipulation.

Key Terminology

1. Node: The basic unit of a tree, containing data and references to its children.

2. Root: The topmost node in a tree, which does not have a parent. Every tree has exactly one
root.

3. Parent: A node that has one or more child nodes.

4. Child: A node that is a descendant of another node (its parent).

5. Leaf: A node that has no children.

6. Internal Node: A node that has at least one child.


7. Sub tree: A tree formed by a node and its descendants.

8. Depth: The length of the path from the root to a node. The root has a depth of 0.

9. Height: The length of the longest path from a node to a leaf. The height of a leaf is 0.

10. Level: The level of a node is the number of edges from the root to the node. The root is at
level 0.

Types of Trees

1. Binary Trees:

• Binary Tree: Each node has at most two children, commonly referred to as the left child
and the right child.
• Full Binary Tree: Every node other than the leaves has exactly two children.
• Complete Binary Tree: All levels are fully filled except possibly the last level, which is
filled from left to right.
• Perfect Binary Tree: All internal nodes have exactly two children, and all leaf nodes are
at the same level.
• Balanced Binary Tree: The height of the left and right subtrees of any node differs by at
most one.

2. Binary Search Trees (BST):

Binary Search Tree: A binary tree where each node’s left subtree contains only nodes with
values less than the node’s value, and the right subtree contains only nodes with values greater
than the node’s value.

3. AVL Trees:

AVL Tree: A self balancing binary search tree where the difference in height between the left
and right sub trees of any node is at most one. Balancing is maintained through rotations during
insertions and deletions.

4. Red Black Trees:


Red Black Tree: A balanced binary search tree with an additional property where each node is
colored red or black, and it satisfies specific balancing rules, ensuring the tree remains balanced
and operations remain efficient.

5. B Trees:

B Tree: A self balancing tree data structure that maintains sorted data and allows searches,
sequential access, insertions, and deletions in logarithmic time. Often used in databases and file
systems.

6. Trie (Prefix Tree):

Trie: A tree used for efficient retrieval of a key in a dataset of strings, where nodes represent
prefixes of the strings. It is particularly useful for tasks like autocomplete and spellchecking.

7. Nary Trees:

Nary Tree: A tree where each node can have up to `N` children. It is a generalization of binary
trees.

8. Heap:

Binary Heap: A complete binary tree used to implement priority queues. It satisfies the heap
property, where the key at the root is either the maximum (maxheap) or minimum (minheap)
among all keys in the tree.

Basic Operations in tree

1. Traversal:

In order Traversal: Visit the left sub tree, then the node, and then the right sub tree. Commonly
used with binary search trees to retrieve values in sorted order.

Preorder Traversal: Visit the node first, then the left subtree, and finally the right subtree.
Useful for creating a copy of the tree.
Pos torder Traversal: Visit the left subtree, then the right subtree, and finally the node. Useful
for deleting the tree or evaluating expressions.

Level Order Traversal: Visit nodes level by level from top to bottom and left to right.

2. Insertion:

Binary Trees: Insert a new node at the first available position in level order traversal.

Binary Search Trees: Insert nodes while maintaining the BST property.

3. Deletion:

Binary Trees: Remove a node and adjust the tree structure.

Binary Search Trees: Remove a node while preserving the BST property, which may involve
reorganizing the tree.

4. Searching:

Binary Search Trees: Search operations are efficient with O(log n) complexity on average due
to the tree’s sorted nature.

Applications

1. Hierarchical Data Representation: Trees are used to represent hierarchical relationships, such
as organizational structures and file systems.

2. Database Indexing: Btrees and B+ trees are used in database systems to maintain indices and
facilitate fast data retrieval.

3. Search Algorithms: Binary search trees and their variants are used in search operations to
maintain sorted data.

4. Memory Management: Trees are used in memory allocation and management, such as in the
buddy system for memory allocation.
BINARY TREE

Binary Trees

A Binary Tree is a foundational data structure in computer science where each node has at most
two children, typically referred to as the left child and the right child. This structure provides a
simple yet powerful way to organize data and supports various algorithms and operations. Here’s
an indepth look at binary trees, including their properties, types, and common operations.

Key Properties

1. Nodes: Each node in a binary tree contains:

• Data: The value or information stored in the node.


• Left Pointer: A reference to the left child node.
• Right Pointer: A reference to the right child node.

2. Root: The top node of the tree. There is only one root in a binary tree.

3. Leaf: A node with no children (both left and right pointers are `None`).

4. Internal Node: A node with at least one child.

5. Height: The length of the longest path from a node to a leaf. The height of a leaf node is 0.

6. Depth: The length of the path from the root to a node. The root node has a depth of 0.

7. Level: The level of a node is defined as the number of edges from the root to the node. The
root node is at level 0.
Types of Binary Trees

1. Full Binary Tree:

• Every node has either 0 or 2 children. No node has exactly one child.

2. Complete Binary Tree:

• All levels are fully filled except possibly the last level, which is filled from left to right.

3. Perfect Binary Tree:

• All internal nodes have exactly two children, and all leaf nodes are at the same level. It
is a type of complete binary tree.

4. Balanced Binary Tree:

• The height of the left and right subtrees of any node differs by at most one. AVL trees
are a common example of balanced binary trees.

5. Degenerate (or Pathological) Tree:

• Each node has only one child, making the tree essentially a linked list. This occurs in
cases where the tree is not balanced.

6. Binary Search Tree (BST):

• A binary tree where for each node, the left sub tree contains only nodes with values less
than the node’s value, and the right sub tree contains only nodes with values greater
than the node’s value.

7. Heap:
A complete binary tree used to implement priority queues. It follows the heap property where
the key at the root is either the maximum (in a max heap) or minimum (in a mini heap) among all
keys.

Basic Operations

1. Traversal:

• In order Traversal: Visit the left sub tree, then the node, then the right sub tree. This
results in nodes being visited in ascending order for a BST.
• Preorder Traversal: Visit the node first, then the left subtree, then the right sub tree.
Useful for creating a copy of the tree.
• Postorder Traversal: Visit the left subtree, then the right subtree, then the node. Useful
for deleting or evaluating expressions.
• Level Order Traversal: Visit nodes level by level from top to bottom and left to right.
Typically implemented using a queue.

2. Insertion:

• Binary Trees: Insert a new node at the first available position in levelorder traversal.
• Binary Search Trees: Insert nodes while maintaining the BST property, placing each node
in its correct position relative to its parent nodes.

3. Deletion:

• Binary Trees: Remove a node and adjust the tree structure accordingly.
• Binary Search Trees: Remove a node while maintaining the BST property, which may
involve reorganizing the tree. There are three cases to handle:
• Node to be deleted is a leaf node.
• Node to be deleted has one child.
• Node to be deleted has two children (requires finding the successor or predecessor).

4. Searching:
• Binary Search Trees: Search operations are efficient, with average time complexity of
O(log n) for balanced trees, due to the sorted nature of the tree.

Example in Python

Here is a basic implementation of a binary tree and its operations in Python:

class Node:

def __init__(self, data):

self.data = data

self.left = None

self.right = None

class Binary Tree:

def __ init__(self):

self.root = None

def insert(self, data):

if not self.root:

self.root = Node(data)

else:

self._insert(self.root, data)

def _insert(self, current_node, data):

if data < current_node.data:


if current_node.left is None:

current_node.left = Node(data)

else:

self._insert(current_node.left, data)

else:

if current_node.right is None:

current_node.right = Node(data)

else:

self._insert(current_node.right, data)

def inorder_traversal(self):

elements = []

self._inorder(self.root, elements)

return elements

def _inorder(self, node, elements):

if node:

self._inorder(node.left, elements)

elements.append(node.data)

self._inorder(node.right, elements)

def preorder_traversal(self):

elements = []

self._preorder(self.root, elements)

return elements
def _preorder(self, node, elements):

if node:

elements.append(node.data)

self._preorder(node.left, elements)

self._preorder(node.right, elements)

def postorder_traversal(self):

elements = []

self._postorder(self.root, elements)

return elements

def _postorder(self, node, elements):

if node:

self._postorder(node.left, elements)

self._postorder(node.right, elements)

elements.append(node.data)

def level_order_traversal(self):

elements = []

if self.root is None:

return elements

queue = [self.root]

while queue:
node = queue.pop(0)

elements.append(node.data)

if node.left:

queue.append(node.left)

if node.right:

queue.append(node.right)

return element

Applications

1. Expression Parsing: Binary trees, particularly expression trees, are used to parse and evaluate
mathematical expressions.

2. Binary Search Trees: Used for efficient searching, insertion, and deletion operations in
dynamic datasets.

3. Heaps: Implemented for priority queues, heapsort, and efficient access to the maximum or
minimum element.

4. Decision Trees: Used in machine learning for classification and regression tasks.

IMPLEMENTING TREES

Implementing trees involves creating a data structure where each node can have a varying
number of children, depending on the type of tree. Below, we’ll cover the implementation of
several common types of trees: a basic binary tree, a binary search tree (BST), and an AVL tree
(a selfbalancing BST). We'll use Python for these implementations.

1. Binary Tree

A Binary Tree is a tree where each node has at most two children. Here’s how to implement a
basic binary tree:
# Basic Binary Tree Implementation

```python

class TreeNode:

def __init__(self, data):

self.data = data

self.left = None

self.right = None

class BinaryTree:

def __init__(self):

self.root = None

def insert(self, data):

if not self.root:

self.root = TreeNode(data)

else:

self._insert(self.root, data)

def _insert(self, node, data):

if data < node.data:

if node.left is None:

node.left = TreeNode(data)

else:
self._insert(node.left, data)

else:

if node.right is None:

node.right = TreeNode(data)

else:

self._insert(node.right, data)

def inorder_traversal(self):

elements = []

self._inorder(self.root, elements)

return elements

def _inorder(self, node, elements):

if node:

self._inorder(node.left, elements)

elements.append(node.data)

self._inorder(node.right, elements)

def preorder_traversal(self):

elements = []

self._preorder(self.root, elements)

return elements

def _preorder(self, node, elements):

if node:
elements.append(node.data)

self._preorder(node.left, elements)

self._preorder(node.right, elements)

def postorder_traversal(self):

elements = []

self._postorder(self.root, elements)

return elements

def _postorder(self, node, elements):

if node:

self._postorder(node.left, elements)

self._postorder(node.right, elements)

elements.append(node.data)

def level_order_traversal(self):

elements = []

if self.root is None:

return elements

queue = [self.root]

while queue:

node = queue.pop(0)

elements.append(node.data)

if node.left:
queue.append(node.left)

if node.right:

queue.append(node.right)

return elements

```

2. Binary Search Tree (BST)

A Binary Search Tree (BST) is a binary tree where each node's left child contains values less
than the node’s value, and the right child contains values greater than the node’s value.

Binary Search Tree Implementation

```python

class BST Node:

def __init__(self, data):

self.data = data

self.left = None

self.right = None

class BinarySearchTree:

def __init__(self):

self.root = None

def insert(self, data):


if not self.root:

self.root = BSTNode(data)

else:

self._insert(self.root, data)

def _insert(self, node, data):

if data < node.data:

if node.left is None:

node.left = BSTNode(data)

else:

self._insert(node.left, data)

else:

if node.right is None:

node.right = BSTNode(data)

else:

self._insert(node.right, data)

def search(self, data):

return self._search(self.root, data)

def _search(self, node, data):

if node is None:

return False

if data == node.data:

return True
elif data < node.data:

return self._search(node.left, data)

else:

return self._search(node.right, data)

def inorder_traversal(self):

elements = []

self._inorder(self.root, elements)

return elements

def _inorder(self, node, elements):

if node:

self._inorder(node.left, elements)

elements.append(node.data)

self._inorder(node.right, elements)

```

3. AVL Tree

An AVL Tree is a self balancing binary search tree where the difference in heights of the left and
right sub trees of any node is at most one. This ensures O(log n) time complexity for insertion,
deletion, and search operations.

AVL Tree Implementation

```python
class AVLNode:

def __init__(self, data):

self.data = data

self.left = None

self.right = None

self.height = 1 # height of node

class AVLTree:

def __init__(self):

self.root = None

def insert(self, data):

self.root = self._insert(self.root, data)

def _insert(self, node, data):

if not node:

return AVLNode(data)

if data < node.data:

node.left = self._insert(node.left, data)

else:

node.right = self._insert(node.right, data)

node.height = 1 + max(self._get_height(node.left), self._get_height(node.right))


balance = self._get_balance(node)

# Left heavy

if balance > 1:

if data < node.left.data:

return self._right_rotate(node)

else:

node.left = self._left_rotate(node.left)

return self._right_rotate(node)

Right heavy

if balance < 1:

if data > node.right.data:

return self._left_rotate(node)

else:

node.right = self._right_rotate(node.right)

return self._left_rotate(node)

return node

def _left_rotate(self, z):

y = z.right

T2 = y.left

y.left = z
z.right = T2

z.height = 1 + max(self._get_height(z.left), self._get_height(z.right))

y.height = 1 + max(self._get_height(y.left), self._get_height(y.right))

return y

def _right_rotate(self, z):

y = z.left

T3 = y.right

y.right = z

z.left = T3

z.height = 1 + max(self._get_height(z.left), self._get_height(z.right))

y.height = 1 + max(self._get_height(y.left), self._get_height(y.right))

return y

def _get_height(self, node):

if not node:

return 0

return node.height

def _get_balance(self, node):


if not node:

return 0

return self._get_height(node.left) self._get_height(node.right)

def inorder_traversal(self):

elements = []

self._inorder(self.root, elements)

return elements

def _inorder(self, node, elements):

if node:

self._inorder(node.left, elements)

elements.append(node.data)

self._inorder(node.right, elements)

```

TREE TRAVASEL ALGORITHMS

Tree Traversal Algorithms

Tree traversal refers to the process of visiting all the nodes in a tree in a specific order. Tree
traversal algorithms are fundamental for performing various operations on trees, such as
searching, sorting, and manipulating data. There are several methods for traversing trees, and the
choice of algorithm often depends on the specific requirements of the problem. The main
traversal algorithms for trees are:

1. DepthFirst Traversal (DFT):


Inorder Traversal

Preorder Traversal

Postorder Traversal

2. BreadthFirst Traversal (BFT):

Level Order Traversal

Each traversal method has different characteristics and use cases. Below is a detailed explanation
of each traversal algorithm.

1. DepthFirst Traversal (DFT)

DepthFirst Traversal explores as far as possible along each branch before backtracking. It uses a
stackbased approach, either explicitly or through recursion.

Inorder Traversal

Inorder traversal visits nodes in the following order: left subtree, node itself, then right subtree.
For a binary search tree (BST), this traversal yields nodes in ascending order.

Algorithm:

1. Traverse the left subtree.

2. Visit the node.

3. Traverse the right subtree.

Implementation in Python:
```python

def inorder_traversal(root):

elements = []

inorder (root, elements)

return elements

def _inorder(node, elements):

if node:

_inorder(node.left, elements)

elements.append(node.data)

_inorder(node.right, elements)

```

Preorder Traversal

Preorder traversal visits nodes in the following order: node itself, left subtree, then right subtree.
It is useful for creating a copy of the tree or for prefix notation.

Algorithm:

1. Visit the node.

2. Traverse the left subtree.

3. Traverse the right subtree.

Implementation in Python:
```python

def preorder_traversal(root):

elements = []

_preorder(root, elements)

return elements

def _preorder(node, elements):

if node:

elements.append(node.data)

_preorder(node.left, elements)

_preorder(node.right, elements)

```

Postorder Traversal

Postorder traversal visits nodes in the following order: left subtree, right subtree, then the node
itself. It is useful for deleting the tree or for postfix notation.

Algorithm:

1. Traverse the left subtree.

2. Traverse the right subtree.

3. Visit the node.

Implementation in Python:
```python

def postorder_traversal(root):

elements = []

_postorder(root, elements)

return elements

def _postorder(node, elements):

if node:

_postorder(node.left, elements)

_postorder(node.right, elements)

elements.append(node.data)

```

2. BreadthFirst Traversal (BFT)

BreadthFirst Traversal explores all nodes at the present depth level before moving on to nodes at
the next depth level. It uses a queue to keep track of nodes.

# Level Order Traversal

Level order traversal visits nodes level by level from top to bottom and left to right. It is useful
for operations that require processing nodes in their level order, such as finding the shortest path
in an unweighted tree.

Algorithm:
1. Initialize a queue with the root node.

2. While the queue is not empty:

Dequeue a node and visit it.

Enqueue its left child (if it exists).

Enqueue its right child (if it exists).

Implementation in Python:

```python

from collections import deque

def level_order_traversal(root):

if not root:

return []

elements = []

queue = deque([root])

while queue:

node = queue.popleft()

elements.append(node.data)

if node.left:

queue.append(node.left)

if node.right:

queue.append(node.right)
return elements

```

Comparison of Traversal Methods

Inorder Traversal: Produces nodes in nondecreasing order for a BST. Useful for sorted output.

Preorder Traversal: Useful for copying the tree, prefix expression evaluation, or serialization.

Postorder Traversal: Useful for deletion of the tree or postfix expression evaluation.

Level Order Traversal: Useful for processing nodes level by level, such as finding the shortest
path in an unweighted tree.

Applications

1. Inorder Traversal: Used to retrieve sorted elements from a BST.

2. Preorder Traversal: Useful for generating a copy of the tree or prefix expression in expression
trees.

3. Postorder Traversal: Useful for safely deleting a tree or postfix expression in expression trees.

4. Level Order Traversal: Used in algorithms that need to process nodes level by level, such as in
finding the shortest path or managing hierarchical data.

UNIT – 4

PRIORITY QUEUE

Priority Queues

A Priority Queue is a type of abstract data structure that supports the efficient retrieval of the
element with the highest (or lowest) priority. Unlike standard queues where elements are
processed in a firstin, firstout (FIFO) manner, priority queues allow elements to be retrieved
based on their priority levels rather than their order of insertion.

Key Concepts

1. Priority: Each element in a priority queue has a priority level. The element with the highest
priority is the one that gets de queued first. In a max priority queue, the element with the highest
value has the highest priority, while in a minpriority queue, the element with the lowest value has
the highest priority.

2. Insertion: Elements are added to the priority queue with an associated priority level. The
priority queue maintains the internal order based on these priorities.

3. Deletion/Removal: The element with the highest priority is removed from the priority queue.
If the priority queue is implemented as a maxpriority queue, the element with the maximum
priority is removed, and if it is a mini priority queue, the element with the minimum priority is
removed.

4. Peek: This operation allows you to view the element with the highest priority without
removing it from the queue.

Implementations

Priority queues can be implemented using several underlying data structures:

1. Binary Heap

2. Balanced Binary Search Tree (e.g., RedBlack Tree)

3. Fibonacci Heap
4. Pairing Heap

5. Ordered List

The most common implementation is the binary heap, which provides efficient operations for
priority queue functionalities.

1. Binary Heap

A binary heap is a complete binary tree that satisfies the heap property. There are two types
of binary heaps:

• MaxHeap: The key at the root node is the maximum among all keys in the heap. Every
parent node has a key greater than or equal to the keys of its children.
• MinHeap: The key at the root node is the minimum among all keys in the heap. Every
parent node has a key less than or equal to the keys of its children.

Operations on Binary Heap:

• Insert: Add a new element to the heap while maintaining the heap property.
• Extract Max/Min: Remove the root element (maximum or minimum) and restore the heap
property.
• Peek: View the root element without removing it.

2. Balanced Binary Search Tree

A balanced binary search tree, like a RedBlack Tree, maintains sorted order of elements and
provides efficient insertion and deletion. It can also be used to implement a priority queue,
though it is less common compared to binary heaps.
Operations:

Insertion: Insert an element while maintaining the balanced tree property.

Deletion: Remove the root element and maintain the balanced tree property.

Peek: View the root element.

3. Fibonacci Heap

A Fibonacci Heap is a more advanced heap data structure that supports a collection of trees and
offers better amortized time complexities for operations compared to binary heaps.

4. Pairing Heap

A Pairing Heap is another type of heap with simpler implementation and good amortized time
complexities for insertions and deletions.

5. Ordered List

A priority queue can also be implemented using an ordered list, where elements are kept in a
sorted order based on priority.

Operations:

• Insert: Insert an element into the correct position in the list with O(n) time complexity.
• ExtractMax/Min: Remove the first element in the list with O(1) time complexity.
• Peek: View the first element in the list with O(1) time complexity.

Priority Queue Abstract Data Type Implementing a Priority Queue

Priority Queue Abstract Data Type (ADT)

The Priority Queue is an abstract data type (ADT) that extends the functionality of a standard
queue by associating each element with a priority. In a priority queue, elements are dequeued
based on their priority rather than their insertion order. The basic idea is that elements with
higher priority are served before elements with lower priority.
Key Operations

A priority queue supports several key operations:

1. Insert (or Enqueue): Add an element with a given priority to the priority queue.

2. Remove (or Dequeue): Remove and return the element with the highest priority.

3. Peek (or Front): View the element with the highest priority without removing it from the
queue.

4. Is Empty: Check whether the priority queue is empty.

Abstract Data Type Definition

The priority queue ADT can be formally defined by the following operations:

1. Insert (item, priority):

• Description: Adds an element `item` with a specified `priority` to the priority queue.
• Complexity: The time complexity depends on the underlying implementation (e.g., O(log
n) for binary heaps).

2. Remove ():

• Description: Removes and returns the element with the highest priority. In a maxpriority
queue, this would be the element with the largest priority value, while in a minpriority
queue, it would be the element with the smallest priority value.
• Complexity: The time complexity is typically O(log n) for binary heaps but can vary
based on the data structure used.

3. Peek ():

• Description: Returns the element with the highest priority without removing it from the
queue.
• Complexity: The time complexity is O(1) for binary heaps and other similar data
structures.

4. Is Empty():

• Description: Returns a boolean indicating whether the priority queue is empty.


• Complexity: The time complexity is O(1) for most implementations.

Priority Queue Implementations


The priority queue can be implemented using various data structures, each with its own strengths
and weaknesses. Common implementations include:

1. Binary Heap

2. Balanced Binary Search Tree (e.g., Red Black Tree)

3. Fibonacci Heap

4. Pairing Heap

5. Ordered List

1. Binary Heap

A binary heap is a complete binary tree that maintains the heap property, where:

Max Heap: Each parent node is greater than or equal to its child nodes.

Mini Heap: Each parent node is less than or equal to its child nodes.

• Time Complexities:
• Insert: O(log n)
• Remove: O(log n)
• Peek: O(1)

2. Balanced Binary Search Tree (BST)

A balanced BST like a RedBlack Tree or AVL Tree maintains sorted order and can be used for
implementing priority queues.

• Time Complexities:
• Insert: O(log n)
• Remove: O(log n)
• Peek: O(1)

3. Fibonacci Heap

A Fibonacci Heap is a more advanced data structure that provides better amortized time
complexities for several operations.

• Time Complexities:
• Insert: O(1) amortized
• Remove: O(log n) amortized
• Decrease Key: O(1) amortized

4. Pairing Heap

A Pairing Heap is a simpler alternative with good amortized time complexities for priority queue
operations.

• Time Complexities:
• Insert: O(1) amortized
• Remove: O(log n) amortized

5. Ordered List

An ordered list maintains elements in a sorted order, allowing for efficient access to the highest
(or lowest) priority element.

Time Complexities:

Insert: O(n)

Remove: O(1)

Peek: O(1)

Applications

Priority queues are widely used in various algorithms and systems, including:

• Scheduling: Operating systems use priority queues to manage tasks based on priority
levels.
• Dijkstra’s Algorithm: Finds the shortest path in a graph by repeatedly extracting the node
with the smallest distance.
• A* Search Algorithm: Used in path finding and graph traversal, where nodes are
prioritized based on a cost function.
• Huffman Coding: Data compression algorithms use priority queues to build optimal
prefix codes.

SORTING
Sorting with a Priority Queue

Sorting with a priority queue involves using the properties of a priority queue to sort elements
efficiently. This approach is particularly useful when dealing with large datasets or when a direct
sorting algorithm like quicksort or merge sort might be less practical. The most common sorting
algorithm that uses a priority queue is Heap sort, but priority queues can also be used in other
sorting contexts.

Heap sort Algorithm

Heapsort is a comparisonbased sorting algorithm that uses a binary heap data structure to sort
elements. It works by leveraging the heap property to repeatedly extract the maximum (or
minimum) element from the heap and build a sorted list. Heapsort is particularly efficient
because it sorts in place and has a time complexity of O(n log n).

Steps of Heap sort

1. Build a MaxHeap (or MinHeap): Convert the input list into a maxheap (for ascending order)
or a minheap (for descending order). This step ensures that the largest (or smallest) element is at
the root of the heap.

2. Extract Elements: Repeatedly extract the root element of the heap, which is the maximum (or
minimum) element, and place it at the end of the sorted section of the list. After each extraction,
restore the heap property by reheapifying.

3. Repeat: Continue extracting the root element and reheapifying until the heap is empty,
resulting in a sorted list.

1. Build a Priority Queue: Insert all elements into a priority queue. If using a minpriority queue,
the smallest element will be at the front, and if using a maxpriority queue, the largest element
will be at the front.
2. Extract Elements: Continuously extract elements from the priority queue and insert them into
a list. This list will be sorted in ascending order if a minpriority queue is used, or in descending
order if a maxpriority queue is used.

Steps of Priority Queue Based Sorting

1. Insert All Elements: Insert each element of the input list into the priority queue. This operation
generally has a time complexity of O(n log n), where n is the number of elements.

2. Extract Elements: Remove elements from the priority queue one by one and place them into a
result list. Each extraction operation also has a time complexity of O(log n), but since there are n
extractions, the overall complexity is O(n log n).

# Priority QueueBased Sorting Implementation in Python

Here's a Python implementation of sorting using a priority queue:

```python

import heapq

def priority_queue_sort(iterable):

# Create a minheap (priority queue) from the iterable

priority_queue = []

for item in iterable:

heapq.heappush(priority_queue, item)

# Extract elements from the priority queue and build the sorted list

sorted_list = []

while priority_ queue:


sorted_ list .append(heap q. heap pop (priority_ queue))

return sorted _list

# Example usage

data = [4, 10, 3, 5, 1]

sorted_data = priority_queue_sort(data)

print(sorted_data) # Output: [1, 3, 4, 5, 10]

```

MAPS

Maps (Dictionaries)

A map, also known as a dictionary, associative array, or hash table, is an abstract data type that
associates keys with values. It allows for efficient retrieval, insertion, and deletion of keyvalue
pairs. Maps are widely used in programming due to their versatility and efficiency in handling
associative data.

Key Concepts

1. KeyValue Pair: Each entry in a map is a pair consisting of a unique key and an associated
value. The key is used to identify and access the value.

2. Uniqueness: Keys in a map are unique; no two entries can have the same key. Values,
however, can be duplicated.

3. Operations:

• Insert (or Put): Add a key value pair to the map.


• Remove (or Delete): Remove a key value pair by its key.
• Get (or Lookup): Retrieve the value associated with a given key.
• Contains Key: Check if a particular key exists in the map.
• Size: Determine the number of key value pairs in the map.
• Iterate: Traverse through all key value pairs in the map.

Implementations

Maps can be implemented using various underlying data structures, each with its own
performance characteristics:
1. Hash Table

2. Binary Search Tree (BST)

3. Balanced Trees (e.g., RedBlack Tree)

4. Tree

1. Hash Table

A hash table uses a hash function to compute an index into an array of buckets or slots, from
which the desired value can be found. This provides averagecase constant time complexity O(1)
for insertion, deletion, and retrieval operations.

Hash Table Characteristics:

• Hash Function: Maps keys to indices in the array.


• Collision Resolution: Techniques such as chaining (linked lists at each array index) or
open addressing (probing) handle hash collisions.

Example in Python:

Python's builtin `dict` type is implemented using a hash table.

```python

# Creating a map

my_map = {}

# Inserting keyvalue pairs

my_map['apple'] = 3

my_map['banana'] = 5
# Retrieving values

print(my_map['apple']) # Output: 3

# Checking if a key exists

print('banana' in my_map) # Output: True

# Removing a keyvalue pair

del my_map['apple']

# Iterating through the map

for key, value in my_map.items():

print(f'{key}: {value}')

```

2. Binary Search Tree (BST)

A BST is a binary tree where each node has at most two children, and for any node, the left
child’s key is less than the node’s key, and the right child’s key is greater.

BST Characteristics:

• Insert: O(log n) averagecase time complexity if the tree is balanced.


• Search: O(log n) averagecase time complexity.
• Deletion: O(log n) averagecase time complexity.

3. Balanced Trees (e.g., RedBlack Tree)

Balanced trees like Red Black Trees are BSTs with additional properties to ensure that the tree
remains balanced. This guarantees logarithmic time complexity for all basic operations.
Red Black Tree Characteristics:

• Insert: O(log n) time complexity.


• Search: O(log n) time complexity.
• Delete: O(log n) time complexity.

4. Tree (Prefix Tree)

A Tree is a special type of tree used for storing associative data structures. It is often used for
dictionaries, where keys are strings.

Tree Characteristics:

• Insert: O(m) time complexity, where m is the length of the key.


• Search: O(m) time complexity.
• Prefix Search: Efficient for operations like autocomplete.

HASHTABLE

Hash Table

A hash table is a data structure that provides efficient methods for storing and retrieving data
using a hash function. It maps keys to values, enabling average case constant time complexity
O(1) for common operations such as insertion, deletion, and lookup. Hash tables are widely used
in various applications, including databases, caches, and associative arrays.

Key Concepts

1. Hash Function:

• A hash function is a function that takes an input (or key) and computes an index into an
array of buckets or slots. The goal is to distribute keys uniformly across the buckets to
minimize collisions.
• A good hash function should minimize collisions and distribute keys uniformly.
• Buckets:

Buckets are storage slots in the hash table where data is stored. Each bucket corresponds to an
index computed by the hash function
3. Collisions:

Collisions occur when two keys hash to the same index. Effective collision resolution
strategies are crucial for maintaining the efficiency of the hash table.

4. Collision Resolution Techniques:

• Chaining: Each bucket maintains a linked list of entries. When a collision occurs, the new
entry is added to the list at the corresponding bucket.
• Open Addressing: When a collision occurs, the hash table searches for the next available
slot according to a probing sequence (e.g., linear probing, quadratic probing).

5. Load Factor:

The load factor is defined as the ratio of the number of elements to the number of buckets. It
measures the density of the hash table. A high load factor may lead to increased collisions and
degraded performance, prompting the need for resizing.

6. Resizing:

To maintain efficiency, hash tables are resized (usually doubled) when the load factor exceeds
a certain threshold. Resizing involves rehashing all existing keys to the new array of buckets.

KEY OPERATIONS:

1. Insert (or Put):

• Description: Adds a keyvalue pair to the hash table.


• Time Complexity: O(1) on average, assuming a good hash function and effective
collision resolution.

2. Remove (or Delete):

• Description: Removes a keyvalue pair from the hash table.


• Time Complexity: O(1) on average, assuming a good hash function and effective
collision resolution.

3. Get (or Lookup):

• Description: Retrieves the value associated with a given key.


• Time Complexity: O(1) on average, assuming a good hash function and effective
collision resolution.

4. Contains Key:

• Description: Checks if a key exists in the hash table.


• Time Complexity: O(1) on average.

Hash Table Implementation

Here's a basic implementation of a hash table using chaining in Python:

```python

class HashTable:

def __init__(self, size=10):

self.size = size

self.table = [[] for _ in range(size)]

def _hash(self, key):

# Simple hash function: use modulo operation

return hash(key) % self.size

def insert(self, key, value):

index = self._hash(key)

# Check if key already exists and update it

for kv in self.table[index]:

if kv[0] == key:

kv[1] = value

return

# Otherwise, add new keyvalue pair

self.table[index].append([key, value])

def remove(self, key):

index = self._hash(key)
for i, kv in enumerate(self.table[index]):

if kv[0] == key:

del self.table[index][i]

return

def get(self, key):

index = self._hash(key)

for kv in self.table[index]:

if kv[0] == key:

return kv[1]

return None

def contains_key(self, key):

index = self._hash(key)

for kv in self.table[index]:

if kv[0] == key:

return True

return False

def __str__(self):

return str(self.table)

# Example usage

ht = HashTable()

ht.insert("apple", 3)
ht.insert("banana", 5)

print(ht.get("apple")) # Output: 3

print(ht.contains_key("banana")) # Output: True

ht.remove("apple")

print(ht.get("apple")) # Output: None

print(ht)

```

Advanced Hash Table Concepts

1. Dynamic Resizing:

• Description: Hash tables often grow dynamically to accommodate more elements and
maintain efficiency. When the load factor exceeds a threshold, the hash table is resized
(usually doubled) and all existing keys are rehashed.
• Implementation: This involves creating a new larger array and rehashing all entries from
the old array to the new one.

2. Perfect Hashing:

• Description: A technique where the hash function is chosen such that there are no
collisions. This is often used in situations where the set of keys is known in advance
and remains static.

3. Cuckoo Hashing:

• Description: A hash table scheme where each key may be hashed to multiple possible
locations. When a collision occurs, the existing key is displaced (kicked out) and
reinserted into its alternative location.

SKIP LISTS

Skip Lists
A skip list is a probabilistic data structure that is used to implement a sorted dictionary or
associative array. It provides an efficient way to perform dynamic set operations such as
insertion, deletion, and search. Skip lists are an alternative to balanced binary search trees,
offering similar performance with simpler implementation.

Key Concepts

1. Layered Structure:

• A skip list consists of multiple levels of linked lists. Each level is a sorted list that
contains a subset of the elements in the level below it.
• The bottom level (level 0) contains all the elements, while higher levels contain fewer
elements. Each element in a higher level has a probability of being promoted from the
level below.

2. Nodes:

• Each node in a skip list contains a key, a value, and multiple forward pointers. The
number of forward pointers depends on the level of the node.
• Nodes are connected in a sorted order at each level. The level of a node determines the
number of levels it appears in.

3. Probabilistic Balancing:

• Skip lists use randomness to maintain balance. When inserting a new element, it is
randomly promoted to higher levels with a certain probability. This ensures that the
skip list remains balanced with high probability.

4. Search, Insert, and Delete Operations:

• Search: Traverse the skip list from the highest level down to the lowest level, moving
right and down to find the desired element.
• Insert: Insert a new element at the bottom level and probabilistically promote it to
higher levels.
• Delete: Remove the element from all levels where it appears.

Operations

1. Search:
• Description: Locate a key in the skip list.
• Time Complexity: O(log n) on average.
• Procedure: Start at the topleft corner and move right until you pass the target key, then
drop down to the next level and continue the process.

2. Insert:

• Description: Add a keyvalue pair to the skip list.


• Time Complexity: O(log n) on average.
• Procedure: Insert the new element at the bottom level, and with a certain probability,
insert it into higher levels.

3. Delete:

• Description: Remove a keyvalue pair from the skip list.


• Time Complexity: O(log n) on average.
• Procedure: Locate the element, remove it from all levels where it appears.

Advantages of Skip Lists

1. Simpilcity:

Easier to implement compared to balanced trees like AVL or Red Black Trees.

2. Efficient:

Provides O(log n) average case time complexity for search, insertion, and deletion operations.

3. Dynamic:

Supports dynamic changes (inserts and deletes) efficiently without the need for complex
rebalancing.

Disadvantages of Skip Lists

1. Probabilistic Guarantees:

Performance is probabilistic, which means that while it has good averagecase performance, the
worst case behavior is not strictly guaranteed.

2. Memory Overhead:

Requires additional space for multiple levels and forward pointers, which can increase
memory usage compared to simpler data structures.
SETS MULTISETS – MULTI MAPS

Sets, Multisets, and Multimaps

In computer science, sets, multisets, and multimaps are different types of abstract data types that
manage collections of elements with varying properties. Each has specific characteristics and use
cases that make them suitable for different scenarios.

Sets

A set is a collection of distinct elements. The primary operations associated with sets are
insertion, deletion, and membership testing. Sets do not allow duplicate elements and are
commonly used to handle collections of unique items.

Characteristics

1. Uniqueness:

Sets do not allow duplicate elements. Each element can appear only once in the set.

2. Operations:

• Insert: Add an element to the set.


• Remove: Remove an element from the set.
• Contains: Check if an element is present in the set.
• Union: Combine two sets to include all unique elements from both sets.
• Intersection: Find elements that are common to both sets.
• Difference: Find elements present in one set but not in the other.
• Subset: Determine if one set is a subset of another.

Multisets

A multi set (also known as a bag) is a generalization of a set that allows duplicate elements.
Unlike sets, multi sets can contain multiple occurrences of the same element.

Characteristics:

1. Duplicates Allowed:

Multi sets can contain multiple instances of the same element. The number of occurrences of
each element is tracked.
2. Operations:

• Insert: Add an element to the multiset with a specified count.


• Remove: Remove occurrences of an element.
• Count: Retrieve the number of occurrences of an element.
• Union: Combine two multisets, summing up occurrences of each element.
• Intersection: Find the common elements, taking the minimum count from both multisets.
• Difference: Find elements present in one multiset but not in the other, considering the
counts.

Multi maps:

A multi map (or multi map) is a generalization of a map that allows multiple values for a single
key. Unlike standard maps where each key is associated with a single value, multi maps allow
each key to be associated with a collection of values.

Characteristics

1. Multiple Values per Key:

Each key in a multimap can be associated with multiple values, usually stored in a collection
such as a list or set.

2. Operations:

• Insert: Add a keyvalue pair, allowing multiple values for the same key.
• Remove: Remove a specific value for a key or all values for a key.
• Get: Retrieve all values associated with a key.
• Contains Key: Check if a key exists in the multimap.
• Contains Value: Check if a value exists for a given key.

# Example in Python

Python's `collections.defaultdict` combined with a list or set can be used to create a multimap.

```python

from collections import defaultdict


# Creating a multimap

mm = defaultdict(list)

# Adding keyvalue pairs

mm['a'].append(1)

mm['a'].append(2)

mm['b'].append(3)

mm['b'].append(4)

# Retrieving values

print(mm['a']) # Output: [1, 2]

print(mm['b']) # Output: [3, 4]

# Checking if key exists

print('a' in mm) # Output: True

# Removing values

mm['a'].remove(1)

print(mm['a']) # Output: [2]

# Remove key entirely

del mm['b']

print(mm) # Output: defaultdict(<class 'list'>, {'a': [2]})

```
UNIT5

Search Trees

Binary search trees

A Binary Search Tree is a data structure used in computer science for organizing and storing
data in a sorted manner. Each node in a Binary Search Tree has at most two children,
a left child and a right child, with the left child containing values less than the parent node and
the right child containing values greater than the parent node. This hierarchical structure
allows for efficient searching, insertion, and deletion operations on the data stored in the tree.

Properties of Binary Search Tree:


• The left subtree of a node contains only nodes with keys lesser than the node’s key.
• The right subtree of a node contains only nodes with keys greater than the node’s key.
• The left and right subtree each must also be a binary search tree.
• There must be no duplicate nodes(BST may have duplicate values with different handling
approaches).
Basic Operations on Binary Search Tree:
1. Searching a node in BST:

Searching in BST means to locate a specific node in the data structure. In Binary search tree,
searching a node is easy because of its a specific order. The steps of searching a node in Binary
Search tree are listed as follows –

1. First, compare the element to be searched with the root element of the tree.
• If root is matched with the target element, then return the node’s location.
• If it is not matched, then check whether the item is less than the root element, if it is
smaller than the root element, then move to the left subtree.
• If it is larger than the root element, then move to the right subtree.
2. Repeat the above procedure recursively until the match is found.
3. If the element is not found or not present in the tree, then return NULL.

Advantages of Binary Search Tree (BST):


• Efficient searching: O(log n) time complexity for searching with a self balancing BST
• Ordered structure: Elements are stored in sorted order, making it easy to find the next or
previous element
• Dynamic insertion and deletion: Elements can be added or removed efficiently
• Balanced structure: Balanced BSTs maintain a logarithmic height, ensuring efficient
operations
• Doubly Ended Priority Queue: In BSTs, we can maintain both maximum and minimum
efficiently
Disadvantages of Binary Search Tree (BST):
• Not selfbalancing: Unbalanced BSTs can lead to poor performance
• Worstcase time complexity: In the worst case, BSTs can have a linear time complexity for
searching and insertion
• Memory overhead: BSTs require additional memory to store pointers to child nodes
• Not suitable for large datasets: BSTs can become inefficient for very large datasets
• Limited functionality: BSTs only support searching, insertion, and deletion operations

2. AVL Tree

An AVL Tree is a selfbalancing binary search tree where the difference in height between the
left and right subtrees of any node (called the balance factor) is at most one. This balance ensures
that operations like insertion, deletion, and lookup have O(log n) time complexity.

Operations

• Insert: O(log n) time complexity, including rebalancing.


• Delete: O(log n) time complexity, including rebalancing.
• Search: O(log n) time complexity.

Balancing: AVL trees use rotations to maintain balance. The four types of rotations are:

1. Right Rotation
2. Left Rotation
3. RightLeft Rotation
4. LeftRight Rotation

3. Red Black Tree

A RedBlack Tree is a balanced binary search tree with additional properties to maintain balance:

1. Each node is either red or black.

2. The root is always black.

3. Red nodes cannot have red children (no two red nodes can be adjacent).

4. Every path from a node to its descendant leaves has the same number of black nodes.

Operations

• Insert: O(log n) time complexity, including rebalancing.


• Delete: O(log n) time complexity, including rebalancing.
• Search: O(log n) time complexity.
• Balancing: Uses color changes and rotations to maintain properties after insertion and
deletion.

4. BTree

A BTree is a selfbalancing search tree that maintains sorted data and allows searches, sequential
access, insertions, and deletions in logarithmic time. It is commonly used in databases and file
systems.

Characteristics

1. Multilevel Index: Nodes in a Btree can have multiple children and keys.

2. Balanced: All leaf nodes are at the same depth.

3. Order: A Btree of order `m` can have at most `m1` keys and `m` children.

Operations

• Insert: O(log n) time complexity.


• Delete: O(log n) time complexity.
• Search: O(log n) time complexity.

5. B+ Tree

A B+ Tree is a variation of a Btree where all values are stored at the leaf level, and internal
nodes only store keys to guide the search. It is widely used in databases and file systems for
indexing.

Characteristics

1. Leaf Nodes: All values are stored in leaf nodes, which are linked in a doubly linked list.

2. Internal Nodes: Only keys are stored in internal nodes to guide the search.

3. Balanced: All leaf nodes are at the same depth.

Operations

• Insert: O(log n) time complexity.


• Delete: O(log n) time complexity.
• Search: O(log n) time complexity.
Search trees

Search trees are crucial for managing dynamic sets of data where efficient querying and updating
are required. Different types of search trees provide various advantages:

Binary Search Tree (BST): Simple, but can become unbalanced.

1) AVL Tree: Maintains balance with rotations, ensuring O(log n) operations.


2) RedBlack Tree: Balances tree using color properties, also O(log n) operations.
3) BTree: Used for diskbased storage, with multilevel indexing.
4) B+ Tree: An enhanced BTree for indexing in databases.
5) Trie: Specializes in prefixbased searching and efficient string operations.

BINARY SEARCH TREE

A Binary Search Tree (BST) is a type of binary tree where each node has at most two children,
commonly referred to as the left and right child. It is used to store data in a way that allows for
efficient retrieval, insertion, and deletion operations. The BST property ensures that the left
subtree of a node contains only nodes with keys less than the node’s key, and the right subtree
contains only nodes with keys greater than the node’s key.

Characteristics

1. Binary Tree Structure: Each node has up to two children: a left child and a right child.

2. Ordering Property: All nodes in the left subtree have keys less than the node’s key.

All nodes in the right subtree have keys greater than the node’s key.

3. Dynamic:

BSTs can grow and shrink dynamically as elements are added or removed.

Basic Operations

1. Insertion:

To insert a new node into a BST, start at the root and traverse the tree according to the BST
property until you find an appropriate null position where the new node should be inserted.

2. Search:

To search for a value, start at the root and traverse the tree based on the BST property. If the
key is less than the current node’s key, move to the left child; if greater, move to the right child.
If the key matches, return the node; otherwise, return null if the end of the tree is reached
3. Deletion:

Deleting a node from a BST involves three cases:

• Node with no children (leaf node): Simply remove the node.


• Node with one child: Remove the node and replace it with its child.
• Node with two children: Find the node’s inorder successor (smallest node in the right
subtree) or predecessor (largest node in the left subtree), copy its key to the node, and
delete the successor/predecessor.

4. Traversal:

• Inorder: Visit the left subtree, then the node, and then the right subtree. This traversal
prints the keys in ascending order.
• Preorder: Visit the node, then the left subtree, and then the right subtree. This traversal is
useful for creating a copy of the tree.
• Postorder: Visit the left subtree, then the right subtree, and then the node. This traversal is
useful for deleting the tree or evaluating expression trees.
• Levelorder: Visit nodes level by level from top to bottom. This traversal is typically
implemented using a queue.

Properties and Variants

1. Height:

The height of a BST is the length of the longest path from the root to a leaf. In the worst case,
a BST can become skewed (like a linked list), making operations O(n). Balanced BSTs ensure
O(log n) height.

2. Balanced BSTs:

To ensure that the height remains O(log n), selfbalancing BSTs like AVL trees and RedBlack
trees are used. They automatically adjust the tree structure during insertions and deletions to
maintain balance.

3. Applications:

Searching: Efficiently find elements.

Sorting: Inorder traversal produces a sorted list of elements.

Dynamic Set Operations: Insertion, deletion, and lookup operations in logarithmic time on
average.
4. Advantages:

Simple and intuitive.

Supports efficient search, insertion, and deletion operations.

5. Disadvantages:

Can become unbalanced, leading to degraded performance (O(n)) in the worst case.

BALANCED SEARCH TREES

Balanced Search Trees are a subset of binary search trees (BSTs) designed to maintain a
balanced structure, which ensures that the height of the tree remains logarithmic in relation to the
number of nodes. This balance guarantees that operations such as insertion, deletion, and search
can be performed efficiently in O(log n) time complexity, where n is the number of nodes in the
tree.

Balanced search trees address the problem of a BST becoming unbalanced, which can occur
when elements are inserted in a sorted or nearly sorted order, leading to a degenerate (skewed)
tree structure similar to a linked list.

Key Types of Balanced Search Trees

1. AVL Trees

AVL Trees are a type of selfbalancing binary search tree. They maintain balance by ensuring
that the heights of the two child subtrees of any node differ by no more than one. This balance is
maintained through rotations during insertion and deletion operations.

Characteristics

• Balance Factor: For any node in the AVL tree, the difference between the height of the
left and right subtrees (balance factor) is 1, 0, or 1.
• Rotations: To maintain balance, AVL trees use four types of rotations:
• Right Rotation (Single Rotation)
• Left Rotation (Single Rotation)
• LeftRight Rotation (Double Rotation)
• RightLeft Rotation (Double Rotation)

Operations

• Insertion: After inserting a node, check the balance factors of the nodes and perform
necessary rotations to restore balance.
• Deletion: After deleting a node, adjust the tree to restore balance by performing rotations
as needed.
• Search: O(log n) time complexity due to balanced structure.

# Example Implementation in Python

2. RedBlack Trees

RedBlack Trees are another type of selfbalancing binary search tree where each node has an
additional color attribute (red or black) to ensure the tree remains approximately balanced. The
balancing criteria ensure that no path from the root to a leaf is more than twice as long as any
other such path.

Characteristics

Color Rules:

• Each node is either red or black.


• The root is always black.
• Red nodes cannot have red children (no two red nodes can be adjacent).
• Every path from a node to its descendant leaves has the same number of black nodes.
• Balancing: Balancing is maintained through color changes and rotations during insertion
and deletion operations.

Operations

Insertion: Insert the node and then fix any violations of redblack properties by performing
rotations and color changes.

Deletion: Remove the node and adjust the tree to maintain redblack properties through
rotations and color changes.

Search: O(log n) time complexity due to balanced structure.

3. BTrees

BTrees are selfbalancing tree data structures that maintain sorted data and allow searches,
sequential access, insertions, and deletions in logarithmic time. They are often used in databases
and file systems to manage large datasets efficiently.
Characteristics

Order: A Btree of order `m` can have at most `m1` keys and `m` children.

Balanced: All leaf nodes are at the same depth.

Multilevel Index: Allows efficient multilevel indexing and management of large datasets.

Operations

• Insertion: Insert a key into the appropriate leaf node and split nodes if necessary to
maintain balance.
• Deletion: Remove a key and ensure the tree remains balanced by merging or
redistributing nodes if needed.
• Search: O(log n) time complexity due to balanced structure.

Splay Trees

A Splay Tree is a type of selfadjusting binary search tree with the property that recently
accessed elements are moved to the root of the tree. This selfadjusting property is achieved
through a process known as "splaying," which involves a series of tree rotations.

Characteristics

1. SelfSplaying:
o After accessing (inserting, deleting, or searching) a node, the tree is rearranged so
that the accessed node is moved to the root of the tree through a series of
rotations.
2. No Strict Balancing:
o Unlike AVL or RedBlack trees, splay trees do not maintain strict balance criteria.
Instead, they use the splaying operation to ensure that frequently accessed nodes
are quickly accessible.
3. Amortized Analysis:
o Although individual operations can be inefficient in the worst case, splay trees
provide good performance on average. The amortized time complexity for splay
tree operations is O(log n), where n is the number of nodes in the tree.

Splaying Operations

The splaying operation involves three types of tree rotations: zig, zigzig, and zigzag.

1. Zig Operation:
o Applied when the node to be splayed is the child of the root. This involves a

single rotation.
2. ZigZig Operation:
oApplied when the node to be splayed and its parent are both either left or right
children. This involves two rotations.
3. ZigZag Operation:
o Applied when the node to be splayed is a child of its parent, but is on the opposite

side. This involves two rotations.

Operations in Splay Trees

1. Search:
o Search for a node involves traversing the tree as in a standard binary search tree.
After the node is found (or if it’s not found), the splay operation is performed to
bring the accessed node to the root.
o Time Complexity: O(log n) amortized.
2. Insertion:
o Insert the node as you would in a regular binary search tree. After the insertion,
perform the splay operation to bring the newly inserted node to the root.
o Time Complexity: O(log n) amortized.
3. Deletion:
o To delete a node, first splay the node to the root (if it exists). Then, remove the
root and restructure the tree by combining the left and right subtrees.
o Time Complexity: O(log n) amortized.

Advantages and Disadvantages

Advantages:

• Amortized Time Complexity: Operations like search, insertion, and deletion have
amortized time complexity of O(log n).
• SelfAdjusting: The tree adapts based on access patterns, potentially speeding up access
to frequently used nodes.

Disadvantages:

• Lack of Strict Balancing: Unlike AVL and RedBlack trees, splay trees do not guarantee
strict balance, so individual operations can be inefficient in the worst case.
• Implementation Complexity: The splaying operation and rotations can be more
complex to implement correctly compared to other balanced trees.

Applications

Splay trees are particularly useful in scenarios where the access patterns are not uniform. They
are used in applications where:

• Locality of Reference: Recently accessed elements are likely to be accessed again soon.
• Cache Optimization: Splay trees can optimize cache performance due to their
selfadjusting nature.

In summary, splay trees provide a unique approach to maintaining balanced binary search trees
through selfadjusting mechanisms. While they may not always provide the same worstcase
guarantees as other balanced trees, their amortized performance and adaptability can be
beneficial in many realworld applications.

Sorting and Selection

Sorting and selection are fundamental operations in computer science, used to arrange elements
in a specific order and to find particular elements within a dataset, respectively. These operations
are essential for data management, search optimization, and efficient algorithm design. Here’s an
overview of sorting and selection techniques:

Sorting

Sorting refers to the process of arranging elements in a specific order, usually in ascending or
descending order. There are numerous sorting algorithms, each with its own performance
characteristics and use cases

Selection

Selection refers to finding specific elements from a dataset, such as the smallest, largest, or kth
smallest/largest element. This is useful for tasks where only a subset of data needs to be
analyzed.

Common Selection Algorithms:

1. Linear Search:
o Description: Scans through the list to find the desired element.
o Time Complexity: O(n) in the worstcase.
o Use Case: Simple and effective for small or unsorted datasets.
2. Binary Search:
o Description: Efficiently finds an element in a sorted list by repeatedly dividing
the search interval in half.
o Time Complexity: O(log n) in the worstcase.
o Use Case: Efficient for searching in sorted data.
3. Selection Algorithms:
o Description: Algorithms to find the kth smallest or largest element in an unsorted
array. Examples include:

▪ Quickselect: An efficient algorithm based on Quick Sort that finds the kth
smallest element.
• Time Complexity: O(n) average case, O(n^2) worst case.
▪ Median of Medians: An algorithm that finds the kth smallest element
with guaranteed O(n) worstcase time complexity.
• Time Complexity: O(n).

Merge Sort

Merge Sort and Quick Sort are two popular sorting algorithms with distinct characteristics and
use cases. Both algorithms are efficient but differ in their approaches, performance
characteristics, and suitability for various types of data.

Merge Sort is a divideandconquer sorting algorithm that divides the array into smaller subarrays,
sorts those subarrays, and then merges them to produce the sorted array.

1. Divide:

Recursively split the array into two halves until each subarray contains a single element.

2. Conquer:

Merge the sorted subarrays to produce the sorted array. The merging process involves
combining two sorted arrays into a single sorted array.

3. Combine:

Continue merging until the entire array is merged into a single sorted array.

Time Complexity

a) WorstCase: O(n log n)


b) AverageCase: O(n log n)
c) BestCase: O(n log n)

Space Complexity

Auxiliary Space: O(n), as extra space is needed for merging.

Stability

Stable: Maintains the relative order of equal elements.

Implementation

Here’s a basic implementation of Merge Sort in Python:


```python

def merge_sort(arr):

if len(arr) > 1:

mid = len(arr) // 2

left_half = arr[:mid]

right_half = arr[mid:]

merge_sort(left_half)

merge_sort(right_half)

i=j=k=0

while i < len(left_half) and j < len(right_half):

if left_half[i] < right_half[j]:

arr[k] = left_half[i]

i += 1

else:

arr[k] = right_half[j]

j += 1

k += 1

while i < len(left_half):

arr[k] = left_half[i]

i += 1

k += 1
while j < len(right_half):

arr[k] = right_half[j]

j += 1

k += 1

# Example usage

arr = [12, 11, 13, 5, 6, 7]

merge_sort(arr)

print("Sorted array is:", arr)

```

Advantages:

Consistent Performance: Merge sort always performs at O(n log n) time complexity.

Stable: Preserves the order of equal elements.

Disadvantages:

Space Complexity: Requires additional space proportional to the size of the array.

Overhead: Extra time and space are needed for merging operations.

Quick Sort

Quick Sort is another divideandconquer sorting algorithm that selects a 'pivot' element, partitions
the array into elements less than and greater than the pivot, and recursively sorts the partitions.

1. Partition:

Choose a pivot element from the array and partition the other elements into two subarrays
according to whether they are less than or greater than the pivot.
2. Recursively Sort:

Recursively apply the same process to the subarrays formed by the partitioning.

3. Combine:

The array is sorted in place, so no merging is needed.

Time Complexity

a) WorstCase: O(n^2) (occurs when the pivot is the smallest or largest element each time)
b) AverageCase: O(n log n)
c) BestCase: O(n log n) (occurs when the pivot divides the array into two equal halves)

Space Complexity

Auxiliary Space: O(log n) due to recursion stack space.

Stability

Unstable: May not preserve the relative order of equal elements.

Here’s a basic implementation of Quick Sort in Python:

```python

def quick_sort(arr):

if len(arr) <= 1:

return arr

pivot = arr[len(arr) // 2]

left = [x for x in arr if x < pivot]

middle = [x for x in arr if x == pivot]

right = [x for x in arr if x > pivot]

return quick_sort(left) + middle + quick_sort(right)

# Example usage

arr = [12, 11, 13, 5, 6, 7]


sorted_arr = quick_sort(arr)

print("Sorted array is:", sorted_arr)

```

Advantages:

InPlace Sorting: Does not require additional space for merging.

Average Case Performance: Generally faster in practice than merge sort for large datasets.

Disadvantages:

WorstCase Performance: Can degrade to O(n^2) if the pivot selection is poor.

Unstable: Does not guarantee to maintain the relative order of equal elements.

You might also like