DSA Notes 3rd Sem
DSA Notes 3rd Sem
Data Structure is a particular way of storing and organizing data in the memory of the computer so that
these data can easily be retrieved and efficiently utilized in the future when required.
Data Structure is a way of organizing all data items that considers not only the elements stored but also
their relationship to each other.
We can also define data structure as a mathematical or logical model of a particular organization of
data items.
The representation of particular data structure in the main memory of a computer is called as storage
structure.
The storage structure representation in auxiliary memory is called as file structure.
It is defined as the way of storing and manipulating data in organized form so that it can be used
efficiently.
Data Structure mainly specifies the following four things
o Organization of Data
o Accessing methods
o Degree of associativity
o Processing alternatives for information
Algorithm + Data Structure = Program
Data structure study covers the following points
o Amount of memory require to store.
o Amount of time require to process.
o Representation of data in memory.
o Operations performed on that data.
DATA
STRUCTURE
PRIMITIVE NON
PRIMITIVE
Data types
A particular kind of data item, as defined by the values it can take, the programming language used, or
the operations that can be performed on it.
Pointer: A variable that holds memory address of another variable are called pointer.
Graph: Graph is a collection of nodes (Information) and connecting edges (Logical relation) between nodes.
o A tree can be viewed as restricted graph.
o Graphs have many types:
Un-directed Graph
Directed Graph
Mixed Graph
Multi Graph
Simple Graph
Null Graph
Weighted Graph
1. Create
The create operation results in reserving memory for program elements. This can be done by declaration
statement. Creation of data structure may take place either during compile-time or run-time. malloc()
function of C language is used for creation.
2. Destroy
Destroy operation destroys memory space allocated for specified data structure. free() function of C language
is used to destroy data structure.
3. Selection
Selection operation deals with accessing a particular data within a data structure.
4. Updation
It updates or modifies the data in the data structure.
5. Searching
It finds the presence of desired data item in the list of data items, it may also find the locations of all
elements that satisfy certain conditions.
6. Sorting
Sorting is a process of arranging all data items in a data structure in a particular order, say for example,
either in ascending order or in descending order.
7. Merging
Merging is a process of combining the data items of two different sorted list into a single sorted list.
8. Splitting
Splitting is a process of partitioning single list to multiple list.
9. Traversal
Traversal is a process of visiting each and every node of a list in systematic manner.
An algorithm is a procedure that you can write as a C function or program, or any other language.
Algorithm Efficiency
Some algorithms are more efficient than others. We would prefer to choose an efficient algorithm, so it
would be nice to have metrics for comparing algorithm efficiency.
The complexity of an algorithm is a function describing the efficiency of the algorithm in terms of the
amount of data the algorithm must process.
Usually there are natural units for the domain and range of this function. There are two main complexity
measures of the efficiency of an algorithm
Time complexity
Time Complexity is a function describing the amount of time an algorithm takes in terms of the
amount of input to the algorithm.
"Time" can mean the number of memory accesses performed, the number of comparisons between
integers, the number of times some inner loop is executed, or some other natural unit related to the
amount of real time the algorithm will take.
Space complexity
Space complexity is a function describing the amount of memory (space) an algorithm takes in terms of
the amount of input to the algorithm.
We often speak of "extra" memory needed, not counting the memory needed to store the input itself.
Again, we use natural (but fixed-length) units to measure this.
We can use bytes, but it's easier to use, say, number of integers used, number of fixed-sized structures,
etc. In the end, the function we come up with will be independent of the actual number of bytes needed
to represent the unit.
Space complexity is sometimes ignored because the space used is minimal and/or obvious, but
sometimes it becomes as important an issue as time.
Asymptotic Notations
The commonly used asymptotic notations used for calculating the running time complexity of an
algorithm is given below:
For example:
If f(n) and g(n) are the two functions defined for positive integers,
Bhavisha Patel - Data Structure
7
Introduction to Data Structure
then f(n) = O(g(n)) as f(n) is big oh of g(n) or f(n) is on the order of g(n)) if there exists constants c
and no such that:
This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound on the function f(n). In
this case, we are calculating the growth rate of the function which eventually calculates the worst time
complexity of a function, i.e., how worst an algorithm can perform.
f(n)<=c.g(n)
2*1+3<=5*1
5<=5
If n=2
2*2+3<=5*2
7<=10
If we required that an algorithm takes at least certain amount of time without using an upper bound,
we use big- Ω notation i.e. the Greek letter "omega". It is used to bound the growth of running time for
large input size.
If f(n) and g(n) are the two functions defined for positive integers,
then f(n) = Ω (g(n)) as f(n) is Omega of g(n) or f(n) is on the order of g(n)) if there exists constants c
and no such that:
Is f(n)= Ω (g(n))?
f(n)>=c.g(n)
To check the above condition, we first replace f(n) by 2n+3 and g(n) by n.
2n+3>=c*n
Suppose c=1
2n+3>=n (This equation will be true for any value of n starting from 1).
Let f(n) and g(n) be the functions of n where n is the steps required to execute the program then:
f(n)= θg(n)
c1.g(n)<=f(n)<=c2.g(n)
where the function is bounded by two limits, i.e., upper and lower limit, and f(n) comes in between. The
condition f(n)= θg(n) will be true if and only if c1.g(n) is less than or equal to f(n) and c2.g(n) is greater
than or equal to f(n). The graphical representation of theta notation is given below:
As c1.g(n) should be less than f(n) so c1 has to be 1 whereas c2.g(n) should be greater than f(n) so c2 is
equal to 5. The c1.g(n) is the lower limit of the of the f(n) while c2.g(n) is the upper limit of the f(n).
c1.g(n)<=f(n)<=c2.g(n)
c1.n <=2n+3<=c2.n
If n=2
1*2<=2*2+3<=2*2
Therefore, we can say that for any value of n, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n). Hence, it
is proved that f(n) is big theta of g(n). So, this is the average-case scenario which provides the realistic
time complexity.
What is an Algorithm?
An algorithm is a process or a set of rules required to perform calculations or some other problem-
solving operations especially by a computer.
Its finite set of instructions which are being carried in a specific order to perform the specific task.
It is not the complete program or code; it is just a solution (logic) of a problem, which can be represented
either as an informal description using a Flowchart or Pseudocode.
Characteristics of an Algorithm
The following are the characteristics of an algorithm:
o Input: An algorithm has some input values. We can pass 0 or some input value to an algorithm.
o Output: We will get 1 or more output at the end of an algorithm.
o Unambiguity: An algorithm should be unambiguous which means that the instructions in an algorithm
should be clear and simple.
o Finiteness: An algorithm should have finiteness. Here, finiteness means that the algorithm should contain
a limited number of instructions, i.e., the instructions should be countable.
o Effectiveness: An algorithm should be effective as each instruction in an algorithm affects the overall
process.
o Language independent: An algorithm must be language-independent so that the instructions in an
algorithm can be implemented in any of the languages with the same output.
b1, b2 b1, u2
[1,1][1,2] [1,3] [ 1 ,m ] col 1 col 2 col 3 col 4
Representation of an array
In the above image, we have shown the memory allocation of an array arr of size 5. The array follows a
0-based indexing approach. The base address of the array is 100 bytes. It is the address of arr[0]. Here,
the size of the data type used is 4 bytes; therefore, each element will take 4 bytes in the memory.
Advantages of Array
o Array provides the single name for the group of variables of the same type. Therefore, it is easy to
remember the name of all the elements of an array.
o Traversing an array is a very simple process; we just need to increment the base address of the array in
order to visit each element one by one.
o Any element in the array can be directly accessed by using the index.
Disadvantages of Array
o Array is homogenous. It means that the elements with similar data type can be stored in it.
o In array, there is static memory allocation that is size of an array cannot be altered.
o There will be wastage of memory if we store less number of elements than the declared size.
Applications of Array
1. Symbol Manipulation (matrix representation of polynomial equation)
2. Sparse Matrix
Once we have algorithm for converting the polynomial equation to an array representation and another
algorithm for converting array to polynomial equation, then different operations in array (matrix) will be
corresponding operations of polynomial equation
0 0 0 2 0 0 1 0
0 6 0 0 7 0 0 3
0 0 0 9 0 8 0 0
0 4 5 0 0 0 0 0
Fig (a) 4 x 8 matrix
Terms 0 1 2 3 4 5 6 7 8
Row 1 1 2 2 2 3 3 4 4
Column 4 7 2 5 8 4 6 2 3
Value 2 1 6 7 3 9 8 4 5
Fig (b) Linear Representation of above matrix
To construct matrix structure we need to record
(a) Original row and columns of each non zero entries
(b) No of rows and columns in the matrix
So each element of the array into which the sparse matrix is mapped need to have three fields: row,
column and value
A corresponding amount of time is saved creating the linear list representation over initialization of two
dimension array.
0 0 6 0 9 0 0
A= 2 0 0 7 8 0 4
10 0 0 0 0 0 0
0 0 12 0 0 0 0
0 0 0 0 0 0 0
0 0 0 3 0 0 5
Bhavisha Patel -Data Structure
5
Linear Data Structure
Here from 6X7=42 elements, only 10 are non zero. A[1,3]=6, A[1,5]=9, A[2,1]=2, A[2,4]=7, A[2,5]=8,
A[2,7]=4, A[3,1]=10, A[4,3]=12, A[6,4]=3, A[6,7]=5.
One basic method for storing such a sparse matrix is to store non-zero elements in one dimensional
array and to identify each array elements with row and column indices fig (c)
ROW COLUMN A
1 1 3 6
2 1 5 9
3 2 1 2
4 2 4 7
5 2 5 8
6 2 7 4
7 3 1 10
8 4 3 12
9 6 4 3
10 6 7 5
Fig (c )
COLUMN A
1 3 6
ROW 2 5 9
1 1 3 1 2
2 3 4 4 7
3 7 5 5 8
4 8 6 7 4
5 0 7 1 10
6 9 8 3 12
9 4 3
10 7 5
ROW NO First Column
for row no COLUMN NO
Fig(d)
A more efficient representation in terms of storage requirement and access time to the row of the matrix
is shown in fid (d). The row vector changed so that its ith element is the index to the first of the column
indices for the element in row I of the matrix.
1 3 6 1 5 9 2 1 2 2 4 7
2 5 8 2 7 4 3 1 10 4 3 12
6 4 3 6 7 5 NULL
What is a Stack?
o A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle.
o Stack has one end, whereas the Queue has two ends (front and rear). It contains only one
pointer top pointer pointing to the topmost element of the stack.
o Whenever an element is added in the stack, it is added on the top of the stack, and the element
can be deleted only from the stack.
o In other words, a stack can be defined as a container in which insertion and deletion can be
done from the one end known as the top of the stack.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure there are five memory blocks in
the stack; therefore, the size of the stack is 5.
Suppose we want to store the elements in a stack and let's assume that stack is empty. We have taken
the stack of size 5 as shown below in which we are pushing the elements one by one until the stack
becomes full.
Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it goes from
the top to the bottom when we were entering the new element in the stack. The stack gets filled up
from the bottom to the top.
When we perform the delete operation on the stack, there is only one way for entry and exit as the other
end is closed. It follows the LIFO pattern, which means that the value entered first will be removed last.
In the above case, the value 5 is entered first, so it will be removed only after the deletion of all the other
elements.
o push(): When we insert an element in a stack then the operation is known as a push. If the stack is full
then the overflow condition occurs.
o pop(): When we delete an element from the stack, the operation is known as a pop. If the stack is empty
means that no element exists in the stack, this state is known as an underflow state.
o isEmpty(): It determines whether the stack is empty or not.
o isFull(): It determines whether the stack is full or not.'
o peek(): It returns the element at the given position.
Bhavisha Patel -Data Structure
8
Linear Data Structure
PUSH operation
The steps involved in the PUSH operation is given below:
POP operation
The steps involved in the POP operation is given below:
o Before deleting the element from the stack, we check whether the stack is empty.
o If we try to delete the element from the empty stack, then the underflow condition occurs.
o If the stack is not empty, we first access the element which is pointed by the top
o Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.
Applications of Stack
The following are the applications of the stack:
o Balancing of symbols: Stack is used for balancing a symbol. For example, we have the following program:
int main()
{
cout<<"Hello";
cout<<"javaTpoint";
}
As we know, each program has an opening and closing braces; when the opening braces come, we push
the braces in a stack, and when the closing braces appear, we pop the opening braces from the stack.
Therefore, the net value comes out to be zero. If any symbol is left in the stack, it means that some
syntax occurs in a program.
o String reversal: Stack is also used for reversing a string. For example, we want to reverse a "javaTpoint"
string, so we can achieve this with the help of a stack.
First, we push all the characters of the string in a stack until we reach the null character.
After pushing all the characters, we start taking out the character one by one until we reach the bottom
of the stack.
o UNDO/REDO: It can also be used for performing UNDO/REDO operations. For example, we have an editor
in which we write 'a', then 'b', and then 'c'; therefore, the text written in an editor is abc. So, there are three
states, a, ab, and abc, which are stored in a stack. There would be two stacks in which one stack shows
UNDO state, and the other shows REDO state.
If we want to perform UNDO operation, and want to achieve 'ab' state, then we implement pop operation.
o Recursion: The recursion means that the function is calling itself again. To maintain the previous states,
the compiler creates a system stack in which all the previous records of the function are maintained.
o DFS(Depth First Search): This search is implemented on a Graph, and Graph uses the stack data structure.
o Backtracking: Suppose we have to create a path to solve a maze problem. If we are moving in a particular
path, and we realize that we come on the wrong way. In order to come at the beginning of the path to
create a new path, we have to use the stack data structure.
o Expression conversion: Stack can also be used for expression conversion. This is one of the most
important applications of stack. The list of the expression conversion is given below:
o Infix to prefix
o Infix to postfix
o Prefix to infix
o Prefix to postfix
o Postfix to infix
o Memory management: The stack manages the memory. The memory is assigned in the contiguous
memory blocks. The memory is known as stack memory as all the variables are assigned in a function call
stack memory. The memory size assigned to the program is known to the compiler. When the function is
created, all its variables are assigned in the stack memory. When the function completed its execution, all
the variables assigned in the stack are released.
Tower of Hanoi
o The Tower of Hanoi is a mathematical puzzle.
o It consists of three poles and a number of disks of different sizes which can slide onto
any pole.
o The puzzle starts with the disk in a neat stack in ascending order of size in one pole,
the smallest at the top thus making a conical shape.
o The objective of the puzzle is to move all the disks from one pole (say ‘source pole’) to
another pole (say ‘destination pole’) with the help of the third pole (say auxiliary pole).
o Total of 2n – 1 moves are required.
o The puzzle has the following two rules:
1. You can’t place a larger disk onto a smaller disk
2. Only one disk can be moved at a time
Example:
Let us understand with a simple example with 3 disks:
So, total number of moves required = 7
S A D
So, after all these destination poles contains all the in order of size.
After observing above iterations, we can think that after a disk other than the smallest
disk is moved, the next disk to be moved must be the smallest disk because it is the top
disk resting on the spare pole and there are no other choices to move a disk.
There are two major cases in any recursive solution. They are
Base case in which the problem is simple enough to be solved directly without the need for any more calls
to the same function
Recursive case
The problem at hand is initially subdivided into simpler sub-parts.
The function calls itself again, but this time with sub-parts of the problem obtained in the first step.
The final result is obtained by combining the solutions of simpler sub-parts.
A recursive function is said to be well-defined if it has the following two properties:
There must be a base criteria in which the function doesn’t call itself.
Every time the function is called itself, it must be closer to the base criteria.
Factorial Function
n!
. To find n!, multiply the number by the factorial of the number that is one less than the number.
That is,
n! = n * (n - 1)!
Assume we need to find the value of 5!.
Then,
5! = 5 * 4!
, where
4! = 4 * 3!
Therefore,
5! = 5 * 4 * 3!
Expanding further, we get
5! = 5 * 4 * 3 * 2 * 1!
We can now write a recursive function that computes the factorial of a number. Here the base case is
when
n = 1
, because the result will be 1 as
1! = 1
. The recursive case of the factorial function will call itself, but with a smaller value of n, as
factorial(n) = n factorial (n–1).
#include <stdio.h>
int Fact(int);
int main()
{
Bhavisha Patel -Data Structure
16
Linear Data Structure
Given integer N, this algorithm computes factorial of N. Stack A is used to store an activation record associated
with each recursive call. Each activation record contains the current value of N and the current return address
RET_ADDE. TEMP_REC is also a record which contains two variables PARAM & ADDRESS.TOP is a pointer to the
top element of stack A. Initially return address is set to the main calling address. PARAM is set to initial value N.