[go: up one dir, main page]

0% found this document useful (0 votes)
32 views14 pages

FDS Unit 1

The document discusses different types of data structures including primitive and non-primitive, linear and non-linear, static and dynamic, and persistent and ephemeral data structures. It also covers fundamentals of data structures including abstract data types and their realization. Key algorithms concepts like time and space complexity are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views14 pages

FDS Unit 1

The document discusses different types of data structures including primitive and non-primitive, linear and non-linear, static and dynamic, and persistent and ephemeral data structures. It also covers fundamentals of data structures including abstract data types and their realization. Key algorithms concepts like time and space complexity are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Unit-1.

Fundamentals of Data Structures


Chapter Index:
1. Fundamentals 1
1.1 Data Structures 1
1.2 Abstract Data Types and its realization 2
2. Types of Data Structure 3
2.1 Primitive, Non-Primitive 3
2.2 Linear, Non-linear 3
2.3 Static, Dynamic 4
2.4 Persistent, Ephemeral data structures 4
3. Performance Analysis of Algorithms 4
3.1 Algorithm 4
3.2 Characteristics of an Algorithm 5
3.3 Performance Analysis of an Algorithm 5
3.3.1 Time Complexity 6
3.3.2 Space Complexity 12

1. Fundamentals:
A data structure is a way of organizing, processing, retrieving and storing data. There
are several basic and advanced types of data structures, all designed to arrange data to
suit a specific purpose. Data structures make it easy for users to access and work with
the data they need in appropriate ways. Most importantly, data structures frame the
organization of information so that machines and humans can better understand it.

In computer science and computer programming, a data structure may be selected or


designed to store data for the purpose of using it with various algorithms. In some cases,
the algorithm's basic operations are tightly coupled to the data structure's design. Each
data structure contains information about the data values, relationships between the data
and -- in some cases -- functions that can be applied to the data.

1.1 Data Structures:


 Data structures is a way of storing and organizing data so that it can be used
efficiently.
 This data is organized in the memory.
 Array, for example, is one of the data structures in C language. Array is
collection of memory elements in which data is stored sequentially.
 A data structure should be seen as a logical concept that must address two
fundamental concerns:
i. How the data will be stored?
ii. What operations will be performed on it.
It is not only important to use data structures, but it is also important to choose the
proper data structure for each task. Choosing a wrong data structure could result in

Page. 1 Prepared by: Prof. P. B. Dhanwate


slow runtimes or unresponsive code. Five factors to consider when picking a data
structure include the following:
a. What kind of information will be stored?
b. How will that information be used?
c. Where should data persist, or be kept, after it is created?
d. What is the best way to organize the data?
e. What aspects of memory and storage reservation management should be
considered?

1.2 Abstract Data Types and its Realization:


Data types such as int, float, double, long, etc. are considered to be in-built data types
and we can perform basic operations with them such as addition, subtraction, division,
multiplication, etc. Now there might be a situation when we need operations for our
user-defined data type which have to be defined. These operations can be defined only
as and when we require them. So, in order to simplify the process of solving problems,
we can create data structures along with their operations, and such data structures that
are not in-built are known as Abstract Data Type (ADT).

Abstract Data type (ADT) is a type (or class) for objects whose behavior is defined
by a set of values and a set of operations. The definition of ADT only mentions what
operations are to be performed but not how these operations will be implemented. It
does not specify how data will be organized in memory and what algorithms will be
used for implementing the operations. It is called “abstract” because it gives an
implementation-independent view.
The process of providing only the essentials and hiding the details is known as
abstraction.

Fig. ADT

The user of data type does not need to know how that data type is implemented, for example,
we have been using Primitive values like int, float, char data types only with the knowledge
that these data type can operate and be performed on without any idea of how they are
implemented.

Page. 2 Prepared by: Prof. P. B. Dhanwate


So a user only needs to know what a data type can do, but not how it will be implemented.
Think of ADT as a black box which hides the inner structure and design of the data type.
1.2.1: Data Type:
Data type is a way to classify various types of data such as integer, string, etc.
a. Built-in data type.
b. Derived data type.

a. Built-in data type:


The data types for which a language has built-in support are known as built-in data
types. For example: integer, Boolean, float, character, string.
b. Derived data type:
The data types which are implementation independent as they can be implemented in
one or other way are known as derived data types. For example: list, array, stack,
queue.

2. Types of Data Structures:

Fig. 1.1 Types of Data Structures


2.1 Primitive and Non-primitive data structures:
2.1.1 Primitive data structures store data of only one type. It contains some value,
i.e., it cannot be NULL. And its size depends on the type of data structure. Ex- integer,
float, character, Boolean.

2.1.2 Non-primitive data structures store data of more than one type. Non-primitive
data structures can contain NULL values. Size not fixed. Ex- Array, linked list, stacks,
queues.

2.2 Linear and Non-Linear data structures:


2.2.1 In linear data structure, arrangement of data is in sequential manner. Since
elements are arranged sequentially, they are easy to implement. Ex- Array, linked list,
stacks, queues.

Page. 3 Prepared by: Prof. P. B. Dhanwate


2.2.2 In non-linear data structure, arrangement of data is not in sequential manner.
Instead they are arranged in hierarchical order. Ex- trees, graphs.

2.3 Static and Dynamic data structures:


2.3.1 With a static data structure, the size of the data structure used is fixed i.e. the
data structure once created we cannot increase or decrease its size, but the content
in it can be modified without changing the memory space allocated to it. Memory
to static data structures is allocated at compile time.
Example: Arrays.
0 1 2 3 4 Index
12 33 10 22 21 Actual Elements

Array length = 5, first index = 0, last index = 5

2.3.2 In dynamic data structures, the size of the data structure is not fixed and can be
modified (increased or decreased) during the operations performed on it. Dynamic
data structures are designed to facilitate change of data structures in the runtime.
Example: linked list

Fig. 1.2 linked list

2.4 Persistent and Ephemeral data structures:


2.4.1 A persistent data structure is a data structure that always preserves the previous
version of itself when it is modified. They can be considered as immutable (non-
changeable). Because of this, we can keep track of all changes made on the data
structures. i.e. easy to roll back to a previous state.
Example: Persistent Segment Tree.

2.4.2 Ephemeral data structure is a data structure that does not preserve the previous
version of itself. Only the current state of data structure is maintained. Since no
maintenance of previous versions makes the overall structure simplier.
Example: Linked list, arrays

3. Performance Analysis of Algorithm:


3.1 Algorithm:
 An algorithm is a finite set of instructions that, if followed, accomplishes a
particular task.

Page. 4 Prepared by: Prof. P. B. Dhanwate


 In simple language: an algorithm is any well-defined computational
procedure that takes some value, or set of values as input and produces some
value or set of values as output.
 Algorithm is independent from any programming language, i.e. its same for
any programming language.

3.2 Characteristics of an Algorithm:

The following are the characteristics of an algorithm:

o Input: An algorithm has some input values. We can pass 0 or some input value to an
algorithm.
o Output: We will get 1 or more output at the end of an algorithm.
o Unambiguity: An algorithm should be unambiguous which means that the instructions
in an algorithm should be clear and simple.
o Finite-ness: The algorithm must be finite, i.e. it should terminate after a finite time.
o Effectiveness: An algorithm should be effective as each instruction in an algorithm
affects the overall process.
o Language independent: An algorithm must be language-independent so that the
instructions in an algorithm can be implemented in any of the languages with the same
output.

Fig. 1.3 characteristics of an algorithm

3.3 Performance Analysis of an Algorithm


Analysis of algorithms is the determination of the amount of time and space required
to execute any program/code.
i.e. performance analysis of an algorithm depends upon two factors:

Page. 5 Prepared by: Prof. P. B. Dhanwate


1. amount of memory used for execution
2. amount of time required for execution.
Formally they are notified as complexities in terms of:
 Space Complexity
 Time Complexity.

3.3.1 Time complexity: The time complexity of an algorithm is the amount of computer time
it need to run to complete.
In simple language, every algorithm requires some amount of computer time to execute its
instructions to perform specific task. This time is called as time complexity.

Generally, the running time of an algorithm depends upon following:


i. Whether it is running on single processor system or multiprocessor system.
ii. Whether it is a 32-bit system or 64-bit system.
iii. Read and write speed of the system.
iv. The amount of time required by an algorithm to perform arithmetic operations,
logical operations, assignment operations and return values.
v. Input data.
Time complexity considers how many number of times each statement executes.
Is time complexity of an algorithm same as running time time/execution time of the
algorithm?
Answer: time complexity of an algorithm is not equal to the actual time required to execute a
particular algorithm, but number of times a statement executes.

T(P) = compile time + execution time


T(P) = tp (execution time)

Step count:
i. For algorithm heading = 0 time
ii. For braces = 0
iii. For expressions = 1 unit of time
iv. For any looping statement = number of times the loop is repeating

Example:
i. Algorithm abc(d, e, f)…………………………………………0 time
{………………………………………………………………0 time
Return d + e * f + (d + e + f)/(d + e) + 4.0………….1 unit of time
} …………………………………………………………….0 time
Total time = 1 unit

Page. 6 Prepared by: Prof. P. B. Dhanwate


ii. Algorithm sum(a,n)……………………………………… ..0 time
{………………………………… ……………………….0 time
S = 0.0; ……………………………………………1
For i=1 to n ………………n + 1 time // n = true cases, 1 = false case
S = s + a[i]; ………………………………………..n
Return s; ………………………………………………..1
} ………………………………………………………0
Total time = 2n + 3

iii. Algorithm rsum(a, n) //recursive algorithm


{
If (n<=0) then
Return 0.0;
Else
Return rsum(a, n-1) + a[n];
}

Assume n = 0
Then T (0) = 2 // 1unit time for if statement and 1 for return statement

Assume n > 0
Then T(n) = 2+ T(n-1)
T(n) = 2+(2+T(n-2))
= 2*2+T(n-2)
= 2* 2+(2+T(n-3))
= 2*3+T(n-3)
.
.
.
= 2 * n + T(n-n)
= 2n+T(0) //T (0)=2; from above
T(n) = 2n+2

iv. Write a C/C++ code to find maximum between N numbers, where N varies from
10, 100, 1000, 10000. In linux operating system, if we run the program with the
following commands:
gcc program.c -o program
time ./program

result:
for N = 10……………………..0.5 ms time
for N = 10,000 ………………0.2 ms time may require

Page. 7 Prepared by: Prof. P. B. Dhanwate


this example shows that actual time required to execute code is machine
dependent.

Consider the following two scenarios to more understand the time complexity:
Suppose you are having one problem and you have 3 algorithms for the same problem. Now
among these 3 algorithms you want to choose the best one. How to choose it?
Solution 1: run all the 3 algorithms on 3 different computers. Provide same input and find
time taken by all 3 algorithms and choose the one who took less time. But in this solution
there is possibility that these systems might be using different processors, so processing speed
might vary. So this solution is not efficient.
Solution 2: run all 3 algorithms on same computer and find which algorithm is taking least
time. But here also we might get wrong results because at the time of execution of a program,
other things are also running simultaneously. So this solution is also not efficient.
So there should be some standard notation to analyze the algorithm. These notations are
called as “Asymptotic Notations”.
In asymptotic notation, system configuration is not considered. Rather order of growth of
input will be considered.
The study of change in performance of the algorithm with the change in the order of the input
size is defined as asymptotic analysis.
Will try to find out how time or space taken by the algorithm will increase/decrease after
increasing/decreasing input size.
There are 3 asymptotic notations that are used to represent time complexity of an
algorithm:
a. Big O Notation
b. Omega Ω Notation
c. Theta Ɵ Notation
Usually, time required by an algorithm falls under 3 types:
a. Best case: minimum time required for program execution.
b. Average case: average time required for program execution.
c. Worst case: maximum time required for program execution.
Example:
Consider an array of size 5. If we want to find out one particular element from say [1, 2, 3, 4,
5] i.e. consider we want to find 1, then we found it at the very first position. So this is best
case. Now suppose array elements are [2, 3, 4, 5, 1] and we want to find 1 again. It is present,
but this time at the last position. So this can be considered as average case. Again if the array
is [2, 4 ,5, 3, 1] and if we are trying to find element 6 then this can be considered as worst
case scenario as 6 is not present in the array.
1. Big O Notation:

Page. 8 Prepared by: Prof. P. B. Dhanwate


The Big O notation defines the upper bound of any algorithm i.e. your algorithm cant
take more time than this time.
In other words, Big O denotes maximum time taken by an algorithm. Thus it gives
worst case complexity of an algorithm.
The Big O notation is the most used notation for time complexity of an algorithm.

A function f(n) is said to be in O(g(n)), if f(n) is bounded above by some constant


multiple ‘c’ of g(n) for all large ‘n’.
F(n) <= c * g(n) for every n >= n0

Note: f (n) = runtime of our algorithm


G (n) = arbitrary time complexity we are trying to relate to our algorithm.

Fig. 1.4 Big O Notation

Example:
1. F(n) = 2n + 2, g(n) = n2
Note: f (n) should always be less than or equal to g(n)

Let n = 1
F(1) = (2*1) + 2 = 4 g(1) = 12 = 1 //f(1)>g(1)

Let n = 2
F(2) = (2*2) + 2 = 6 g(2) = 22 = 4 //f(2)>g(2)

Let n = 3
F(3) = (2*3) + 2 = 8 g(3) = 32 = 9 //f(3)<g(3)

Page. 9 Prepared by: Prof. P. B. Dhanwate


Let n = 4
F(4) = (2*4) + 2 = 10 g(4) = 42 = 16 //f(4)<g(4)

Hence, f(n) = O(g(n)), where n>=3 and g(n) is bounding above when n>=3
2. Omega Ω Notation:
It denotes the lower bound of an algorithm i.e. the time taken by the algorithm cannot
be lower than this.
In other words, this is the fastest time taken algorithm when provided with best case
input.
A function f(n) is said to be in Ω(g(n)) denoted by f(n) = Ωg(n)), if f(n) is bounded
below by some constant multiple of g(n) for all large ‘n’.
i.e. if there exists some positive constant ‘c’ and some non-negative integer n0 such
that
f(n) > = c*g(n) for all n>=n0

Fig. 1.5 Omega Ω Notation

Example:
F(n) = 2n2 + 3 g(n) = 7n
Note: f(n) should always be greater than or equal to g(n)

Let n = 1
F(1) = 2 * (12) + 3 = 5 g(1) = 7 * 1 = 7 //f(1)<g(1)

Let n = 2
F(2) = 2 * (22) + 3 = 11 g(2) = 7 * 2 = 14 //f(2)<g(2)

Let n = 3
F(3) = 2 * (23) + 3 = 19 g(3) = 7 * 3 = 21 //f(3)<g(3)

Page. 10 Prepared by: Prof. P. B. Dhanwate


Let n = 4
F(4) = 2 * (24) + 3 = 35 g(4) = 7 * 4 = 28 //f(4)>g(4)

Hence, f(n) = Ω(g(n)) for all n>=3

3. Theta Ɵ Notation:
A function f(n) is said to be Ɵ(g(n)), denoted as f(n) = Ɵ(g(n)), if f(n) is bounded both
above and below by some positive constant multiples of g(n) for all large ‘n’.
Ɵ is used to find out the average bound of an algorithm i.e. it defines upper bound and
lower bound and your algorithm will lie between these levels.

C2 * g(n) <= f(n) <= c1 * g(n) for all n>=n0

Fig. 1.6 Theta ɵ Notation


Example:
F(n) = 2n+8 g(n)=n
Assume c1 = 8 and c2 = 2
Let n=1
F(1) = (2 * 1) + 8 = 10 ; g(1) = 1
C1 * g(1) = 8*1 = 8
C2 * g(1) = 2*1 = 2

C2 * g(n) <= f(n) <= c1 * g(n)


2 <= 10 <=8 //not satisfied

Page. 11 Prepared by: Prof. P. B. Dhanwate


Let n=2

F(2) = (2*2) + 8 = 12 g(2) = 2


C1 * g(2) = 8*2 = 16
C2 * g(2) = 2*2 = 4

C2 * g(n) <= f(n) <= c1 * g(n)


4 <= 12 <=16 //satisfied

Therefore, f(n) = Ɵ(g(n)) for n>=2

Let’s see all different time complexities in one graph:

Fig. 1.7 Time complexity graph


Common Asymptotic Notations:
Following is the list of some asymptotic notations:
Sr. No Time Complexity Notation Examples
1. Constant O(1) Accessing a value with an array index
int a[]={1,2,3,4,5}
printf(“%d”, a[2]);
Output: 3
2. Logarithmic O(log n) Binary Search
3. Linear O(n) Linear Search
4. Linearithmic O (n log n) Merge sort
5. Quadratic O(n2) Bubble sort (2 for loops)
6. Cubic O(n3) Matrix update (3 for loops)
n
7. Exponential O(2 ) Brute Force algorithm
8. Factorial time O(n!) Travelling Salesman Problem (TSP)

3.2.2 Space Complexity:


Space complexity is nothing but the amount of memory space that an algorithm or a
problem takes during execution of that particular problem.
Space needed by an algorithm is given by:

Page. 12 Prepared by: Prof. P. B. Dhanwate


S(P) = C (fixed part) + Sp (variable part)

Fixed part: is independent of instance characteristic. i.e. space for simple variables,
constants, etc (int a; int b)

Variable part: is space for variables whose size is dependent on particular problem
instance.
Example: array
Example:
i. Algorithm max(A, n)
{
Result = A[1];
For i = 2 to n do
If A[i]>result then
Result = A[i];
Return = A[i];
}
Answer:
Variables i, n, result = 1 unit each // 4 bytes each, which is fixed, can’t be changed
Variable A = n units // n * 4 bytes, completely depends on ‘n’
Total = n + 4
Removing constant value:
Space complexity: O(n)

ii. Algorithm abc(d, e, f)


{
Return d + e * f + (d + e + f)/(d + e) + 4.0
}
Answer:
d: 4 bytes, fixed
e: 4 bytes, fixed
f: 4 bytes, fixed
total: 12 bytes at least
removing constant values:
space complexity: O(1), constant space

iii. Algorithm sum(a, n)


{
s=0.0;
for i=1 to n do
s=s+a[i];
return s;
}

Page. 13 Prepared by: Prof. P. B. Dhanwate


Answer:
Variables i, n, s = 4 bytes each i.e. (4 + 4 + 4 = 12)// fixed part
Variable a: n * 4 bytes // variable part
Total =(4*n+12) bytes
Removing constant values:
Total = 4*n
Space complexity: O(n)

Page. 14 Prepared by: Prof. P. B. Dhanwate

You might also like