Unit - 1
Unit - 1
UNIT -1
The term data structure is used to describe the way data is stored, and the term algorithm
is used to describe the way data is processed. Data structures and algorithms are interrelated.
Choosing a data structure affects the kind of algorithm you might use, and choosing an algorithm
affects the data structures we use.
An Algorithm is a finite sequence of instructions, each of which has a clear meaning and
can be performed with a finite amount of effort in a finite length of time. No matter what the
input values may be, an algorithm terminates after executing a finite number of instructions.
A data structure is said to be linear if its elements form a sequence or a linear list. The
linear data structures like an array, stacks, queues and linked lists organize data in linear order. A
data structure is said to be non linear if its elements form a hierarchical classification where, data
items appear at various levels.
Trees and Graphs are widely used non-linear data structures. Tree and graph structures
represents hierarchial relationship between individual data elements. Graphs are nothing but trees
with certain restrictions removed.
Primitive Data Structures are the basic data structures that directly operate upon the machine
instructions. They have different representations on different computers. Integers, floating point
numbers, character constants, string constants and pointers come under this category.
Non-primitive data structures are more complicated data structures and are derived from
primitive data structures. They emphasize on grouping same or different data items with
relationship between each data item. Arrays, lists and files come under this category.
Data structures: Organization of data
The collection of data you work with in a program have some kind of structure or
organization.No matte how complex your data structures are they can be broken down into two
fundamentaltypes:
• Contiguous
• Non-Contiguous.
In contiguous structures, terms of data are kept together in memory (either RAM or in a
file). An array is an example of a contiguous structure. Since each element in the array is located
next to one or two other elements. In contrast, items in a non-contiguous structure and scattered
in memory, but we linked to each other in some way. A linked list is an example of a non-
contiguous data structure. Here, the nodes of the list are linked together using pointers stored in
each node. Figure 1.2 below illustrates the difference between contiguous and non-contiguous
structures.
Contiguous structures:
Contiguous structures can be broken drawn further into two kinds: those that contain data
items of all the same size, and those where the size may differ. Figure 1.2 shows example of each
kind. The first kind is called the array. Figure 1.3(a) shows an example of an array of numbers.
In an array, each element is of the same type, and thus has the same size.
The second kind of contiguous structure is called structure, figure 1.3(b) shows a simple
structure consisting of a person‘s name and age. In a struct, elements may be of different data
types and thus may have different sizes.
For example, a person‘s age can be represented with a simple integer that occupies two
bytes of memory. But his or her name, represented as a string of characters, may require many
bytes and may even be of varying length.
Couples with the atomic types (that is, the single data-item built-in types such as integer, float
and pointers), arrays and structs provide all the ―mortar‖ you need to built more exotic form of
data structure, including the non-contiguous forms.
Non-contiguous structures:
Hybrid structures:
If two basic types of structures are mixed then it is a hybrid form. Then one part
contiguous and another part non-contiguous. For example, figure 1.5 shows how to implement a
double–linked list using three parallel arrays, possibly stored a past from each other in memory.
The array D contains the data for the list, whereas the array P and N hold the previous
andnext ―pointers‘‘. The pointers are actually nothing more than indexes into the D array. For
instance, D[i] holds the data for node i and p[i] holds the index to the node previous to i, where
may or may not reside at position i–1. Like wise, N[i] holds the index to the next node in the list.
The design of a data structure involves more than just its organization. You also need to
plan for the way the data will be accessed and processed – that is, how the data will be
interpreted actually, non-contiguous structures – including lists, tree and graphs – can be
implemented either contiguously or non- contiguously like wise, the structures that are normally
treated as contiguously - arrays and structures – can also be implemented non-contiguously.
The notion of a data structure in the abstract needs to be treated differently from what
ever is used to implement the structure. The abstract notion of a data structure is defined in terms
of the operations we plan to perform on the data.Considering both the organization of data and
the expected operations on the data, leads to the notion of an abstract data type. An abstract data
type in a theoretical construct that consists of data as well as the operations to be performed on
the data while hiding implementation.
For example, a stack is a typical abstract data type. Items stored in a stack can only be
addedand removed in certain order – the last item added is the first item removed. We call these
operations, pushing and popping. In this definition, we haven‘t specified have items are store on
the stack, or how the items are pushed and popped. We have only specified the valid operations
that can be performed. For example, if we want to read a file, we wrote the code to read the
physical file device. That is, we may have to write the same code over and over again. So we
created what is knowntoday as an ADT. We wrote the code to read a file and placed it in a
library for a programmer touse.
As another example, the code to read from a keyboard is an ADT. It has a data
structure,character and set of operations that can be used to read that data structure.
To be made useful, an abstract data type (such as stack) has to be implemented and this is
where data structure comes into ply. For instance, we might choose the simple data structure of
an array to represent the stack, and then define the appropriate indexing operations to perform
pushing and popping.
The most important process in designing a problem involves choosing which data
structure to use. The choice depends greatly on the type of operations you wish to perform.
Suppose we have an application that uses a sequence of objects, where one of the main
operations is delete an object from the middle of the sequence. The code for this is as follows:
This function shifts towards the front all elements that follow the element at position
posn. This shifting involves data movement that, for integer elements, which is too costly.
However, suppose the array stores larger objects, and lots of them. In this case, the overhead for
moving data becomes high. The problem is that, in a contiguous structure, such as an array the
logical ordering (the ordering that we wish to interpret our elements to have) is the same as the
physical ordering (the ordering that the elements actually have in memory).
The process of detecting a node from a list is independent of the type of data stored in the
node, and can be accomplished with some pointer manipulation as illustrated in figure below:
Since very little data is moved during this process, the deletion using linked lists will
often befaster than when arrays are used. It may seem that linked lists are superior to arrays. But
is that always true? There are tradeoffs. Our linked lists yield faster deletions, but they take up
more space because they require two extra pointers per element.
Linked List
Linked lists and arrays are similar since they both store collections of data. Array is the
most common data structure used to store collections of elements. Arrays are convenient to
declare and provide the easy syntax to access any element by its index number. Once the array is
set up, access to any element is convenient and fast. The disadvantages of arrays are:
• The size of the array is fixed. Most often this size is specified at compile time. This makes the
programmers to allocate arrays, which seems "large enough" than required.
• Inserting new elements at the front is potentially expensive because existing elements need to
be shifted over to make room.
• Deleting an element from an array is not possible.
Linked lists have their own strengths and weaknesses, but they happen to be strong where
arrays are weak. Generally array's allocates the memory for all its elements in one block whereas
linked lists use an entirely different strategy. Linked lists allocate memory for each element
separately and only when necessary.
Here is a quick review of the terminology and rules of pointers. The linked list code will depend
on the following functions:
malloc() is a system function which allocates a block of memory in the "heap" and returns a
pointer to the new block. The prototype of malloc() and other heap functions are in stdlib.h.
malloc() returns NULL if it cannot fulfill the request. It is defined by:
void *malloc (number_of_bytes)
Since a
void * is returned the C standard states that this pointer can be converted to any type.
For example
char *cp;
cp = (char *) malloc (100);
Attempts to get 100 bytes and assigns the starting address to cp. We can also use the sizeof()
function to specify the number of bytes. For example,
int *ip;
ip = (int *) malloc (100*sizeof(int));
free()is the opposite of malloc(), which de-allocates memory. The argument to free() is a pointer
to a block of memory in the heap — a pointer which was obtained by a malloc() function. The
syntax is:
free (ptr);
The advantage of free() is simply memory management when we no longer need a block.
Linked lists have many advantages. Some of the very important advantages are:
1. Linked lists are dynamic data structures. i.e., they can grow or shrink during the execution of a
program.
2. Linked lists have efficient memory utilization. Here, memory is not pre-allocated. Memory is
allocated whenever it is required and it is de-allocated (removed) when it is no longer needed.
3. Insertion and Deletions are easier and efficient. Linked lists provide flexibility in inserting a
data item at a specified position and deletion of the data item from the given position.
4. Many complex applications can be easily carried out with linked lists.
1. It consumes more space because every node requires a additional pointer to store address of
the next node.
2. Searching a particular element in list is difficult and also time consuming.
Basically we can put linked lists into the following four items:
1. Single Linked List.
2. Double Linked List.
3. Circular Linked List.
4. Circular Double Linked List.
A single linked list is one in which all nodes are linked together in some sequential manner.
Hence, it is also called as linear linked list.
A double linked list is one in which all nodes are linked together by multiple links which helps in
accessing both the successor node (next node) and predecessor node (previous node) from any
arbitrary node within the list. Therefore each node in a double linked list has two link fields
(pointers) to point to the left node (previous) and the right node (next). This helps to traverse in
forward direction and backward direction.
A circular linked list is one, which has no beginning and no end. A single linked list can be
made a circular linked list by simply storing address of the very first node in the link field of the
last node. A circular double linked list is one, which has both the successor pointer and
predecessor pointer in the circular manner.
Applications of linked list:
1. Linked lists are used to represent and manipulate polynomial. Polynomials areexpression
containing terms with non zero coefficient and exponents.
Forexample:P(x) = a0 Xn + a1 Xn-1 + …… + an-1 X + an
2. Represent very large numbers and operations of the large number such asaddition,
multiplication and division.
3. Linked lists are to implement stack, queue, trees and graphs.
4. Implement the symbol table in compiler construction
Singly Linked List:
A linked list allocates space for each element separately in its own block of
memorycalled a "node". The list gets an overall structure by using pointers to connect all
itsnodes together like the links in a chain. Each node contains two fields; a "data" field tostore
whatever element, and a "next" field which is a pointer used to link to the nextnode. Each node is
allocated in the heap using malloc(), so the node memorycontinues to exist until it is explicitly
de-allocated using free(). The front of the list is apointer to the ―start‖ node.
The beginning of the linked list is stored in a "start" pointer which points to the
firstnode. The first node contains a pointer to the second node. The second node contains
apointer to the third node, ... and so on. The last node in the list has its next field set toNULL to
mark the end of the list. Code can access any node in the list by starting at thestart and following
the next pointers.
The start pointer is an ordinary local pointer variable, so it is drawn separately on theleft
top to show that it is in the stack. The list nodes are drawn on the right to showthat they are
allocated in the heap.
Before writing the code to build the above list, we need to create a start node, used
tocreate and access other nodes in the linked list. The following structure definition will do:
• Creating a structure with one data item and a next pointer, which will be pointing to next node
of the list. This is called as self-referential structure.
• Initialise the start pointer to be NULL.
struct node
{
int data;
struct node *next;
};
The basic operations in a single linked list are:
• Creation.
• Insertion.
• Deletion.
• Traversing.
Figure shows 4 items in a single linked list stored at different locations in memory.
Insertion of a Node:
One of the most primitive operations that can be done in a singly linked list is the
insertion of a node. Memory is to be allocated for the new node (in a similar way that is done
while creating a list) before reading the data. The new node will contain empty data field and
empty next field. The data field of the new node is then stored with the information read from the
user. The next field of the new node is assigned to NULL. The new node can then be inserted at
three different places namely:
The following steps are to be followed to insert a new node at the beginning of the list:
• Get the new node.
• If the list is empty then start = newnode.
• If the list is not empty, follow the steps given below:
newnode -> next = start;
start = newnode;
Deletion of a node:
Another primitive operation that can be done in a singly linked list is the deletion of
anode. Memory is to be released for the node to be deleted. A node can be deleted from the list
from three different places namely.
• Deleting a node at the beginning.
• Deleting a node at the end.
• Deleting a node at intermediate position.