TECHNICAL UNIVERSITY OF CRETE
DEPT OF ELECTRONIC AND COMPUTER ENGINEERING
DATA STRUCTURES
AND
FILE STRUCTURES
Euripides G.M. Petrakis
http://www.intelligence.tuc.gr/~petrakis
Chania, 2007
E.G.M. Petrakis Abstract Data Types (ADT) 1
Introduction
We study data structures and we
learn how to write efficient programs
this hasn’t to do with programming tricks
but rather with
good organization of information and
good algorithms that save
computer memory and running time
E.G.M. Petrakis Abstract Data Types (ADT) 2
Data Structures
Representation of data in the memory
file structure: representation of data on
the disk
e.g., collection of records (list, tree, etc)
Efficient programs require efficient
data structures
a problem has to be solved within the
given time and space constraints
E.G.M. Petrakis Abstract Data Types (ADT) 3
Problem Constraints
Each problem puts constraints on time and
space
e.g., bank example:
start account: a few minutes
transactions: a few seconds
close account: overnight
A solution is efficient if it solves the
problem within its space and time
constraints
Cost of solution: amount of resources
consumed
E.G.M. Petrakis Abstract Data Types (ADT) 4
Goals of this course
Teach data structures for main memory
and disk
Teach algorithms for different tasks and
data structures
Teach the idea of trade-offs
there are costs and benefits associated with
each data structure and each algorithm
Teach how to measure effectiveness of
each algorithm and data structure
E.G.M. Petrakis Abstract Data Types (ADT) 5
Selecting Data Structures
1. Analyze the problem to determine the
resource constraints a solution must meet
2. Determine the operations that must be
supported
• e.g., record search, insertion, deletion etc.
3. Quantify the constraints for each
operation
• e.g., search operations must be very fast
4. Select data structure that best meet
these requirements
E.G.M. Petrakis Abstract Data Types (ADT) 6
Costs & Benefits
Each data structure requires:
space for each data item it stores
time to perform each operation
programming effort to implement it
Each data structure has costs and benefits
rarely one data structure is better than
another in all situations
one may permit faster search (or insertion or
deletion) operations than another
are all operations the same important?
E.G.M. Petrakis Abstract Data Types (ADT) 7
Abstract Data Type (ADT)
ADT: definition of a data type in terms of
a set of values and
a set of operations allowed on that data type.
Each ADT operation is defined by its
inputs and outputs
ADTs hide implementation details
A data structure is the implementation of
an ADT
operations associated with the ADT are
implemented by one or more functions
E.G.M. Petrakis Abstract Data Types (ADT) 8
Logical and Physical forms
Data items have both a logical and a
physical form
1. Logical form: definition of the data
item within an ADT
e.g., integers in mathematical sense: +, -
2. Physical form: implementation of the
data item
e.g., 16 or 32 bit integers
E.G.M. Petrakis Abstract Data Types (ADT) 9
Data Type
ADT:Type +
Operations Data Items:
Logical Form
Data Structure: Data Items:
Storage Space + Physical Form
functions
E.G.M. Petrakis Abstract Data Types (ADT) 10
ADT String: Sequence of chars
• ADT function length (s: string): integer;
post condition : length = len(s);
• ADT function concat (s1,s2: string): string;
post condition: concat = s1 + s2;
ADT function substr (s: string, i, j: integer): string;
precondition: 0 < i < len(s), 0 < j < len(s) – i + 1
post condition: substr(s, i, j);
ADT function pos (s1, s2): integer;
precondition …
post condition …
E.G.M. Petrakis Abstract Data Types (ADT) 11
Definition of an ADT
Depends on the application
Different definitions for the same
application
An ADT hides implementation details
different implementations for the same ADT
When the ADT is given, its data type can
be used by the programmer
e.g., string, math libraries in C
when the implementation changes the programs
need not be changed
E.G.M. Petrakis Abstract Data Types (ADT) 12
Algorithms
The method that solves a problem
An algorithm takes the input to a problem
and transforms it to the output
a mapping of input to output
a problem can have many algorithms
A program is the implementation of an
algorithm in some programming language
E.G.M. Petrakis Abstract Data Types (ADT) 13
Properties of Algorithms
Effectiveness: the algorithm can be written as a
program
there are problems for which no algorithm
exists
Correctness: finds the correct solution for every
input
Termination: terminates after a finite number of
steps
each step requires a finite amount of time
Efficiency: makes efficient use of the computer’s
resources
Complexity: it must be easy to implement, code
and debug
E.G.M. Petrakis Abstract Data Types (ADT) 14
Tiling Problem
The algorithm inputs a finite set T of tiles
it is assumed that an unlimited number of cards
of each type is available
asks whether any finite area, of any size, can
be covered using tiles in T so that
the colors in any two touching edges are the
same
For any algorithm there can be inputs T for
which the algorithm never terminates or
finds a wrong answer
E.G.M. Petrakis Abstract Data Types (ADT) 15
Tile tile types that Tile tile types that
can tile any area cannot tile any area
From “Algorithmics”, David Harel,
E.G.M. Petrakis Abstract Data Types (ADT) 16
A Termination Problem
An algorithm must terminate with the
correct solution for any input
int OddEven( int n ) {
while ( n > 1 )
if (( n % 2 ) == 0) n = n / 2;
else n = 3n + 1;
return n;
}
No one has been able to prove that the
algorithm terminates for any positive n
although most people believe that it does!!
E.G.M. Petrakis Abstract Data Types (ADT) 17
Taxonomy of Algorithms
An algorithmic
problem that
admits no algorithm
is termed “non-
computable”
If it is a decision
problem it is
termed
“undecidable”
E.G.M. Petrakis Abstract Data Types (ADT) 18
Disk Model
T = Taccess +
Trotation + Tread
Block: unit memory
size for disk
size of data
transferred in main
memory per disk
access
In most cases
page=block=track
e.g., 1, 2, 4, 8Kbytes
E.G.M. Petrakis Abstract Data Types (ADT) 19
Disk Model (cont.)
Taccess > Trotation > Tread Î increase the amount of data which is
transferred to the main memory per disk access
large blocks, compression, data in memory
in multi-user systems, the disk head can be anywhere
time
distance coved by disk head
E.G.M. Petrakis Abstract Data Types (ADT) 20