0% found this document useful (0 votes)

178 views44 pages

Data Structures AND Algorithm Analysis: University of Veszprem Hungary

This document contains lecture notes on mathematics review and asymptotic notation relevant for analyzing algorithms and data structures. It reviews exponents, logarithms, series, proofs, and asymptotic notation such as Big O, Theta, and Omega notation. It provides examples of applying these concepts to analyze the runtime of simple algorithms and prove properties of sequences like the Fibonacci numbers. The notes also cover floor/ceiling functions, rates of growth of polynomials vs exponentials, and approximations like Stirling's formula.

Uploaded by

Saaaaaaakkk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

178 views44 pages

Data Structures AND Algorithm Analysis: University of Veszprem Hungary

Uploaded by

Saaaaaaakkk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

University of Veszprem

Hungary

DATA STRUCTURES
AND
ALGORITHM ANALYSIS

lecture notes
Sandor Dominich
Adrienn Skrop
Maria Horvath

2001.

Mathematics Review
1. Exponents

a ba
a
a

= a

(a )

b+c

= a

+ a

= 2a

+ 2

= 2

n +1

2. Logarithms
typically log 2 = log
log a b = x a x = b
properties : log ab = log a + log b
a
log = log a log b
b
log c b
log a b =
log c a
log 1 = 0
log 2 = 1
Note : ln , lg
log 2 is just a compromise
3. Series
Geometric series
a n +1 1
a =

a 1
i =0
n

a = 2,

2i =
i =0

0 < a < 1,

2 n +1 1
1

a
i =0

1
1 a

' =' n

1 + a + a 2 + a 3 + ... = S | a

a + a 2 + a 3 + a 4 + ... = aS

Note : only valid if convergence

1
1 a

HW :

i =1

Arithmetic series
n

i =
i =1
n

n(n + 1)
2

i =1

n(n + 1)(2n + 1)
6

Harmonic number
n
1
Hn = = log e n + n
i =1 i
n : const

n = 0.57721566 Euler' s constant

Hn ln n
4. Proofs
1. Proof by induction
We want to prove p(n)
1. Prove that p(n) holds for a base case p(n0)
2. Assume p(n) holds, n > n0
3. Prove p(n) p(n+1)

Fibonacci numbers
F0 = 1, F1 = 1 (by definition)
F2 = F1 + F0 = 2
Fi = Fi-1 + Fi-2, i 2
i

5
Fi < , i 1
3

2. Proof by contradiction (indirect, reductio ad absurdum)

To prove: P
1. Assume that P holds
2. If we contradict some known property, then:
3. P (must be true)
Ex.:
P: there are infinitely many primes
P: there are a finite number of primes
P1, P2, , Pn
P1+ P2 + + Pn
N = P1 P2 Pn + 1
N > Pi, i = 1,,n
N is not prime

contradict s

rem. (N : Pi) = 1 0 Fundamental Theorem of Arithmetic

MUST conclude: P is false P is true
Ex.: 2 irrational , 2 I = R \ Q
P: 2 irrational
P: 2 rational
2 = a / b, (a, b) = 1
2b2 = a2

a = 2c
b2 = 2c2
b = 2d (a, b) 1

Note: P
P: The sun is shining
P: The sun is NOT shining
It is not the sun that is shining

contradiction

P: The rain, whose mean value in South Africa exceeds that of

Central Europe in august, is only half welcome in parts of a rain
forest.
S | xn xm | < S , n > m
3. Evaluate

1 i
;

i i
i =0 4
i =0 4
n

Prove (2i 1) = n 2
i =1

Asymptotic notation
- notation
(g (n )) = {f (n ) c1 , c 2 , n0 : 0 c1 g (n ) f (n ) c 2 g (n ), n > n0 }
def

c1g(n)
f(n)
f (n ) (g (n ))
c2g(n)
n0

: asymptotically tight bound

Ex.:
n2
3n
2
1
1
c1 = , c2 = , n0 = 3
6
2
2
c1n f (n) c2 n 2
f ( n) =

f ( n ) = ( n 2 )
HW: 6n3 (n2)

Property:
d

p(n) = ai n i
i =0

p ( n) = ( n d )
c = const = (n0) = (1)

O notation

f (n ) = (g (n ))

O( g (n)) = {f (n) c, n0 : 0 f (n) cg (n), n n0 }

def

n2 = O(n2)
n = O(n2)

cg(n)
f (n) = O( g (n))
f(n)

O upper bound for f(n)

- notation
( g (n)) = {f (n) c, n0 : 0 cg (n) f (n), n n0 }
def

f(n)
cg(n)

n0
lower bound for f(n)

n2
nlogn

f (n) = ( g (n))

Ex.:
i, S: integers
S: = 0

running time
measure effective time

FOR i = 1 to n

algorithm analysis

S=S+i*i
1

1 1

declaration takes no time

assignment: 1 unit
FOR: 1 unit n + 1 unit
1 + 1 + n + 1 +3 = n + 6 = (n) = O(n)

Similarly (loop in loop):

FOR
.
FOR
.
(n2)

1. Calculate:

2
i =1

1 2 3
1
+ + + ...
2 4 8
2

S 1 2 3

= + + + ...

2 4 8 16
+
1 1 1
+ + + ...

4 8 16
1 2 3
= + + + ...
2 4 8

1 1
1
S S + i = S = 2
2
i =2 2
2

1
2

2. Prove:
a,

5
Fi < , i 1
3
5
i = 1, Fi = 1 <
3
25
i = 2, Fi = 2 <
9
5
Fi +1 <
3

i +1

5 5
Fi +1 = Fi + Fi 1 < +
3 3

i 1

5
=
3

i +1

3 3 2 5 i +1 24 5 i +1
+ = <
5 5 3 25 3

b,
n(n + 1)
2

i =
i =1

n =11=1
n

i =

i =1
n +1

i =

n(n + 1)
2

(n + 1)(n + 2) = n(n + 1) + 2(n + 1)

i =1

c,
n

i =1

n(n + 1)(2n + 1)
6

n = 1: 1 =

n +1:
n +1

1 2 3
6

n +1

i
i =1
n

i = i
2

i =1

=
2

(n + 1)(n + 2)(2n + 3)
6

+ (n + 1) =
2

i =1

n(n + 1)(2n + 1)
2
+ (n + 1) =
6

n(2n + 1) + 6n + 6
n(2n + 1)

= (n + 1)
+ n + 1 = (n + 1)
=
6
6

2n 2 + 7 n + 6
6
(n + 2)(2n + 3) = 2n 2 + 7 n + 6 igaz
= (n + 1)

3.
n

(2i 1) = n

i =1
n

i =1

(2i 1) = 2i 1 = 2

n(n + 1)
n = n2 + n n = n2
2

4.
n2

F = F

i =1

n = 3:

F =1= F
i

2 =1

i =1

= Fn 2

= Fn +1 2

i =1
n 1

i =1

Fi = F1 +

n 1

i =2

i=2

(Fi 1 + Fi2 ) = F1 + Fi 1 + Fi 2

Fn +1 2 = Fn + Fn 1 2 = (Fn 2 ) + (Fn 1 2 ) + 2 =

n 3

F +F +2
i

i =1

Asymptotic notation in equations

n = (n2)

n = O(n2)

2n2 + 3n + 1 = 2n2 + (n) There exists some functions such that the equation holds.
2n2 + (n) = (n2)
O, ,

upper bound: O tight, o loose

lower bound: tight, loose

f ( n)
=0
n g ( n)
f o( g )
=
f ( n)
lim
=
n g ( n)
f (g )
=

lim

g(n) grows much rapidly than f(n)

f(n) grows much rapidly than g(n)

Properties
Transitivity
( f(n) = . (g(n)) g(n) = . (h(n)) ) f(n) = . (h(n))
where . {O, , , o, }
Reflexivity
f(n) = . (f(n))
Symmetry
f(n) = (g(n)) g(n) = (f(n))

Analogy between asymptotic notation and real numbers

f(n) = O(g(n))

f(n) = (g(n))

a=b

f(n) = o(g(n))

a<b

f(n) = (g(n))

a>b

Trichotomy
Dichotomy: Boolean logic
Trichotomic: three different values
Trichotomy: a, b R: exactly one of the followings holds:
a<b

a=b

a>b

Note: for every real number holds, and we can prove it.
Asymptotic notation: trichotomy doesnt hold
Not any two functions can be compared in an asymptotic sense.
Ex.: f(n) = n
g(n) = n

1+sin n

They cant be compared.

f(n) = n
g(n) = n

sin n

They can be compared.

f(n) is a tight upper bound for g(n).

Floor, ceiling
Floor of x R:

x: the greatest integer less than or equal to x

Ceiling of x R:

x: the least integer greater than or equal to x.

x 1 < x x x < x + 1
n Z: n / 2 + n / 2 = n
Rate of growth of polynomials and exponentials
n
lim x x = 0
x a
n \ {0}, a (1, )
Series
ex =

Any exponential grows faster than any polynomial!

i!
i =0

e x 1+ x
1+ x e x 1+ x + x 2 , x 1

( )

x 0, e x = 1 + x + x 2
e 1+ x
x

ex =

lim
n

x
1+

ln(1 + x ) = x

x2 x3
+
..., x < 1
2
3

Stirling approximation:
n
n!= 2 n
e
n

1
1 +
n

n
n 12 n
2 n n! 2 n
e
e
Fibonacci numbers and the Golden ratio
Fi = Fi-1 + Fi-2,

i 2,

F0 = 0, F1 = 1

AB AP
=
AP PB
AB
AP =
5 1
2
5 1
AB = 1, AP =
0.61803
2

1+ 5 1 5
, =
2
2
i
i

The relation between the Fibonacci numbers and the Golden ratio.
Fi =
2
Fibonacci numbers grow exponentially.

Summations
Sequence: a1,, an

(finite)

Series: a1 + ... + an + ... = ai

(infinite)

i =1

lim ak If it exists we have convergent series (else divergent).

n
k =1

Linearity
n

(ca
k =1

k =1

+ bk ) = c ak + bk

( f (k )) = f (k )
k =1

k =1

Differentiation

k =0

1
, x <1
1 x

x
(1 x )2

Telescoping
Telescoping series (sums)
n

(a
k =1

ak 1 ) = a/ 1 a0 + a/ 2 a/ 1 + a/ 3 a/ 2 + ... + an a/ n 1 = an a0

n 1

n 1
1
1
1
1
=

= 1

k +1
n
k =1 k (k + 1)
k =1 k

Products

n
n
lg ak = lg ak
k =1 k =1
Bounding summations
By induction
Prove:

( )

= O 3k

k =0

c : 3k c 3
k =0

Initial value : n = 0
1 c 1, c 1
Assume:
n +1

k =0

3k = 3k + 3n+1
c 3n + 3n +1
1 1
= + c 3n +1
3 c
c 3n +1
By bounding terms
n

, amax = max ak
k

k =1
n

Ex.

k n

k =1

namax
n

k =1

By splitting
n even
n

n
2

k =1

k = k +
n
2

0+
k =1

n
k = +1
2
n

n
n 2
k = +1

n
4

( )

= n2

By integrals
n

n +1

k =m

f (x )dx f (k ) f (x )dx

m 1

k =1 k
n

Ex.

n +1

dx
x

|
m-1 m m+1 n+1

Elementary Data Structures

Data should be organised into stuctures so that it can be processed as required by the problem
Elementary (Basic or fundamental):
There are just a few elementary data structure. All the other rely on the elementary data structures.
STACK
a) LIFO (Last in, first out)
b) FIFO (First in, first out)
The stack is organised into locations: address
p: pointers
stack pointer: the address of the next available
locations

LIFO
4
3
2

Note:
p: the address of the top which isnt empty
(another view)

p
7
B

locations
address

Two basic operations:

write: PUSH: D (new data) p (location) ; p p+1
read: POP: p p-1; p(location) D
Note:
In theory : the stack is infinite
In practice: finite
its impossible to increment the stack pointer: overflow: the stack is full
before writing overflow check is needed
Note:
before read: check whether the stack is empty (underflow check) (we want to extract a data from the empty stack)
Application:

management of teh main memory

when the OS is loaded
compilers: evaluating expressions

FIFO
its called a QUEUE
Head

Tail
A

Locations
It is infinite at both ends. Tha data is written only from one direction at one end.
Tail: where we write the data in
Head: where we read the data out
The space we can allocate for a QUEUE is finite.
overflow check: ENQUEUE
underflow check: DEQUEUE
write: ENQUEUE
read: DEQUEUE
Application:
modelling the dataflow in online processing (pressure, temperature)
ARRAY
1 dimensional

sequence: a1, a2, a3,, an

2 dimensional
matrix: (ai,j)nn
higher dimensional
matrix of matrices

Usage: represent stack, queue, list

LIST
L :E1, E2, , Ei,En the elements of the list
Ei: Di pi data and pointer fields
pi can contain several pointers: pij , j{1,}
j=1

single linked list

p 1 D2
E1

p2
E2

...

NIL
En

NIL: no more list elements

p1: the address of E2
pi: the address of Ei+1
Important: the order in wich the elements are linked through the pointers (the logical order) not necessarily the same as the
physical order.

Example: Application:
to store data on disk (files)
element = track : the elements are stored on available tracks, ??????????
Operations:
INSERT: write a new element into the list
a. front of the first element of the list

NIL

E
pointer of E points towards E1
b. after the last element

NIL

E
Replacing NIL with a pointer to E
c. between the existing elements

NIL

E
Rearraging the pointers.

SEARCHING: (n)
D1

E1
D : to be searched

sequentially:

NIL
En

compare D with Di i=1,

we either find

it or not
DELETION: delete an existing element

D
NIL
E1

Ei+1

D: find the element contains D and take it out from the list
a) searching
b) deletion Ei

Ei-1

Ei+1

Logical deletion: just the pointers are rearranged, not physical deletion
Example:

array representation of a list

two dimensional array

1
1
2

2
B

(1,2)

(1,3)

3
C

NIL

Data
Pointers

Doubly linked list

Dn
E2

Sentinel: contains no data , points to the first element

Circular list
Circular singly linked list

D1
E1

The sentinel always tells us which is the first element of the list.

Dn
En

TREE
Mathematically: acyclic connected graph
ROOT level 0: parent of its children, has no parent
level 1: children of the root, parents of level 2
level 2: children of the nodes of level 1, parents of
level 3

last level:
leaves: have no children

..
Binary tree: at most two children (except the leaves)

ROOT number of levels: HIGHT (h) of the tree

number of leaves 2h
left child
right child
Implementation: by means of lists
Example:
MULTI-LIST
(linked)
NIL

NIL

Binary search tree

5
3
2

Every key in every left subtree is at most the key of

its root.
Every key in every righta subtree is at most the value
of its root.

7
5

8
Binary search tree property

WALK:
R
L

INORDER:
PREORDER:
POSTORDER:

L Root R
Root L R
L R Root

Example:
INORDER:
PREORDER:
POSTORDER:

235578
532578
25 3875

ascending order of keys

Application:
(A+B)*C-D mathematical form (usual)
Evaluation of an arithmetic expression :
parentheses
operators
form: Pohsh:
no parentheses

precedence rules
no need the check the precedence of the operators

postorder walk: AB+C*D- POSTFIX POLISH FORM

*
+
A

D
C

Application:
SORTING: arrange data
numbers 1, 7, 6

1, 6, 7
Comparison

Decision tree (specialised binary tree)

a1, a2, a3
h=?
a1:a2
How many comparisons needed to sat the

numbers?
a2:a3
Lower bound for comparison bored

sorting
a1:a2:a3
n! : number of leaves

. . .

n! 2h
h log (n!)

a1:a3
>

HASHING
Hash table

K1 K2

SLOTS (locations): every slot is

identified by a code (e.g., serial number)

:
.
m-1
U: Universe of all possible key values (from where the keys can take values).
K1, K2, K: keys, they are used to identify records.
Hash function: h, meets the key values with the slots.
h: U {0, 1, , m-1}
h: K h(K)
Collision: h(K1) = h(K2) when K1 K2
Resolve: chaining, data that collide are chained.

Sentinel

Complexity
Analysing hashing with chaining.
Search: (1+)
n: number of keys

m: number of slots
n
: average number of elements in a list
m
n
= : load factor (the analysis is made in term of )
m
Search:
K
h(K)

(1)

search in the list

()

(1) + () = (1 + )
Assume: n = O(m) (as many keys as slots)
n O ( m)
=
= O(1)
m
m
The search takes constant time.

Uniform hashing: any given key is equally likely to hash into any of the slots.
Probability:1 / m
Application

Spelling checker

Compilers

Game playing

Graphs

Hash function
What makes a good hash function? (Uniform hashing is ensured 1 / m)
Usually we do not know the distribution of the key values difficult to design good hash function.
In practice: spread the key values as much as possible.

Division method
h(K) = K mod m
It does not work well in every case m = 2p
Reminder

p bits

p = 2 m = 22
0, 1, 2, 3
K=5

2 bits
1

3 bits

M should be a prime near

n = 2000, = 2000 / 3
Multiplication method
h(K) = m((KA) mod 1)
0 < A < 1, A = 0.618033 (Golden section) good results
Uniform method
h(K) = Km
K uniform distribution [0, 1]
Practically and theoretically good function
Interpreting strings as numbers
SOUNDEX CODING
RADIX 128
C1 C2 Cn

ASCII

C1 C2 Cn

number 0 Cj 127

Ex. p = 112, t = 116, pt = 116 + 128 112 = 14452

Universal hashing
There are h hashing functions s.t. there exist key values that more than one will be hashed into the
same slot.
Any fixed hashing is valid.
H = {h1, h2, , hr} set of hash functions
For any random hi uniform hashing
Statement
If h is chosen from a universal collection H of hash functions and is used to hash n keys into a hash
table of size m, where n m, then the expected number of collisions involving the particular key x is
less than 1.
Proof

1, h( y ) = h( z ), y, z
Let cyz a random variable : c yz =
0, otherwise

H universal: probability for y and z to collide is 1 / m (by definition)

[ ]

1
m
, Pn = P ( = xn )

E
p
x
[
]

n n n

E c y,z =

Let cx be the number of collisions involving key x.

cxy1

E c xy1 =

[ ]

1
m

[ ]

1
m

cxy2
y2

E c xy 2 =

cxyn-1

yn-1

[ ]

E [c x ] = E cxy =
y

E c xyn 1 =

1 n 1
=
<1
m
m

1
m
c x = c xy1 + ... + c xyn 1

E i = E [ i ]
i i

Construction of a universal set H.

m prime
given key x
decompose x = x0 x1 xr , value xi < m
{0, 1, , m-1}, let a be a sequence of slots

ha (x ) = ai xi mod m
i =0

H = ! {ha } the universal set

a = <a0 a1 ar>, every ai is chosen randomly from the set {0, 1, , m-1}
Theorem
The set H is universal.

Proof

x y
r

ha (x ) = ai xi mod m
i =0
r

ha ( y ) = ai yi mod m
i =0

y = y0 y1... yr
x y x0 y 0

ha (x ) = ha ( y )
r

ai xi mod m = ai yi mod m
i =0

i =0

a x a y
i =0

i i

i =0

(mod m )

a0 (x0 y0 ) ai (xi yi )
i =1

[mod m]

As many collisions as the number of equations.

As many equations as different ai.

mr
H = m r +1
mr
1
=
r +1
m
m
H(n m) n, m should be comparable.
Choose at random any time when hashing should be done, apply h.
Application
Data bases
Hash table + B- tree
Not easy to estimate n
Not easy the choice of h.

ESTIMATION OF COMPLEXITY
Running time: the time required for a computer to execute a program.
Running time dependes on the following factors:
CPU: the higher the speed, the quicker the computer.
Memory: main and secondary memory available to execute the program.
Input data: size, type, operations.
Software: compiler, operating system, etc.
Algorithm (based on which the particular program is written).
Example: a1 + a2 + a3 + a4
S=0
S = S + a1

S=0
FOR i = 1 TO 4

S = S + a2

S = S + ai

S = S + a3
S = S + a4
The asymptotic notation is a way to express how bad (slow) or good (fast) an algorithm is.
Expression of complexity we get a measure

to express the 'character' or behaviour of an

algorthm

comparison of algorthms in term of complexity.

We can decide which algorithm is better or we should choose.

Note: the complexity will not guarantee that the algorithm will really be faster or that the computer
will always execute the program faster, because physical running time, as seen above, depends on
other factors, too.

Estimation of the complexity of an algorithm:

1. Assignment
Operations

1 umit

Ex.: S = 0

1 unit each

2. Consecutive statement

sum of each

3. Loop: time required to evaluate the body multiplied by the number of iterations
4. Nested loop: inside multiplied by the product of iterations
5. IF: test + max (THEN, ELSE)
This technique can overestimate (ex. 5), but it will never underestimate the complexity.

SORTING
To arrange given data according to given criteria.
Ex.:

1. ci, i = 1, , n
j, j = 1, , m (criteria)
arrange ci taking into account every j
2. List of names alphabetic order
3. Temperature values ascending or descending order
4. Sorting a1, a2,,an input data
criterion: ascending (descending)

BUBBLE SORTING
Main idea: find the smallest and put it on the top
find the second smallest and put it next to the top one
Ex. 2, 1, 7, 6

Fix the first number and compare it with all the other. If
wrong order swap and change.

1, 2, 7, 6
Repeat from the second position.

2, 7, 6

7, 6

1, 2, 6, 7

a1, a2, , ai, aj,an

FOR i = TO n 1
FOR j = i + 1 TO n
IF ai > aj THEN swap (ai, aj)
2 (n - 1)(n - 1) = O(n2)

Best case: the input is already in the right order (no swap) O(n2)
Worst case: all the numbers are in the wrong order O(n2)
Average case: the numbers are given at random (typical case)
Bubble sorting (comparison-based): O(n2)
Comparison-based sorting: (n log n)
O(n log n)

QUICK SORTING
The fastest known method.
Divide and conquer philosophy
A(p, q), x q
A(pr)
A(q + 1, r), y q
q: computed value
Ex. 9, 2, 11, 20, 7
Q: pivot = (9 + 7) / 2 = 8 not necessarily belongs to the sequence
7, 2

11, 20, 9

q=4

q = 10

2 7

9 11, 20
q = 15

2 7 9 11 20
QUICKSORT (A, p, r)
IF p < r THEN
q PARTITION (A, p, r)
QUICKSORT (A (p, q))
QUICKSORT (A (q + 1, r))
Initial call: QUICKSORT (A, 1, length(A))

PARTITION (A, p, r)
x A(p), i p 1, j r + 1
WHILE TRUE DO
REPEAT j j 1 UNTIL A(j) x
REPEAT i i + 1 UNTIL A(i) x
IF i < j THEN SWAP (A(i), A(j))
ELSE RETURN j
Recurrence: resolved by telescoping
n
T (n) = 2T + (n )
2
n
T (n) = 2T + n
|: n
2
n
T
T ( n)
2
= + 1
n
n

n
T
T ( n)
4

= + 1
n
n
+

2
4

T (2) T (1)
=
+1
2
1

T(n) = nc + nlogn
T(n) = O(nlogn) average (and best) case
Worst case: O(n2)
Worst case: one element / region

T(n) T( 1 )
=
+ log n
n
1

T (n) = T (n 1) + (n)

T (n 1) = T (n 2) + (n 1)

.
+
.

T (2) = T (1) + (2)

________________________
n

T (n) = T (1) + (k ) =
k =2

T (n) = (n ) = (k ) = ( k )
2

SEARCHING
There may be different situations where searching is performed
SEQUENTIAL SEARCH
Given a1, , ai, , an

(numbers / characters: objects to be searched)

Find: x
(nave or brute force search or straight search)
It is in fact one loop:
FOR i = 1 TO n
IF x = ai THEN STOP
O(n)
Note: this kind of search is very simple (primitive),
but what is if ai is a matrix?
records of a file?
The comparison is more complicated here. But in principle it is very simple.
RANDOM SEARCH
Introduce some sort of probability.
Given a1, , ai, , an to be searched.
Find: x
Coin: probability element
If head: search from 1 to n.
Flip a coin
If tail: search from n to 1.
Two sequential searches are combined, applied together.
Assume: x = ai (ith position)
The coin is fair: the head and the tail occur with equal probabilities.

If head: i comparisons to find x.

If tail: n i + 1 comparisons to find x.
i + (n i + 1) = (n + 1) / 2 better than n. (Average the two case.)
In average we need (n + 1) / 2 comparisons rather than n.
The order of the elements is irrelevant (no need for pre-sorting) in
Sequential search,
Randomized search.
BINARY SEARCH
It is very quick and used almost everywhere.
The elements to be searched are sorted.
Given a1 ai an
Find: x
Idea: guess the number I am thinking at
Ex. 2 3 7 | 8 9 | 10
Find: 9
Half the sequence.
Compare the last element with x.
a1, , an
x
low: leftmost element in the half = 1
high: rightmost element in the half = n
REPEAT
mid = (low + high) DIV 2
IF low > high THEN mid = 0
ELSE
IF a(mid) < x THEN low = mid + 1
ELSE high = mid 1
UNTIL x = a(mid)

n
20
n
21
.
.
.

O(log n) , very fast.

Note: n finite
If n infinite then binary search: div 2m
It may not happen in practice, just in theory.

n
2

log n

Note:
1. Will the search work when all elements are given at once (at the same time)?
2. Will the search work when all elements are given on line (one by one)?
1.

Sequential search

YES

Randomized search

YES

Binary search

YES

In every case we assume that we have just one processor to do the job.
PARALLEL BINARY SEARCH
It speeds up the binary search.
P processors with shared memory (PRAM parallel RAM).
CREW: Concurrent Read Exclusive Write
Given the elements a1, , ai, , an sorted.
Find: x
Divide the sequence into p + 1 parts:
a1, , ai1 | ai1+1, , ai2 | , aij | , an
x is compared with the boundary elements (or the leftmost or the rightmost)
in parallel:

processor j compares x with the jth boundary

processor j sets a variable cj:

0, if x > aij
1, otherwise

Thus: s: cs = 0 cs + 1 = 1 (to locate part)

Repeat recursively until x = aij
Complexity:

BS
n
20

PBS
n
( p + 1)0
n
( p + 1)2

.
.
.

n
=1
( p + 1)n

n
2

log n

O (logp + 1 n) better than O(log n), if p > 1

Note: PBS doesnt online
PBS works offline
STRING SEARCHING (straight search, nave, brute force)
Idea:
text (string of characters) of length n, i (index) pointer
find: pattern of length m, j
Comparisons from left to right
match: both i and j are incremented
mismatch: reset j to the beginning of the pattern
i set to the position corresponding to moving the pattern to the right one
position

Ex. 3 2 4 5 6
4 5
3

text: n = 5
pattern: m = 2

2 4 5 6

|
4 5
|
4 5
O(n m)
KNUTH
MORRIS
PRATT ALGORITHM
Given text: n, i
pattern: m, j
Both pointers are incremented by 1 as long as there is a match.
Mismatch at position k: j is reset and comparison is restarted.
I. 3 2 3 4 5
3 4

j=1

II. 3 2 3 4 5
3 4

IV.

i=1

i=2
j=2

III. 3 2 3 4 5

i=2

3 4

j=1

3 2 3 4 5
3 4
O(n + m)

i=3
j=1

BOYER
MOORE ALGORITHM
text: s1, m
pattern: s2, m
Idea: the pattern is searched from right to left, while it is moved from left to right an appropriate
number of positions.
I. 3 2 7 4 5 6
4 5 mismatch: no matching characters between pattern and text we move the
pattern m position to the right
II. 3 2 7 4 5 6
4 5 mismatch: align the four
III. 3 2 7 4 5 6
4 5 stop
O(n/m) really very fast.
Application: word processing (spelling checker)
SIGNATURE FILES
Given text
Every word is hashed (transformed) into a string of bits. (Ex. radix 128, soundex coding)
synonym: in hashing sense, not grammatically
hash table:

The locations contains the possible bit representation

sector (block)

Find word x:
x is transformed into a string of bits,
the hash table is presorted,
x is searched using binary search in the hash table or better hash function,
we get a block address on the disk, we find the word on that block somewhere,

we apply string searching within the block

maximumsized file
relatively constant
bibliographic DB searching
INVERTED FILE
Huge file (DB) such as in a library, WWW
Search engines work as this: locate a word (not when the Web page is returned)
Crawler (programme): scans the Web all the time. Builds (updates) an inverted file.
Inverted file:
w1

U RL 1

U RL 2

U RL n

wi: words on the Web

Records

URLs: containing that word

search (whether the word exists in
the inverted file): Btree (to
implement the inverted file)
sorting (the inverted file is updated every time
a new word appears)

When we enter a word to search:

The search engine locates the word by searching the inverted file string searching.
If the word is found in the inverted file then depending on the retrieval techniques the
document / URL will be presented or not (this depends not solely on whether the word is present
or not).

Algorithm Analysis for Students
No ratings yet
Algorithm Analysis for Students
20 pages
Algorithms Analysis
No ratings yet
Algorithms Analysis
21 pages
Exercises
No ratings yet
Exercises
32 pages
Asymptotics and Recurrence Equations: Analysis of Algorithms
No ratings yet
Asymptotics and Recurrence Equations: Analysis of Algorithms
43 pages
21-355 Notes
No ratings yet
21-355 Notes
42 pages
hw1 Sols
No ratings yet
hw1 Sols
6 pages
Tutorial 1
No ratings yet
Tutorial 1
15 pages
Asymptotic Notation: Measuring Complexity
No ratings yet
Asymptotic Notation: Measuring Complexity
23 pages
Solutions of RDBMS
No ratings yet
Solutions of RDBMS
64 pages
Data Structures and Algorithms: (CS210/ESO207/ESO211)
No ratings yet
Data Structures and Algorithms: (CS210/ESO207/ESO211)
22 pages
Algorithm Analysis & Complexity Guide
No ratings yet
Algorithm Analysis & Complexity Guide
62 pages
3 - Lec
No ratings yet
3 - Lec
32 pages
Introduction To Algorithms CS 445 Spring 2005
No ratings yet
Introduction To Algorithms CS 445 Spring 2005
17 pages
Publication 11 23912 388
No ratings yet
Publication 11 23912 388
11 pages
Lecture 2 - Sorting and Asymptotic Analysis
No ratings yet
Lecture 2 - Sorting and Asymptotic Analysis
90 pages
CS 124 Lecture 3 - Feb 3
No ratings yet
CS 124 Lecture 3 - Feb 3
11 pages
CH 2algo Analysis - Part2
No ratings yet
CH 2algo Analysis - Part2
31 pages
2011 Pure Math Summary
No ratings yet
2011 Pure Math Summary
25 pages
Sergey 1
No ratings yet
Sergey 1
20 pages
Complexity Analysis Problem Set
No ratings yet
Complexity Analysis Problem Set
9 pages
LecNote317-01 Real Infinite Series
100% (1)
LecNote317-01 Real Infinite Series
29 pages
Math & CS Formula Cheat Sheet
No ratings yet
Math & CS Formula Cheat Sheet
1 page
LectureNotes DHolt
No ratings yet
LectureNotes DHolt
78 pages
Week 2 Growth Asymptotic Insertion 05042021 022500pm
No ratings yet
Week 2 Growth Asymptotic Insertion 05042021 022500pm
48 pages
Full Solution Abbott
No ratings yet
Full Solution Abbott
171 pages
Asymptotic Notations
No ratings yet
Asymptotic Notations
39 pages
MA138 Lecture Notes
No ratings yet
MA138 Lecture Notes
55 pages
Seq Ser Notes 1819
No ratings yet
Seq Ser Notes 1819
79 pages
254 Tutnotes
No ratings yet
254 Tutnotes
107 pages
Discrete Math: Algorithm Analysis Notes
No ratings yet
Discrete Math: Algorithm Analysis Notes
56 pages
Book
No ratings yet
Book
360 pages
Analysis Notes by Linhan Li and David Quinn
No ratings yet
Analysis Notes by Linhan Li and David Quinn
70 pages
3-July 18 Asymptotic
No ratings yet
3-July 18 Asymptotic
6 pages
Calculus 250722 123108
No ratings yet
Calculus 250722 123108
61 pages
Notes MA1102 2018
No ratings yet
Notes MA1102 2018
81 pages
Asymptotic Notation Demo
No ratings yet
Asymptotic Notation Demo
16 pages
DSA2 Chapter 1 Math Revision
No ratings yet
DSA2 Chapter 1 Math Revision
23 pages
DSA02 Analysis
No ratings yet
DSA02 Analysis
48 pages
Dis01 Sol
No ratings yet
Dis01 Sol
6 pages
Algorithm Complexity Explained
No ratings yet
Algorithm Complexity Explained
11 pages
Intuitive Analysis: Jo Hoffacker
No ratings yet
Intuitive Analysis: Jo Hoffacker
96 pages
Discrete Refresher Notes 2018
No ratings yet
Discrete Refresher Notes 2018
29 pages
Asymptotic
No ratings yet
Asymptotic
11 pages
Solution of Serie 1
No ratings yet
Solution of Serie 1
6 pages
Master Theorem Revision
No ratings yet
Master Theorem Revision
10 pages
Mat25 Hw4solns
No ratings yet
Mat25 Hw4solns
4 pages
Notes
No ratings yet
Notes
5 pages
DAA (Lecture 3)
No ratings yet
DAA (Lecture 3)
39 pages
Unofficiall Abbott 2nd Edition Solutions
No ratings yet
Unofficiall Abbott 2nd Edition Solutions
210 pages
IA 2017 With Proofs
No ratings yet
IA 2017 With Proofs
89 pages
On The Harmonic Series
100% (1)
On The Harmonic Series
22 pages
Analysis
No ratings yet
Analysis
99 pages
Dma 3 Notes
No ratings yet
Dma 3 Notes
9 pages
Recurrence Relations and Generating Func
No ratings yet
Recurrence Relations and Generating Func
33 pages
Discrete Mathematics
No ratings yet
Discrete Mathematics
45 pages
数学分析习题课讲义（复旦大学 PDF
No ratings yet
数学分析习题课讲义（复旦大学 PDF
88 pages
Analysis I: Example Sheet 1
No ratings yet
Analysis I: Example Sheet 1
8 pages
Daa L2
No ratings yet
Daa L2
19 pages
Discrete Event Systems Timed Models: Máté Hegyháti
No ratings yet
Discrete Event Systems Timed Models: Máté Hegyháti
13 pages
Hash Tables - : Structure
No ratings yet
Hash Tables - : Structure
21 pages
Discrete Event Systems Petri Nets: Máté Hegyháti
No ratings yet
Discrete Event Systems Petri Nets: Máté Hegyháti
21 pages
Ravi Kumar - Ramayan Around The World (2007, 47 Oldal) PDF
100% (2)
Ravi Kumar - Ramayan Around The World (2007, 47 Oldal) PDF
47 pages
Paczolay Gyula - 750 Magyar Közmondás Magyarul És Angolul (1991, 44 Oldal)
No ratings yet
Paczolay Gyula - 750 Magyar Közmondás Magyarul És Angolul (1991, 44 Oldal)
43 pages
Tha History of USA (2003, 52 Oldal)
100% (1)
Tha History of USA (2003, 52 Oldal)
52 pages
Matlab GA
No ratings yet
Matlab GA
32 pages
CCN - Part-B Programs
No ratings yet
CCN - Part-B Programs
11 pages
Numerical Analysis: BVP Methods
No ratings yet
Numerical Analysis: BVP Methods
4 pages
Taylor Series and Bisection Method Solutions
No ratings yet
Taylor Series and Bisection Method Solutions
4 pages
Assignment Model
No ratings yet
Assignment Model
9 pages
Java Implementation of Dijkstra's Alogrith
No ratings yet
Java Implementation of Dijkstra's Alogrith
4 pages
dm545 Lec3
No ratings yet
dm545 Lec3
19 pages
Introduction To Machine Learning - Boosting
No ratings yet
Introduction To Machine Learning - Boosting
6 pages
(MCQ) - Data Warehouse and Data Mining - LMT-2
No ratings yet
(MCQ) - Data Warehouse and Data Mining - LMT-2
3 pages
Cit425 2021 2
No ratings yet
Cit425 2021 2
2 pages
Optimization Method Minor Project Levenberg-Marquardt Method Semester 6 3 SSCM 2019/2020
No ratings yet
Optimization Method Minor Project Levenberg-Marquardt Method Semester 6 3 SSCM 2019/2020
18 pages
Bca (Ai) - Unit Iv
No ratings yet
Bca (Ai) - Unit Iv
22 pages
Introduction To Tic Tac Toe
No ratings yet
Introduction To Tic Tac Toe
8 pages
Ai& ML Lab TLP (Ai A)
No ratings yet
Ai& ML Lab TLP (Ai A)
5 pages
Data Structures IT
No ratings yet
Data Structures IT
43 pages
Brachistochrone ProblemIPOPTinfo
No ratings yet
Brachistochrone ProblemIPOPTinfo
2 pages
Mixed-Integer Linear Programming (MILP) - MATLAB Intlinprog - MathWorks France
No ratings yet
Mixed-Integer Linear Programming (MILP) - MATLAB Intlinprog - MathWorks France
11 pages
CS 6363 Final Exam Problems
No ratings yet
CS 6363 Final Exam Problems
3 pages
Bca Bca 302 Data Structure With C 2011
No ratings yet
Bca Bca 302 Data Structure With C 2011
4 pages
Data Structures: Hapter
No ratings yet
Data Structures: Hapter
78 pages
Data Structure Programs Using C Language (Unit-3)
No ratings yet
Data Structure Programs Using C Language (Unit-3)
10 pages
Super Fast Derandomization of Interactive Proofs
No ratings yet
Super Fast Derandomization of Interactive Proofs
75 pages
371CPE Lectures Part1
No ratings yet
371CPE Lectures Part1
59 pages
Lec 25
No ratings yet
Lec 25
19 pages
What Is Linear Data Structure
No ratings yet
What Is Linear Data Structure
2 pages
ADS & A Unit-1 Study Material
No ratings yet
ADS & A Unit-1 Study Material
13 pages
BCA 4th Sem. AI Paper
No ratings yet
BCA 4th Sem. AI Paper
12 pages
1 - Updated - AP1.3 - Ayush Anand
No ratings yet
1 - Updated - AP1.3 - Ayush Anand
5 pages
Min - Max Heap
No ratings yet
Min - Max Heap
68 pages
Tut 3 Solution
No ratings yet
Tut 3 Solution
2 pages

Data Structures AND Algorithm Analysis: University of Veszprem Hungary

Uploaded by

Data Structures AND Algorithm Analysis: University of Veszprem Hungary

Uploaded by

University of Veszprem

Note : only valid if convergence

n = 0.57721566 Euler' s constant

2. Proof by contradiction (indirect, reductio ad absurdum)

rem. (N : Pi) = 1 0 Fundamental Theorem of Arithmetic

P: The rain, whose mean value in South Africa exceeds that of

: asymptotically tight bound

O( g (n)) = {f (n) c, n0 : 0 f (n) cg (n), n n0 }

O upper bound for f(n)

declaration takes no time

Similarly (loop in loop):

(n + 1)(n + 2) = n(n + 1) + 2(n + 1)

Asymptotic notation in equations

upper bound: O tight, o loose

lower bound: tight, loose

g(n) grows much rapidly than f(n)

f(n) grows much rapidly than g(n)

Analogy between asymptotic notation and real numbers

They cant be compared.

They can be compared.

x: the greatest integer less than or equal to x

x: the least integer greater than or equal to x.

Any exponential grows faster than any polynomial!

Series: a1 + ... + an + ... = ai

lim ak If it exists we have convergent series (else divergent).

Elementary Data Structures

Two basic operations:

management of teh main memory

sequence: a1, a2, a3,, an

Usage: represent stack, queue, list

single linked list

NIL: no more list elements

compare D with Di i=1,

array representation of a list

two dimensional array

Doubly linked list

Sentinel: contains no data , points to the first element

ROOT number of levels: HIGHT (h) of the tree

Binary search tree

Every key in every left subtree is at most the key of

ascending order of keys

postorder walk: AB+C*D- POSTFIX POLISH FORM

Decision tree (specialised binary tree)

SLOTS (locations): every slot is

identified by a code (e.g., serial number)

search in the list

M should be a prime near

Ex. p = 112, t = 116, pt = 116 + 128 112 = 14452

H universal: probability for y and z to collide is 1 / m (by definition)

Let cx be the number of collisions involving key x.

Construction of a universal set H.

H = ! {ha } the universal set

As many collisions as the number of equations.

to express the 'character' or behaviour of an

comparison of algorthms in term of complexity.

We can decide which algorithm is better or we should choose.

Estimation of the complexity of an algorithm:

a1, a2, , ai, aj,an

T (2) = T (1) + (2)

(numbers / characters: objects to be searched)

If head: i comparisons to find x.

O(log n) , very fast.

processor j compares x with the jth boundary

Thus: s: cs = 0 cs + 1 = 1 (to locate part)

O (logp + 1 n) better than O(log n), if p > 1

The locations contains the possible bit representation

we apply string searching within the block

wi: words on the Web

URLs: containing that word

When we enter a word to search:

You might also like