DDB Lectures
DDB Lectures
Review
1
DISTRBUTED DATABASE
3- Since one data is often duplicated in more than two files when is
updated it becomes inconsistent unless maintained simultaneously to
all file concerned.
4- One file is usually created in a suitable format for some times the
file format cannot be used for another application.
3- Data isolation.
2
DISTRBUTED DATABASE
Database Advantages
6- Minimum cost.
3
DISTRBUTED DATABASE
Distributed Database
1. Distributed file systems simply allow users to access files that are located on
machines other than their own. These files have no explicit structure (i.e., they
are flat) and the relationships among data in different files (if there are any) are
not managed by the system and are the user’s responsibility. A DDB, on the other
hand, is organized according to a schema that defines both the structure of the
distributed data, and the relationships among the data. The schema is defined
according to some data model, which is usually relational or object-oriented.
4
DISTRBUTED DATABASE
In distributed databases, data are “delivered” from the sites where they are
stored to where the query is posed. We characterize the data delivery
alternatives along three orthogonal dimensions: delivery modes, frequency and
communication methods. The combinations of alternatives along each of these
dimensions provide a rich design space.
5
DISTRBUTED DATABASE
6
DISTRBUTED DATABASE
B- Physical data independence, on the other hand, deals with hiding the
details of the storage structure from user applications. When a user
application is written, it should not be concerned with the details of
physical data organization. Therefore, the user application should not
need to be modified when data organization changes occur due to
performance considerations.
3- Network Transparency: the user should be protected from the
operational details of the network; possibly even hiding the existence
of the network. Then there would be no difference between database
applications that would run on a centralized database and those that
would run on a distributed database. This type of transparency is
referred to as network transparency or distribution transparency.
4- Replication Transparency: reliability, and availability reasons, it is
usually desirable to be able to distribute data in a replicated fashion
across the machines on a network. Such replication helps performance
since diverse and conflicting user requirements can be more easily
accommodated. For example, data that are commonly accessed by one
user can be placed on that user’s local machine as well as on the
machine of another user with the same access requirements. This
increases the locality of reference. Furthermore, if one of the machines
fails, a copy of the data is still available on another machine on the
network.
5- Fragmentation Transparency: it is commonly desirable to divide
each database relation into smaller fragments and treat each fragment
as a separate database object (i.e., another relation). This is commonly
done for reasons of performance, availability, and reliability.
Furthermore, fragmentation can reduce the negative effects of
7
DISTRBUTED DATABASE
replication. Each replica is not the full relation but only a subset of it;
thus, less space is required and fewer data items need be managed.
1. Since each site handles only a portion of the database, contention for CPU
and I/O services is not as severe as for centralized databases.
2. Localization reduces remote access delays that are usually involved in wide
area networks (for example, the minimum round-trip message propagation delay
in satellite-based systems is about 1 second).
8
DISTRBUTED DATABASE
9
DISTRBUTED DATABASE
1- Level of sharing: In terms of the level of sharing, there are three possibilities.
First, there is no sharing each application and its data execute at one site, and
there is no communication with any other program or access to any data file at
other sites. This characterizes the very early days of networking and is probably
not very common today.
2. Behavior of access patterns: The access patterns of user requests may be
static, so that they do not change over time, or dynamic. It is obviously
considerably easier to plan for and manage the static environments than would
be the case for dynamic distributed systems. Unfortunately, it is difficult to find
many real-life distributed applications that would be classified as static. The
significant question, then, is not whether a system is static or dynamic, but how
dynamic it is. Incidentally, it is along this dimension that the relationship
between the distributed database design and query processing is established.
10
DISTRBUTED DATABASE
The requirements study also specifies where the final system is expected to
stand with respect to the objectives of a distributed DBMS. These objectives
are defined with respect to performance, reliability and availability,
economics, and expandability (flexibility).
11
DISTRBUTED DATABASE
There is a relationship between the conceptual design and the view design.
In one sense, the conceptual design can be interpreted as being an integration
of user views. Even though this view integration activity is very important,
the conceptual model should support not only the existing applications, but
also future applications.
– If the relation is not replicated, we get a high volume of remote data accesses.
– Might be an Ok solution, if queries need all the data in the relation and data
stays at the only sites that uses the data.
13
DISTRBUTED DATABASE
– Reliability.
– Performance.
– Communication costs.
– Security.
Types of Fragmentation
14
DISTRBUTED DATABASE
Example:
Horizontal fragmentations
Consists of partitioning the tuples of a global relation r into subsets r1, r2…
rn each subset can contain data with common properties. The reconstruction of
relation r can be obtained by taking the union of all fragments, that is: r = r1 Ur2
U …… Urn For example, suppose that the relation r is the deposit relation of
table (1) this relation has only two branches, Baghdad and Mosul, and if we
choose the attribute branch-name for horizontal fragmentation the relation, then
the result are two different fragment shows in Table (2).
15
DISTRBUTED DATABASE
16
DISTRBUTED DATABASE
Mosul Ahmad 3
Mosul Hassan 4
Baghdad Hassan 5
Mosul Hassan 6
Mosul Ali 7
Deposit3
Account-number blanance tuple-id
305 500 1
226 336 2
177 205 3
402 1000 4
155 62 5
408 1123 6
639 75 7
Deposit 4
Table (4)
3- Distribution Design Issues
Client/Server Systems
18
DISTRBUTED DATABASE
19
DISTRBUTED DATABASE
20
DISTRBUTED DATABASE
21