0% found this document useful (0 votes)

714 views50 pages

Chapter - 6 Distributed Database System

This document provides an overview of distributed databases and distributed database management systems (DDBMS). It discusses the need for distributed databases, the differences between distributed databases, distributed processing, and parallel databases. It also covers the functions of a DDBMS, architectures for DDBMSs, and issues related to distributed database design such as fragmentation, replication, and allocation.

Uploaded by

dawod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

714 views50 pages

Chapter - 6 Distributed Database System

Uploaded by

dawod

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 50

In this chapter you will learn:

• The need for distributed databases.

• The differences between distributed database
Chapter - 6 systems, distributed processing, and parallel database

Distributed systems.
• The advantages and disadvantages of distributed
Databases DBMSs.
system • The functions that should be provided by a
distributed DBMS.
• An architecture for a distributed DBMS.
• The main issues associated with distributed database
design, namely fragmentation, replication, and
allocation.
1
– Distributed database –

Distributed – logically interrelated collection of shared

Database data (and a description of this data)

Concepts physically distributed over a computer

network.

– DDBMS –

– is a software system that manages a

distributed database while making the
distribution transparent to the user.
– A collection of logically related shared data;

– The data is split into a number of fragments;

– Fragments may be replicated;

– The sites are linked by a communications

Characteristics
network;
of DDBMS:
– The data at each site is under the control of a
DBMS;
– The DBMS at each site can handle local
applications, autonomously;
– Each DBMS participates in at least one global
application.
Advantages DDS
1. Management of distributed data with different levels of
transparency:
 Distribution transparency
– This refers to the physical placement of data (files, relations, etc.) is not
known to the user.

 Network transparency
– Users do not have to worry about operational details of the network.

 Location transparency
– refers to freedom of issuing command from any location without
affecting its work.
Advantages DDS…
 Naming transparency

– Allows access to any named object (files, relations, etc.) from any
location.

 Replication transparency
− Allows to store copies of a data at multiple sites.

− This is done to minimize access time to the required data.

 Fragmentation transparency
− Allows to segment a relation horizontally (create a subset of tuples of a
relation) or vertically (create a subset of columns of a relation).
Advantages of DDS
2. Increase reliability and availability:
− Reliability refers to system live time, that is, system is running efficiently most of
the time.
− Availability is the probability that the system is continuously available (usable or
accessible) during a time interval.
− A distributed database system has multiple nodes (computers) and if one fails
then others are available to do the job.
3. Improved performance:
− DDBMS fragments the database to keep data closer to where it is needed most.
− This reduces data management (access and modification) time significantly.
4. Scalability - Easier expansion
− Allows new nodes (computers) to be added anytime without chaining the entire
configuration.
– Complexity

Disadvantages – Cost

of – Security
DDS – Integrity control more difficult

– Lack of standards

– Lack of experience

– Database design more complex

Database system architectures
 A Database Architecture is a representation of DBMS design.

 It helps to design, develop, implement, and maintain the

database management system.
 There are three database system architectures:

1. Centralized Database Architecture

2. Parallel Database Architectures

3. Distributed Database Architecture

Centralized database
• A centralized database is basically a type of database that is
stored, located and maintained at a single location only.
• This type of database is modified and managed from that
location itself.
Parallel database architectures

 Parallel DBMSs link multiple, smaller machines to achieve

the same throughput as a single, larger machine, often with
greater scalability and reliability.
 The three main architectures for parallel DBMSs:

 Shared memory(tightly coupled)

 Shared disk (loosely coupled architecture)

 Shared nothing-(massively parallel processing (MPP))

architecture
The three main architectures for parallel DBMSs:

■ Shared memory - tightly coupled architecture in which multiple processors

share secondary (disk) storage and primary memory.
The three main architectures for parallel DBMSs:

 Shared disk -loosely coupled architecture where multiple processors

share secondary (disk) storage but each has their own primary memory.
The three main architectures for parallel DBMSs:

 Shared nothing-(massively parallel processing (MPP)) architecture.

• Multiple processor architecture in which each processor is part of a
complete system, with its own memory and disk storage.
Distributed database
• A distributed database system allows applications to access data
from local and remote databases.
Site 1
• Two Types of distributed database
Type of system:
Distributed
• Homogeneous Distributed Database.
database system
• Heterogeneous Distributed Database.
Homogeneous
• All sites of the database system have identical setup, i.e., same database
system software.
• The underlying operating systems can be a mixture of Linux, Window,
Unix, etc.
• For example, all sites run Oracle or DB2, or Sybase or some other database
system.
Window
Advantages Site 5 Unix
Oracle Site 1
 Easy to use Oracle
Window
 Easy to mange Site 4 Communications
neteork
 Easy to Design
Oracle
Disadvantages Site 3 Site 2
Linux Oracle Linux Oracle
 Difficult for most organizations to
force a homogeneous environment
Homogeneous Distributed Database Systems

 Autonomy determines the extent to which individual nodes or

DBs in a connected DDB can operate independently.
• Design autonomy refers to independence of data model usage and
transaction management techniques among nodes.
• Communication autonomy determines the extent to which each node
can decide on sharing of information with other nodes.
• Execution autonomy refers to independence of users to act as they
please.
Heterogeneous
 Different data center may run different DBMS products, with possibly different underlying data models.

 Translations required to allow for:

 Different hardware. Object Unix Relational
Oriented Site 5 Unix
 Change of codes and word lengths. Site 1
 Different DBMS products. Hierarchical
 Mapping of data structures in one Window
Site 4 Communications
data model to the equivalent data
network
structures in another data model
Network
 Translate the query language used (for example, a relational model SQL SELECT
Object
statements are mapped to the network FIND and GET statements) DBMS
Oriented Site 3 Site 2 Relational
 Different hardware and different DBMS products.
 If both the hardware and software are different, then both these types Linux
Linux of translation are required. This
makes the processing extremely complex.
Heterogeneous
 Advantages
 Huge data can be stored in one Global center from different data center
 Remote access is done using the global schema.
 Different DBMSs may be used at each node

 Disadvantages
 Difficult to mange
 Difficult to design.

.
Multidatabase system (MDBS)

• Multidatabase system (MDBS)- a distributed DBMS in which each site

maintains complete autonomy.
• MDBSs logically integrate a number of independent DDBMSs while allowing the
local DBMSs to maintain complete control of their operations.
• MDBS allows users to access and share data without requiring full database
schema integration.

• Federated database system - collection of cooperating database systems that

are autonomous and possibly heterogeneous.
• Differences in data models
• Differences in constraints
• Differences in query language
Distributed Processing and Distributed Database
DDBMS Components
DDBMS protocol
Computer workstations
 To form the network system.
Network hardware and software
 Components that reside in each workstation.
Communications media
 Carry the data from one workstation to another.
Transaction processor (TP)
 Receives and Processes the application’s data requests.
Data processor (DP)
 Stores and Retrieves data located at the site.
 Also Known as data manager (DM).
DDBMS protocol
• DDBMS protocol determines how the DDBMS will:

– Interface with the network to transport data and commands

between DPs and TPs.

– Synchronize all data received from DPs (TP side) and route
retrieved data to the appropriate TPs (DP side).

– Ensure common database functions in a distributed system --

security, concurrency control, backup, and recovery.
Distributed Database Design
• The design of a distributed database introduces three new
issues:

– How to partition the database into fragments?

– Which fragments to replicate?

– Where to locate those fragments and replicas?

Data Fragmentation
 Data fragmentation allows us to break a single object
into two or more segments or fragments.
 There are three Types of Fragmentation Strategies:

 Horizontal Fragmentation
 Vertical Fragmentation
 Mixed Fragmentation
Horizontal Fragmentation

 Horizontal Fragmentation - Consists of a subset of the tuples

of a relation.
 Fragment represents the equivalent of a SELECT statement, with
the WHERE clause on a single attribute.
Vertical fragment

 Vertical fragment Consists of a subset of the attributes of a

relation.
 Equivalent to the PROJECT statement.
Mixed fragment

 Mixed fragment - Consists of a horizontal

fragment that is subsequently vertically
fragmented, or a vertical fragment that is
then horizontally fragmented.
 A mixed fragment is defined using the
Selection and Projection operations of the
relational algebra.
Data Replication

 Data replication refers to the storage of data copies at multiple

sites served by a computer network.
– Enhance data availability and response time, reducing
communication and total query costs.
Data Replication
• Mutual Consistency Rule
– All copies of data fragments be identical.
– DDBMS must ensure that a database update is performed at all
sites where replicas exist.
• Replication Conditions
– Fully Replicated database stores multiple copies of all database
fragments at multiple sites.
– Partially Replicated database stores multiple copies of some
database fragments at multiple sites.
• Factors for Data Replication Decision
– Database Size
– Usage Frequency
Data Allocation
 Data allocation describes the processing of deciding where to locate
data.
 Data Allocation Strategies
– Centralized
The entire database is stored at one site.
– Partitioned
The database is divided into several disjoint parts (fragments) and
stored at several sites.
– Replicated
Copies of one or more database fragments are stored at several
sites.
Data allocation algorithms
• Data allocation algorithm take into consideration a variety of
factors:

– Performance and data availability goals

– Size, number of rows, the number of relations that an entity

maintains with other entities.

– Types of transactions to be applied to the database, the

attributes accessed by each of those transactions.
Transparencies in a DDBMS

 Transparency hides implementation details from the

user.
‒ Distribution transparency

– Transaction transparency
– Failure transparency
– Performance transparency
Distribution Transparency
• Distribution transparency allows the user to perceive the database as a
single, logical entity.
• Allows us to manage a physically dispersed database as though it were
a centralized database.

• Three Levels of Distribution Transparency

– Fragmentation transparency

– Location transparency

– Local mapping transparency

Distribution Transparency
• Example :
• Employee data (EMPLOYEE) are distributed over three locations: New York, Atlanta, and
Miami.
• Depending on the level of distribution transparency support, three different cases of queries
are possible:
Distribution Transparency
• Case 1: DB Supports Fragmentation Transparency
SELECT * FROM EMPLOYEE WHERE EMP_DOB < '01-JAN-1940';

• Case 2: DB Supports Location Transparency

SELECT * FROM E1 WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E2 WHERE EMP_DOC < '01-JAN-1940';
UNION
SELECT * FROM E3 WHERE EMP_DOC < '01-JAN-1940';

• Case 3: DB Supports Local Mapping Transparency

SELECT * FROM E1 NODE NY WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E2 NODE ATL WHERE EMP_DOB < '01-JAN-1940';
UNION
SELECT * FROM E3 NODE MIA WHERE EMP_DOB < '01-JAN-1940';
Transaction Transparency
• Transaction transparency - ensures that database transactions
will maintain the database’s integrity and consistency.
• Transaction transparency consists:

– Remote Requests

– Remote Transactions

– Distributed Transactions

– Distributed Requests
A Remote Request
 Allows us to access data to be processed by a single remote database
processor.
A Remote Transaction
 Composed of several requests, may access data at only a single
site.
 Allows a transaction to reference several (local or remote) DP sites.
A Distributed Request
 Reference data from several remote DP sites.
 Allows a single request to reference a physically partitioned table.

Example2:
Distributed Request
Distributed Transactions and 2 Phase Commit
 Transaction transparency in a DDBMS environment ensures that all distributed
transactions maintain the distributed database’s integrity and consistency.
 Transaction may access data at several sites.

 Each site has a local transaction manager responsible for:

– Maintaining a log for recovery purposes

– Participating in coordinating the concurrent execution of the transactions

executing at that site.

 Each site has a transaction coordinator, which is responsible for:

– Starting the execution of transactions that originate at the site.

– Distributing sub transactions at appropriate sites for execution.

– Coordinating the termination of each transaction that originates at the site.

Two-Phase Commit Protocol
 DO performs the operation and records the “before” and “after” values in the
transaction log.
 UNDO reverses an operation, using the log entries written by the DO portion
of the sequence.
 REDO redoes an operation, using the log entries written by DO portion of the
sequence.

– The write-ahead protocol forces the log entry to be written to permanent

storage before the actual operation takes place.
• Two-phase commit protocol defines the operations between two nodes;

• Coordinator and

• Subordinates or cohorts - one or more

Two-Phase Commit Protocol
• The protocol is implemented in two phases:
• Phase 1: Preparation

• The coordinator sends a PREPARE TO COMMIT message to all

subordinates.
• The subordinates receive the message, write the transaction log
using the write-ahead protocol, and send an acknowledgement
message to the coordinator.
• The coordinator makes sure that all nodes are ready to commit, or
it aborts the transaction.
Two-Phase Commit Protocol
 Phase 2: The Final Commit
– The coordinator broadcasts a COMMIT message to all
subordinates and waits for the replies.

– Each subordinate receives the COMMIT message then updates

the database, using the DO protocol.
– The subordinates reply with a COMMITTED or NOT COMMITTED
message to the coordinator.
– If one or more subordinates uncommitted, the coordinator sends
an ABORT message, thereby forcing them to UNDO all changes.
Performance Transparency and
Query Optimization

• Query optimization must provide distribution transparency as well

as replica transparency.

• Replica transparency refers to the DDBMSs ability to hide the

existence of multiple copies of data from the user.

• Query optimization algorithms are based on two principles:

• Selection of the optimum execution order

• Selection of sites to be accessed to minimize communication

costs
Operation Modes of Query Optimization
Automatic query optimization
– DDBMS finds the most cost-effective access path without user intervention.
Manual query optimization
– Optimization is selected and scheduled by the end user or programmer.

Timing of Query Optimization

– Static query optimization takes place at compilation time.
– Dynamic query optimization takes place at execution time.
•Optimization Techniques -
– Statistically based query optimization - uses statistical information about
the database.
– Rule-based query optimization algorithm - based on a set of user-defined
rules to determine the best query access strategy.
Date’s Twelve Rules for a DDBMS
• In this final section, we list Date’s twelve rules (or objectives) for DDBMSs (Date, 1987b).

• Fundamental principle

• To the user, a distributed system should look exactly like a non-distributed system.
1) Local autonomy

2) No reliance on a central site

3) Continuous operation

4) Location independence
Date’s Twelve Rules for a DDBMS
5) Fragmentation independence

6) Replication independence

7) Distributed query processing

8) Distributed transaction processing

9) Hardware independence

10) Operating system independence

11) Network independence

12) Database independence

Questions ?
1. Explain what is meant by a DDBMS and discuss the motivation in
providing such a system.

2. Compare and contrast a DDBMS with a parallel DBMS. Under what

circumstances would you choose a DDBMS over a parallel DBMS?

3. Discuss the advantages and disadvantages of a DDBMS.

4. What is the difference between a homogeneous and a heterogeneous

DDBMS? Under what circumstances would such systems generally
arise?

Distributed Database Systems Guide
100% (1)
Distributed Database Systems Guide
54 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
Unit - 1 DDB
No ratings yet
Unit - 1 DDB
34 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
Chapter 5 Object Oriented Database Systems
No ratings yet
Chapter 5 Object Oriented Database Systems
110 pages
Advanced Database System Simple Questions
No ratings yet
Advanced Database System Simple Questions
13 pages
DDBMS True False
No ratings yet
DDBMS True False
7 pages
Distributed DBMS Reliability: Presented by (Team7) : Yashika Tamang Spencer Riner
No ratings yet
Distributed DBMS Reliability: Presented by (Team7) : Yashika Tamang Spencer Riner
51 pages
DDBMS MCQ - 1
No ratings yet
DDBMS MCQ - 1
10 pages
Chapter - 6 Database Security and Authorization
50% (2)
Chapter - 6 Database Security and Authorization
25 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Haramaya University
No ratings yet
Haramaya University
29 pages
Final DB Systems Exam June 2020
No ratings yet
Final DB Systems Exam June 2020
8 pages
Final Exam Course Outline OODBMS
No ratings yet
Final Exam Course Outline OODBMS
2 pages
Distributed Databases
100% (1)
Distributed Databases
26 pages
Transaction Management & Concurrency
100% (1)
Transaction Management & Concurrency
28 pages
SQL Quiz for Database Students
No ratings yet
SQL Quiz for Database Students
31 pages
Mettu University: Fundamental of Database System
No ratings yet
Mettu University: Fundamental of Database System
30 pages
Chapter 6 - Review Questions
No ratings yet
Chapter 6 - Review Questions
6 pages
OOP For Exit Exam
100% (1)
OOP For Exit Exam
51 pages
Chapter 5 Database Recovery Techniques
100% (1)
Chapter 5 Database Recovery Techniques
46 pages
Object Oriented Databases
No ratings yet
Object Oriented Databases
12 pages
Database Design
No ratings yet
Database Design
97 pages
Chapter 3 Database Systems and Big Data
No ratings yet
Chapter 3 Database Systems and Big Data
39 pages
A Introduction To Computing Questions For 2016 Exit Exam
No ratings yet
A Introduction To Computing Questions For 2016 Exit Exam
27 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
Distributed Systems Chapter 5-Naming 1
No ratings yet
Distributed Systems Chapter 5-Naming 1
57 pages
Midterm Exam
50% (2)
Midterm Exam
3 pages
Database System Final Exam Sheet 3
No ratings yet
Database System Final Exam Sheet 3
5 pages
Chapter 1 - Concept of Object Oriented Database
100% (1)
Chapter 1 - Concept of Object Oriented Database
23 pages
Database Question For Exit
100% (1)
Database Question For Exit
12 pages
Programming Pointers Guide
No ratings yet
Programming Pointers Guide
36 pages
Operating System Mid Term Exam Revision Note
No ratings yet
Operating System Mid Term Exam Revision Note
105 pages
Advanced Database Systems Transactions Processing: What Is A Transaction?
No ratings yet
Advanced Database Systems Transactions Processing: What Is A Transaction?
102 pages
Databases MidTerm Exam 1 - Questions
50% (2)
Databases MidTerm Exam 1 - Questions
3 pages
Advanced DB Chapter-3
No ratings yet
Advanced DB Chapter-3
54 pages
FDB For Exit Exam
No ratings yet
FDB For Exit Exam
284 pages
Chapter 6
No ratings yet
Chapter 6
20 pages
CSC331 Midterm Exam Answers
100% (2)
CSC331 Midterm Exam Answers
11 pages
Assignment ON Data Mining
No ratings yet
Assignment ON Data Mining
24 pages
Database Systems Questions and Answers
100% (2)
Database Systems Questions and Answers
14 pages
Event Driven Programming Mock Exam Questions
No ratings yet
Event Driven Programming Mock Exam Questions
5 pages
Chapter - 1 Object-Oriented & Objectrelational Databases
100% (2)
Chapter - 1 Object-Oriented & Objectrelational Databases
73 pages
C++ Arrays and Strings Guide
100% (1)
C++ Arrays and Strings Guide
41 pages
Single-User vs. Multi-User System: Dbms - Module - 5 - Notes
No ratings yet
Single-User vs. Multi-User System: Dbms - Module - 5 - Notes
19 pages
Royal University College: Department of Business Management
100% (1)
Royal University College: Department of Business Management
2 pages
Database Managment System
No ratings yet
Database Managment System
18 pages
Chapter 3. Control Statements
100% (1)
Chapter 3. Control Statements
62 pages
Database Management Multiple Choice Questions & Answers: 1. A. C. D. 2. A. C. D. 3. A. B. C. 4. A. C. D. 5. A. C. D. 6
No ratings yet
Database Management Multiple Choice Questions & Answers: 1. A. C. D. 2. A. C. D. 3. A. B. C. 4. A. C. D. 5. A. C. D. 6
3 pages
Advanced Database Chapter One
100% (1)
Advanced Database Chapter One
60 pages
Assignment Distributed Database System
No ratings yet
Assignment Distributed Database System
6 pages
Exit Exam Training
No ratings yet
Exit Exam Training
16 pages
Distributed System: Naming System in DS
No ratings yet
Distributed System: Naming System in DS
51 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
Android 100 MCQS
No ratings yet
Android 100 MCQS
39 pages
School of Information Science: Addis Ababa University College of Natural and Computational Science
0% (1)
School of Information Science: Addis Ababa University College of Natural and Computational Science
8 pages
Chapter 3 Review Questions
No ratings yet
Chapter 3 Review Questions
5 pages
Distributed Database Systems Guide
0% (1)
Distributed Database Systems Guide
54 pages
Distributed DBMS Architecture
No ratings yet
Distributed DBMS Architecture
49 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
Advanced Database Systems Chapter One Query Processing & Optimization
No ratings yet
Advanced Database Systems Chapter One Query Processing & Optimization
22 pages
Ass 3
No ratings yet
Ass 3
1 page
Lab Manual
No ratings yet
Lab Manual
44 pages
SQL Query Analysis for Students
No ratings yet
SQL Query Analysis for Students
1 page
Chapter - 3 TRANSACTION PROCESSING
No ratings yet
Chapter - 3 TRANSACTION PROCESSING
51 pages
Oxygen Therapy. Methods of Oxygenation
No ratings yet
Oxygen Therapy. Methods of Oxygenation
74 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
28 pages
Chapter 1 Introduction To The Theory of Computation
No ratings yet
Chapter 1 Introduction To The Theory of Computation
70 pages
Introduction to Databases
No ratings yet
Introduction to Databases
20 pages
1149 BD Databases-Ramaz
No ratings yet
1149 BD Databases-Ramaz
32 pages
Database Management
No ratings yet
Database Management
19 pages
Database Requirements of CIM Applications
No ratings yet
Database Requirements of CIM Applications
29 pages
DB2 Federation Luis Garmendia
No ratings yet
DB2 Federation Luis Garmendia
163 pages
Database Systems for IT Professionals
No ratings yet
Database Systems for IT Professionals
19 pages
CAD/CAM
100% (1)
CAD/CAM
18 pages
UCMDB
No ratings yet
UCMDB
4 pages
DB2Connect Db2c0e953
No ratings yet
DB2Connect Db2c0e953
189 pages
Data Integration Challenges & Solutions
No ratings yet
Data Integration Challenges & Solutions
22 pages
Assignment 1 2
No ratings yet
Assignment 1 2
4 pages
Database
100% (2)
Database
106 pages
DBMS-unit 5-Distributed Databases
No ratings yet
DBMS-unit 5-Distributed Databases
11 pages
Types of Distributed Data Base System - 49724
No ratings yet
Types of Distributed Data Base System - 49724
37 pages
Term Paper On DBMS
No ratings yet
Term Paper On DBMS
27 pages
Data Integration
No ratings yet
Data Integration
44 pages
Chapter 6
No ratings yet
Chapter 6
45 pages
An Overview of Ontology Based Approach To Organize The
No ratings yet
An Overview of Ontology Based Approach To Organize The
6 pages
Database PDF
No ratings yet
Database PDF
22 pages
DDBMS Architecture
No ratings yet
DDBMS Architecture
11 pages
IT Knowledge Assessment
100% (2)
IT Knowledge Assessment
12 pages
Multidatabase Systems Overview
No ratings yet
Multidatabase Systems Overview
21 pages
Chapter - 6 Distributed Database System
No ratings yet
Chapter - 6 Distributed Database System
50 pages
Databases Wikibook
No ratings yet
Databases Wikibook
105 pages
Introduction To Distributed Query Processing
No ratings yet
Introduction To Distributed Query Processing
10 pages
1 Distributed DB
No ratings yet
1 Distributed DB
67 pages
c2.Fdb Orcl - Data Source Access3 XML Json
No ratings yet
c2.Fdb Orcl - Data Source Access3 XML Json
35 pages
CRM Coca Cola
No ratings yet
CRM Coca Cola
6 pages
IOT Data Management and Analytics
No ratings yet
IOT Data Management and Analytics
27 pages

Chapter - 6 Distributed Database System

Uploaded by

Chapter - 6 Distributed Database System

Uploaded by

In this chapter you will learn:

• The need for distributed databases.

Distributed – logically interrelated collection of shared

Database data (and a description of this data)

Concepts physically distributed over a computer

– is a software system that manages a

– The data is split into a number of fragments;

– Fragments may be replicated;

– The sites are linked by a communications

− This is done to minimize access time to the required data.

– Database design more complex

 It helps to design, develop, implement, and maintain the

1. Centralized Database Architecture

2. Parallel Database Architectures

3. Distributed Database Architecture

 Parallel DBMSs link multiple, smaller machines to achieve

 Shared memory(tightly coupled)

 Shared disk (loosely coupled architecture)

 Shared nothing-(massively parallel processing (MPP))

■ Shared memory - tightly coupled architecture in which multiple processors

 Shared disk -loosely coupled architecture where multiple processors

 Shared nothing-(massively parallel processing (MPP)) architecture.

 Autonomy determines the extent to which individual nodes or

 Translations required to allow for:

• Multidatabase system (MDBS)- a distributed DBMS in which each site

• Federated database system - collection of cooperating database systems that

– Interface with the network to transport data and commands

– Ensure common database functions in a distributed system --

– How to partition the database into fragments?

– Which fragments to replicate?

– Where to locate those fragments and replicas?

 Horizontal Fragmentation - Consists of a subset of the tuples

 Vertical fragment Consists of a subset of the attributes of a

 Mixed fragment - Consists of a horizontal

 Data replication refers to the storage of data copies at multiple

– Performance and data availability goals

– Size, number of rows, the number of relations that an entity

– Types of transactions to be applied to the database, the

 Transparency hides implementation details from the

• Three Levels of Distribution Transparency

– Local mapping transparency

• Case 2: DB Supports Location Transparency

• Case 3: DB Supports Local Mapping Transparency

 Each site has a local transaction manager responsible for:

– Maintaining a log for recovery purposes

– Participating in coordinating the concurrent execution of the transactions

 Each site has a transaction coordinator, which is responsible for:

– Distributing sub transactions at appropriate sites for execution.

– Coordinating the termination of each transaction that originates at the site.

– The write-ahead protocol forces the log entry to be written to permanent

• Subordinates or cohorts - one or more

• The coordinator sends a PREPARE TO COMMIT message to all

– Each subordinate receives the COMMIT message then updates

• Query optimization must provide distribution transparency as well

• Replica transparency refers to the DDBMSs ability to hide the

• Query optimization algorithms are based on two principles:

• Selection of the optimum execution order

• Selection of sites to be accessed to minimize communication

Timing of Query Optimization

2) No reliance on a central site

7) Distributed query processing

8) Distributed transaction processing

10) Operating system independence

11) Network independence

12) Database independence

2. Compare and contrast a DDBMS with a parallel DBMS. Under what

3. Discuss the advantages and disadvantages of a DDBMS.

4. What is the difference between a homogeneous and a heterogeneous

You might also like