[go: up one dir, main page]

0% found this document useful (0 votes)
14 views32 pages

RDBMS Unit-5

Uploaded by

kevanacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views32 pages

RDBMS Unit-5

Uploaded by

kevanacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

RDBMS by P Nagaiah Goud

UNIT-5

5.1 Distributed database and Need for Distributed database management


systems
Distributed database

It is a collection of multiple interconnected databases, which are spread physically across


various locations that communicate via a computer network.

Features

▪ Databases in the collection are logically interrelated with each other. Often, they
represent a single logical database.

▪ Data is physically stored across multiple sites. Data in each site can be managed by a
DBMS independent of the other sites.

▪ The processors in the sites are connected via a network. They do not have any
multiprocessor configuration.

▪ A distributed database is not a loosely connected file system.

▪ A distributed database incorporates transaction processing, but it is not synonymous


with a transaction processing system

Distributed Database Management System

A distributed database management system (DDBMS) is a centralized software system that


manages a distributed database in a manner as if it were all stored in a single location.

Features

▪ It is used to create, retrieve, update and delete distributed databases.

▪ It synchronizes the database periodically and provides access mechanisms by the virtue
of which the distribution becomes transparent to the users.

▪ It ensures that the data modified at any site is universally updated.

▪ It is used in application areas where large volumes of data are processed and accessed
by numerous users simultaneously.

▪ It is designed for heterogeneous database platforms.


RDBMS by P Nagaiah Goud

▪ It maintains confidentiality and data integrity of the databases.

Need for DDBMS:

The following factors encourage moving over to DDBMS

▪ Distributed Nature of Organizational Units

▪ Need for Sharing of Data

▪ Support for Both OLTP and OLAP

▪ Database Recovery

▪ Support for Multiple Application Software

Distributed Nature of Organizational Units

Most organizations in the current times are subdivided into multiple units that are
physically distributed over the globe. Each unit requires its own set of local data. Thus, the
overall database of the organization becomes distributed.

Need for Sharing of Data


RDBMS by P Nagaiah Goud

The multiple organizational units often need to communicate with each other and share
their data and resources. This demands common databases or replicated databases that
should be used in a synchronized manner.

Support for Both OLTP and OLAP

Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) work
upon diversified systems which may have common data. Distributed database systems
aid both these processing by providing synchronized data.

Database Recovery

One of the common techniques used in DDBMS is replication of data across different
sites. Replication of data automatically helps in data recovery if database in any site is
damaged. Users can access data from other sites while the damaged site is being
reconstructed. Thus, database failure may become almost inconspicuous to users.

Support for Multiple Application Software

Most organizations use a variety of application software each with its specific database
support. DDBMS provides a uniform functionality for using the same data among
different platforms.

5.2 Advantages and Disadvantages of DDBMS


Advantages of DDBMS are as follows:

▪ Data are located near the greatest demand site. The data in a distributed database
system are dispersed to match business requirements which reduce the cost of data
access.

▪ Faster data access. End users often work with only a locally stored subset of the
company’s data.

▪ Faster data processing. A distributed database system spreads out the systems
workload by processing data at several sites.

▪ Growth facilitation. New sites can be added to the network without affecting the
operations of other sites.

▪ Improved communications. Because local sites are smaller and located closer to
customers, local sites foster better communication among departments and between
customers and company staff.
RDBMS by P Nagaiah Goud

▪ Reduced operating costs. It is more cost-effective to add workstations to a network


than to update a mainframe system. Development work is done more cheaply and more
quickly on low-cost PCs than on mainframes.

▪ User-friendly interface. PCs and workstations are usually equipped with an easy-to-use
graphical user interface (GUI). The GUI simplifies training and use for end users.

▪ Less danger of a single-point failure. When one of the computers fails, the workload is
picked up by other workstations. Data are also distributed at multiple sites.

▪ Processor independence. The end user is able to access any available copy of the data,
and an end user's request is processed by any processor at the data location.

Disadvantages of DDBMS:

▪ Complexity of management and control. Applications must recognize data location, and
they must be able to stitch together data from various sites. Database administrators
must have the ability to coordinate database activities to prevent database degradation
due to data anomalies.

▪ Technological difficulty. Data integrity, transaction management, concurrency control,


security, backup, recovery, query optimization, access path selection, and so on, must all
be addressed and resolved.

▪ Security. The probability of security lapses increases when data are located at multiple
sites. The responsibility of data management will be shared by different people at
several sites.

▪ Lack of standards. There are no standard communication protocols at the database


level. (Although TCP/IP is the de facto standard at the network level, there is no
standard at the application level.) For example, different database vendors employ
different—and often incompatible—techniques to manage the distribution of data and
processing in a DDBMS environment.

▪ Increased storage and infrastructure requirements. Multiple copies of data are


required at different sites, thus requiring additional disk storage space.

▪ Increased training cost. Training costs are generally higher in a distributed model than
they would be in a centralized model, sometimes even to the extent of offsetting
operational and hardware savings. Costs. Distributed databases require duplicated
infrastructure to operate (physical location, environment, personnel, software,
licensing, etc.)
RDBMS by P Nagaiah Goud

5.3 Explain DDBMS Structure


Structure of Distributed Database System (DDBMS)
A Distributed Database Management System (DDBMS) is basically a software. Which is
used for managing the distributed databases. This system consists of multiple DBMSs,
each of which is locally executed. The connection between these DBMSs is done using
message handling mechanism. The major component of DDBMS is data dictionary.
Which contains information for managing,
i) Data
ii) Data location
iii) Data replication and
iv) Data fragmentation.

The purpose of data dictionary is to provide relevant information about location and
replication during query processing. It provides the information by assuring that the updates
have been transferred to the desired locations. The data dictionary can be managed at
centralized location or at distributed locations. However, to obtain complete data dictionary, all
the distributed subsets of the dictionary must be integrated. DDBMS provides a provision of
user interaction. This interaction is done by means of transactions. Transaction is the
mechanism of executing programs. DDBMS transactions consist of multiple processes. Each of
which is controlled by independent software modules.

Every process involved in the transaction is referred to as an agent. It is possible that a


transaction may require a single agent or multiple agents. In the former case, the transaction is
referred as local transaction and in the latter; it is referred as global transaction. The purpose of
RDBMS by P Nagaiah Goud

an agent is to access only the data which is under the control of its local data management
software.

Basically, the execution of a transaction requires an initiating agent which can activate
agents of other sites by means of a request sent so as to get access over the required data.
Once the agent gets activated, they can interact with each other by means of message
exchange mechanism. This interaction requires cooperation of two or more agents.

To access records, transaction issues read and write operations. The sites that take part
in the DBMS can run one or more software modules such as a Transaction Manager(TM), a Data
Manager (DM) and a scheduler.

The relationships among various software modules involved in DDBMS are as follows.

▪ Transactions interact with TMs

▪ TMs interact with schedulers

▪ Scheduler interact among themselves and with DMs

▪ DMs manage the data

Components of Distributed DBMS Architecture

User processor and Data processor are the two major components of Distributed DBMS
architecture. These major components handle different user requests using several sub-
components in a Peer-to-Peer Distributed DBMS. Those are;

▪ User Processor

▪ Data Processor

User Processor

▪ User interface handler – interpreting user commands when they are given in, and
formatting the result sets when the request is answered.

▪ Semantic data controller – uses the Global Conceptual Schema to check the integrity
constraints defined on database elements and also to check the authorizations on
accessing the requested database.

▪ Global query optimizer and decomposer – devises a best execution strategy to execute
the given user requests in minimal cost (in terms of time, processor, memory). It is like
RDBMS by P Nagaiah Goud

Query Optimizer in Centralized database systems. Only exception is it has to devise a


strategy which is globally optimal.

▪ Distributed execution monitor – it is the Transaction manager. The Transaction


managers of various sites that are participated in a query execution communicate with
each other as part of execution monitoring.

Data Processor

▪ Local query optimizer – it optimizes data access by choosing the best access path. For
example, Local query optimizer decides which index to be used for optimally executing
the given query.

▪ Local recovery manager – deals with the consistency of the local database. In case of
failure, local recovery manager is responsible for maintaining a consistent database.

▪ Run-time support processor – it accesses the database physically according to the


strategy suggested by the local query optimizer. The run-time support processor is the
interface to the operating system and contains the database buffer (or cache) manager,
which is responsible for maintaining the main memory buffers and managing the data
accesses.

5.4 Advantages and disadvantages of Distributed Database (or Data


distribution)

Advantages:
RDBMS by P Nagaiah Goud

• Increased reliability and availability: – A distributed database system is robust to failure


to some extent. Hence, it is reliable when compared to a Centralized database system.
• Local control: – The data is distributed in such a way that every portion of it is local to
some sites (servers). The site in which the portion of data is stored is the owner of the
data.
• Modular growth (resilient): – Growth is easier. We do not need to interrupt any of the
functioning sites to introduce (add) a new site. Hence, the expansion of the whole
system is easier. Removal of site is also does not cause much problems.
• Lower communication costs (More Economical): – Data are distributed in such a way
that they are available near to the location where they are needed more. This reduces
the communication cost much more compared to a centralized system.
• Faster response: – Most of the data are local and in close proximity to where they are
needed. Hence, the requests can be answered quickly compared to a centralized
system.
• Secured management of distributed data: – Various transparencies like network
transparency, fragmentation transparency, and replication transparency are
implemented to hide the actual implementation details of the whole distributed system.
In such way, Distributed database provides security for data.
• Robust: – The system is continued to work in case of failures. For example, replicated
distributed database performs in spite of failure of other sites.
• Improved performance and Parallelism in executing transactions can be achieved.

Disadvantages:
• Complex Software: – Complex implementation. Costs more in terms of software cost
compared to a centralized system. Additional software might be needed in most of the
cases over a centralized system.
• Increased Processing overhead: – It costs many messages to be shared between sites to
complete a distributed transaction.
• Data integrity: – Data integrity becomes complex. Too much network resources may be
used.
• Different data formats might be used – This may cost time.
• Deadlock is difficult to handle compared to a centralized system.
• May cause much more network traffic in case of write operation in a replicated form of
distributed database.
• Distributed System supported Operating System is required to implement distributed
database system.
• The data shared between sites over networks are vulnerable to attack. Hence, network-
oriented security protocols to be used based on the sensitivity of data shared.
• More complex in terms database design – According to various applications, we may
need to fragment a database, or replicate a database or both.
• Handling failures is a difficult task. In some cases, we may not distinguish site failure,
network partition, and link failure.
RDBMS by P Nagaiah Goud

5.5 Data fragmentation


The main goal of DDBMS is to provide the data to the user from the nearest location to them
and as fast as possible. Hence the data in a table are divided according their location or as per
user’s requirement. Dividing the whole table data into smaller chunks and storing them in
different DBs in the DDBMS is called data fragmentation. By fragmenting the relation in DB
allows:

• Easy usage of Data: It makes most frequently accessed set of data near to the
user. Hence these data can be accessed easily as and when required by them.
• Efficiency: It in turn increases the efficiency of the query by reducing the size of
the table to smaller subset and making them available with less network access
time.
• Security: It provides security to the data. That means only valid and useful
records will be available to the actual user. The DB near to the user will not have
any unwanted data in their DB. It will contain only that information, which are
necessary for them.
• Parallelism: Fragmentation allows user to access the same table at the same
time from different locations. Users at different locations will be accessing the
same table in the DB at their location, seeing the data that are meant for them. If
they are accessing the table at one location, then they have to wait for the locks
to perform their transactions.
• Reliability: It increases the reliability of fetching the data. If the users are located
at different locations accessing the single DB, then there will be huge network
load. This will not guarantee that correct records are fetched and returned to the
user. Accessing the fragment of data in the nearest DB will reduce the risk of
data loss and correctness of data.
• Balanced Storage: Data will be distributed evenly among the databases in DDB.
Fragmentation of data can be done according to the DBs and user requirement. But while
fragmenting the data, below points should be kept in mind:

• Completeness: While creating the fragment, partial records in the table should
not be considered. Fragmentation should be performed on whole table’s data to
get the correct result.
• Reconstructions: When all the fragments are combined, it should give whole
table’s data. That means whole table should be able to reconstruct using all
fragments.
• Disjointedness: There should not be any overlapping data in the fragments. If so,
it will be difficult to maintain the consistency of the data. Effort needs to be put
to create same replication in all the copies of data.
RDBMS by P Nagaiah Goud

Eid Ename Salary Location


E101 Naveen 12000 HTN
E102 Jaswanth 15000 HTN
E103 Satvik 13000 LBN
E104 Samarth 14000 LBN
E105 Sai kiran 20000 KKP
E106 Vinod 18000 KKP

E105 Sai 20000 KKP


KKP
kiran
Local DBMS E106 Vinod 18000 KKP

site 2

E103 Satvik 13000 LBN E101 Naveen 12000 HTN


E104 Samarth 14000 LBN E102 Jashwanth 15000 HTN

LBN HTN
Local DBMS Site 1 Local DBMS site 3

Types:
• Horizontal Data Fragmentation
• Vertical Data Fragmentation
• Hybrid Data Fragmentation
Horizontal Data Fragmentation:

• Here the data / records are fragmented horizontally. i.e.; horizontal subset of table data
is created and are stored in different database in DDB.
RDBMS by P Nagaiah Goud

• For example, consider the employees working at different locations of the organization
like India, USA, UK etc. number of employees from all these locations are not a small
number. They are huge in number. When any details of any one employee are required,
whole table needs to be accessed to get the information. Again, the employee table may
present in any location in the world. But the concept of DDB is to place the data in the
nearest DB so that it will be accessed quickly. Hence what we do is divide the entire
employee table data horizontally based on the location.

Vertical Data Fragmentation:

This is the vertical subset of a relation. That means a relation / table is fragmented by
considering the columns of it.

For example, consider the EMPLOYEE table with ID, Name, Address, Age, location, DeptID,
ProjID. The vertical fragmentation of this table may be dividing the table into different tables
with one or more columns from EMPLOYEE.

Hybrid Data Fragmentation :

This is the combination of horizontal as well as vertical fragmentation. This type of


fragmentation will have horizontal fragmentation to have subset of data to be distributed over
the DB, and vertical fragmentation to have subset of columns of the table.

This type of fragmentation can be done in any order. It does not have any particular order. It is
solely based on the user requirement. But it should satisfy fragmentation conditions.
RDBMS by P Nagaiah Goud

5.6 Data Replication and types of Data Replication


Data replication is the process in which the data is copied at multiple locations (Different
computers or servers) to improve the availability of data. Data replication is done with an aim
to:
• Increase the availability of data.
• Speed up the query evaluation.

Types of data replication

There are two types of data replication:

Synchronous Replication:
In synchronous replication, the replica will be modified immediately after some changes are
made in the relation table. So, there is no difference between original data and replica.

Asynchronous replication:
In asynchronous replication, the replica will be modified after commit is fired on to the
database.

Replication Schemes

The three replication schemes are as follows:


• Full replication
• No Replication
• Partial replication
Full Replication
In full replication scheme, the database is available to almost every location or user in
communication network.
RDBMS by P Nagaiah Goud

Advantages of full replication


• High availability of data, as database is available to almost every location.
• Faster execution of queries.
Disadvantages of full replication
• Concurrency control is difficult to achieve in full replication.
• Update operation is slower.
No Replication
No replication means, each fragment is stored exactly at one location.

KKP E105 Sai 20000 KKP


kian
Local DBMS site 2 E106 Vinod 18000 KKP

E103 Satvik 13000 LBN E101 Naveen 12000 HTN


E104 Samarth 14000 LBN E102 Jashwanth 15000 HTN

LBN HTN

Local DBMS Site 1 Local DBMS site 3


RDBMS by P Nagaiah Goud

Advantages of no replication
• Concurrency can be minimized.
• Easy recovery of data.
Disadvantages of no replication
• Poor availability of data.
• Slows down the query execution process, as multiple clients are accessing the same
server.

Partial replication

Partial replication means only some fragments are replicated from the database.

Advantages of partial replication


The number of replicas created for fragments depend upon the importance of data in that
fragment.

Advantages of Data Replication

• In data replication, availability of data gets increased.


• In data replication, consistency is maintained across every node of the database.
• In data replication, reliability gets the increase.
• In data replication, there can be multiple users and still there is not much load at any
one site as the data is distributed to various sites consistently.
• In data replication, faster processing and execution time.
• In data replication, the data may get found at the place where the transaction gets
executed thus helps in lesser movement of the data.
• In data replication, performance is increased.
• In data replication, retrieval of data or modification of data becomes easier.

Disadvantages of Data Replication


RDBMS by P Nagaiah Goud

• In data replication, storage space required gets higher as the replicas needs more space
going through various sites at a time.
• In data replication, the cost to replicate the data at all sites also gets increased as every
site needs to get updated altogether.
• In data replication, it becomes hard to maintain the consistency of data.
• In data replication, complexity of data increases as well.

5.7 Client Server database and Need for Client Server Computing
Client server database system:

It is a system in which server manages the resources and client consumed these resources. The
client and the server are the logical entities that work together over a network to accomplish a
task.

Characteristics of The Client and The Server:

• Service
• Resource sharing
• Asymmetrical protocols
• Transparency of location
• Inter- Communication via messages
• Encapsulation of services
• Scalability
• Integrity

Service: The client/server is basically a relationship between processes running on


distributed devices, server process considered as a supplier of services where the client
process is a consumer of services. Briefly, this methodology provides a separation of
functionalities subject to offered services.
RDBMS by P Nagaiah Goud

Resource sharing: A server is eligible of handling clients simultaneously, controlling the


service access for the resources.

Asymmetrical protocols: Client/server is considered as a many-to-one relationship which is


initiated by clients through request of service while the server passively awaits. Sometimes
a client may pass a reference to a callback object when it requests a service. This enforces
the server call back the client, making the server a client itself.

Transparency of location: The server process can reside in a client or any machine across a
network. In such situations Client/Server software is responsible for implying the server
location by redirecting service calls. Therefore, a program can be a client/server/or both.

Inter- Communication via messages: Interaction between clients and servers is obtained
through a message-passing mechanism mainly to deliver service requests and responses.

Encapsulation of services: A server is specialized in satisfying client requests varyingly and


can be upgraded without affecting its external environment (clients, shared resources) as
long as the message broadcasting interface remains the same.

Scalability: Client/Server systems can be scaled horizontally or vertically. Horizontal scaling


implies the addition or removal of client workstations with a minor impact in performance.
Migration to more efficient servers or dividing the work load over numerous servers is
considered as vertical scaling.

Integrity: Since the server code and server data is managed centrally, maintenance cost is
less and results in shared data consistency and undependability of clients.

Need for client server computing:

Client server computing is typically needed in large databases systems because of the
following reasons.

• To increase the availability of the data.


• To control multiple requests from multiple users.
• To simplify the system implementation with the features of centralized sever,
separate client and server operations
• To eliminate the need of expensive servers which are not capable of carrying out
efficient user interactions.
• To facilitate users in carrying out their operations in a user-friendly way with use of
GUI.
• To provide shared access to the client accessing the centralized databases.
RDBMS by P Nagaiah Goud

• To handover all the computational responsibilities to server while client can only
generate request for the data.
• To develop a customized platform for specific applications.
• To fasten the processing by providing high speed links that connect hard disks and
processors.
• They are required to develop applications that are based on management
information systems.
• To distribute the workload among client and server.

5.8 Structure of client server system and its Advantages


Definition:

Client-server architecture is also called of the “Client/Server Network" or “Network computing


Model “, because in this architecture all services and requests are spread over the network. Its
functionality like as distributed computing system because in which all components are
performing their tasks independently from each other.

Client-server architecture is a shared computer network architecture where several clients


(remote system) send many requests and finally to obtained services from the centralized
server machine (host system). Client machine delivers user-friendly interface that helps to users
to fire request services of server computer and finally to show your output on client system.

Types of Client-Server Architecture:

The following are the different types of architectures.

• 1-Tier Architecture
• 2-Tier Architecture
• 3-Tier Architecture
• N-Tier Architecture

1-Tier Architecture

In the 1-tier architecture, all client/server configuration setting, user interface environment,
data logic, and marketing logic system are existed on the same system. These types of
services are reliable but it is very difficult tasks to handle because they contain all data in
different variance, which are allotted the replication of entire work. This architecture also
contains the different layers
RDBMS by P Nagaiah Goud

For example – Presentation, Business, Data Access layer with using of single software
package. All data is saved on the local machine. Some applications, which manage all three
tiers like as MP3 player, MS Office; but these types of applications are presented under 1-
tier architecture applications.

2-Tier Architecture

2-tier architecture provides the best client/server environment that helps to store user
interface on the client system and all database is saved on the server machine. Business logic
and database logic are existed on the client otherwise server, but they are required to be
maintained. When data logic and business are gathered on the client terminal then it is known
as “fat client thin server architecture”. But if Business Logic and Data Logic are controlled at the
server machine then it is known as “thin client fat server architecture”.

In this architecture, client and server machines are connected directly incorporation because if
client is firing any input for server terminal then in between should not any intermediate. So, it
delivers the output with fastest rate and to ignore misunderstanding between the other
clients. For example – online ticket reservations program, in which 2-tier architecture is used

Benefits :
RDBMS by P Nagaiah Goud

• Easy to design all applications


• Maximum user satisfaction
• Implementation of Homogeneous Environment
• Best performance

Limitations :

• Poor performance due to grow number of connections of each user


• Less security
• All clients are totally dependent upon the manufacturer’s database.
• Less portability means this architecture is totally dependent upon the particular
database.

3-Tier Architecture

In this 3-tier architecture, middleware is needed because if client machine sends the request to
server machine, then firstly this request is received by middle layer, and finally this request is
obtained to server. So, firstly response of server is received by middle layer then it is obtained
to client machine. All data logic and business logic are stored on the middleware. Due to use of
middleware, to improve its flexibility and deliver excellent performance.

3-tier architecture is divided into 3 layers such as presentation layer (Client Tier), Application
layer (Business Tier) and Database layer (Data Tier). Client machine handles the presentation
layer, Application layer controls the Application layer, and finally Server machine takes care of
Database layer.

Benefits:

• Best performed data integrity


• Improved security to 2-tier architecture
• Hide database structure
RDBMS by P Nagaiah Goud

Limitation :

▪ To increase complexity of communication in between client and server because in which


middleware is also used

N-Tier Architecture

This architecture is also known as the “Multitier Architecture”, so it is scaled form of 3-tier
architecture. In this architecture, entire presentations, application processing, and data
management functions are isolated from each other.

Benefit

▪ It delivers the flexible and reusable applications.

Limitations

▪ Harder to implement because it uses the complex structure (componentization of tiers)

5.9 Multimedia database


Multimedia database:

It is the collection of interrelated multimedia data that includes text, graphics, images,
animations, video, audio etc. and have vast amounts of multisource multimedia data. The
framework that manages different types of multimedia data which can be stored, delivered and
utilized in different ways is known as multimedia database management system. There are
three classes of the multimedia database which includes

▪ Static media

▪ Dynamic media

▪ Dimensional media

Content of Multimedia Database management system:

• Media data – The actual data representing an object.

• Media format data – Information such as sampling rate, resolution, encoding scheme
etc. about the format of the media data after it goes through the acquisition, processing
and encoding phase.

• Media keyword data – Keywords description relating to the generation of data. It is also
known as content descriptive data. Example: date, time and place of recording.
RDBMS by P Nagaiah Goud

• Media feature data – Content dependent data such as the distribution of colors, kinds
of texture and different shapes present in data.

Types of multimedia applications based on data management characteristic are:

▪ Repository applications – A Large amount of multimedia data as well as meta-data


(Media format date, Media keyword data, Media feature data) that is stored for
retrieval purpose, e.g., Repository of satellite images, engineering drawings, radiology
scanned pictures.

▪ Presentation applications – They involve delivery of multimedia data subject to


temporal constraint. Optimal viewing or listening requires DBMS to deliver data at
certain rate offering the quality of service above a certain threshold. Here data is
processed as it is delivered. Example: Annotating of video and audio data, real-time
editing analysis.

▪ Collaborative work using multimedia information – It involves executing a complex task


by merging drawings, changing notifications. Example: Intelligent healthcare network.

Challenges to multimedia databases:

▪ Modelling – Working in this area can improve database versus information retrieval
techniques thus, documents constitute a specialized area and deserve special
consideration.

▪ Design – The conceptual, logical and physical design of multimedia databases has not
yet been addressed fully as performance and tuning issues at each level are far more
complex as they consist of a variety of formats like JPEG, GIF, PNG, MPEG which is not
easy to convert from one form to another.

▪ Storage – Storage of multimedia database on any standard disk presents the problem of
representation, compression, mapping to device hierarchies, archiving and buffering
during input-output operation. In DBMS, a ”BLOB”(Binary Large Object) facility allows
untyped bitmaps to be stored and retrieved.

▪ Performance – For an application involving video playback or audio-video


synchronization, physical limitations dominate. The use of parallel processing may
alleviate some problems but such techniques are not yet fully developed. Apart from
this multimedia database consume a lot of processing time as well as bandwidth.
RDBMS by P Nagaiah Goud

▪ Queries and retrieval –For multimedia data like images, video, audio accessing data
through query opens up many issues like efficient query formulation, query execution
and optimization which need to be worked upon.

Areas where multimedia database is applied:

▪ Documents and record management: Industries and businesses that keep detailed
records and variety of documents. Example: Insurance claim record.

▪ Knowledge dissemination: Multimedia database is a very effective tool for knowledge


dissemination in terms of providing several resources. Example: Electronic books.

▪ Education and training: Computer-aided learning materials can be designed using


multimedia sources which are nowadays very popular sources of learning. Example:
Digital libraries.

▪ Marketing, advertising, retailing, entertainment and travel. Example: a virtual tour of


cities.

▪ Real-time control and monitoring: Coupled with active database technology,


multimedia presentation of information can be very effective means for monitoring and
controlling complex tasks Example: Manufacturing operation control.

5.10 Mobile database


Mobile databases:

These databases are separate from the main database and can easily be transported to various
places. Even though they are not connected to the main database, they can still communicate
with the database to share and exchange data.

Mobile database components:

▪ The main system database that stores all the data and is linked to the mobile database.

▪ The mobile database that allows users to view information even while on the move. It
shares information with the main database.

▪ The device that uses the mobile database to access data. This device can be a mobile
phone, laptop etc.

▪ A communication link that allows the transfer of data between the mobile database and
the main database.
RDBMS by P Nagaiah Goud

Advantages of Mobile Databases

▪ The data in a database can be accessed from anywhere using a mobile database. It
provides wireless database access.

▪ The database systems are synchronized using mobile databases and multiple users can
access the data with seamless delivery process.

▪ Mobile databases require very little support and maintenance.

▪ The mobile database can be synchronized with multiple devices such as mobiles,
computer devices, laptops etc.

Disadvantages of Mobile Databases

Some disadvantages of mobile databases are −

▪ The mobile data is less secure than data that is stored in a conventional stationary
database. This presents a security hazard.

▪ The mobile unit that houses a mobile database may frequently lose power because of
limited battery. This should not lead to loss of data in database.

5.11 Web Database


Web database:

It is a database application designed to be managed and accessed through the Internet.


Website operators can manage this collection of data and present analytical results based on
the data in the Web database application. Databases first appeared in the 1990s, and have
been an asset for businesses, allowing the collection of seemingly infinite amounts of data from
infinite amounts of customers.
RDBMS by P Nagaiah Goud

Data Organization in web database

Web databases enable collected data to be organized and cataloged thoroughly within
hundreds of parameters. The Web database does not require advanced computer skills, and
many database software programs provide an easy "click-and-create" style with no complicated
coding. Fill in the fields and save each record. Organize the data however you choose, such as
chronologically, alphabetically or by a specific set of parameters.

Web Database Software

Web database software programs are found within desktop publishing programs, such as
Microsoft Office Access and OpenOffice Base. Other programs include the Webex WebOffice
database and FormLogix Web database. The most advanced software applications can set up
data collection forms, polls, feedback forms and present data analysis in real time.

Applicable Uses

Businesses both large and small can use Web databases to create website polls, feedback
forms, client or customer and inventory lists. Personal Web database use can range from
storing personal email accounts to a home inventory to personal website analytics. The Web
database is entirely customizable to an individual's or business's needs.

5.12 Multidimensional Database


Multidimensional databases

▪ These are used mostly for OLAP (online analytical processing) and data warehousing.
They can be used to show multiple dimensions of data to users .

▪ A multidimensional database is created from multiple relational databases. While


relational databases allow users to access data in the form of queries, the
multidimensional databases allow users to ask analytical questions related to business
or market trends.

▪ The multidimensional databases uses MOLAP (multidimensional online analytical


processing) to access its data. They allow the users to quickly get answers to their
requests by generating and analyzing the data rather quickly.

▪ The data in multidimensional databases is stored in a data cube format. This means that
data can be seen and understood from many dimensions and perspectives.
RDBMS by P Nagaiah Goud

▪ The revenue costs for a company can be understood and analyzed on the basis of
various factors like the company products, the geographical locations of the company
offices, time to develop a product, promotions done etc.

Advantages of Multidimensional Databases

▪ Increased performance

The performance of the multidimensional databases is much superior to that of normal


databases such as relational database.

▪ Easy maintenance

The multidimensional database is easy to handle and maintain

▪ Better data presentation

The data in a multidimensional database is multi-faceted and contains many different


factors. Hence, the data presentation is far superior to conventional databases.

Disadvantages of Multidimensional Databases

▪ One of the disadvantages of multidimensional databases are that it is quite complex and
it takes professionals to truly understand and analyse the data in the database.

5.13 Write about OLTP and OLAP


On Line Transaction Processing (OLTP)

On-Line Transaction Processing (OLTP) System refers to the system that manage transaction-
oriented applications. These systems are designed to support on-line transaction and process
query quickly on the Internet.

For example: POS (point of sale) system of any supermarket is a OLTP System.
RDBMS by P Nagaiah Goud

Every industry in today’s world use OLTP system to record their transactional data. The main
concern of OLTP systems is to enter, store and retrieve the data. They covers all day to day
operations such as purchasing, manufacturing, payroll, accounting, etc.of an organization. Such
systems have large numbers of user which conduct short transaction. It supports simple
database query so the response time of any user action is very fast.

The data acquired through an OLTP system is stored in commercial RDBMS, which can be used
by an OLAP System for data analytics and other business intelligence operations.

Some other examples of OLTP systems include order entry, retail sales, and financial
transaction systems.

Advantages of an OLTP System:

• OLTP Systems are user friendly and can be used by anyone having basic understanding

• It allows its user to perform operations like read, write and delete data quickly.

• It responds to its user actions immediately as it can process query very quickly.

• This systems are original source of the data.

• It helps to administrate and run fundamental business tasks

Challenges of an OLTP system:

• It allows multiple users to access and change the same data at the same time. So it
requires concurrency control and recovery mechanism to avoid any unprecedented
situations

• The data acquired through OLTP systems are not suitable for decision making. OLAP
systems are used for the decision making or “what if” analysis.

OLAP

OLAP stands for Online Analytical Processing Server. It is a software technology that allows
users to analyze information from multiple database systems at the same time. It is based on
multidimensional data model and allows the user to query on multi-dimensional data. OLAP
databases are divided into one or more cubes and these cubes are known as Hyper-cubes.
RDBMS by P Nagaiah Goud

Advantages of OLAP

• OLAP is a platform for all type of business includes planning, budgeting, reporting, and
analysis.
• Information and calculations are consistent in an OLAP cube. This is a crucial benefit.
• Quickly create and analyze “What if” scenarios
• Easily search OLAP database for broad or specific terms.
• OLAP provides the building blocks for business modeling tools, Data mining tools,
performance reporting tools.
• Allows users to do slice and dice cube data all by various dimensions, measures, and
filters.
• It is good for analyzing time series.
• Finding some clusters and outliers is easy with OLAP.
• It is a powerful visualization online analytical process system which provides faster
response times

Disadvantages of OLAP

• OLAP requires organizing data into a star or snowflake schema. These schemas are
complicated to implement and administer
• You cannot have large number of dimensions in a single OLAP cube
• Transactional data cannot be accessed with OLAP system.
• Any modification in an OLAP cube needs a full update of the cube. This is a time-
consuming process

Difference between OLAP and OLTP

S.No. OLAP OLTP


1 OLAP stands for Online analytical processing. OLTP stands for online transaction
processing.
2 It includes software tools that help in analyzing data It helps in managing online database
mainly for business decisions. modification.
3 It utilizes the data warehouse. It utilizes traditional approaches of
DBMS.
4 It is popular as an online database query management It is popular as an online database
system. modifying system.
5 OLAP employs the data warehouse. OLTP employs traditional DBMS.
6 It holds old data from various Databases. It holds current operational data.
7 Here the tables are not normalized. Here, the tables are normalized.
RDBMS by P Nagaiah Goud

8 It allows only read and hardly write operations. It allows both read and write
operations.
9 Here, the complex queries are involved. Here, the queries are simple.

5.14 Parallel Database

Parallel Databases

Companies need to handle huge amount of data with high data transfer rate. The client server
and centralized system is not much efficient. The need to improve the efficiency gave birth to
the concept of Parallel Databases.

Parallel database system improves performance of data processing using multiple resources in
parallel, like multiple CPU and disks are used parallelly. It also performs many parallelization
operations like, data loading and query processing.

Goals of Parallel Databases

Improve performance:
The performance of the system can be improved by connecting multiple CPU and disks in
parallel. Many small processors can also be connected in parallel.

Improve availability of data:


Data can be copied to multiple locations to improve the availability of data.
For example: if a module contains a relation (table in database) which is unavailable then it is
important to make it available from another module.

Improve reliability:
Reliability of system is improved with completeness, accuracy and availability of data.

Provide distributed access of data:


Companies having many branches in multiple cities can access data with the help of parallel
database system.
RDBMS by P Nagaiah Goud

5.15 Data warehouse

Data Warehouse

• A data warehouse is a Relational database that is designed for query and analysis
rather than transaction processing. It includes historical data derived from
transaction data from single and multiple sources.
• A data warehouse provides integrated, enterprise-wide, historical data and
focuses on providing support for decision makers for data modeling and analysis.
• A data warehouse is a group of data specific to the entire organization, not only
to a particular group of users.
• It is not used for daily operations and transaction processing but used for making
decision.
Data Warehouse Features
The key features of a data warehouse are discussed below −
• Subject Oriented − A data warehouse is subject oriented because it provides
information around a subject rather than the organization's ongoing operations.
These subjects can be product, customers, suppliers, sales, revenue, etc. A data
warehouse does not focus on the ongoing operations, rather it focuses on
modelling and analysis of data for decision making.
• Integrated − A data warehouse is constructed by integrating data from
heterogeneous sources such as relational databases, flat files, etc. This
integration enhances the effective analysis of data.
• Time Variant − The data collected in a data warehouse is identified with a
particular time period. The data in a data warehouse provides information from
the historical point of view.
• Non-volatile − Non-volatile means the previous data is not erased when new
data is added to it. A data warehouse is kept separate from the operational
database and therefore frequent changes in operational database is not reflected
in the data warehouse.
RDBMS by P Nagaiah Goud

Applications:
Data warehouses are widely used in the following fields −

• Financial services
• Banking services
• Consumer goods
• Retail sectors
• Controlled manufacturing

Types of Data Warehouse

Information processing, analytical processing, and data mining are the three types of data
warehouse applications.
• Information Processing − A data warehouse allows to process the data stored in
it. The data can be processed by means of querying, basic statistical analysis,
reporting using crosstabs, tables, charts, or graphs.
• Analytical Processing − A data warehouse supports analytical processing of the
information stored in it. The data can be analyzed by means of basic OLAP
operations, including slice-and-dice, drill down, drill up, and pivoting.
• Data Mining − Data mining supports knowledge discovery by finding hidden
patterns and associations, constructing analytical models, performing
classification and prediction. These mining results can be presented using the
visualization tools.
RDBMS by P Nagaiah Goud

5.16 NoSQL Database


NoSQL Database:
NoSQL Database is a non-relational Data Management System, that does not require a fixed
schema. It avoids joins, and is easy to scale. The major purpose of using a NoSQL database is for
distributed data stores with humongous data storage needs. NoSQL is used for Big data and
real-time web apps. For example, companies like Twitter, Facebook and Google collect
terabytes of user data every single day.
NoSQL database stands for “Not Only SQL” or “Not SQL.” Traditional RDBMS uses SQL syntax
to store and retrieve data for further insights. Instead, a NoSQL database system encompasses
a wide range of database technologies that can store structured, semi-structured, unstructured
and polymorphic data.

Features of NoSQL
Non-relational

• NoSQL databases never follow the relational model


• Never provide tables with flat fixed-column records

Schema-free
RDBMS by P Nagaiah Goud

• NoSQL databases are either schema-free or have relaxed schemas


• Do not require any sort of definition of the schema of the data

Simple API

• Offers easy to use interfaces for storage and querying data provided
• APIs allow low-level data manipulation & selection methods

Distributed

• Multiple NoSQL databases can be executed in a distributed fashion


• Offers auto-scaling and fail-over capabilities

Advantages of NoSQL

• Big Data Capability


• No Single Point of Failure
• Easy Replication
• It provides fast performance and horizontal scalability.
• Can handle structured, semi-structured, and unstructured data with equal effect
• NoSQL databases don’t need a dedicated high-performance server
• Simple to implement than using RDBMS
• It can serve as the primary data source for online applications.
• Excels at distributed database and multi-data center operations
• Eliminates the need for a specific caching layer to store data
• Offers a flexible schema design which can easily be altered without downtime or service
disruption

Disadvantages of NoSQL

• No standardization rules
• Limited query capabilities
• It does not offer any traditional database capabilities, like consistency when multiple
transactions are performed simultaneously.
• When the volume of data increases it is difficult to maintain unique values as keys
become difficult
• Doesn’t work as well with relational data
• Open-source options so not so popular for enterprises.

You might also like