0% found this document useful (0 votes)

17 views54 pages

ADBMS Notes

The document outlines the course objectives and content for an Advanced Database Management System course for M.Tech. students, covering topics such as database design, normalization, distributed databases, object-oriented databases, web databases, and mobile databases. It emphasizes the importance of schema refinement, physical database design, indexing, and security measures in database management. The course aims to equip students with the skills to design efficient databases, formulate SQL queries, and apply security concepts in application development.

Uploaded by

ebtesam khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views54 pages

ADBMS Notes

Uploaded by

ebtesam khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Advanced Database Management System

M.Tech. SEM-II

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING

1
EVEREST COLLEGE OF ENGINEERING & TECHNOLOGY
(Affiliated to Dr. BATU, Lonere, Approved by AICTE – Accredited by NAAC – B Grade – ISO 9001:2015
Institute)

Course Learning Objectives:

This Course will enable students to:
1. Ability to design entity relationship and convert entity relationship diagrams into RDBMS and formulate
SQL queries on the respect data.
2. Able to differentiate various database architecture.
3. Infer and represent the real-world data using object-oriented database.
4. Get familiarized about web and Mobile database.
5. Apply security concepts for the development of application software’s

At the end of the course the student will be able to:

1. Ability to design entity relationship and convert entity relationship diagrams into RDBMS and formulate
SQL queries on the respect data.
2. Able to differentiate various database architecture.
3. Infer and represent the real-world data using object-oriented database.
4. Get familiarized about web and Mobile database.
5. 5 Apply security concepts for the development of application software’s

UNIT – I: Database Design and tuning 06 Hours

Schema Refinement and Normal Forms: Introduction to schema refinement, functional dependencies,
reasoning about FD’s, Normal forms, properties of decompositions, Normalization, schema refinement in
database designs, other kinds of dependencies.
Physical Database Design and Tuning: Introduction to physical database design, guidelines for index
selection, basic examples of index selection, clustering and indexing, indexes that enables index- only plan,
tools to assist in index selection, overview of database tuning, choices in tuning the conceptual schema, choices
in tuning queries and views, Impact of concurrency.
UNIT – II: PARALLEL AND DISTRIBUTED DATABASES 09 Hours

Parallel databases: introduction, architecture for parallel databases, parallel query evaluation, parallelizing
individual operations, parallel query optimization. Distributed DBMS-Concepts and Design: Introduction,
overview of networking, functions and architecture of a DBMS, distributed relational database design,
transparencies in a DDBMS, Dates twelve rules for a
DBMS. Distributed DBMS—Advanced Concepts: Distributed Transaction Management, Distributed
Concurrency control, Distributed deadlock management, distributed database Recovery, the X/Open Distributed
Transaction Processing Model, Distributed Query optimization, distribution in oracle.

UNIT – III: OBJECT-DATABASE Management SYSTEMS 07 Hours

Object-Oriented DBMS—Concepts and Design: Next-generation databased systems, introduction to

OODBMSs, Persistence in OODBMSs, Issues in OODBMSs, Advantages and disadvantages of OODBMSs,

2
comparison of ORDBMS and OODBMS, object-oriented database design and object-oriented analysis and
design with UML.
Object-Oriented DBMS—Standards and Systems: Object Management Group, Object Data Standard
ODMG 3.0,1999, Object Store

.
UNIT – IV: Web and DBMS 09 Hours

Web Technology and DBMS: Introduction to the internet and the Web, the web, scripting languages, common
gateway interface, HTTP Cookies, Extending the web server, Java, Microsoft web platform and oracle internet
platform. Semi structured Data and XML: Semi structure data, Introduction to XML, XML-Related
technologies, XML schema, XML Query Languages, XML and databases and XML in oracle.

UNIT – V: Mobile Databases and security 08 Hours

Introduction to Mobile Databases: Mobile DBMS, Issues with mobile DBMS.

Security and administration: Database security, Countermeasures- computer based controls, security in
Microsoft office access DBMS, security in Oracle DBMS, DBMSs and Web security, Database Administration

3
UNIT – I: Database Design and tuning

Introduction to schema refinement:

schema refinement, a critical process in designing efficient and reliable databases. Schema refinement, also
known as database normalization, is the process of organizing data in a database to eliminate redundancy,
ensure data integrity, and make the database easier to maintain.

1.Schema Refinement: The Schema Refinement refers to refine the schema by using some technique. The best
technique of schema refinement is decomposition.
Normalisation or Schema Refinement is a technique of organizing the data in the database. It is a
systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics
like Insertion, Update and Deletion Anomalies.

Redundancy refers to repetition of same data or duplicate copies of same data stored in different locations.
Anomalies: Anomalies refers to the problems occurred after poorly planned and normalized databases where
all the data is stored in one table which is sometimes called a flat file database.

2 Anomalies or problems facing without normalization(problems due to redundancy) :

Anomalies refers to the problems occurred after poorly planned and unnormalized databases where all the
data is stored in one table which is sometimes called a flat file database. Let us consider such type of
schema –

4
Here all the data is stored in a single table which causes redundancy of data or say anomalies as SID and Sname
are repeated once for same CID. Let us discuss anomalies one by one. Due to redundancy of data, we may get
the following problems, those are-

1.insertion anomalies: It may not be possible to store some information unless some other information is
stored as well.
2.redundant storage: some information is stored repeatedly
3.update anomalies: If one copy of redundant data is updated, then inconsistency is created unless all
redundant copies of data are updated.
4.deletion anomalies: It may not be possible to delete some information without losing some other information
as well.

Normalization:
Normalization is a process of designing a consistent database with minimum redundancy which support data
integrity by grating or decomposing given relation into smaller relations preserving constraints on the relation.
OR

removes data redundancy and it will help in designing a good data base which involves a set of normal forms as
follows -
5)Forth normal form(4NF)
1)First normal form(1NF)
6)Fifth normal form(5NF)
2)Second normal form(2NF)
7)Sixth normal form(6NF)
3)Third normal form(3NF)
8)Domain key normal form.
4)Boyce coded normal form BCNF

5
Purpose of Normalization:
➢ Minimize the redundancy in data.
➢ Remove insert, update, and delete anomalies during the database activities.
➢ Reduce the need to organize the data when it is modified or enhanced.
➢ Normalization reduces a complex user view to a set of small and sub groups of fields or relations. This
process helps to design a logical data model known as conceptual data model.
Advantages of Normalization: 1. Greater overall database organization will be gained. 2. The amount of
unnecessary redundant data reduced. 3. Data integrity is easily maintained within the database. 4. The database
& application design processes are much for flexible. 5. Security is easier to maintain or manage.
Disadvantages of Normalization: 1. The disadvantage of normalization is that it produces a lot of tables with a
relatively small number of columns. These columns then have to be joined using their primary/foreign key
relationship. 2. This has two disadvantages. Performance: all the joins required to merge data slow processing &
place additional stress on your hardware. Complex queries: developers have to code complex queries in order to
merge data from different tables.

Properties of decomposition:
1) Lossless decomposition: while joining two smaller tables no data should be lost and should satisfy all
the rules of decomposition. No additional data should be generated on natural join of decomposed tables.

6
Mm;

Lossy join decomposition:

if information is lost after joining and if do not satisfy any one of the above rules of decomposition.

2) Dependency preservation: functional dependencies should be satisfied even after splitting

relations and they should be satisfied by any of splitted tables.

7
lack of redundancy: It is also known as repetition of information. The proper decomposition should
not suffer from any data redundancy.

Schema refinement in database designs:

Schema refinement in database design refers to the process of improving and optimizing a database
schema to ensure it is efficient, free from anomalies, and meets the requirements of data integrity and
performance. It involves analyzing and restructuring the schema to eliminate redundancy, ensure data
consistency, and achieve higher normal forms through techniques like normalization and decomposition.
Key Aspects of Schema Refinement

1. Normalization:
o Summary: Primary technique to decompose tables into smaller relations, eliminating redundancy
and anomalies (insertion, deletion, update). Achieves higher normal forms (1NF, 2NF, 3NF,
BCNF) by managing functional dependencies.
o Example: Splitting a table with redundant customer names into Customer and Order tables for
3NF.
2. Elimination of Anomalies:
o Summary: Prevents insertion (cannot add data due to missing attributes), deletion (losing
unrelated data), and update (multiple updates for consistency) anomalies by logically structuring
data.
o Example: Ensuring deleting an order doesn’t remove customer data.
3. Decomposition:
o Summary: Breaks large relations into smaller ones while ensuring lossless join (no data loss) and
dependency preservation (maintaining constraints).
o Example: Decomposing a table into smaller tables that can be re-joined without losing
information.

8
4. Functional Dependency Analysis:
o Summary: Analyzes functional dependencies to identify and remove redundant attributes or
transitive dependencies causing anomalies.
o Example: Decomposing a schema with dependencies A→B A \to B A→B, B→C B \to C B→C
to eliminate transitive dependency for 3NF.
5. Ensuring Data Integrity:
o Summary: Enforces constraints like primary keys, foreign keys, and referential integrity to
maintain consistency across relations.
o Example: Ensuring an Order table’s foreign key references a valid Customer ID.
6. Performance Optimization:
o Summary: Balances normalization with performance, using controlled denormalization or
indexing to avoid complex queries in read-heavy applications.
o Example: Adding a redundant attribute to reduce frequent joins.
7. Higher Normal Forms:
o Summary: Refines schema to achieve 3NF or BCNF for minimal redundancy. BCNF is stricter
but may not preserve dependencies, while 3NF is practical.
o Example: Converting a schema to 3NF to eliminate transitive dependencies.

Steps in Schema Refinement

1. Analyze Requirements:
o Summary: Understands data, relationships, and functional dependencies based on application
needs.
2. Identify Anomalies:
o Summary: Detects potential insertion, deletion, or update anomalies in the initial schema.
3. Normalize the Schema:
o Summary: Converts schema to 1NF (atomic attributes), 2NF (no partial dependencies), 3NF (no
transitive dependencies), and optionally BCNF.
o Example: Removing repeating groups for 1NF.
4. Decompose Relations:
o Summary: Breaks tables into smaller relations while ensuring lossless join and dependency
preservation.
5. Validate Constraints:
o Summary: Ensures primary keys, foreign keys, and other constraints are correctly implemented.
6. Optimize for Performance:
o Summary: Evaluates query patterns and applies denormalization or indexing for efficiency.
7. Test the Schema:
o Summary: Verifies the schema supports all operations without anomalies or data loss.

Physical Database Design

Physical Database Design is the process of translating a logical database schema (the structure defined
during logical design, such as tables, relationships, and constraints) into a physical structure that can be
implemented on a specific database management system (DBMS).

9
It focuses on optimizing the database for performance, storage efficiency, and scalability while ensuring
data integrity and security.

Physical database design bridges the gap between the conceptual/logical model and the actual
implementation on hardware and software.

Guidelines for index selection :

Index selection is a crucial part of physical database design, where you decide which columns in a database
table should have indexes to speed up queries while balancing performance trade-offs.

Guidelines for Index Selection

1. Index Frequently Queried Columns:

o Create indexes on columns used often in WHERE, JOIN, GROUP BY, or ORDER BY clauses,
as these benefit most from faster lookups.
o Example: If you frequently search for customers by their ID, index the CustomerID column.
2. Use Primary and Unique Keys:
o Columns with primary keys or unique constraints should have indexes, as they’re often used for
lookups and enforce uniqueness.
o Example: A Student column, which is the primary key, gets an index automatically.
3. Consider Selectivity:
o Index columns with high selectivity (many unique values) for better performance. Low-
selectivity columns (e.g., gender with only "M" or "F") are less effective.
o Example: Index Email (unique) rather than Gender (few values).
4. Support Joins:
o Index columns used in joins (e.g., foreign keys) to speed up linking tables.
o Example: Index Order. CustomerID to join with Customer. CustomerID.
5. Avoid Over-Indexing:
o Too many indexes slow down inserts, updates, and deletes because each index must be updated.
Only index what’s necessary.
o Example: Don’t index a column rarely queried, like Customer Notes.
6. Use Composite Indexes for Multi-Column Queries:
o Create composite (multi-column) indexes for queries filtering or sorting on multiple columns
together.
o Example: Index (OrderDate, CustomerID) for queries like WHERE CustomerID = 123 AND
OrderDate > '2024-01-01'.
7. Consider Query Patterns:
o Analyze common queries (read-heavy vs. write-heavy) to decide which indexes improve
performance most.
o Example: For a reporting system with heavy reads, prioritize indexes on columns used in reports.
8. Choose Appropriate Index Types:
o Use the right index type (e.g., B+ tree for range queries, hash for exact matches) based on the
DBMS and query type.
o Example: B+ tree for Price in range queries like Price BETWEEN 100 AND 500.
9. Monitor and Maintain Indexes:
10
o Regularly check index usage and remove unused or redundant indexes to save space and improve
write performance.
o Example: Drop an index on a column no longer used in queries after a system update.
10. Balance Storage and Performance:
o Indexes consume disk space, so weigh storage costs against performance gains, especially for
large tables.
o Example: Avoid indexing every column in a massive table to save disk space.

Clustering and Indexing:

Aspect Clustering Indexing
Organizes physical storage of table Creates a separate data structure to speed
Definition data based on the values of one or up data retrieval without changing
more columns (clustering key) physical data storage
Improves performance for range Speeds up searches, joins, and filters by
Purpose queries and sequential access by providing quick access to data locations.
storing related data physically close.
Data rows are physically sorted and Data rows remain unchanged; index
Physical data order stored on disk based on the clustering stores pointers to data locations.
key.
- Clustered table (data sorted by Primary index (on primary key). -
Types clustering key). - Only one clustering Secondary index (on non-key columns).
key per table (in most DBMS). - Clustered/non-clustered indexes.

- Fast for range queries (e.g., WHERE Fast for lookups, joins, and filters. -
Performance impact col BETWEEN x AND y). - Slow for Slows inserts/updates due to index
inserts/updates if they disrupt order. maintenance.
Minimal, as it reorders existing data Additional storage for index structure
Storage overhead (no extra structure unless partitioning (e.g., B+ tree, hash table).
is used)

One clustered structure per table Multiple indexes (primary, secondary,

Number Allowed (since data can only be physically composite) per table.
ordered one way).
Ideal for tables with frequent range Ideal for frequent searches, joins, or
Use cases queries or sequential access (e.g., filtering on specific columns (e.g.,
time-series data). customer lookups)
Supported in DBMS like SQL Server, Supported in all major DBMS (e.g.,
DBMS support Oracle (as Index-Organized Tables), MySQL, PostgreSQL, Oracle) with
MySQL (InnoDB with primary key). various index types (B+ tree, hash).

Choices in tuning:
SQL Database Tuning involves a set of techniques and best practices designed to optimize
database performance. By tuning a database, we can prevent it from becoming a bottleneck, ensuring faster
query execution and improved system efficiency. Database tuning includes strategies such as query
optimization, indexing, normalization, and hardware resource enhancements.

11
Queries and view Tuning Techniques :
1. Database Normalization
Normalization eliminates duplicate data by breaking down large tables into smaller, related
tables. This reduces storage requirements and speeds up data retrieval. This structure ensures data
consistency and reduces redundancy, allowing for faster and more efficient queries.

We have a single table called CUSTOMERS that combines customer and order data. Let’s normalize it step
by step.
Step 1: deformalized table

Step 2: Normalization (First Normal Form)

To eliminate redundancy, the data is split into two related tables: the Customers table and
the Orders table.

Customer table order table

2.Proper Indexing

12
Indexes are database structures that act as pointers to the location of specific data within a table, significantly
reducing query execution time. By creating indexes on frequently searched columns, we can optimize query
performance and enhance the efficiency of data retrieval, especially in large databases.
Example:
Create an index on the NAME column in a CUSTOMERS table:

CREATE INDEX idx_name ON CUSTOMERS(NAME);

2. Avoid Improper Queries

Writing efficient SQL queries is crucial for maintaining optimal database performance. Improper queries, such
as retrieving unnecessary data or using inefficient operators, can significantly slow down query execution and
consume excessive resources.
i. Use specific columns in SELECT statements:
Instead of retrieving all columns using SELECT *, specify only the columns you need. Retrieving
unnecessary columns increases data transfer and processing time.

SELECT ID, NAME FROM CUSTOMERS;

ii. Use wildcards only with indexed columns

Wildcards are useful for searching patterns, but they should be used on indexed columns to ensure
quick lookups.
Efficient Query:
SELECT NAME FROM CUSTOMERS WHERE NAME LIKE 'A%';

iii. Use explicit JOINs instead of implicit JOINs:

Explicit JOINs are preferred over implicit joins for better readability and reliability in complex
queries.

SELECT c.NAME, o.ORDER_ID

FROM CUSTOMERS c
JOIN ORDERS o ON c.CustomerID = o.CustomerID;

3. Avoid Multiple OR Conditions

The OR operator is used to combine multiple conditions in SQL queries. However, using
multiple OR conditions can significantly degrade performance because the database engine processes
each condition separately, often resulting in a full table scan. An optimized alternative is to use
the UNION operator, which processes each condition as a separate query and combines the results

13
SELECT c.NAME, o.ORDER_ID
FROM CUSTOMERS c
JOIN ORDERS o ON c.CustomerID = o.CustomerID;

5. Use WHERE Instead of HAVING

The WHERE clause is more efficient than HAVING as it filters data before grouping

SELECT DEPARTMENT, AVG(SALARY)

FROM EMPLOYEES
WHERE SALARY > 5000
GROUP BY DEPARTMENT;

Choice in tuning the conceptual schema:

tuning the conceptual schema in an Advanced Database Management System (ADBMS) involves
optimizing the logical structure of the database to improve performance, scalability, and maintainability while
preserving data integrity and business requirements.

The conceptual schema defines entities, relationships, and constraints at a high level, so tuning focuses on
refining these abstraction

1.Normalization vs. Denormalization:

Choice: Normalize for Consistency: Normalize the schema (up to 3NF or BCNF) to eliminate redundancy,
reduce anomalies, and ensure data integrity. This is ideal for OLTP systems with frequent updates.

Choice: Denormalize for Performance: Introduce controlled redundancy (e.g., adding derived attributes or
precomputed aggregates) to reduce joins in read-heavy OLAP systems. For example, store a customer’s total
order value in a summary table.

Trade-off: Balance normalization (less storage, fewer anomalies) with denormalization (faster reads, higher
storage).

1.Entity and Relationship Design:

o Choice: Consolidate Entities: Merge entities with one-to-one relationships or low-cardinality relationships
if they’re always queried together to reduce joins.
o Choice: Split Entities: Break large entities into smaller ones (vertical partitioning) to isolate frequently
accessed attributes, improving query efficiency.

14
o Choice: Optimize Relationships: Replace many-to-many relationships with associative entities or
precomputed tables for faster lookups in high-read scenarios.

2.Attribute Optimization:

Choice: Select Appropriate Data Types: Use precise data types (e.g., INTEGER vs. VARCHAR for IDs) to
reduce storage and improve query performance. For example, use DATE instead of VARCHAR for date
fields.

Choice: Eliminate Redundant Attributes: Remove attributes that can be derived from others (e.g., age from
birthdate) to save space and simplify queries.
Choice: Add Derived Attributes: Include computed fields (e.g., total_price = quantity * unit price) in the
schema for frequent queries to avoid runtime calculations.

4 Partitioning Strategies:

o Choice: Logical Partitioning: Design the schema to support table partitioning (e.g., range, list, or hash)
based on query patterns, such as partitioning sales data by year or region.
o Choice: Sub partitioning: Use composite partitioning (e.g., range-list) for complex datasets to enable
finer-grained access and pruning.
o Choice: Align with Workload: Ensure partitions align with common query filters to reduce scanned data.

5 Constraints Optimization:

o Choice: Selective Constraints: Apply only necessary primary key, foreign key, and unique constraints to
minimize overhead while ensuring integrity.
o Choice: Deferrable Constraints: Use deferrable constraints (e.g., in PostgreSQL or Oracle) to delay
enforcement until transaction commit, improving bulk operation performance.
o Choice: Disable Constraints Temporarily: For large data loads, temporarily disable non-critical constraints
to speed up inserts.

Impact of concurrency:

Concurrency: Concurrency in a database refers to multiple users or processes accessing and modifying
data at the same time. It’s like several people trying to use the same resource simultaneously. Proper
concurrency management ensures the database remains fast, reliable, and accurate, but poor management
can cause issues like delays, errors, or data corruption.

Impact of Concurrency:

1. Performance
Concurrency maximizes resource utilization by enabling parallel execution of transactions, reducing idle
time. For example:
15
 In an online store’s ADBMS, hundreds of customers can browse products, add items to carts, and place
orders simultaneously without waiting for each other, as long as they access different data. This
leverages multi-core CPUs and parallel I/O operations.

2. Data Integrity

Concurrency can jeopardize data consistency if multiple transactions modify the same data without
coordination. ADBMS uses isolation mechanisms to ensure transactions don’t interfere destructively, but these
mechanisms have trade-offs.

Proper concurrency control ensures ACID (Atomicity, Consistency, Isolation, Durability) properties,
maintaining data integrity

3.Scalability

Concurrency determines how well the database handles increasing numbers of users or
transactions, critical for large-scale applications.

Effective concurrency management enables the database to scale:

 Horizontal Scaling: In distributed ADBMS (e.g., Oracle RAC, Snowflake), concurrency allows
transactions to run across multiple nodes, supporting more users. For example, sharding the online
store’s database by region lets customers in different areas access separate shards concurrently.
 Vertical Scaling: Multi-core CPUs and large memory allow parallel query execution, handling more
transactions. For instance, parallel scans of the product catalog improve response times during peak
traffic

4. Resource Utilization

Concurrency affects how efficiently the database uses hardware resources (CPU, memory, disk,
network).

UNIT – II: PARALLEL AND DISTRIBUTED DATABASE

Parallel database:
16
A Parallel Database is a type of database system where multiple processors or machines work together
to handle large amounts of data faster and more efficiently.

A parallel database is a database system that splits large amounts of data and tasks across multiple
computers or processors to work on them at the same time.
Key Features of a Parallel Database:

1. Data Partitioning: The data is split into smaller chunks and stored on different machines.
2. Parallel Processing: Multiple processors or computers process queries simultaneously.
3. Scalability: It can handle more data or users by adding more machines.
4. Speed: It’s faster for big tasks because the work is shared.

Architecture of parallel database:

In Parallel Databases, mainly there are three architectural designs for parallel DBMS. They are as follows:
1. Shared Memory Architecture
2. Shared Disk Architecture
3. Shared Nothing Architecture

1. Shared Memory Architecture-

In Shared Memory Architecture, there are multiple CPUs that are attached to an interconnection
network. They are able to share a single or global main memory and common disk arrays.

It is to be noted that, In this architecture, a single copy of a multi-threaded operating system and
multithreaded DBMS can support these multiple CPUs. Also, the shared memory is a solid
coupled architecture in which multiple CPUs share their memory.

It is also known as Symmetric multiprocessing (SMP). This architecture has a very wide range
which starts from personal workstations that support a few microprocessors in parallel via RISC.

Advantages:
1. It has high-speed data access for a limited number of processors.
2. The communication is efficient.

Disadvantages:
1. It cannot use beyond 80 or 100 CPUs in parallel.
2. The bus or the interconnection network gets block due to the increment of the large number of
CPUs.

17
Fig: shared memory architecture

2. Shared Disk Architectures:

In Shared Disk Architecture, various CPUs are attached to an interconnection network. In this, each CPU has
its own memory and all of them have access to the same disk. Also, note that here the memory is not shared
among CPUs therefore each node has its own copy of the operating system and DBMS. Shared disk
architecture is a loosely coupled architecture optimized for applications that are inherently centralized. They
are also known as clusters.

Fig: Shared Disk Architecture

Advantages:
1. The interconnection network is no longer a bottleneck each CPU has its own memory.
2. Load-balancing is easier in shared disk architecture.
3. There is better fault tolerance.

Disadvantages:

18
1. If the number of CPUs increases, the problems of interference and memory contentions also
increase.
2. There's also exists a scalability problem.

3, Shared Nothing Architecture:

Shared Nothing Architecture is multiple processor architecture in which each processor has its own memory
and disk storage.
In this, multiple CPUs are attached to an interconnection network through a node. Also, note that no two
CPUs can access the same disk area.
In this architecture, no sharing of memory or disk resources is done. It is also known as Massively parallel
processing (MPP).

Fig: Shared Nothing Architecture

Advantages:
1. It has better scalability as no sharing of resources is done
2. Multiple CPUs can be added

Disadvantages:
1. The cost of communications is higher as it involves sending of data and software interaction at
both ends
2. The cost of non-local disk access is higher than the cost of shared disk architectures.

Note that this technology is typically used for very large databases that have the size of 10 12 bytes
or TB or for the system that has the process of thousands of transactions per second.

4, Hierarchical Architecture:

19
This architecture is a combination of shared disk, shared memory and shared nothing architectures. This
architecture is scalable due to availability of more memory and many processor. But is costly to other
architecture

Parallel query evaluation:

Parallel query evaluation refers to the technique of executing database queries using multiple processors
or machines concurrently to improve performance and throughput. This is achieved by breaking down a
query into smaller tasks that can be processed in parallel, reducing the overall execution time.

Types of Parallelism in Query Evaluation:

Inter-query parallelism:

Multiple, independent queries are executed simultaneously on different processors or machines. This
approach improves overall system throughput and reduces the time required to process a set of
queries.
Intra-query parallelism:
A single query is divided into smaller tasks that are executed concurrently on different processors or
machines. This approach speeds up the execution of individual, complex queries.
Inter-operator parallelism:
Different operators within a query execution plan (e.g., joins, filters) are executed in parallel on different
processors.

20
Intra-operator parallelism:
A single operator (e.g., a join) is parallelized by dividing its input data and processing it concurrently
across multiple processors.

Advantages of Parallel Query Evaluation

 Speed: Queries run faster by distributing work.

 Scalability: Easily handle growing data by adding more nodes.
 Efficiency: Maximizes use of hardware resources.
 High Availability: If one node fails, others can continue processing.

Parallel query optimization:

Parallel Query Optimization is the process of finding the best way to divide and execute a query across
multiple processors or nodes in parallel, so that it runs faster and more efficiently.

the optimizer’s job is to create a plan that divides the query into smaller tasks, assigns them to different
processors, and ensures the work is done as quickly and cost-effectively as possible.

Steps in Parallel Query Optimization:

 Query Parsing:

o The database system interprets the SQL query (e.g., SELECT, JOIN, WHERE).
o Output: A logical query tree (query tree or relational algebra expression).

Logical Plan Generation:

o Create multiple logical plans for the query (e.g., different ways to order joins or apply filters).
o Example: For “SELECT * FROM Customers JOIN Orders ON Customers.id =
Orders.customerid Orders.customerid WHERE Customers.region = ‘Asia’”, plans might include:

o Join first, then filter by region.

o Filter by region first, then join.

Parallel Plan Generation:

 Transform logical plans into parallel plans by:

o Dividing operations into parallel tasks.
o Considering data partitioning (e.g., tasks only process local data).
o Adding data exchange steps (e.g., shuffle data for joins).
 Example: Split a join into tasks where each node joins local partitions

Cost Estimation:

 Estimate the cost of each parallel plan based on CPU, I/O, network, and load balance.

21
Techniques in Parallel Query Optimization:

1) Data Locality Optimization:

Assign tasks to nodes where the required data is already stored to minimize data movement.

2) Parallel Join Optimization:

Optimize join operations, which are common and costly.

3) Parallel Aggregation Optimization:

Split aggregation (e.g., SUM, COUNT) into local and global phases.
4) Load Balancing:
Ensure tasks are distributed evenly to avoid some nodes being idle while others are overloaded.
5) Pipelining:
Stream results from one operation to the next without waiting for completion.

Distributed DBMS-concepts and Design:

Distributed DBMS: The Distributed DBMS is defined as, the software that allows for the management of
the distributed database and make the distributed data available for the users

A distributed DBMS consist of a single logical database that is divided into a number of pieces called the
fragments.
In DDBMS, each site is capable of independently processing the users request.
Users can access the DDBMS via applications classified:
1. Local Applications - Those applications that doesn't require data from the other sites are
classified under the category of Local applications.
2. Global Applications - Those applications that require data from the other sites are classified under
the category of Global applications.

Characteristics of Distributed DDBMS: A DDBMS has the following characteristics-

1. A collection of logically related shared data.
2. The data is split into a number of fragments.
3. Fragments may be duplicate.
4. Fragments are allocated to sites.
5. The data at each site is under the control of DBMS and managed by DBMS.

Functions and Architecture of DDBMS:

Distributed database systems play an important role in modern data management by distributing data across
multiple nodes

A distributed database system refers to a collection of multiple, interconnected databases that are physically
dispersed across different locations but function as a single unified database. This system allows for data to
22
be stored, accessed, and managed across multiple sites, providing several advantages over centralized or
single-site databases.

Data Distribution in Distributed Database System

Data distribution refers to the process of distributing data across multiple nodes or locations in a distributed
database system. This distribution can be based on various criteria such as:
 Horizontal Distribution:
o Also known as horizontal partitioning , this involves dividing a table into multiple
partitions (or fragments) based on rows.
o Each partition contains a subset of the rows from the original table.
o For example, a customer database might be horizontally partitioned based on
geographic regions, with each partition containing customers from a specific region.

 Vertical Distribution:
o Also known as vertical partitioning , this involves dividing a table into multiple
partitions based on columns.
o Each partition contains a subset of columns from the original table.
o Vertical partitioning is useful when different columns are frequently accessed together,
allowing those columns to be stored together on the same nodes.
 Hybrid Distribution:
o In some cases, a combination of horizontal and vertical partitioning may be used to
optimize data storage and access patterns.
o This hybrid approach allows for flexibility in managing large datasets by combining the
benefits of both horizontal and vertical partitioning strategies.

Replication and Data Consistency in Distributed Database System

Replication and Data Consistency are critical concepts in distributed database systems, addressing how data is
replicated across multiple nodes and ensuring that all replicas maintain consistent and accurate data

1. Replication
Replication in distributed database systems involves creating and maintaining copies (replicas) of data on
multiple nodes (servers or locations) within a network. The primary goals of replication are to enhance data
availability, fault tolerance, and performance. Replication can be categorized into different types:

 Full Replication: Every data item is replicated on every node in the distributed system. This
ensures high availability and fault tolerance because any node can serve data even if others fail.
However, it increases storage and update overhead.
 Partial Replication: Only certain data items or subsets of data are replicated across nodes. This
approach can be more efficient in terms of storage and update propagation but requires careful
planning to ensure critical data availability.

2. Data Consistency
Data Consistency refers to ensuring that all replicas of a data item or database reflect the same value at any
given time. In distributed database systems, maintaining consistency across replicas is challenging due to
factors like network latency, node failures, and concurrent updates. Consistency models define how
updates are propagated and how consistent data appears to users:

23
 Strong Consistency: Requires that all replicas reflect the most recent update before any read
operation. This ensures that all read operations return the latest committed data but can introduce
higher latency and coordination overhead.
 Eventual Consistency: Allows replicas to diverge temporarily but guarantees that they will
converge to the same state eventually, without requiring immediate synchronization. Eventual
consistency improves availability and performance but may lead to temporary inconsistencies in
read operations.

Consistency and Concurrency Control:

In a distributed database, maintaining consistency across numerous nodes is a challenging problem. Conflicts
that result in inconsistent data might happen when many transactions are being carried out simultaneously.
Distributed databases use a variety of concurrency control techniques, such as locking, timestamp ordering, or
optimistic concurrency control, to assure consistency.

Network Communication and Latency:

The ability to communicate over a network is essential in distributed database systems. To get the best
performance and availability, low? latency networks and effective communication protocols are necessary. For
distributed database architecture, minimizing the quantity of data sent over the network and enhancing data
transfer rates are crucial factors.

Fig: Architecture of DDBMS

Concurrency control:
24
Concurrency controlling techniques ensure that multiple transactions are executed simultaneously while
maintaining the ACID properties of the transactions and serializability in the schedules.

Locking Based Concurrency Control Protocols

Locking-based concurrency control protocols use the concept of locking data items. A lock is a variable
associated with a data item that determines whether read/write operations can be performed on that data item.
Generally, a lock compatibility matrix is used which states whether a data item can be locked by two
transactions at the same time.
Locking-based concurrency control systems can use either one-phase or two-phase locking protocols.
i) One-phase Locking Protocol
In this method, each transaction locks an item before use and releases the lock as soon as it has finished using it.
This locking method provides for maximum concurrency but does not always enforce serializability.

ii) Two-phase Locking Protocol

In this method, all locking operations precede the first lock-release or unlock operation. The transaction
comprise of two phases. In the first phase, a transaction only acquires all the locks it needs and do not release
any lock. This is called the expanding or the growing phase. In the second phase, the transaction releases the
locks and cannot request any new locks. This is called the shrinking phase.
Timestamp concurrency control algorithm:
Timestamp-based concurrency control algorithms use a transactions timestamp to coordinate concurrent access
to a data item to ensure serializability. A timestamp is a unique identifier given by DBMS to a transaction that
represents the transactions start time.
These algorithms ensure that transactions commit in the order dictated by their timestamps. An older transaction
should commit before a younger transaction, since the older transaction enters the system before the younger
one.
Timestamp-based concurrency control techniques generate serializable schedules such that the equivalent serial
schedule is arranged in order of the age of the participating transactions.
Some of timestamp-based concurrency control algorithms are:
Basic timestamp ordering algor Some of timestamp-based concurrency control algorithms are −
Basic timestamp ordering algorithm.
Conservative timestamp ordering algorithm.
Multi version algorithm based upon timestamp ordering.

Optimistic Concurrency Control Algorithm:

25
In systems with low conflict Optimistic Concurrency Control Algorithm
In systems with low conflict rates, the task of validating every transaction for serializability may lower
performance. In these cases, the test for serializability is postponed to just before commit. Since the conflict rate
is low, the probability of aborting transactions which are not serializable is also low. This approach is called
optimistic concurrency control technique.t rates, the task of validating every transaction for serializability may
lower performance. In these cases, the test for serializability is postponed to just before commit. Since the
conflict rate is low, the probability of aborting transactions which are not serializable is also low. This approach
is called optimistic concurrency control technique.

Distributed deadlock management:

Deadlock is a state of a database system having two or more transactions, when each transaction is waiting for a
data item that is being locked by some other transaction. A deadlock can be indicated by a cycle in the wait-for-
graph. This is a directed graph in which the vertices denote transactions and the edges denote waits for data
items.

Fig: Deadlock state

Transaction processing in a distributed database system is also distributed, i.e., the same transaction may be
processing at more than one site.
The two main deadlock handling concerns in a distributed database system that are not present in a centralized
system are transaction location and transaction control.
Once these concerns are addressed, deadlocks are handled through any of deadlock prevention, deadlock
avoidance or deadlock detection and removal.
Transaction Location
Transactions in a distributed database system are processed in multiple sites and use data items in multiple sites.
The amount of data processing is not uniformly distributed among these sites. The time period of processing
also varies. Thus, the same transaction may be active at some sites and inactive at others. When two conflicting
transactions are located in a site, it may happen that one of them is in inactive state. This condition does not
arise in a centralized system. This concern is called transaction location issue.
Transaction Control

26
Transaction control is concerned with designating and controlling the sites required for processing a transaction
in a distributed database system. There are many options regarding the choice of where to process the
transaction and how to designate the centre of control.
Distributed Deadlock Prevention
Just like in centralized deadlock prevention, in distributed deadlock prevention approach, a transaction should
acquire all the locks before starting to execute. This prevents deadlocks.
The site where the transaction enters is designated as the controlling site. The controlling site sends messages to
the sites where the data items are located to lock the items. Then it waits for confirmation. When all the sites
have confirmed that they have locked the data items, transaction starts. If any site or communication link fails,
the transaction has to wait until they have been repaired.

Distributed Deadlock Avoidance

As in centralized system, distributed deadlock avoidance handles deadlock prior to occurrence. Additionally, in
distributed systems, transaction location and transaction control issues need to be addressed. Due to the
distributed nature of the transaction, the following conflicts may occur −
 Conflict between two transactions in the same site.
 Conflict between two transactions in different sites.

X/open Distributed Transaction Processing Model:

The X/Open Distributed Transaction Processing (DTP) Model is a standardized framework developed by the
X/Open consortium (now part of The Open Group) to facilitate reliable transaction processing in distributed
environments. It provides a structured approach for coordinating transactions across multiple resource managers
(e.g., databases, file systems) in a distributed system, ensuring the ACID properties (Atomicity, Consistency,
Isolation, Durability) are maintained.

1. Distributed Transaction:
o A distributed transaction involves multiple resource managers (e.g., databases on different
servers) that must work together to complete a single logical transaction.
o The goal is to ensure that all operations either commit successfully or roll back entirely,
maintaining data consistency across systems.
2. Two-Phase Commit (2PC):
o The X/Open DTP model relies on the Two-Phase Commit protocol to ensure atomicity in
distributed transactions.
o Phase 1 (Prepare): The transaction coordinator asks all resource managers if they are ready to
commit.
o Phase 2 (Commit/Rollback): Based on responses, the coordinator instructs all resource
managers to either commit or roll back.
3. Interoperability:
o The model standardizes interfaces to ensure that components from different vendors (e.g.,
databases, transaction monitors) can work together seamlessly.
27
The model has four key players that work together:

1. Application Program (AP):

o This is the program you’re using, like the online store’s software.
o It decides what needs to happen (e.g., “reduce inventory by 1” and “charge $50”).
o It talks to the other parts to make sure everything happens correctly.
2. Transaction Manager (TM):
o Think of the TM as the “boss” or coordinator.
o It makes sure all systems agree to complete the transaction or cancel it.
o It uses a process called Two-Phase Commit (explained below) to keep everyone in sync.
3. Resource Managers (RMs):
o These are the systems that hold data, like databases for inventory or payments.
o Each RM handles its own part of the work (e.g., one database updates the inventory, another
processes the payment).
4. Communication Resource Manager (CRM):
o This is like a translator that helps the TM talk to RMs, especially if they’re on different systems
or use different languages.
o It’s less common in simple setups but important for complex systems.

Real-World Applications

The X/Open DTP model is used in:

 Enterprise Applications: Java EE application servers (e.g., WebSphere, WebLogic) use the XA interface
to manage distributed transactions across databases.
 Banking Systems: Ensures atomicity in financial transactions involving multiple accounts or banks.
 E-Commerce: Coordinates updates to inventory, payment, and order databases in a single transaction.
 Middleware: Transaction processing monitors like IBM CICS or Tuxedo implement the X/Open DTP
model.

Distributed Query Optimization:

Distributed Query Optimization is the process of finding the most efficient way to execute a query over a
Distributed Database System (DDBS), where data is stored across multiple physical locations or nodes.

In a distributed database, data is stored across multiple computers (or sites) connected by a network. When you
run a query (e.g., "Find all customers who bought T-shirts last month"), the system needs to fetch and combine
data from these different sites. Distributed Query Optimization is the process of figuring out the best way to
execute this query to get results quickly and efficiently.

Think of it like planning the fastest route for a road trip across multiple cities. You want to minimize travel
time, fuel, and effort. Similarly, query optimization minimizes the time, network usage, and computing
resources needed to process a query.

28
When you submit a query, the distributed database system creates an execution plan—a step-by-step strategy to
fetch and process data. Here’s how it happens in simple terms:

1. Query Parsing:
o The system checks if your query is valid and breaks it into smaller parts (e.g., "select," "join,"
"where").
o It creates a logical query tree to represent the operations.
2. Query Decomposition:
o The query is split into smaller sub-queries that can be sent to different sites.
o For example, if customer data is on Site A and order data is on Site B, the system splits the query
to fetch data from both.
3. Optimization:
o The system evaluates multiple ways to execute the query (called query plans).
o It estimates the cost of each plan based on:
 Data size: How much data needs to be moved (e.g., transferring 1 GB over the network is
costly).
 Network speed: How fast data travels between sites.
 Processing power: How quickly each site can process its part.
 Join operations: Combining data from different sites (e.g., matching customers with
orders).
o It picks the plan with the lowest cost (fastest, least resource-intensive).
4. Data Localization:
o The system decides where to process each part of the query.
o For example, it might move all data to one site or process data locally at each site to reduce
network traffic.
5. Execution:
o The chosen plan is executed, fetching data from sites, combining it, and returning the results.

Optimization Techniques

 Selection Pushdown: Apply filters early at each site to reduce data.

 Join Optimization: Choose the best site and order for joining tables.
 Parallelism: Execute sub-queries simultaneously on different sites.
 Data Replication: Use the closest or fastest copy of data.
 Query Rewriting: Simplify queries (e.g., remove redundant operations).

Challenges

 Network Latency: Slow communication between sites.

 Data Fragmentation: Data split across sites complicates planning.
 Dynamic Environment: Changing network or site conditions.
 Scalability: Harder to optimize for many sites or large datasets.

29
UNIT NO. III OBJECT DATABASE MANAGEMENT SYSTEMS

Object Oriented DBMS Concept and Design:

An Object-Oriented Database Management System (OODBMS) combines the features of object-oriented
programming (like encapsulation, inheritance, polymorphism) with the data persistence and query
capabilities of a database system.

An Object-Oriented Database Management System (OODBMS) is a database system that stores data as objects,
similar to how objects are used in object-oriented programming (OOP) languages like Java or C++. Unlike
traditional relational databases that store data in tables with rows and columns, an OODBMS stores data along
30
with its behaviour (methods), making it easier to model complex, real-world entities like a "Car" or a "Student"
that have both attributes (e.g., colour, name) and actions (e.g., drive, enroll).

Why Use an OODBMS?

 Natural Modeling: Objects in the database mirror real-world entities, making it easier to represent
complex data like videos, 3D models, or hierarchical structures.
 No Impedance Mismatch: In relational databases, you need to convert objects from code (e.g., Java
objects) into tables and back, which is cumbersome. OODBMS stores objects directly, so code and data
work seamlessly together.
 Flexibility: Supports complex data types (e.g., images, graphs) and relationships (e.g., inheritance,
associations).
 Behavior with Data: Stores methods (functions) with data, so objects can "act" on themselves.

Key Concepts of OODBMS

1. Objects:
o An object is a single instance of data with attributes (e.g., a Student’s name, ID) and methods (e.g.,
calculate GPA).
o Each object has a unique Object ID (OID), like a permanent address, to identify it.
2. Classes:
o A class is a blueprint for objects, defining their attributes and methods.
o Example: A "Student" class defines that all students have a name, ID, and a method to enroll in courses.
3. Encapsulation:
o Data and methods are bundled together in an object, and access is controlled (e.g., private attributes can
only be changed by the object’s methods).
o Example: A Student object might hide its GPA and only allow updates through a specific method.
4. Inheritance:
o Classes can inherit properties and methods from other classes, creating hierarchies.
o Example: A "GraduateStudent" class can inherit from the "Student" class, adding specific attributes like
thesis topic.
5. Polymorphism:
o Objects of different classes can respond to the same method call in their own way.
o Example: A "printDetails" method might show different details for a Student vs. a Teacher.
6. Associations:
o Objects can be linked to other objects (e.g., a Student is enrolled in Courses).
o Supports complex relationships like one-to-many or many-to-many.
7. Persistence:
o Objects are stored permanently in the database, not just in memory, so they can be retrieved later.

Issues in OODBMSs:

OODBMSs store data as objects (like in object-oriented programming), combining data and behavior. While
powerful for complex data, they face several challenges that can make them harder to use compared to
relational databases.

31
1. Complexity in Design and Implementation

 What It Means: Designing an OODBMS is harder than a relational database because you need to define
classes, attributes, methods, and relationships (like inheritance or associations). It’s like building a
detailed blueprint for every object, which takes more effort.
 Why It’s a Problem: Programmers and database designers need advanced skills in object-oriented
concepts, which increases development time and cost.
 Example: Creating a "Student" class with attributes (name, ID), methods (calculate GPA), and links to
"Course" objects is more complex than creating tables in a relational database.
 Impact: Slower development, higher chance of errors, and need for specialized expertise.

2. Lack of Universal Standards

 What It Means: Unlike relational databases, which use SQL as a standard query language, OODBMSs
don’t have a widely accepted standard for querying or managing objects. Each OODBMS (e.g.,
ObjectDB, db4o) may use its own query language or approach.
 Why It’s a Problem: Developers can’t easily switch between OODBMSs, and skills learned for one
system may not apply to another.
 Example: A query in ObjectDB’s language (e.g., JPQL) might not work in Versant’s OODBMS, unlike
SQL, which works across MySQL, Oracle, etc.
 Impact: Reduced portability, vendor lock-in, and difficulty integrating different systems.

3. Performance Issues for Simple Queries

 What It Means: OODBMSs are great for complex data (e.g., multimedia, graphs), but they can be slower
for simple queries like fetching a list of names or numbers compared to relational databases.
 Why It’s a Problem: OODBMSs are designed for objects with relationships and behaviors, which adds
overhead when you just need basic data retrieval.
 Example: A query like “List all student names” is faster in a relational database’s table than navigating an
OODBMS’s object structure.
 Impact: Inefficient for applications with many simple, tabular queries.

4. Limited Adoption and Ecosystem

 What It Means: OODBMSs are less popular than relational databases, so fewer companies use them, and
there are fewer tools, libraries, and experts available.
 Why It’s a Problem: It’s harder to find support, tutorials, or third-party tools, and fewer developers are
trained in OODBMSs.
 Example: Relational databases like MySQL have huge communities and tools (e.g., phpMyAdmin),
while OODBMSs like ObjectDB have smaller ecosystems.
 Impact: Higher costs for training, support, and integration with other systems.

32
5. Query Optimization Challenges

 What It Means: Optimizing queries in an OODBMS is trickier because objects have complex structures
(e.g., hierarchies, relationships) compared to flat tables in relational databases.
 Why It’s a Problem: The system may struggle to find the fastest way to retrieve or combine objects,
especially in distributed environments.
 Example: Finding all students enrolled in a course might require navigating multiple object links, which
is harder to optimize than a simple table join.
 Impact: Slower query performance and higher resource usage for complex queries.

6. Concurrency Control Difficulties

 What It Means: When multiple users access or modify objects at the same time, ensuring consistency
(e.g., no conflicts) is harder in OODBMSs due to their complex structures.
 Why It’s a Problem: Objects often have methods and relationships, making locking or versioning trickier
than in relational databases.
 Example: If two users try to update a Student object’s course list simultaneously, the OODBMS must
carefully manage locks to avoid errors.
 Impact: Increased risk of conflicts, deadlocks, or performance slowdowns in multi-user systems.

7. Scalability Limitations

 What It Means: OODBMSs can struggle to handle very large datasets or many users across distributed
systems compared to relational or NoSQL databases designed for scalability.
 Why It’s a Problem: Complex object relationships and persistence mechanisms can slow down as data
grows or when systems are spread across multiple servers.
 Example: A social media app with millions of user objects and relationships (e.g., friends, posts) may
perform poorly compared to a NoSQL database like MongoDB.
 Impact: Limited use in big data or highly distributed applications.

8. Data Migration and Integration

 What It Means: Moving data from an OODBMS to another system (e.g., relational or NoSQL) or
integrating it with other databases is difficult because of the object-oriented structure.
 Why It’s a Problem: Objects include data and methods, which don’t easily map to tables or other
formats, and there’s no standard way to export/import data.
 Example: Converting a Student object with methods and relationships to a relational table loses its
behavior and structure.
 Impact: High effort and cost for data migration or system integration.

33
9. Learning Curve and Maintenance

 What It Means: OODBMSs require knowledge of both database management and object-oriented
programming, which is harder to learn and maintain than relational databases.
 Why It’s a Problem: Developers need to understand complex concepts like inheritance, polymorphism,
and object persistence, and maintaining the system (e.g., updating schemas) is time-consuming.
 Example: Updating a class hierarchy (e.g., adding a new subclass) may require changing many objects
and methods, unlike adding a column in a relational table.
 Impact: Increased training and maintenance costs.

10. Security Concerns

 What It Means: Securing objects in an OODBMS is more complex because objects include methods,
which could be exploited if not properly restricted.
 Why It’s a Problem: Methods might allow unintended access or changes to data, and securing object-
level access is less straightforward than table-level permissions in relational databases.
 Example: A poorly designed method in a Student object could accidentally expose private data like
grades.
 Impact: Higher risk of security vulnerabilities and more effort to enforce access control.

Advantages and disadvantages of OODBMSs:

1)Natural Modelling of Complex Data

 Explanation: OODBMSs store data as objects, which closely mimic real-world entities with attributes
(data) and methods (behaviour). This makes it easier to represent complex data like multimedia, graphs,
or hierarchical structures.
 Simplifies modelling for applications like CAD, multimedia, or gaming, where data has complex
relationships or behaviours.

2)No Impedance Mismatch

 Explanation: In relational databases, you must convert objects from code (e.g., Java objects) to tables and
back, which is time-consuming. OODBMSs store objects directly, so the data in the database matches
the code structure.
 Saves development time and reduces errors when integrating applications with the database.

3)Support for Object-Oriented Features

 Explanation: OODBMSs support OOP concepts like encapsulation, inheritance, and polymorphism,
allowing complex relationships and reusable code.
 Enables flexible and reusable database designs, especially for hierarchical or dynamic data.

4)Rich Data Types

34
 Explanation: OODBMSs handle complex data types like images, videos, or geospatial data, which are
hard to manage in relational databases.
 Ideal for applications like multimedia, scientific data, or 3D modeling, where non-tabular data is common.

5)Efficient Handling of Relationships

 Explanation: OODBMSs support direct relationships (e.g., one-to-many, many-to-many) using object
references, avoiding complex joins like in relational databases.
 Faster and simpler queries for applications with complex relationships, like social networks or e-
commerce.
6)Persistence of Objects

 Explanation: Objects in an OODBMS are stored permanently with their data and methods, making it
easy to retrieve and use them later.

Disadvantages of OODBMSs:

1) Complexity in Design and Implementation

Explanation: Designing an OODBMS requires defining classes, methods, and relationships, which is
more complex than creating tables in a relational database.
Increases development time, cost, and the need for skilled developers familiar with OOP and databases.

2) Lack of Universal Standards

 Explanation: Unlike relational databases with SQL as a universal standard, OODBMSs use different
query languages (e.g., OQL, proprietary APIs), making it hard to switch between systems.

3) Poor Performance for Simple Queries

 Explanation: OODBMSs are optimized for complex objects but can be slower for simple, tabular queries
compared to relational databases.

4) Limited Adoption and Ecosystem

 Explanation: OODBMSs are less popular than relational or NoSQL databases, so there are fewer tools,
libraries, and experts available.

5) Query Optimization Challenges

 Explanation: Optimizing queries in OODBMSs is harder due to complex object structures (e.g.,
hierarchies, relationships) compared to flat tables.

6) Concurrency and Scalability Issues

 Explanation: Managing multiple users accessing objects simultaneously or scaling to large datasets is
harder due to complex object relationships and methods.

35
Comparison of ORDBMS and OODBMS:

Aspect ORDBMS (Object-Relational DBMS) OODBMS (Object-Oriented

DBMS)

Definition A database system that combines A database system that stores data as
relational database features (tables, SQL) objects, fully embracing object-
with object-oriented features like oriented programming principles (data
objects, inheritance, and complex data + behavior) without relying on tables.
types.

Example Systems PostgreSQL, Oracle (with object ObjectDB, db4o, Versant,

extensions), IBM DB2. GemStone/S.

Data Model Stores data in tables but supports objects, Stores data as objects with attributes
complex data types (e.g., arrays, JSON), and methods, directly mirroring OOP
and some OOP features like inheritance. structures. No tables are used.

Query Language Uses SQL with extensions for objects Uses object-oriented query languages
(e.g., PostgreSQL’s JSONB queries or (e.g., OQL, proprietary APIs). No
user-defined types). Standardized and universal standard, varies by system.
widely supported.

OOP Features Supports some OOP features Fully supports OOP features
(encapsulation, inheritance, (encapsulation, inheritance,
polymorphism) but within a relational polymorphism, associations) as a core
framework. Limited compared to part of the system.
OODBMS.

Performance Faster for simple, tabular queries (e.g., Faster for complex object queries
“List all student names”) due to (e.g., navigating object relationships)
optimized relational engine. May be but slower for simple, tabular queries.
slower for complex object queries.

36
Relationships Uses joins (e.g., foreign keys) to manage Uses direct object references (e.g., a
relationships, which can be complex for Student object links to Course
hierarchical data. objects), simplifying complex
relationships.
Applications Suited for enterprise applications, Suited for niche applications like
analytics, and systems needing both multimedia, CAD, gaming, or
relational and complex data (e.g., e- scientific systems with complex,
commerce, GIS). object-oriented data.
Advantages Combines relational strengths (SQL, Eliminates impedance mismatch;
scalability) with object-oriented features; excels at complex data and
widely supported; good for mixed relationships; supports full OOP
workloads features.
Disadvantages Partial OOP support; some impedance Complex design; poor performance
mismatch; complex to manage object for simple queries; limited adoption
extensions. and scalability; non-standardized

Object Oriented DBMS – Standard and systems:

ODMG Standard (Object Data Management Group)

The ODMG (1993-2001) defined standards to unify object-oriented databases.

Key components:

 ODL (Object Definition Language): Defines schema (classes, interfaces).

 OQL (Object Query Language): Query language similar to SQL but with object support.
 Binding to languages: Java, C++, Smalltalk support.

 Object Model:

 Defines how objects, classes, attributes, methods, and relationships are represented.
 Supports OOP features: encapsulation (data and methods bundled), inheritance (class hierarchies),
polymorphism (methods behaving differently for different classes), and object identity (unique OIDs).
 Example: A "Student" class with attributes (name, ID) and methods (enroll) can inherit from a "Person"
class.

 Object Definition Language (ODL):

 A language to define the schema (structure) of classes, including attributes, methods, and relationships.
 Similar to SQL’s DDL (Data Definition Language) but for objects.

 Object Query Language (OQL):

 A query language for retrieving objects, similar to SQL but designed for object-oriented structures.
 Supports path expressions (e.g., student.enrolled) to navigate relationships.
 Example: SELECT s.name FROM Student s WHERE s.enrolled.title = "Math101"; retrieves names of
students enrolled in Math101.

37
 Object Manipulation Language (OML):

 Defines how to create, update, and delete objects, typically embedded in programming languages like C+
+ or Java.
 Example: Creating a new Student object in a Java program and saving it to the database.

 Language Bindings:

 Provides interfaces for programming languages (e.g., C++, Java, Smalltalk) to interact with the
OODBMS.
 Ensures seamless integration between application code and the database.
 Example: A Java program uses ODMG bindings to persist a "Student" object directly.

 Key Features of ODMG:

 Object Identity: Each object has a unique OID, independent of its attributes.
 Persistence: Objects are stored permanently and retrievable.
 Transactions: Supports ACID properties (Atomicity, Consistency, Isolation, Durability).
 Relationships: Supports associations (e.g., one-to-many, many-to-many) via object references.
 Extensibility: Allows user-defined types and methods.

Popular OODBMS Systems

Several OODBMS systems implement object-oriented concepts, some adhering to ODMG standards and others
using proprietary approaches

 ObjectDB:

 A pure OODBMS designed for Java applications, supporting JPA (Java Persistence API) and JDO (Java
Data Objects).
 Features: High performance, zero configuration, supports ODMG-like object persistence.
 Use Case: Embedded databases for Java-based applications (e.g., web apps, mobile apps).
 Example: A Java app stores a "Product" object with attributes (name, price) and methods (discount)
directly.

 db4o (Database for Objects):

 A lightweight, open-source OODBMS for Java and .NET.

 Features: Simple API, native object storage, no schema required for simple applications.
 Use Case: Small-scale applications like mobile or IoT systems.
 Limitation: Limited scalability for large systems; less active development since 2010s.

 Versant Object Database:

 A commercial OODBMS for high-performance, large-scale applications.

 Features: Supports complex object graphs, distributed systems, and ODMG standards.
 Use Case: Telecom, finance, or real-time systems needing complex data models.
 Example: Stores network topology as objects with relationships.

38
Applications of OODBMS

 Multimedia Systems: Storing videos, images, or audio with methods (e.g., play, edit).
 CAD/CAM: Managing complex designs (e.g., car parts with 3D models).
 Healthcare: Storing patient records with relationships (e.g., treatments, doctors).
 Gaming: Managing game objects (e.g., characters, items) with attributes and behaviors.
 Scientific Applications: Handling complex data like geospatial or genetic data.

UNIT NO. IV : WEB and DBMS

Web Technology and DBMS:

 Internet: The Internet is a global network of interconnected computers and devices that communicate
using standardized protocols (e.g., TCP/IP) to share data and resources.

Web: The World Wide Web (WWW or Web) is a service on the Internet that allows users to access and
share information stored on servers through hypertext documents (web pages) using web browsers.

web scripting languages

Web scripting languages are programming languages designed to create dynamic and interactive web
content. Unlike static web pages (e.g., plain HTML), these languages allow web pages to respond to
user actions, fetch data from servers, and interact with databases.

They are executed either on the client side (in the user’s browser) or the server side (on the web
server), enabling features like form validation, dynamic content updates, and data retrieval from a
DBMS.
39
1. Client-Side Scripting Languages:
o Run in the user’s web browser.
o Handle user interactions, update the webpage without reloading, and enhance user experience.
o Examples: JavaScript, TypeScript.
o Use Case: Validating a form (e.g., checking if an email field is valid) before sending data to the
server.
2. Server-Side Scripting Languages:
o Run on the web server to generate dynamic content, process user requests, and interact with
databases.
o Examples: PHP, Python (with frameworks like Django), Ruby, Node.js (JavaScript on the
server), Java (with frameworks like Spring).
o Use Case: Querying a DBMS to fetch product details for an e-commerce website

 JavaScript:

 Type: Client-side (also server-side with Node.js).

 Features:
o Adds interactivity to web pages (e.g., dropdown menus, animations).
o Manipulates HTML/CSS via the Document Object Model (DOM).
o Supports asynchronous requests (AJAX) for dynamic updates without page reloads.
o With Node.js, handles server-side tasks like API development and database queries.
 Example: A JavaScript function validates a login form or fetches user data from a server via an API.
 Use Case: Real-time updates (e.g., live notifications on Twitter) or server-side apps with Node.js.

 PHP (Hypertext Preprocessor):

 Type: Server-side.
 Features:
o Generates dynamic HTML by processing server-side logic.
o Integrates easily with DBMS like MySQL for data-driven websites.
o Widely used in content management systems (CMS) like WordPress.
 Example: A PHP script queries a MySQL database to display a list of blog posts.
 Use Case: E-commerce platforms (e.g., Magento), CMS, and web forms.

 Python (with frameworks like Django, Flask):

 Type: Server-side.
 Features:
o Clean syntax, easy to learn, and versatile for web and database tasks.
o Frameworks like Django provide built-in tools for database integration, authentication, and URL
routing.
o Connects to DBMS (e.g., PostgreSQL, MongoDB) via libraries like SQLAlchemy or PyMongo.
 Example: A Django app retrieves product details from a PostgreSQL database and displays them on a
webpage.
 Use Case: Web applications (e.g., Instagram uses Django), APIs, and data-driven sites.

 Ruby (with Ruby on Rails):

40
 Type: Server-side.
 Features:
o Emphasizes simplicity and developer productivity.
o Ruby on Rails (RoR) framework simplifies database integration and rapid development.
o Supports relational and NoSQL databases.
 Example: A Rails app fetches user profiles from a DBMS for a social networking site.
 Use Case: Web startups, e-commerce (e.g., Shopify uses RoR).

 Node.js (JavaScript on Server-Side):

 Type: Server-side.
 Features:
o Uses JavaScript for both client and server, enabling full-stack development.
o Non-blocking, event-driven architecture for high performance.
o Integrates with NoSQL (e.g., MongoDB) and relational DBMS.
 Example: A Node.js app handles real-time chat by querying a MongoDB database.
 Use Case: Real-time apps (e.g., chat, streaming), APIs.

 TypeScript:

 Type: Client-side (superset of JavaScript).

 Features:
o Adds static typing to JavaScript for better code reliability.
o Used in frameworks like Angular for building robust web applications.
 Example: A TypeScript script in Angular updates a webpage dynamically based on user input.
 Use Case: Large-scale web apps, enterprise systems

Semi structure Data and XML:

Semi-structured data is a type of data that does not follow a rigid structure like relational databases (tables with
fixed rows and columns) but has some organization, making it more flexible than unstructured data (e.g., plain
text, images). It is partially structured with tags, labels, or hierarchies, allowing it to represent complex and
varied data while remaining adaptable.

This flexibility makes semi-structured data ideal for situations where the data schema is not fully defined in
advance or when integrating data from diverse sources with different formats. Common examples include:

 JSON (JavaScript Object Notation): A lightweight format for storing and exchanging data.
 XML (Extensible Markup Language): A markup language for representing semi-structured data (more
on this below).
 NoSQL databases: Systems like MongoDB that store data in flexible, document-based formats.
 Web data: Such as HTML pages, which have a loose structure.

41
Introduction to XML

XML, or Extensible Markup Language, is a popular markup language designed to store and transport data in a
semi-structured format. Unlike HTML, which focuses on displaying data, XML is built to describe and structure
data in a way that is both human-readable and machine-processable. It uses a tree-like structure, where data is
organized into elements enclosed by user-defined tags.

Key Features of XML:

 Self-descriptive: The tags in XML describe the data they contain, making the document easy to
understand.
 Hierarchical structure: Data is nested within elements, allowing for complex relationships.
 Platform-independent: XML works across different systems and programming languages, making it
perfect for data exchange.
 Extensible: Users can create their own tags and structure to suit specific needs

XML-Related Technologies

Several technologies enhance XML’s functionality, making it a powerful tool for data management and
integration. Here are the key ones:

1. XSLT (Extensible Stylesheet Language Transformations):

o XSLT is a language for transforming XML documents into other formats, such as HTML, PDF,
or another XML structure.
o It uses stylesheets to define how the XML data should be reorganized or presented.

2. XPath (XML Path Language):

o XPath is a query language for navigating and selecting specific elements or attributes within an
XML document.
o It provides a way to pinpoint parts of the XML tree, much like a path in a file system.
3. XQuery (XML Query Language):
o XQuery is a robust language for querying XML data, similar to SQL for relational databases.
o It allows users to retrieve, filter, and manipulate data from XML documents or collections.
4. DTD (Document Type Definition) and XML Schema:
o These define the structure and rules for an XML document, ensuring data validity.
o DTD is an older, simpler method, while XML Schema offers more detailed and flexible
validation.
5. SOAP (Simple Object Access Protocol):
o SOAP is a protocol that uses XML to exchange structured data between applications over a
network.
o It’s commonly used in web services for secure, standardized communication.

XML Schema:

XML Schema is a language used to define the structure, rules, and constraints for an XML document. Think of
it as a blueprint that ensures an XML document is organized correctly and contains the right kind of data. It
42
specifies what elements and attributes are allowed, their data types (like string or integer), and how they relate
to each other. This makes sure the XML document is valid and follows a specific format, which is critical for
sharing and processing data between systems.

Key Features of XML Schema:

 Structure Definition: It outlines the hierarchy of elements and how they are nested, keeping the
document consistent.
 Data Types: It defines what kind of data (e.g., numbers, text, dates) elements and attributes can contain.
 Validation: It enforces rules, like requiring certain elements or limiting values, to ensure data quality.
 Namespaces: It supports namespaces, which help avoid naming conflicts in complex documents by
distinguishing elements with the same name.
 Flexibility: Compared to the older DTD (Document Type Definition), XML Schema is more powerful,
offering detailed constraints and better support for modern needs.

In short, XML Schema ensures XML documents are reliable and consistent, making them easier to use in
applications.

XML Schemas Secure Data Communication:

When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about
the content.
With XML Schemas, the sender can describe the data in a way that the receiver will understand.
A date like: "03-11-2004" will, in some countries, be interpreted as 3. November and in other countries as 11.
March.
However, an XML element with a data type like this:
<date type="date">2004-03-11</date>
ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYY-
MM-DD"

XML Query Language:

1. XQuery

 Description: XQuery is a functional programming language developed as a W3C standard specifically for
querying and manipulating XML data. It is widely regarded as the most powerful and versatile XML
query language.

 Key Features:
o Enables complex operations such as filtering, grouping, sorting, and joining of XML data.
o Uses FLWOR expressions (For, Let, Where, Order by, Return), which are analogous to SQL’s
SELECT statements.
o Allows construction of new XML documents or transformation of existing ones.
o Integrates well with XML databases and various XML processing tools.
 Use Case: Best suited for applications requiring sophisticated queries or transformations on large XML
datasets, such as data integration, content management systems, or reporting.

43
2. XPath

 Description: XPath, another W3C standard, is a language designed to navigate and select specific
elements and attributes within an XML document. It is often used as a building block within XQuery
and XSLT.
 Key Features:
o Uses path expressions to locate nodes or sets of nodes in an XML structure.
o Includes functions for string manipulation, numeric operations, and boolean logic.
o Provides a lightweight way to extract specific data or identify parts of an XML document.
 Use Case: Ideal for simpler tasks like extracting specific elements or attributes, such as in web scraping,
XML validation, or pinpointing data for further processing.

3. XSLT (Extensible Stylesheet Language Transformations)

 Description: XSLT is an XML-based language focused on transforming XML documents into other
formats, such as HTML, plain text, or different XML structures. While not a query language in the
traditional sense, it can extract and manipulate XML data.
 Key Features:
o Employs templates to define transformation rules for XML content.
o Supports conditional logic, looping, and sorting operations.
o Produces output in various formats, enhancing its flexibility for presentation or conversion tasks.
 Use Case: Perfect for transforming XML into human-readable formats, such as generating web pages
from XML data or converting XML into CSV files for reporting purposes.

4. SQL/XML

 Description: SQL/XML is an extension of the SQL standard that enables querying of XML data using
SQL-like syntax. It is supported by relational database management systems (RDBMS) like Oracle,
IBM DB2, and Microsoft SQL Server.
 Key Features:
o Allows querying of XML data stored within relational databases using familiar SQL constructs.
o Offers functions to extract and manipulate XML data within SQL queries.
o Facilitates integration of XML and relational data in a unified query.
 Use Case: Useful in enterprise environments where XML data is stored in a relational database and needs
to be queried alongside traditional relational data.

XML in Oracle:

XML in Oracle is a robust feature set that allows you to store, query, and manipulate XML data directly within
the database, primarily through Oracle XML DB. This functionality has been part of Oracle since version 9i,
making it a mature and well-supported component of the database

1. XML Storage Options

Oracle provides several ways to store XML data, each suited to different use cases:

 CLOB (Character Large Object): XML is stored as a text string. This is simple but not optimized for
querying or indexing.

44
 Object-Relational Storage: XML is shredded into relational tables based on an XML schema. This
allows for efficient querying and indexing but requires schema registration.
 Binary XML: A compact, optimized format that balances storage efficiency and query performance. It’s
the recommended option for most use cases, as it supports fast parsing and querying without requiring a
schema.

These storage methods offer flexibility depending on whether you prioritize simplicity, performance, or schema
validation.

2. Querying XML Data

Oracle supports multiple standards for querying XML, making it highly flexible:

 XQuery: A powerful language for querying XML data, which can be embedded directly in SQL
statements. It allows you to extract and manipulate XML elements efficiently.
 XPath: Used for navigating and selecting specific parts of an XML document. It’s often integrated with
XQuery or SQL functions to target specific nodes.
 SQL/XML: Oracle supports this standard, extending SQL with XML-related functions to query XML
data as if it were relational.

For example, the XMLTABLE function can project XML data into rows and columns, making it easy to
integrate with SQL queries.

3. Generating XML from Relational Data

Oracle provides several SQL functions to generate XML from relational data:

 XMLELEMENT: Creates an XML element from a column or expression.

 XMLAGG: Aggregates multiple XML elements into a single XML document.
 XMLFOREST: Generates a set of XML elements from multiple columns.

These functions are particularly useful for transforming database query results into XML documents for export
or application use.

4. XML Type and APIs

The XML Type data type is central to Oracle’s XML handling, enabling you to store and manipulate XML data
natively. It can be used in tables, views, or PL/SQL procedures.

Oracle also provides useful packages and APIs:

 DBMS_XMLGEN: A PL/SQL package that generates XML from SQL queries.

 DBMS_XDB: Provides access to the XML DB repository, allowing you to manage XML files and folders
within the database.

45
5. Performance Considerations
Oracle optimizes XML processing with features like:

 XML Indexes: These can be created on XML Type columns to accelerate queries, especially for large
datasets.
 Binary XML Storage: Designed for high-performance querying and efficient storage, this is ideal for
applications with significant XML workloads.

These features ensure scalability and speed for enterprise-level XML processing.

6. Integration with Other Technologies

Oracle’s XML capabilities integrate seamlessly with other database features:

 Advanced Queuing (AQ): XML can be used for messaging between applications.
 Oracle Data Integrator (ODI): XML supports ETL processes for transforming and loading data from
various sources.

This makes Oracle XML DB a versatile tool for data exchange and application integration.

46
UNIT NO. V : Mobile Database and security

Introduction to Mobile Databases:

Mobile DBMS:

Mobile Databases refer to databases that reside on or are accessed by mobile devices, such as smartphones,
tablets, or IoT devices. These databases store and manage data locally on the device or synchronize with a
central server to support applications in a mobile computing environment. Unlike traditional databases, mobile
databases operate in resource-constrained environments and must handle unique challenges like intermittent
connectivity, limited battery life, and mobility.

A Mobile DBMS (Database Management System) is software designed to manage these mobile databases. It
provides functionality for data storage, retrieval, updates, and synchronization while addressing the constraints
of mobile devices. Examples of Mobile DBMS include SQLite (widely used in Android and iOS apps), Realm,
and Couchbase Lite.

Key Features of Mobile DBMS:

1. Lightweight Design: Optimized for devices with limited CPU, memory, and storage.
2. Offline Support: Enables data access and updates without an active network connection.
3. Synchronization: Facilitates data exchange between the mobile device and a central server.
4. Embedded Architecture: Typically embedded within the application, requiring minimal user
intervention.
5. Portability: Supports cross-platform operation (e.g., iOS, Android).

Issues with mobile DBMS:

Mobile DBMS face several challenges due to the unique characteristics of mobile environments. Below are the
key issues, explained in detail:

1. Resource Constraints:
 Limited Processing Power: Mobile devices have less powerful CPUs compared to desktop or
server systems, which limits the performance of complex queries or large-scale data processing.
 Memory Limitations: Mobile devices typically have limited RAM, restricting the size of
datasets that can be processed at once.
 Storage Constraints: Storage capacity on mobile devices is often limited, requiring efficient
data compression and storage techniques.
 Impact: Mobile DBMS must be lightweight and optimized to minimize resource usage while
maintaining functionality.
2. Battery Life:
47
 Power Consumption: Database operations like querying, indexing, and synchronization can
drain a device’s battery quickly.
 Challenge: Mobile DBMS must implement energy-efficient algorithms, such as batching
operations or minimizing disk I/O, to conserve power.
 Impact: Excessive power usage can degrade user experience and limit the practicality of data-
intensive mobile apps.
3. Intermittent Connectivity:
 Network Instability: Mobile devices often operate in environments with unreliable or
intermittent network connections (e.g., moving between Wi-Fi and cellular networks or entering
areas with no coverage).
 Challenge: Mobile DBMS must support offline operations and efficient synchronization when
connectivity is restored. This requires robust mechanisms for conflict resolution and data
consistency.
 Impact: Poor handling of connectivity issues can lead to data inconsistencies or delays in
synchronization.
4. Data Synchronization and Consistency:
 Synchronization Complexity: Mobile databases often need to synchronize local data with a
central server, which can lead to conflicts if data is modified on both ends.
 Challenge: Ensuring data consistency across distributed systems (device and server) requires
sophisticated algorithms, such as version control or timestamp-based reconciliation.
 Impact: Without proper synchronization, applications may face data loss, duplication, or
inconsistencies, affecting reliability.
5. Security and Privacy:
 Vulnerability to Theft or Loss: Mobile devices are prone to being lost or stolen, exposing
sensitive data stored in the database.
 Challenge: Mobile DBMS must implement strong encryption, secure authentication, and access
control to protect data. However, encryption adds computational overhead, conflicting with
resource constraints.
 Impact: Weak security measures can lead to data breaches, compromising user privacy and trust.
6. Scalability Limitations:
 Small-Scale Data Handling: Mobile DBMS are designed for small to medium-sized datasets,
unlike enterprise DBMS, which handle large-scale data.
 Challenge: Managing growing datasets or supporting complex queries on resource-constrained
devices is difficult.
 Impact: Applications requiring large-scale data processing may need to offload tasks to a server,
increasing dependency on connectivity.
7. Heterogeneous Platforms:
 Cross-Platform Compatibility: Mobile devices run on diverse operating systems (e.g., Android,
iOS) and hardware configurations, complicating DBMS design.
 Challenge: Developing a Mobile DBMS that performs consistently across platforms requires
significant effort in portability and optimization.
 Impact: Inconsistent performance or compatibility issues can limit the adoption of a Mobile
DBMS in multi-platform applications.
8. Data Integrity and Transaction Management:
 Limited Transaction Support: Mobile DBMS often have simplified transaction models to
reduce resource usage, which can compromise ACID (Atomicity, Consistency, Isolation,
Durability) properties.
 Challenge: Ensuring data integrity during operations like updates or synchronization in a
disconnected environment is complex.
48
 Impact: Weak transaction support can lead to partial updates or corrupted data, especially in
critical applications like banking.
9. User Interface and Usability:
 Minimal Administrative Interface: Unlike traditional DBMS, Mobile DBMS are embedded
and lack robust administrative tools, making it harder to monitor or optimize performance.
 Challenge: Developers must handle database management programmatically, increasing
development complexity.
 Impact: This can lead to longer development cycles and potential errors in database
management.

Security and administration:

Database security is a critical aspect of database administration that focuses on protecting a database from
unauthorized access, misuse, or destruction. Given that databases often store sensitive information—such as
personal data, financial records, or intellectual property—ensuring their security is paramount.

Database security:

Database security refers to the collective measures and practices used to safeguard a database from threats that
could compromise its confidentiality (keeping data private), integrity (ensuring data accuracy and
trustworthiness), or availability (ensuring data is accessible to authorized users). It involves protecting not only
the data itself but also the database management system (DBMS), the applications that access the database, and
the physical or virtual environment in which the database resides.

The primary goals of database security are:

 Ensuring only authorized users can access sensitive data.

 Preventing unauthorized changes or corruption of data.
 Guaranteeing the database remains operational and accessible when needed.

Key Components of Database Security

Database security is built on several essential elements, each addressing specific vulnerabilities and protection
needs:
1. Authentication

 What it is: The process of verifying the identity of users or systems attempting to access the database.
 How it works: Typically involves credentials like usernames and passwords, but stronger methods like
multi-factor authentication (MFA), biometrics, or digital certificates can be used.
 Example: A user enters a username and password to log into the database system.
2. Authorization

 What it is: The process of determining what actions or resources an authenticated user is allowed to
access.
49
 How it works: Managed through roles and privileges, ensuring users can only perform tasks (e.g., read,
write, delete) appropriate to their role.
 Example: A database administrator (DBA) can modify schemas, while a regular user can only view
specific tables.
3. Encryption

 What it is: The process of converting data into a coded format that can only be deciphered with a
decryption key.
 How it works: Protects data at rest (stored in the database) and in transit (sent over networks), making it
unreadable if intercepted or stolen.
 Example: Credit card numbers are encrypted before storage in the database.

4. Auditing

 What it is: The practice of monitoring and recording database activities to detect and investigate potential
security breaches.
 How it works: Logs user actions like login attempts or data modifications to identify suspicious behavior
or policy violations.
 Example: An audit log tracks every access to a sensitive table.
5. Backup and Recovery

 What it is: The process of creating copies of the database and ensuring data can be restored after loss,
corruption, or disaster.
 How it works: Regular backups protect against hardware failures or cyberattacks, while recovery plans
minimize downtime.
 Example: Daily backups are stored offsite and tested regularly.

Common Threats to Database Security

Understanding potential threats is key to designing effective security measures. Some of the most common
include:
SQL Injection Attacks

 Description: Attackers inject malicious SQL code into input fields to manipulate the database, potentially
gaining unauthorized access or extracting data.
 Impact: Can lead to data breaches or loss.
Insider Threats

Description: Authorized users (e.g., employees) misuse their privileges to steal, alter, or destroy data.
Impact: Hard to detect due to legitimate access, often causing significant damage.
Malware

 Description: Malicious software (e.g., ransomware) infects the database system, corrupting data or
locking access.
 Impact: Results in data loss, downtime, or financial losses.

50
Physical Security Breaches

 Description: Unauthorized physical access to the database server or storage media.

 Impact: Can lead to data theft or hardware damage.

Other threats include weak passwords, unpatched software vulnerabilities, and denial-of-service (DoS) attacks
that disrupt availability.

Security in Microsoft office access DBMS:

Microsoft Access is a relational database management system (DBMS) included in the Microsoft Office suite,
designed primarily for small to medium-sized databases. It is commonly used for desktop applications and small
team projects due to its user-friendly interface and integration with other Office tools. Since databases often
store sensitive information, securing an Access database is essential to protect against unauthorized access, data
theft, and corruption. Below, we explore the key security features, limitations, and best practices for securing
Microsoft Access.

Microsoft Access offers several built-in security mechanisms to safeguard your data. Here’s an overview of the
main features:

1. Database Password
o You can set a password to lock the entire database, restricting access to only those who know the
password.
o How it works: When a password is set, users must enter it to open the database file.
o Limitation: This provides basic protection but lacks granularity—once a user has the password,
they gain full access to all data and functionality.
2. User-Level Security (Legacy)
o In older versions of Access (e.g., pre-2007, using the .mdb file format), user-level security
allowed administrators to define users and groups with specific permissions (e.g., read-only, edit,
or administrative access).
o Current status: This feature is not supported in the newer .accdb file format introduced with
Access 2007. For advanced permission management, Microsoft recommends integrating Access
with SQL Server or SharePoint.
3. Encryption
o Access databases in the .accdb format support encryption, which protects data at rest by making
it unreadable without the correct password.
o Benefit: Encryption ensures that even if the database file is copied or stolen, the data remains
secure.
o This feature was enhanced in Access 2007 and later versions, offering stronger protection
compared to earlier formats.
4. Splitting the Database
o Access allows you to split a database into two parts:
 Back-end: Contains the tables with the actual data.
 Front-end: Contains forms, queries, and reports for user interaction.
o Security advantage: The back-end can be stored in a secure network location with restricted
access, while the front-end is distributed to users, limiting direct access to the raw data.
5. Macro and VBA Security

51
o Access provides settings to control the execution of macros and Visual Basic for Applications
(VBA) code:
 Macros can be disabled or restricted to those digitally signed by a trusted source to
prevent malicious code from running.
 VBA projects can be password-protected to block unauthorized changes to the code.
o Purpose: This protects against threats introduced through automation or scripting.
6. Trusted Locations
o You can designate specific folders or network locations as "trusted," allowing databases stored
there to run macros and code without security prompts.
o Use case: This balances usability and security by enabling trusted databases while restricting
unverified ones.
7. Network and File System Security
o When an Access database is stored on a network, you can leverage file system permissions (e.g.,
Windows NTFS permissions) to control who can access the database file.
o Benefit: This adds an additional layer of protection, especially in multi-user environments.

Limitations and Potential Vulnerabilities

While Access provides useful security tools, it has limitations that make it less suitable for high-security or
large-scale scenarios:

 No Granular Permissions in Newer Versions: Without user-level security in the .accdb format, it’s
difficult to assign different access levels to users within the same database.
 Scalability Constraints: Access is not designed for enterprise-level use, with limits on file size (2 GB)
and concurrent users (typically under 255). These constraints can indirectly impact security by
increasing the risk of data corruption or performance issues.
 Physical Security Risks: As a file-based DBMS, Access databases can be easily copied. Without
encryption, someone with physical access could potentially extract data using external tools.
 Web Integration Risks: When used with SharePoint for web-enabled databases, additional
vulnerabilities (e.g., SQL injection) may arise, requiring careful web security management.

Security in Oracle DBMS:

 Oracle DBMS offers robust security features, including authentication, authorization, encryption, and
auditing, to protect sensitive data.
 Advanced tools like Oracle Database Vault and Data Safe enhance security by restricting access and
monitoring activities, especially against insider threats.

 Research suggests regular patching, secure configuration, and user education are crucial for maintaining
security, given the evolving threat landscape.

Core Security Features in Oracle DBMS:

1. Authentication
o Oracle supports multiple authentication methods to verify user identities:

52
 Local Authentication: Users are authenticated using credentials stored within the
database, managed through the CREATE USER and ALTER USER commands.
 External Authentication: Integrates with external systems like operating systems,
LDAP, or Kerberos, allowing for centralized identity management.
 Single Sign-On (SSO): Enables users to access multiple applications with a single set of
credentials, enhancing user convenience while maintaining security.
o Password Management: Oracle allows administrators to enforce strong password policies,
including complexity requirements, expiration periods, and account lockout after failed login
attempts, using features like password profiles and the DBMS_PASSWORD package.
2. Authorization
o Oracle uses a role-based access control (RBAC) model to manage user privileges:
 Privileges: Specific permissions (e.g., SELECT, INSERT, UPDATE, DELETE) can be
granted to users or roles, controlled through SQL commands like GRANT and REVOKE.
 Roles: Collections of privileges that can be assigned to users, simplifying privilege
management. For example, a DBA role might include all administrative privileges, while
a READ_ONLY role restricts users to query operations.
o Administrators can define granular access controls, ensuring users only have the permissions
necessary for their roles, reducing the risk of privilege abuse.
3. Encryption
o Oracle provides robust encryption mechanisms to protect data at rest and in transit:
 Transparent Data Encryption (TDE): Automatically encrypts sensitive data in
tablespaces, columns, or tables without requiring application changes. TDE uses a master
key stored in an Oracle wallet, managed through the DBMS_TDE package.
 Column-Level Encryption: Encrypts specific columns containing sensitive data, such as
credit card numbers or Social Security numbers, using the DBMS_CRYPTO PL/SQL
package.
 Network Encryption: Uses Oracle Advanced Security to encrypt data transmitted over
the network, ensuring secure communication between clients and the database server.
This can be configured using Oracle Net Services, supporting protocols like SSL/TLS.
4. Auditing
o Oracle's auditing capabilities allow administrators to monitor and record database activities,
crucial for detecting unauthorized access and ensuring compliance:
 Fine-Grained Auditing (FGA): Audits specific actions on specific data, such as access
to sensitive columns, using the DBMS_FGA package. For example, auditing all SELECT
statements on a salary table.
 Unified Auditing: A centralized auditing mechanism introduced in Oracle 12c,
simplifying audit management by consolidating standard and fine-grained audits into a
single framework, accessible through the UNIFIED_AUDITING view.
o Audit trails can be used to investigate security incidents, generate compliance reports, and meet
regulatory requirements, with options to purge old audit data for performance optimization.
5. Network Security
o Securing network communications is critical to prevent data interception during transmission.
Oracle Net Services provides features to enhance network security:
o Encryption and Integrity: Ensures data transmitted over the network is encrypted and protected
from tampering, configurable through the sqlnet.ora file with options like
SQLNET.ENCRYPTION_CLIENT and SQLNET.ENCRYPTION_SERVER.
o Strong Authentication: Supports advanced authentication methods like Kerberos and SSL/TLS
for secure connections, reducing the risk of man-in-the-middle attacks.

53
o Firewall Integration: Oracle databases can be configured to work with external firewalls to
restrict access based on IP addresses or other criteria, enhancing network segmentation.

DBMSs and Web Security:

A Database Management System (DBMS) is software that helps create, manage, and interact with databases. It
ensures data is stored safely, retrieved quickly, and kept accurate, which is vital for applications like online
shopping or banking.

Web Security protects websites and web apps from cyber threats, ensuring user data stays private and systems
work smoothly. It’s like a shield against hackers trying to steal information or disrupt services.

Connection Between DBMS and Web Security

In web apps, DBMS stores the data users see, like account details. Securing the DBMS is key to web security,
involving protecting against SQL injection, ensuring strong logins, and encrypting data to keep it safe from
hackers.

DBMS Unit 3
No ratings yet
DBMS Unit 3
6 pages
Database Design Essentials
No ratings yet
Database Design Essentials
29 pages
DBMS Ca3
No ratings yet
DBMS Ca3
15 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
12 pages
Chapter 7 - Database Design
No ratings yet
Chapter 7 - Database Design
52 pages
DBMS - Question Bank
No ratings yet
DBMS - Question Bank
9 pages
Database Design and Development Week 1
No ratings yet
Database Design and Development Week 1
64 pages
Adbm Mid-I
No ratings yet
Adbm Mid-I
24 pages
cb3401 Unit 2
No ratings yet
cb3401 Unit 2
24 pages
20240820162324D5181 - ISYS6843003 Week4 Session7 8 Designing The Database
No ratings yet
20240820162324D5181 - ISYS6843003 Week4 Session7 8 Designing The Database
57 pages
G12 Notes
No ratings yet
G12 Notes
4 pages
Unit 3 DBMS
No ratings yet
Unit 3 DBMS
53 pages
Module 2 DB
No ratings yet
Module 2 DB
21 pages
Csproject
No ratings yet
Csproject
33 pages
4th Module DBMS Notes
No ratings yet
4th Module DBMS Notes
23 pages
Chapter Four PDF
No ratings yet
Chapter Four PDF
17 pages
Database Management Systems
No ratings yet
Database Management Systems
44 pages
Normalization
No ratings yet
Normalization
11 pages
Cape Notes Unit 2 Module 1 Content 09
No ratings yet
Cape Notes Unit 2 Module 1 Content 09
4 pages
Chapter 5 - Logical and Physical Database Design
No ratings yet
Chapter 5 - Logical and Physical Database Design
19 pages
Module - 4 Notes - 12-11-2021
No ratings yet
Module - 4 Notes - 12-11-2021
21 pages
Review Questions
No ratings yet
Review Questions
29 pages
4.database Design
No ratings yet
4.database Design
24 pages
Phase 3 - Logical Design
No ratings yet
Phase 3 - Logical Design
8 pages
Connecting With Computer Science Chapter 6 Review
No ratings yet
Connecting With Computer Science Chapter 6 Review
5 pages
Dbms Imp All Units
No ratings yet
Dbms Imp All Units
15 pages
Data Modeling Advanced Concepts & Database Tables and Normalization
No ratings yet
Data Modeling Advanced Concepts & Database Tables and Normalization
7 pages
CSC232 Database Design and Management
No ratings yet
CSC232 Database Design and Management
5 pages
DBMS R23 Syllabus
No ratings yet
DBMS R23 Syllabus
2 pages
Database Design
No ratings yet
Database Design
7 pages
Lesson 2 PowerPoint
No ratings yet
Lesson 2 PowerPoint
10 pages
Cape Notes Unit 2 Module 1 Content 10
No ratings yet
Cape Notes Unit 2 Module 1 Content 10
12 pages
AIS Report Chapter 9 Part 2
No ratings yet
AIS Report Chapter 9 Part 2
31 pages
Dbms Presentation
No ratings yet
Dbms Presentation
50 pages
Database Applications 1.1. Introduction To Database Applications 1.1.1. What Is A Database?
No ratings yet
Database Applications 1.1. Introduction To Database Applications 1.1.1. What Is A Database?
8 pages
Dbmsmicroproject 2
No ratings yet
Dbmsmicroproject 2
7 pages
Z Database System Tutorial
No ratings yet
Z Database System Tutorial
37 pages
Importance of Database Design in DBMS
No ratings yet
Importance of Database Design in DBMS
5 pages
QUIZ 2 Notes
No ratings yet
QUIZ 2 Notes
14 pages
DBMS Essentials for IT Professionals
No ratings yet
DBMS Essentials for IT Professionals
44 pages
Report For Blood Bank
No ratings yet
Report For Blood Bank
10 pages
Database Essentials for Developers
No ratings yet
Database Essentials for Developers
9 pages
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
No ratings yet
Database Management System: Introduction To DBMS Ms. Deepikkaa.S
45 pages
Unit 4
No ratings yet
Unit 4
19 pages
Designing Databases
100% (1)
Designing Databases
27 pages
12th Databases
No ratings yet
12th Databases
32 pages
18BCS42C U4
No ratings yet
18BCS42C U4
17 pages
Sun Dbms QB
No ratings yet
Sun Dbms QB
3 pages
Masela Macwani.
No ratings yet
Masela Macwani.
9 pages
1) Define Normalization
No ratings yet
1) Define Normalization
37 pages
Database Management and Relational Database Management System
No ratings yet
Database Management and Relational Database Management System
11 pages
FYBSC(CS) DBMS Syllabus 2019-20
No ratings yet
FYBSC(CS) DBMS Syllabus 2019-20
5 pages
DBMS
No ratings yet
DBMS
9 pages
3410
No ratings yet
3410
5 pages
Advanced DBMS Revision Questions - 011507
No ratings yet
Advanced DBMS Revision Questions - 011507
16 pages
Siyanganyambe
No ratings yet
Siyanganyambe
7 pages
Database Normalization
No ratings yet
Database Normalization
8 pages
DBMS Lab Part1
No ratings yet
DBMS Lab Part1
28 pages
Logical Database Design
No ratings yet
Logical Database Design
21 pages
Unit 1 OSY Presentation
No ratings yet
Unit 1 OSY Presentation
30 pages
Servicing IBM Systems X Servers II - Study Guide
No ratings yet
Servicing IBM Systems X Servers II - Study Guide
154 pages
Os Lecture1
No ratings yet
Os Lecture1
60 pages
Cloud Computing and Internet of Things PQ2
No ratings yet
Cloud Computing and Internet of Things PQ2
12 pages
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
No ratings yet
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
52 pages
Course Title: Operating System Course Code:22516
No ratings yet
Course Title: Operating System Course Code:22516
23 pages
CPU Scheduling: Source: Operating System Concepts by by Silberschatz, Galvin and Gagne
No ratings yet
CPU Scheduling: Source: Operating System Concepts by by Silberschatz, Galvin and Gagne
50 pages
Vsphere 5 Cheat Sheet
No ratings yet
Vsphere 5 Cheat Sheet
3 pages
Cs8493 Notes
No ratings yet
Cs8493 Notes
155 pages
Advanced Computer Architecture: 1.0 Objective
No ratings yet
Advanced Computer Architecture: 1.0 Objective
27 pages
Bit 8 System Admin Chapter 1, Purbanchal University
No ratings yet
Bit 8 System Admin Chapter 1, Purbanchal University
8 pages
COA Assignment
No ratings yet
COA Assignment
21 pages
Lecture 1: Introduction To Operating Systems
No ratings yet
Lecture 1: Introduction To Operating Systems
56 pages
Solution Manual For Operating System Concepts Essentials 2nd Edition by Silberschatz
No ratings yet
Solution Manual For Operating System Concepts Essentials 2nd Edition by Silberschatz
6 pages
OS Concepts for IT Students
No ratings yet
OS Concepts for IT Students
4 pages
ANSYS Academic R13.0
No ratings yet
ANSYS Academic R13.0
63 pages
Operating Systems Exam Guide
No ratings yet
Operating Systems Exam Guide
19 pages
Cluster and Grid Computing Guide
No ratings yet
Cluster and Grid Computing Guide
52 pages
Operating System
No ratings yet
Operating System
81 pages
Introduction to Operating Systems
No ratings yet
Introduction to Operating Systems
17 pages
Parallel Computing System
No ratings yet
Parallel Computing System
4 pages
OS Chapter 1-2
No ratings yet
OS Chapter 1-2
19 pages
Calibre nm Platform & nmDRC Overview
No ratings yet
Calibre nm Platform & nmDRC Overview
39 pages
Unit-7 - Parallel Database Systems
No ratings yet
Unit-7 - Parallel Database Systems
35 pages
Os Econtent
No ratings yet
Os Econtent
231 pages
Data Kelas 12 2025
No ratings yet
Data Kelas 12 2025
84 pages
CS614 Quiz-3 by Vu Topper RM
No ratings yet
CS614 Quiz-3 by Vu Topper RM
33 pages
Distributed Databases and Client Server Architectures
No ratings yet
Distributed Databases and Client Server Architectures
14 pages
Operating 4
No ratings yet
Operating 4
57 pages
Operating System Interview Questions & Answers
No ratings yet
Operating System Interview Questions & Answers
10 pages