0% found this document useful (0 votes)

7 views11 pages

Adbms Pyq

previous year question adbms

Uploaded by

va2xervp1e

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views11 pages

Adbms Pyq

previous year question adbms

Uploaded by

va2xervp1e

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

[PYQ]

Question 1 (2022)
a) Explain different types of anomalies.

[Frequently Asked]

Anomalies are problems that can occur in poorly designed databases where data is stored redundantly . Redundancy means storing the same piece of information in
multiple places, which can lead to inconsistencies when data is added, updated, or deleted .

There are three main types of anomalies:

1. Update Anomaly

What it is: This occurs when an update to a single piece of data is not applied to all of its repeated instances. If a fact is stored in multiple rows and
you update only some of them, the database becomes inconsistent .
Example: Imagine an employee's name ( Ename ) is stored in three different tables: Employee , Emp_Salary , and Emp_Designation . If the
employee's name changes, you must update it in all three tables. If you miss one, the data will be inconsistent .

2. Insertion Anomaly

What it is: This anomaly happens when you cannot add certain information to the database unless some other, unrelated information is also present .
Example: You cannot add a new employee to the Emp_Salary table until that employee has been added to the main Employee table first. This
prevents you from storing salary information for a newly hired employee who is not yet in the main system .

3. Deletion Anomaly

What it is: This occurs when deleting one piece of data unintentionally causes the loss of other, different data. Deleting data from one table may
require deleting related data from other tables to maintain consistency .
Example: If you delete an employee record from the main Employee table, you must also delete their corresponding records from the Emp_Salary
and Emp_Designation tables. If you don't, those tables will contain information about a non-existent employee, leading to inconsistency .

b) Explain relational model constraints in detail.

[Important Definition]

Relational model constraints, also known as integrity constraints, are rules that the database management system (DBMS) enforces to ensure the correctness,
consistency, and completeness of data .

The main types of constraints are:

1. Domain Integrity

This ensures that all values in a column are of the correct type and are within a valid range .

NOT NULL: Ensures that a column cannot have a NULL (unknown or missing) value. For example, every student must have a name .
CHECK: Restricts the values that can be entered into a column by a specific condition. For example, CHECK(Age > 18) ensures that the value for age must be
greater than 18 .
DEFAULT: Assigns a default value to a column if no value is specified during an insert operation. For example, DEFAULT 'NA' could be used for a customer's
name if it's not provided .

2. Entity Integrity

This ensures that each row (or entity) in a table is uniquely identifiable .

UNIQUE: Guarantees that all values in a column are unique. No two rows can have the same value in that column. However, it allows for one NULL value .
PRIMARY KEY: A primary key is a column (or set of columns) that uniquely identifies every row in a table. It is similar to a UNIQUE constraint but does not
allow NULL values .

3. Referential Integrity

This preserves the defined relationships between tables when records are entered or deleted .

FOREIGN KEY: This is the main mechanism for enforcing referential integrity. A foreign key is a column in one table that refers to the primary key of another
table . This creates a link between the two tables.

Example: An Emp table might have a Did (Department ID) column that is a foreign key referencing the Did primary key in the Department table. This
ensures that you cannot assign an employee to a department that does not exist .

Dealing with Foreign Key Violations: When a record in a parent table (the one with the primary key) is deleted or updated, the DBMS can take several actions on
the dependent records in the child table:

RESTRICT / NO ACTION: Prevents the deletion or update on the parent table if there are matching records in the child table. This is the default behavior
.
CASCADE: If a record in the parent table is deleted, all corresponding records in the child table are also automatically deleted .
SET NULL: If a record in the parent table is deleted, the foreign key values in the corresponding child table records are set to NULL .
SET DEFAULT: The foreign key values in the child table are set to their specified default value .
c) Explain the concept of polymorphism in detail.

[Important Definition]

Polymorphism is an object-oriented concept that gives an object the ability to take on multiple forms or patterns . It allows the same programming code, operator, or
symbol to have different implementations and work with different data types .

Core Idea: The core idea is "one interface, multiple functions." An action is defined in a general way, and the specific implementation of that action changes
based on the object that performs it.
Example: The textbook uses the example of vehicles. A car, a truck, and a bike all have a "break" function. The action vehicle.break() is the same interface
for all of them. However, the internal mechanism for breaking is different for each vehicle (e.g., disc brakes vs. drum brakes). The break operation is
therefore polymorphic—its behavior changes depending on which type of vehicle performs it .

Question 2
a) Explain basic properties of OOP.

Object-Oriented Programming (OOP) concepts are used in Object-Oriented Database Management Systems (OODBMS) to handle complex data. The basic properties
are:

1. Identity (Object Identity)

Every object is given a unique, system-generated Object Identifier (OID) that is permanent and independent of the object's attribute values (its state) .
This is different from a relational primary key, which is based on data values.

2. Encapsulation

This is the bundling of data (attributes) and the operations (methods) that work on that data into a single unit, called an object . It is related to data
hiding , which protects an object's internal state from outside interference .

3. Inheritance

This allows a new type (a subtype or child class) to be created from an existing type (a supertype or parent class). The new type inherits all the
attributes and operations of the existing type and can add its own unique characteristics .

4. Complexity

Objects in an OODBMS can have complex internal structures. An object's state can contain other objects, allowing for nested and multi-level data
models .

5. Persistency

This is the ability of an object to exist in memory even after the program that created it has finished executing. Persistent objects are stored
permanently in the database and can be retrieved later .

6. Relationship

Relationships between two objects are maintained using their unique OIDs. One object can store the OID of another object to create a reference or link
between them .

b) Explain relational database schema in detail.

[Important Definition]

A database schema is the logical structure, design, or blueprint of a database . It defines the database entities (like tables), their attributes (columns), the relationships
between them, and all the constraints that apply to the data . The schema is defined during database design and is not expected to change frequently .

Schema vs. Instance

It's important to distinguish between the schema and an instance:

Schema: The definition of the database structure (the empty tables and their columns). It's the constant framework.
Instance: The actual data stored in the database at a specific point in time. An instance changes whenever data is added, updated, or deleted .

Types of Database Schemas (Three-Schema Architecture)

The database design is often described at three different levels of abstraction:

1. Physical / Internal Schema

This is the lowest level and describes how the data is physically stored on storage devices like disks . It deals with file organization, data structures,
and access paths (like indexes).

2. Logical / Conceptual Schema

This level describes the logical structure of the entire database for all users . It shows the tables, columns, data types, relationships, and constraints
without getting into the physical storage details. This is the level most database administrators and developers work with.

3. External / View Schema

This is the highest level and describes the part of the database that is relevant to a specific user or group of users . It creates different "views" of the
data, simplifying the database for users and providing a level of security by hiding parts of the data they don't need to see.

c) Differentiate between data abstraction and encapsulation.

Data abstraction and encapsulation are related but distinct OOP concepts. The textbook states that the idea of encapsulation comes from abstract data types
(ADTs), which is the principle behind data abstraction .

Data Abstraction is a design concept focused on hiding complex implementation details while showing only the essential features of an object. It answers the
question, "What does the object do?". For example, you know a car can accelerate, but you don't need to know the details of the engine's combustion cycle to
do it.

Encapsulation is the implementation mechanism for abstraction and data hiding. It is the process of bundling an object's data (attributes) and the methods
that operate on that data into a single unit . It answers the question, "How is it implemented?". Encapsulation controls access to the data, often by making it
private and only allowing it to be modified through public methods.

Basis of
Data Abstraction Encapsulation
Difference

Main Focus Hiding implementation complexity. Bundling data and methods together.

Level A design-level concept or principle. An implementation-level mechanism.

To simplify the view of a system by exposing To achieve data hiding and protect the internal state of an object
Goal
only essential functionalities. from direct outside access.

Achieved using abstract classes and interfaces Achieved by making data members private and providing public
Implementation
(conceptual). methods to access them (practical).

Question 4
a) Explain architecture of distributed database.

A distributed database is a collection of logically related databases that are physically distributed across multiple sites connected by a computer network . The
architecture of a Distributed DBMS (DDBMS) defines how functionality is split across different processes.

The three main architectural approaches are:

1. Client-Server Systems

Architecture: This model consists of multiple client processes and one or more server processes . Clients are responsible for the user interface and
application logic, while servers are responsible for data management, query processing, and transaction execution . A client sends a query to a server,
which processes it and returns the result .
Limitation: A standard client-server architecture does not allow a single query to transparently access data from multiple servers at once .

2. Collaborating Server Systems

Architecture: This is an extension of the client-server model where there is no longer a strict distinction between clients and servers. It is a collection of
database servers, each capable of managing its local data .
Functionality: When a server receives a query that requires data from other servers, it takes on the role of a coordinator. It decomposes the query into
sub-queries, sends them to the appropriate servers for execution, and then combines their results to answer the original query .

3. Middleware Systems

Architecture: This approach uses a special software layer called middleware to manage queries and transactions across multiple, independent, and
often heterogeneous database servers .
Functionality: The middleware layer handles the complex task of distributed query processing, but it generally does not store any data itself . The
individual database servers only need to be capable of handling local requests. This architecture is useful for integrating existing legacy database
systems .

b) Explain interquery and intraquery parallelism.

Parallelism in databases is used to improve performance by executing operations simultaneously. This is achieved in two ways:

1. Inter-query Parallelism

Important Definition: This form of parallelism involves executing different, independent queries at the same time on different processors .
Goal: The main goal of inter-query parallelism is to increase the overall throughput of the system, which is the number of transactions or queries that can be
completed in a given period .
Example: If you have 10 separate queries that each take 2 seconds to run sequentially (totaling 20 seconds), a system with 10 processors can run all of them
in parallel, completing all 10 queries in just 2 seconds .
2. Intra-query Parallelism

Important Definition: This form of parallelism involves taking a single large or complex query, breaking it down into smaller sub-tasks , and executing those
tasks simultaneously on multiple processors .
Goal: The main goal of intra-query parallelism is to reduce the response time of a single, long-running query .
How it works: This can be achieved in two ways:
1. Executing different operations within the same query in parallel (e.g., performing two separate joins at the same time).
2. Parallelizing a single operation (e.g., using multiple processors to sort one large table) .

Feature Inter-query Parallelism Intra-query Parallelism

Unit of
Multiple different queries. A single query.
Parallelism

Primary Goal Increase Throughput. Decrease Response Time.

Systems with many small, independent Systems with large, complex queries (like in data
Best For
transactions. warehousing).

c) Explain distributed catalog management.

[Important Definition]

A database catalog is a system-owned repository that stores metadata—that is, data about the data. This includes table and column definitions, constraints, indexes,
and user permissions.

In a distributed database system, the catalog must also store information about how the data is distributed across the network. Distributed Catalog Management
refers to the strategies for storing, managing, and accessing this distributed metadata.

The distributed catalog must contain information about:

Data Fragmentation: The catalog must store the fragmentation schema, which describes how each global relation is broken down into fragments (e.g., the
conditions for horizontal fragments or the column lists for vertical fragments) .
Data Replication: It needs to maintain a replication schema, which details how many copies (replicas) of each fragment exist and where they are located .
Data Allocation: The catalog holds the allocation schema, which describes the specific site where each fragment or replica is physically stored .

This information is critical for the DDBMS, especially for the query optimizer, which needs to know the location and characteristics of data fragments to create an
efficient execution plan that minimizes costs like data transfer over the network .

[PYQ]

Question 1 (2023)
a) Explain referential integrity and domain integrity constraint with suitable example.

[Important Definition]

Integrity constraints are rules that ensure the correctness and consistency of data in a database. Domain and referential integrity are two fundamental types.

1. Domain Integrity

Definition: Domain integrity ensures that all values entered into a column are valid. It specifies the set of permissible values for an attribute . This is enforced
using three main constraints:
NOT NULL: This constraint ensures that a column cannot store a NULL value . For example, every student in a Student table must have a name.
CHECK: This constraint validates that a value meets a specific condition before being inserted . For example, in an Employee table, a constraint
CHECK(Age > 18) would ensure no one under 18 can be registered.
DEFAULT: This constraint provides a default value for a column when no value is specified by the user during an insert operation . For instance, a
column Status could have a default value of 'Active' .

2. Referential Integrity

Definition: Referential integrity is a rule that preserves the defined relationships between tables. It ensures that a foreign key value in one table always refers to
a valid, existing primary key in another table .
Example: Consider two tables, Employee and Department .
Department table has a primary key Did (Department ID).
Employee table has a foreign key Did that refers to the Department table's primary key.
This constraint ensures that you cannot add an employee to the Employee table with a Did that does not exist in the Department table. This
prevents having "orphan" records and maintains data consistency .

b) Explain basic properties of OODBMS.

Object-Oriented Database Management Systems (OODBMS) are built on the principles of Object-Oriented Programming to handle complex data structures. The basic
properties are:
1. Identity (Object Identity): Every object has a unique, system-generated Object Identifier (OID) that is immutable and independent of the object's attribute
values. This allows two objects to have the same state but still be distinct entities .

2. Encapsulation: This is the bundling of an object's data (attributes) and the methods (operations) that act on that data into a single unit. It hides the internal
implementation details from the outside world .

3. Inheritance: This allows a new class (subtype) to be created from an existing class (supertype), inheriting all its attributes and methods. This promotes code
reuse and the creation of class hierarchies .

4. Complexity: Objects can represent complex real-world entities with nested structures. An object's state can be built from other objects, allowing for multi-level
complexity .

5. Persistency: An object is persistent if it continues to exist in the database even after the program that created it has terminated. This is a core feature that
separates a database object from a transient program variable .

c) Explain Relational Data Model.

The Relational Data Model, first proposed by E. F. Codd, is a way of logically structuring data using tables, also known as relations . It is the foundation of most
modern database systems.

The core components of the model are:

Relation: A relation is a table with columns and rows . Each table in a database has a unique name.
Attribute: An attribute is a named column of a relation. It represents a property of the entity (e.g., Name , Age , Address for a Student entity) .
Tuple: A tuple is a row in a relation. It represents a single record or a single instance of an entity (e.g., all the information for one specific student) .
Domain: A domain is the set of all permissible values for a given attribute. For example, the domain for an Age attribute might be integers between 0 and 120 .

Key Characteristics of a Relation:

Each row (tuple) in the table is unique.

All values in a given column (attribute) must be of the same data format.
Each column has a distinct name.
The order of rows and columns is immaterial.
Each table must have a primary key that uniquely identifies each row .

Question 2
a) Explain Database schema with suitable example.

[Important Definition]

A database schema is the logical blueprint or structure of a database . It is defined during the design phase and specifies the tables, the columns in each table, the
data types for each column, the relationships between tables, and the constraints on the data . The schema represents the logical view of the entire database and is
not expected to change frequently .

It's important to distinguish between a schema and an instance:

Schema: The design or structure of the database (e.g., the CREATE TABLE statements).
Instance: The actual data contained in the database at a specific moment in time. The instance changes as data is added, updated, or deleted .

Example of a Database Schema: An example is the HR Schema for a company database, which might include several related tables :

EMPLOYEE Table: With columns like employee_id (Primary Key), first_name , last_name , job_id (Foreign Key), department_id (Foreign Key).
DEPARTMENTS Table: With columns department_id (Primary Key), department_name , location_id (Foreign Key).
JOBS Table: With columns job_id (Primary Key), job_title , min_salary .
LOCATIONS Table: With columns location_id (Primary Key), street_address , city .

This structure, showing the tables, their columns, and the primary/foreign key relationships, represents the database schema.

b) What is OID? Explain properties of OID.

[Important Definition]

An OID (Object Identifier) is a unique identifier that is assigned by the Object-Oriented Database system to each object when it is created . Unlike a primary key in a
relational database, an OID is not based on the attribute values of the object. Its purpose is to provide a permanent and unique identity for an object throughout its
lifetime .

Properties of OID: The key properties of an Object Identifier are:

Uniqueness: An OID is unique system-wide. Once assigned to an object, it can never be assigned to another object .
System Generated: OIDs are generated automatically by the database system; they cannot be manually assigned or changed by the user .
Invariant: The OID of an object cannot be altered during its entire lifetime .
Immutable: Once an OID is used for an object, it cannot be regenerated or reused for another object, even after the original object is deleted .
Invisible: The OID is generally not visible to the end-user .
Independent: The value of an OID is independent of the state (attribute values) of the object. Two objects can have identical attribute values but will always
have different OIDs .
c) Compare Relational and Object Oriented Databases.

Relational Database Management Systems (RDBMS) and Object-Oriented Database Management Systems (OODBMS) represent different approaches to storing and
managing data.

Here is a comparison based on several key features:

Feature Relational Database (RDBMS) Object-Oriented Database (OODBMS)

Data is stored in two-dimensional tables consisting Data is stored as objects, which encapsulate both data
Data Model
of rows and columns . (attributes) and behavior (methods) .

Requires a pre-defined, static schema. The table

Supports a dynamic or flexible schema. The structure of an
Schema structure must be defined before data can be loaded
object can vary, and new types can be added easily .
.

Data Best suited for simple, structured data that fits well Designed to handle complex, semi-structured, or unstructured
Structure into tables. data (e.g., multimedia, nested objects) .

Relationships are represented using foreign keys to Relationships are represented by directobject references (OIDs),
Relationships
link rows in different tables . where one object can contain a reference to another object .

Query Uses the standardized Structured Query Language Uses Object Query Language (OQL), which is designed to work
Language (SQL) . with objects and their methods .

Key Based on OOP concepts (identity, encapsulation, inheritance,

Based on relational algebra (tables, keys, joins).
Concepts polymorphism) .

Question 3
a) Explain Shared Disk and Shared Memory Parallel Database Architecture.

These are two common architectures for building parallel database systems, where multiple processors work together to improve performance.

1. Shared Memory Architecture

Architecture: In this model, multiple CPUs are connected via a high-speed interconnection network to a single, global shared memory and a single set of
shared disks. All processors can access all memory and all disks .
How it Works: Processors communicate with each other by reading and writing to the shared memory, which is very fast. Each processor typically has a local
cache to reduce contention for the shared memory bus .
Advantages:
Very efficient and fast communication between processors .
Simple to implement load balancing.
Disadvantages:
The shared memory bus or network can become a bottleneck as more processors are added .
It is not easily scalable beyond a certain number of processors (e.g., 32 or 64) .

2. Shared Disk Architecture

Architecture: In this model, each processor has its own private memory, but all processors can access a single, shared set of disks through an interconnection
network . These systems are often called clusters .
How it Works: Since each processor has its own memory, there is no memory bus bottleneck. Processors coordinate access to the shared data on the disks.
Advantages:
Provides good fault tolerance. If one processor fails, the other processors can still access all the data on the shared disks and take over its tasks .
It is more scalable than a shared memory architecture.
Disadvantages:
The interconnection network to the disks can become a bottleneck as a large amount of data may need to be shipped between the processors'
memories .

b) Write short note on Parallel Query Optimization.

[Important Definition]

Parallel Query Optimization is the process of finding the most efficient query execution plan in a parallel database environment . Since data and processing are
distributed across multiple nodes, the optimizer must consider not only traditional costs like disk I/O but also communication costs, data partitioning, and how to best
parallelize each step of the query.

Goals of Parallel Query Optimization:

Speed Up Queries: Find the fastest execution plan by leveraging parallel resources .
Select the Best Plan: Choose from many alternative plans, considering factors like join order and data distribution .
Increase Performance: Decompose complex queries into simpler sub-tasks that can be run in parallel .

Two-Phase Approach to Optimization: A common strategy for parallel query optimization involves two steps :

1. Join Ordering (Uni-processor Optimization): In the first phase, a traditional query optimizer is used to determine the best sequence of operations (like joins)
without considering parallelism. This produces a sequential query plan .
2. Parallel Resource Allocation: In the second phase, this sequential plan is adapted for a parallel environment. The optimizer decides how to partition the data,
which operations to run in parallel, and how to allocate processors and memory to each task in the plan .

c) Explain Fragmentation and Replication.

Fragmentation and replication are two key strategies used in distributed database design to manage how data is stored across multiple sites.

1. Fragmentation

Definition: Fragmentation is the process of breaking a single logical database or table into smaller pieces called fragments. These fragments can then be
stored on different computers in the network . The key rule is that the original table must be reconstructible from its fragments .
Types of Fragmentation:
Horizontal Fragmentation: The table is split into groups of rows (tuples). Each fragment contains a subset of the rows based on a specific condition
(e.g., all customers from 'Pune' in one fragment, 'Mumbai' in another) . The original table is reconstructed using UNION .
Vertical Fragmentation: The table is split into groups of columns (attributes). Each fragment contains a subset of columns, but the primary key must be
included in all fragments to link them back together . The original table is reconstructed using JOIN .

2. Replication

Definition: Replication is the process of creating and maintaining multiple copies of data or data fragments on different sites in the network .
Goals of Replication:
Increased Availability: If one site containing the data fails, users can still access the data from another site that has a replica. This makes the system
more fault-tolerant .
Faster Query Evaluation: Queries can be performed on a local copy of the data instead of accessing a remote site, which reduces network delays and
improves response time .

Question 4
a) Explain parallel join operation.

A parallel join is a technique used in parallel databases to speed up the process of joining two large tables (relations). The fundamental strategy is to decompose the
large join into a collection of smaller, independent joins that can be executed simultaneously on different processors .

How It Works: The key to parallelizing a join is data partitioning .

1. Partitioning: Both relations that need to be joined (say R1 and R2) are partitioned across the available processors using the exact same partitioning function
on their join attributes . This is the most crucial step.
2. Data Distribution: This partitioning ensures that all the tuples from R1 and R2 that could possibly match each other (i.e., have the same join attribute value) are
sent to the same processor .
3. Local Join: Each processor then performs a join operation locally on the subset of data it has received. Since all matching tuples are guaranteed to be on the
same processor, no further communication between processors is needed for the join itself.
4. Result Aggregation: The results from the local joins on all processors are then combined to produce the final result of the overall join.

Types of Parallel Joins: The partitioning method determines the type of parallel join:

Parallel Hash Join: If hash partitioning is used on the join attribute, it results in a parallel hash join algorithm .
Parallel Sort-Merge Join: If range partitioning is used on the join attribute, it results in a parallel sort-merge join algorithm, with the added benefit that the final
output is already sorted .

b) Explain step by step 2 phase commit protocol.

The Two-Phase Commit (2PC) protocol is a distributed algorithm that ensures all parts of a transaction are either committed or aborted across all participating sites,
thus guaranteeing atomicity.

Roles:

Coordinator: The transaction manager at the site where the transaction originated.
Subordinates (or Cohorts): The transaction managers at the other sites involved in the transaction.

Here is the step-by-step process:

Phase 1: The Commit-Request (Voting) Phase

1. The Coordinator sends a PREPARE message to all Subordinates, asking them if they are ready to commit .
2. Upon receiving the PREPARE message, each Subordinate determines if it can successfully commit its part of the transaction.
If it can, it writes a prepare record to its log, becomes "prepared," and sends a YES vote to the Coordinator .
If it cannot (due to an error), it writes an abort record to its log and sends a NO vote to the Coordinator .

Phase 2: The Commit (Decision) Phase 3. The Coordinator collects the votes from all Subordinates. 4. The Coordinator makes a final decision: - If all Subordinates
voted YES : The Coordinator decides to COMMIT. It writes a commit record to its own log and sends a COMMIT message to all Subordinates . - If even one
Subordinate voted NO (or timed out): The Coordinator decides to ABORT. It writes an abort record to its own log and sends an ABORT message to all Subordinates .
5. Each Subordinate receives the final decision from the Coordinator. It writes a commit or abort record to its log as instructed, performs the required action, and
releases any locks. 6. Each Subordinate sends an ACK (acknowledgment) message back to the Coordinator to confirm it has completed the action . 7. Once the
Coordinator receives ACK from all Subordinates, it considers the transaction complete and writes an end record to its log .

c) Explain Distributed Transaction processing.

[Important Definition]

Distributed transaction processing is the mechanism that manages transactions in a distributed database system to ensure they maintain the database's consistency
and integrity, especially in the face of concurrent operations and system failures .

Key Concepts:

Local vs. Global Transactions:

A Local Transaction is one that accesses data only at the site where the transaction was initiated .
A Global Transaction is one that accesses data at multiple sites. It is composed of several sub-transactions, each of which is a local transaction
running at a specific site .
Transaction Monitor (TM): When a global transaction enters the system, it is assigned to a Transaction Monitor, which oversees its execution across the
different sites .
Challenges in Distributed Transactions:
Concurrency Control: Managing simultaneous access to data that may be replicated across multiple sites. This involves complex locking mechanisms
to prevent deadlocks and inconsistencies.
Recovery: Ensuring that the database can recover to a consistent state after a failure (e.g., site crash, network failure).
Distributed Commit: The most significant challenge is ensuring that a global transaction is atomic. This means it must either commit at all participating
sites or abort at all sites. A partial commit (committing at some sites but not others) would violate database consistency. This atomicity is achieved
using protocols like the Two-Phase Commit (2PC) Protocol .

[PYQ]

Question 1(2024)
a) Explain Entity Integrity Constraints & Referential Integrity Constraints with example.

[Important Definition]

Integrity constraints are rules used to maintain the accuracy and consistency of data in a relational database. Entity integrity and referential integrity are two of the
most critical types.

1. Entity Integrity

Definition: This constraint ensures that each row (or entity) in a table is uniquely identifiable. It is enforced primarily by the PRIMARY KEY . A primary key is a
column or a set of columns whose values uniquely identify a single row in the table .
Rules: The entity integrity constraint has two main rules:
1. A primary key's value must be unique for each row.
2. A primary key cannot contain NULL values .
Example: In a STUDENT table with columns (ID, Name, Email) , the ID column would be the primary key. Entity integrity ensures that every student has a
unique ID and that no student record can exist without an ID. An attempt to insert a new student with an existing ID or a NULL ID would be rejected by the
database.

2. Referential Integrity

Definition: This constraint is used to maintain consistency between related tables. It ensures that a relationship between two tables remains valid. It is
enforced using a FOREIGN KEY . A foreign key is a column in one table that points to the primary key of another table .
Rules: Referential integrity ensures that a value in the foreign key column must either match an existing value in the primary key of the referenced table or be
NULL .
Example: Consider an EMPLOYEE table with a Did column (Department ID) and a DEPARTMENT table where Did is the primary key. The Did in the
EMPLOYEE table is a foreign key referencing DEPARTMENT . Referential integrity ensures that you cannot assign an employee to a department that does not
exist (i.e., the Did value you insert into EMPLOYEE must already be present in the DEPARTMENT table) .

b) Write a short note on i) Class Inheritance & its types ii) Polymorphism

i) Class Inheritance and its Types

Definition: Inheritance is a fundamental concept in object-oriented systems where a new class (the subtype) can be created based on an existing class (the
supertype ) . The subtype automatically inherits all the attributes and functions (methods) of its supertype, and can also have its own additional, specific
functions . This concept is similar to specialization in the ER model and promotes reusability .
Type Hierarchy: Inheritance leads to the creation of a type hierarchy (or class hierarchy), which shows the supertype/subtype relationships among all the
classes in the system .
Example: A PERSON class can be defined as a supertype with general attributes like Name , Address , and Birthdate . From this, two subtypes can be
created:
STUDENT subtype-of PERSON : Inherits all attributes from PERSON and adds its own specific attributes like Standard , Fees , Roll_No .
TEACHER subtype-of PERSON : Inherits all attributes from PERSON and adds Salary , Major , Grade .
Types of Inheritance: The provided textbook explains the general concept of creating subtypes from supertypes to form a hierarchy but does not detail
different types of inheritance (like single, multiple, or multi-level).

ii) Polymorphism

Definition: Polymorphism is an object-oriented property that allows an object to take on multiple forms . It means that the same operation or function name
can have different implementations depending on the object it is called on .
Core Idea: The main idea is "one name, multiple forms." This allows for writing more generic and flexible code.
Example: A Vehicle class could have a method called move() . Subtypes like Car , Boat , and Airplane would all inherit this method but implement it
differently. Car.move() would implement driving, Boat.move() would implement sailing, and Airplane.move() would implement flying. The action is
polymorphic because the same method name ( move() ) results in different behaviors. The textbook uses a similar example of a break operation on different
vehicles .

Question 2
a) Explain insert, update & Delete anomalies with suitable example.

[Frequently Asked]

Anomalies are data inconsistencies that occur in a database, typically due to data redundancy in a poorly designed (non-normalized) schema . Redundant data leads
to problems when modifying the database.

There are three main types of anomalies:

1. Insertion Anomaly

Definition: This occurs when you are unable to insert a new piece of information into the database because another, unrelated piece of information is
missing .
Example: Consider a single table that stores both employee details and the project they are assigned to: (Emp_Id, Ename, Address, Project_Id,
Project_Name) . If a new employee is hired but not yet assigned to a project, you cannot add their details to the table because the Project_Id would
be null, and if the project ID is part of the primary key, this is not allowed. You are blocked from inserting a valid fact (a new employee) because another
fact (a project assignment) is not yet available.

2. Update Anomaly

Definition: This occurs due to data redundancy. If a piece of information is stored in multiple rows, and an update is made to only some of them, the
database becomes inconsistent .
Example: Using the same employee-project table, if an employee like 'Sachin' works on two different projects, his Address ('Malad') will be listed in
two separate rows . If 'Sachin' moves to a new address, you must update this information in both rows. If you only update one, the database will have
conflicting information about where 'Sachin' lives, leading to an inconsistent state.

3. Deletion Anomaly

Definition: This occurs when the deletion of one piece of information unintentionally leads to the loss of another, different piece of information that
you wanted to keep .
Example: In the employee-project table, if an employee works on only one project, and that project is completed and its record is deleted, you might
lose all information about that employee. For instance, if 'Suhas' only works on project 'FID', deleting the row for project 'FID' would also remove
'Suhas's Ename and Address from the database entirely, even though 'Suhas' is still an employee of the company.

b) What is the need of OOPs? Explain different properties of OOP's.

Need for OOPs in Databases

The Object-Oriented paradigm was developed to better handle the complexity of modern applications. In the context of databases, Object-Oriented Database Systems
(OODBS) were proposed as an alternative to relational systems, particularly for domains where complex objects play a major role .

The primary needs are:

Handling Complex Data: Traditional relational databases struggle to efficiently store and manage complex data types like multimedia files, geographic
information systems (GIS) data, and computer-aided design (CAD) models. OODBS are designed to handle these complex, nested, and non-structured data
types naturally as objects.
Overcoming Impedance Mismatch: In application development, programmers work with objects. When using a relational database, these objects must be
translated into a tabular format (rows and columns), a process which is often complex and inefficient. This is called the "object-relational impedance
mismatch." OODBS eliminate this by allowing objects to be stored directly, creating a more seamless integration between the application and the database.

Different Properties of OOP's

1. Identity (Object Identity): Each object has a unique Object Identifier (OID) that is independent of its values .
2. Encapsulation: Data (attributes) and operations (methods) are bundled together into a single object, hiding the internal details .
3. Inheritance: New classes (subtypes) can be created from existing classes (supertypes), inheriting their properties and promoting code reuse .
4. Complexity: Objects can have complex internal structures, containing other objects or collections of objects .
5. Persistency: Objects can be stored permanently in the database and exist even after the program that created them terminates .
6. Polymorphism: The ability for an operation to be implemented in different ways depending on the object it is acting upon .
Question 3
a) Explain in details architectures for parallel database.

[Frequently Asked]

A parallel database system uses multiple processors and disks in parallel to improve performance. There are three main architectures:

1. Shared Memory Architecture

Architecture: Multiple CPUs are connected to a single, global shared memory and a shared set of disks through a common communication bus or
interconnection network . All processors have equal access to all memory and disk space.
Functionality: Processors communicate by writing to and reading from the shared memory, which is extremely fast. To reduce contention, each processor has
a private cache .
Advantages:
Efficient and fast inter-processor communication .
Simplified load balancing.
Disadvantages:
The shared memory bus becomes a bottleneck as the number of CPUs increases, limiting scalability .
Not scalable beyond 32 or 64 processors.

2. Shared Disk Architecture

Architecture: In this model, each processor has its own private (local) memory, but all processors can access a common set of shared disks through an
interconnection network . These systems are often called clusters.
Functionality: Since memory is not shared, the memory bus is not a bottleneck. Processors communicate by passing messages over the network, and all have
access to the entire database on the shared disks.
Advantages:
High degree of fault tolerance. If one processor fails, others can take over its tasks since the database is accessible to all .
Disadvantages:
The network connecting the processors to the shared disks can become a bottleneck as data is shipped between processors .

3. Shared Nothing Architecture

Architecture: Each processor (or node) is part of a self-contained unit with its own local memory and its own local disk(s) . The nodes communicate with each
other exclusively by sending messages over a high-speed network. Data is partitioned across all the nodes.
Functionality: Queries are processed locally on a node as much as possible. If data from another node is needed, a message is sent to that node to request it.
Advantages:
Highly scalable. Thousands of processors can be added, as there are no shared resources to create a bottleneck .
Eliminates I/O and memory contention found in other architectures.
Disadvantages:
The cost of accessing data on a non-local node is high, as it requires network communication .
System performance is highly dependent on how well the data is partitioned across the nodes.

b) Explain distributed transaction management system in details.

A distributed transaction management system is responsible for ensuring the properties of transactions (like atomicity and consistency) are maintained in a
distributed database system, where a single transaction can span multiple physical sites.

Key Concepts:

A distributed transaction is a transaction that accesses data on more than one site. It is treated as a single logical unit but is composed of several sub-
transactions , each executing on a different site .
The main challenge is ensuring atomicity, meaning the transaction must either commit at all participating sites or abort at all sites. A partial commit would
leave the database in an inconsistent state.

The Two-Phase Commit (2PC) Protocol: This is the standard protocol used to manage distributed transactions and ensure atomicity.

Roles:

Coordinator: The transaction manager at the site where the transaction originated .
Subordinates: The transaction managers at all other sites involved .

Step-by-Step Process:

Phase 1: The Voting Phase

1. The Coordinator sends a PREPARE message to all Subordinates .

2. Each Subordinate checks if it can guarantee the commit of its sub-transaction. It force-writes its decision to its local log (a prepare record if yes, an
abort record if no).
3. The Subordinate sends a vote back to the Coordinator: YES if it is prepared to commit, or NO if it must abort .

Phase 2: The Decision Phase 4. The Coordinator collects votes from all Subordinates. 5. It makes a global decision: - If ALL Subordinates voted YES , the
Coordinator decides to COMMIT. - If ANY Subordinate voted NO (or failed to respond), the Coordinator decides to ABORT . 6. The Coordinator force-writes its
final decision ( commit or abort ) to its own log. 7. The Coordinator sends the final decision ( COMMIT or ABORT message) to all Subordinates . 8. Each
Subordinate receives the final decision, writes it to its log, and then either commits or aborts its sub-transaction accordingly. 9. Each Subordinate sends an
ACK (acknowledgment) message to the Coordinator. The transaction is complete once the Coordinator receives all acknowledgments .

Question 4
a) Differentiate between Interquery Parallelism & Intraquery Parallelism.

Inter-query and Intra-query parallelism are two distinct strategies for improving performance in a parallel database system. The key difference lies in what is being
parallelized: multiple queries or a single query.

Basis of
Inter-query Parallelism Intra-query Parallelism
Difference

Multiple, independent queries are executed A single, large query is broken down into smaller sub-tasks, which are
Definition
simultaneously on different processors . then executed simultaneously on different processors .

Primary To increase system throughput (the number of To reduce the response time (the execution time) of a single, complex
Goal queries completed per unit of time) . query .

Unit of
The unit of parallelism is an entire query. The unit of parallelism is a sub-task or an operation within a query.
Work

Ideal for Online Transaction Processing (OLTP)

Ideal for Decision Support Systems (DSS) or Data Warehousing, which
Use Case systems, which handle a large number of small,
involve a few very large and complex queries.
simple transactions.

Running a query for Customer A on CPU 1 at the A query to calculate the total sales across all regions is broken down;
Example same time as a query for Customer B runs on CPU CPU 1 calculates sales for the North, and CPU 2 calculates sales for
2. the South in parallel.

b) Explain distributed database system architecture and how data is stored in distributed database system.

1. Distributed Database System Architecture

A distributed database system architecture defines how different DBMS-related processes and components are structured across a network. There are three main
approaches:

Client-Server Systems: This architecture separates functionality between client processes (which handle user interaction) and server processes (which
manage data and transactions). A client sends a query to a server, which executes it and returns the result .
Collaborating Server Systems: This is an evolution where multiple database servers can work together to execute a single query. When a server receives a
query that needs data from other sites, it coordinates with those other servers to generate the final result .
Middleware Systems: This architecture uses a special software layer (middleware) to integrate several independent, and possibly heterogeneous, database
systems. The middleware handles the complexity of executing a query across these different systems, while each system only needs to manage its local data .

2. How Data is Stored in a Distributed System

Data is physically stored across multiple sites using two primary techniques: fragmentation and replication.

Data Fragmentation: This is the process of splitting a database into smaller logical units called fragments, which can be stored at different sites .

Horizontal Fragmentation: The table is divided by rows. For example, an Orders table could be fragmented by year, with all 2023 orders at Site A and
all 2024 orders at Site B .
Vertical Fragmentation: The table is divided by columns. For example, an Employee table could be split into a fragment with personal info (EmpID,
Name, Address) at one site and another with confidential info (EmpID, Salary, SSN) at a more secure site .

Data Replication: This is the process of storing multiple copies of data fragments at different sites .

Need for Replication: Replication is done to increase availability (if one site fails, data is still accessible) and improve performance (queries can be
executed on a local copy, reducing network delay) .

Dbms Most Important Questions
No ratings yet
Dbms Most Important Questions
13 pages
Note2 1
No ratings yet
Note2 1
9 pages
Question 1 To 7 Notes (Defn)
No ratings yet
Question 1 To 7 Notes (Defn)
14 pages
Data and Information Management Term 1
No ratings yet
Data and Information Management Term 1
17 pages
Mcs 23 Imp Q
No ratings yet
Mcs 23 Imp Q
29 pages
PYQ
No ratings yet
PYQ
8 pages
DBMS Unit2 Cser22
No ratings yet
DBMS Unit2 Cser22
39 pages
RDBMS Concepts and Database Models
No ratings yet
RDBMS Concepts and Database Models
28 pages
Unit2 2021
No ratings yet
Unit2 2021
82 pages
DBMS Test 2
No ratings yet
DBMS Test 2
17 pages
Description of The Properties: Commit Rollback
No ratings yet
Description of The Properties: Commit Rollback
4 pages
UNIT2
No ratings yet
UNIT2
12 pages
238 Stuffs
No ratings yet
238 Stuffs
7 pages
DMS Question Bank ANS
No ratings yet
DMS Question Bank ANS
15 pages
Relational Database Model
No ratings yet
Relational Database Model
51 pages
Chapter Two DB
No ratings yet
Chapter Two DB
30 pages
HW 1 Key
No ratings yet
HW 1 Key
10 pages
Dbms Kcs501 2018-19 Aktusolution
No ratings yet
Dbms Kcs501 2018-19 Aktusolution
20 pages
DBMS Concepts for CSE Students
No ratings yet
DBMS Concepts for CSE Students
55 pages
DBMS Lecture of Unit 2 F
No ratings yet
DBMS Lecture of Unit 2 F
33 pages
DBMS Notes 2
No ratings yet
DBMS Notes 2
7 pages
CITS1402 Mid Sem Notes
No ratings yet
CITS1402 Mid Sem Notes
4 pages
Relational Data Model
No ratings yet
Relational Data Model
61 pages
DBMS ASSIGNMENT - 25. Model Solution of Assignment I
No ratings yet
DBMS ASSIGNMENT - 25. Model Solution of Assignment I
7 pages
DBMS
No ratings yet
DBMS
20 pages
POD Question Bank Answer
No ratings yet
POD Question Bank Answer
12 pages
SQL Basics
No ratings yet
SQL Basics
6 pages
Database Design and Development Week 1
No ratings yet
Database Design and Development Week 1
64 pages
Relational Model & SQL Basics
No ratings yet
Relational Model & SQL Basics
74 pages
Interview Questions - Data Management
No ratings yet
Interview Questions - Data Management
2 pages
Adbms Unit1 Answers
No ratings yet
Adbms Unit1 Answers
6 pages
DBMS Module-2 Notes
No ratings yet
DBMS Module-2 Notes
5 pages
IA - 1 Explain
No ratings yet
IA - 1 Explain
8 pages
Database Concepts
No ratings yet
Database Concepts
12 pages
4.database Concepts
No ratings yet
4.database Concepts
7 pages
Unit 2 DBMS
No ratings yet
Unit 2 DBMS
72 pages
UNIt - 2 Dbms (Relational Calculus)
No ratings yet
UNIt - 2 Dbms (Relational Calculus)
83 pages
Dbms Imp All Units
No ratings yet
Dbms Imp All Units
15 pages
Imp Ans
No ratings yet
Imp Ans
7 pages
Q.bank Solve of Programming
No ratings yet
Q.bank Solve of Programming
33 pages
Exist Exam
No ratings yet
Exist Exam
176 pages
DBMS2
No ratings yet
DBMS2
22 pages
Database Answers
No ratings yet
Database Answers
25 pages
Come 301 Dbms Lecture 4 Presentation Slides
No ratings yet
Come 301 Dbms Lecture 4 Presentation Slides
25 pages
Complete Unit 2 PDF Dbms
No ratings yet
Complete Unit 2 PDF Dbms
39 pages
DBMS
No ratings yet
DBMS
5 pages
DBMS-2 Relational Model-1
No ratings yet
DBMS-2 Relational Model-1
21 pages
Topic 1 - Introduction To Database
No ratings yet
Topic 1 - Introduction To Database
14 pages
Advanced Database Management System
No ratings yet
Advanced Database Management System
41 pages
Chap No 4
No ratings yet
Chap No 4
10 pages
CSC3305 CONTINUATION RDMS and Normilization
No ratings yet
CSC3305 CONTINUATION RDMS and Normilization
10 pages
Ch-2 Database Concepts
No ratings yet
Ch-2 Database Concepts
5 pages
Lecture 2 DBMS
No ratings yet
Lecture 2 DBMS
18 pages
Database Concepts Till Features of MySQL
No ratings yet
Database Concepts Till Features of MySQL
13 pages
Fundamental Database Concepts
No ratings yet
Fundamental Database Concepts
5 pages
Python File Handling Guide
No ratings yet
Python File Handling Guide
3 pages
Noiseless Channels
No ratings yet
Noiseless Channels
29 pages
SQL Basics and Indexing
No ratings yet
SQL Basics and Indexing
12 pages
BD Ventas
No ratings yet
BD Ventas
8 pages
DDR PHY Test Solution
No ratings yet
DDR PHY Test Solution
45 pages
Scanners: Types and Functions
No ratings yet
Scanners: Types and Functions
32 pages
Wi-Pms310Gf-Ups+ 8Ge+2Sfp Ports 24V-48V Mixed L2 Managed Ups No-Break Poe Switch With 8-Port Poe Data Sheet
No ratings yet
Wi-Pms310Gf-Ups+ 8Ge+2Sfp Ports 24V-48V Mixed L2 Managed Ups No-Break Poe Switch With 8-Port Poe Data Sheet
10 pages
Advanced Internet Tech & Graphics Lab
No ratings yet
Advanced Internet Tech & Graphics Lab
39 pages
Grade 7
No ratings yet
Grade 7
3 pages
Comptia Security Plus Mini Course Handbook
100% (3)
Comptia Security Plus Mini Course Handbook
26 pages
Processor Bit Differences Explained
No ratings yet
Processor Bit Differences Explained
11 pages
SQL Notes
No ratings yet
SQL Notes
9 pages
Oversubscription and Density Best Practices
No ratings yet
Oversubscription and Density Best Practices
10 pages
Gigabyte Ga-A55m-Ds2 Rev. 1.11 SCH
No ratings yet
Gigabyte Ga-A55m-Ds2 Rev. 1.11 SCH
27 pages
Introduction to DBMS Concepts
No ratings yet
Introduction to DBMS Concepts
37 pages
MODBUS Communication Protocol
No ratings yet
MODBUS Communication Protocol
6 pages
VueScan Settings Formats Guide
No ratings yet
VueScan Settings Formats Guide
7 pages
Hamming Code Eng
No ratings yet
Hamming Code Eng
3 pages
Train Reservation System Documentation
No ratings yet
Train Reservation System Documentation
27 pages
DP Authorizations
No ratings yet
DP Authorizations
8 pages
Football API Goal Calculator
No ratings yet
Football API Goal Calculator
3 pages
Cashier Application Information System
No ratings yet
Cashier Application Information System
8 pages
Class Xii Cs 083 Syllabus 2023 24
No ratings yet
Class Xii Cs 083 Syllabus 2023 24
4 pages
Kafka Producer Setup Guide
No ratings yet
Kafka Producer Setup Guide
31 pages
BEx Query Execution
100% (1)
BEx Query Execution
18 pages
Database Components and Database Concepts, Data Independence, Structural Independence
No ratings yet
Database Components and Database Concepts, Data Independence, Structural Independence
3 pages
Microprocessors 68HC11
No ratings yet
Microprocessors 68HC11
20 pages
The 80386 Microprocessor
No ratings yet
The 80386 Microprocessor
54 pages
IDC Superflex Pro Audio Series Receivers
No ratings yet
IDC Superflex Pro Audio Series Receivers
2 pages
Holographic Data Storage Explained
No ratings yet
Holographic Data Storage Explained
15 pages

Adbms Pyq

Uploaded by

Adbms Pyq

Uploaded by

[PYQ]

There are three main types of anomalies:

b) Explain relational model constraints in detail.

The main types of constraints are:

1. Identity (Object Identity)

b) Explain relational database schema in detail.

Schema vs. Instance

It's important to distinguish between the schema and an instance:

Types of Database Schemas (Three-Schema Architecture)

The database design is often described at three different levels of abstraction:

1. Physical / Internal Schema

2. Logical / Conceptual Schema

3. External / View Schema

c) Differentiate between data abstraction and encapsulation.

Level A design-level concept or principle. An implementation-level mechanism.

The three main architectural approaches are:

2. Collaborating Server Systems

b) Explain interquery and intraquery parallelism.

Feature Inter-query Parallelism Intra-query Parallelism

Primary Goal Increase Throughput. Decrease Response Time.

c) Explain distributed catalog management.

The distributed catalog must contain information about:

b) Explain basic properties of OODBMS.

c) Explain Relational Data Model.

The core components of the model are:

Key Characteristics of a Relation:

Each row (tuple) in the table is unique.

It's important to distinguish between a schema and an instance:

b) What is OID? Explain properties of OID.

Properties of OID: The key properties of an Object Identifier are:

Here is a comparison based on several key features:

Feature Relational Database (RDBMS) Object-Oriented Database (OODBMS)

Requires a pre-defined, static schema. The table

Key Based on OOP concepts (identity, encapsulation, inheritance,

1. Shared Memory Architecture

2. Shared Disk Architecture

b) Write short note on Parallel Query Optimization.

Goals of Parallel Query Optimization:

c) Explain Fragmentation and Replication.

How It Works: The key to parallelizing a join is data partitioning .

b) Explain step by step 2 phase commit protocol.

Here is the step-by-step process:

Phase 1: The Commit-Request (Voting) Phase

c) Explain Distributed Transaction processing.

Local vs. Global Transactions:

i) Class Inheritance and its Types

There are three main types of anomalies:

b) What is the need of OOPs? Explain different properties of OOP's.

Need for OOPs in Databases

The primary needs are:

Different Properties of OOP's

1. Shared Memory Architecture

2. Shared Disk Architecture

3. Shared Nothing Architecture

b) Explain distributed transaction management system in details.

Phase 1: The Voting Phase

1. The Coordinator sends a PREPARE message to all Subordinates .

Ideal for Online Transaction Processing (OLTP)

1. Distributed Database System Architecture

2. How Data is Stored in a Distributed System

You might also like