[go: up one dir, main page]

0% found this document useful (0 votes)
11 views81 pages

Dbms Notes

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 81

UNIT 1

Data Sharing and Accessibility: DBMS allows multiple users and


applications to access the same data simultaneously. This enables
collaboration and sharing of data across departments, teams, and
geographical locations.
Data Security: DBMS provides mechanisms for controlling access to
data, ensuring that only authorized users can view or modify sensitive
information. This includes user authentication, authorization, and
encryption features to safeguard data from unauthorized access and
data breaches.
Data Integrity: DBMS enforces data integrity constraints to maintain
the accuracy and consistency of data stored in the database.
Constraints such as unique keys, foreign key relationships, and data
validation rules help prevent invalid or inconsistent data from being
entered into the database.
Data Consistency: With DBMS, changes made to data are
immediately reflected in all related data items and data structures.
This ensures that data remains consistent across the database,
reducing the risk of data discrepancies or errors.
Data Independence: DBMS provides a layer of abstraction between
the physical storage of data and the logical representation of data.
This allows changes to the database schema or storage structures
without affecting the applications that use the data, providing both
logical and physical data independence.
Concurrent Access and Transaction Management: DBMS supports
concurrent access to data by multiple users and applications while
ensuring data integrity and consistency. Transaction management
features, such as ACID properties (Atomicity, Consistency, Isolation,
Durability), ensure that database transactions are executed reliably
and recoverable in case of failures.
Backup and Recovery: DBMS includes built-in mechanisms for data
backup and recovery, allowing organizations to create regular
backups of their database and restore data in the event of system
failures, disasters, or data corruption.
Scalability and Performance Optimization: DBMS can scale to
accommodate growing volumes of data and users by providing
features such as partitioning, indexing, and query optimization. This
ensures that database performance remains optimal even as the
workload increases.
Reduced Data Redundancy and Duplication: DBMS helps minimize
data redundancy and duplication by centralizing data storage and
providing mechanisms for data normalization. This results in efficient
use of storage space and reduces the likelihood of inconsistencies or
conflicts in data.
Data Analysis and Reporting: DBMS often includes tools and features
for data analysis, reporting, and business intelligence, allowing
organizations to gain insights from their data and make informed
decisions.
Overall, the adoption of a DBMS offers significant advantages in
terms of data management, security, integrity, and efficiency, making
it an essential component of modern information systems.

Top of Form
Hierarchical Model:
Data is organized in a tree-like structure with parent-child
relationships.
Each child can have only one parent.
Commonly used in older systems, such as IMS (Information
Management System) databases.
Network Model:
Data is organized as a collection of records and relationships.
Supports many-to-many relationships between entities.
Records are connected through pointers.
Used in systems like CODASYL databases.
Relational Model:
Data is organized into tables (relations) consisting of rows (tuples)
and columns (attributes).
Relationships between tables are established using foreign keys.
One of the most widely used data models, implemented in relational
database management systems (RDBMS) such as MySQL,
PostgreSQL, Oracle, and SQL Server.
Object-oriented Model:
Data is represented as objects containing attributes and methods.
Supports inheritance, encapsulation, and polymorphism.
Popular in object-oriented databases (OODBMS) and object-
relational mapping (ORM) frameworks.
Entity-Relationship (ER) Model:
Represents data as entities, attributes, and relationships.
Entities are objects or concepts with attributes that describe them.
Relationships depict associations between entities.
Often used for conceptual modeling before implementing a database
in a relational or other model.
Document Model:
Data is stored as documents, typically in JSON or XML format.
Documents can contain nested structures and arrays.
Widely used in NoSQL databases like MongoDB, Couchbase, and
Elasticsearch for flexible and schema-less data storage.
Graph Model:
Represents data as nodes (vertices) and edges (relationships) between
them.
Used to model complex relationships and networks, such as social
networks, recommendation systems, and network infrastructure.
Graph databases like Neo4j and Amazon Neptune implement this
model for efficient traversal and analysis of graph data.
These data models provide different ways of structuring and
organizing data, each with its own strengths and weaknesses
depending on the requirements of the application and the nature of the
data being stored.

A Database stores a lot of critical information to access data quickly


and securely. Hence it is important to select the correct architecture
for efficient data management. DBMS Architecture helps users to get
their requests done while connecting to the database. We choose
database architecture depending on several factors like the size of the
database, number of users, and relationships between the users. There
are two types of database models that we generally use, logical model
and physical model. Several types of architecture are there in the
database which we will deal with in the next section.

Types of DBMS Architecture


There are several types of DBMS Architecture that we use according
to the usage requirements. Types of DBMS Architecture are discussed
here.
1-Tier Architecture
2-Tier Architecture
3-Tier Architecture
1-Tier Architecture
In 1-Tier Architecture the database is directly available to the user,
the user can directly sit on the DBMS and use it that is, the client,
server, and Database are all present on the same machine. For
Example: to learn SQL we set up an SQL server and the database on
the local system. This enables us to directly interact with the
relational database and execute operations. The industry won’t use
this architecture they logically go for 2-tier and 3-tier Architecture.

Advantages of 1-Tier Architecture


Below mentioned are the advantages of 1-Tier Architecture.
Simple Architecture: 1-Tier Architecture is the most simple
architecture to set up, as only a single machine is required to maintain
it.
Cost-Effective: No additional hardware is required for implementing
1-Tier Architecture, which makes it cost-effective.
Easy to Implement: 1-Tier Architecture can be easily deployed, and
hence it is mostly used in small projects.
2-Tier Architecture
The 2-tier architecture is similar to a basic client-server model. The
application at the client end directly communicates with the database
on the server side. APIs like ODBC and JDBC are used for this
interaction. The server side is responsible for providing query
processing and transaction management functionalities. On the client
side, the user interfaces and application programs are run. The
application on the client side establishes a connection with the server
side to communicate with the DBMS.
An advantage of this type is that maintenance and understanding are
easier, and compatible with existing systems. However, this model
gives poor performance when there are a large number of users.
DBMS 2-Tier Architecture
Advantages of 2-Tier Architecture
Easy to Access: 2-Tier Architecture makes easy access to the
database, which makes fast retrieval.
Scalable: We can scale the database easily, by adding clients or
upgrading hardware.
Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture
and Multi-Tier Architecture.
Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier
Architecture.
Simple: 2-Tier Architecture is easily understandable as well as simple
because of only two components.
3-Tier Architecture
In 3-Tier Architecture, there is another layer between the client and
the server. The client does not directly communicate with the server.
Instead, it interacts with an application server which further
communicates with the database system and then the query processing
and transaction management takes place. This intermediate layer acts
as a medium for the exchange of partially processed data between the
server and the client. This type of architecture is used in the case of
large web applications.

DBMS 3-Tier Architecture


Advantages of 3-Tier Architecture
Enhanced scalability: Scalability is enhanced due to the distributed
deployment of application servers. Now, individual connections need
not be made between the client and server.

Data Integrity: 3-Tier Architecture maintains Data Integrity. Since


there is a middle layer between the client and the server, data
corruption can be avoided/removed.
Security: 3-Tier Architecture Improves Security. This type of model
prevents direct interaction of the client with the server thereby
reducing access to unauthorized data.
Disadvantages of 3-Tier Architecture
More Complex: 3-Tier Architecture is more complex in comparison
to 2-Tier Architecture. Communication Points are also doubled in 3-
Tier Architecture.
Difficult to Interact: It becomes difficult for this sort of interaction to
take place due to the presence of middle layers.
Data independence is the ability to modify the scheme without
affecting the programs and the application to be rewritten. Data is
separated from the programs, so that the changes made to the data will
not affect the program execution and the application.
We know the main purpose of the three levels of data abstraction is to
achieve data independence. If the database changes and expands over
time, it is very important that the changes in one level should not
affect the data at other levels of the database. This would save time
and cost required when changing the database.
There are two levels of data independence based on three levels of
abstraction. These are as follows −
Physical Data Independence
Logical Data Independence
Physical Data Independence
Physical Data Independence means changing the physical level
without affecting the logical level or conceptual level. Using this
property, we can change the storage device of the database without
affecting the logical schema.
The changes in the physical level may include changes using the
following −
A new storage device like magnetic tape, hard disk, etc.
A new data structure for storage.
A different data access method or using an alternative files
organization technique.
Changing the location of the database.
Logical Data Independence
Logical view of data is the user view of the data. It presents data in
the form that can be accessed by the end users.
Codd’s Rule of Logical Data Independence says that users should be
able to manipulate the Logical View of data without any information
of its physical storage. Software or the computer program is used to
manipulate the logical view of the data.
Database administrator is the one who decides what information is to
be kept in the database and how to use the logical level of abstraction.
It provides the global view of Data. It also describes what data is to be
stored in the database along with the relationship.
The data independence provides the database in simple structure. It is
based on application domain entities to provide the functional
requirement. It provides abstraction of system functional
requirements. Static structure for the logical view is defined in the
class object diagrams. Users cannot manipulate the logical structure
of the database.
Introduction of ER Model
Last Updated : 08 Apr, 2024

The Entity Relational Model is a model for identifying entities to be


represented in the database and representation of how those entities
are related. The ER data model specifies enterprise schema that
represents the overall logical structure of a database graphically.
The Entity Relationship Diagram explains the relationship among the
entities present in the database. ER models are used to model real-
world objects like a person, a car, or a company and the relation
between these real-world objects. In short, the ER Diagram is the
structural format of the database.
Why Use ER Diagrams In DBMS?
ER diagrams are used to represent the E-R model in a database, which
makes them easy to convert into relations (tables).
ER diagrams provide the purpose of real-world modeling of objects
which makes them intently useful.
ER diagrams require no technical knowledge and no hardware
support.
These diagrams are very easy to understand and easy to create even
for a naive user.
It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data
perspective which consists of these symbols:
Rectangles: Rectangles represent Entities in the ER Model.
Ellipses: Ellipses represent Attributes in the ER Model.
Diamond: Diamonds represent Relationships among Entities.
Lines: Lines represent attributes to entities and entity sets with other
relationship types.
Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
Double Rectangle: Double Rectangle represents a Weak Entity.
Symbols used in ER Diagram
Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among
Entities in a Database System.
Components of ER Diagram
Entity
An Entity may be an object with a physical existence – a particular
person, car, house, or employee – or it may be an object with a
conceptual existence – a company, a job, or a university course.
Entity Set: An Entity is an object of Entity Type and a set of all
entities is called an entity set. For Example, E1 is an entity having
Entity Type Student and the set of all students is called Entity Set. In
ER diagram, Entity Type is represented as:

Entity Set
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong
Entity does not depend on other Entity in the Schema. It has a primary
key, that helps in identifying it uniquely, and it is represented by a
rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity
in the entity set. But some entity type exists for which key attributes
can’t be defined. These are called Weak Entity types.
For Example, A company may store the information of dependents
(Parents, Children, Spouse) of an Employee. But the dependents can’t
exist without the employee. So Dependent will be a Weak Entity
Type and Employee will be Identifying Entity type for Dependent,
which means it is Strong Entity Type.
A weak entity type is represented by a Double Rectangle. The
participation of weak entity types is always total. The relationship
between the weak entity type and its identifying strong entity type is
called identifying relationship and it is represented by a double
diamond.

Strong Entity and Weak Entity


Attributes
Attributes are the properties that define the entity type. For example,
Roll_No, Name, DOB, Age, Address, and Mobile_No are the
attributes that define entity type Student. In ER diagram, the attribute
is represented by an oval.
Attribute
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is
called the key attribute. For example, Roll_No will be unique for each
student. In ER diagram, the key attribute is represented by an oval
with underlying lines.

Key Attribute
2. Composite Attribute
An attribute composed of many other attributes is called a composite
attribute. For example, the Address attribute of the student Entity type
consists of Street, City, State, and Country. In ER diagram, the
composite attribute is represented by an oval comprising of ovals.
Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For
example, Phone_No (can be more than one for a given student). In ER
diagram, a multivalued attribute is represented by a double oval.

Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type
is known as a derived attribute. e.g.; Age (can be derived from DOB).
In ER diagram, the derived attribute is represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be
represented as:
Entity and Attributes
Relationship Type and Relationship Set
A Relationship Type represents the association between entity types.
For example, ‘Enrolled in’ is a relationship type that exists between
entity type Student and Course. In ER diagram, the relationship type
is represented by a diamond and connecting the entities with lines.

Entity-Relationship Set
A set of relationships of the same type is known as a relationship set.
The following relationship set depicts S1 as enrolled in C2, S2 as
enrolled in C1, and S3 as registered in C3.

Relationship Set
Degree of a Relationship Set
The number of different entity sets participating in a relationship set is
called the degree of a relationship set.
1. Unary Relationship: When there is only ONE entity set
participating in a relation, the relationship is called a unary
relationship. For example, one person is married to only one person.

Unary Relationship
2. Binary Relationship: When there are TWO entities set participating
in a relationship, the relationship is called a binary relationship. For
example, a Student is enrolled in a Course.

Binary Relationship
3. Ternary Relationship: When there are n entities set participating in
a relation, the relationship is called an n-ary relationship.
Cardinality
The number of times an entity of an entity set participates in a
relationship set is known as cardinality. Cardinality can be of different
types:
1. One-to-One: When each entity in each entity set can take part only
once in the relationship, the cardinality is one-to-one. Let us assume
that a male can marry one female and a female can marry one male.
So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.

one to one cardinality


Using Sets, it can be represented as:

Set Representation of One-to-One


2. One-to-Many: In one-to-many mapping as well where each entity
can be related to more than one entity and the total number of tables
that can be used in this is 2. Let us assume that one surgeon
department can accommodate many doctors. So the Cardinality will
be 1 to M. It means one department has many Doctors.
total number of tables that can used is 3.

one to many cardinality


Using sets, one-to-many cardinality can be represented as:

Set Representation of One-to-Many


3. Many-to-One: When entities in one entity set can take part only
once in the relationship set and entities in other entity sets can take
part more than once in the relationship set, cardinality is many to one.
Let us assume that a student can take only one course but one course
can be taken by many students. So the cardinality will be n to 1. It
means that for one course there can be n students but for one student,
there will be only one course.
The total number of tables that can be used in this is 3.
many to one cardinality
Using Sets, it can be represented as:

Set Representation of Many-to-One


In this case, each student is taking only 1 course but 1 course has been
taken by many students.
4. Many-to-Many: When entities in all entity sets can take part more
than once in the relationship cardinality is many to many. Let us
assume that a student can take more than one course and one course
can be taken by many students. So the relationship will be many to
many.
the total number of tables that can be used in this is 3.

many to many cardinality


Using Sets, it can be represented as:
Many-to-Many Set Representation
In this example, student S1 is enrolled in C1 and C3 and Course C3 is
enrolled by S1, S3, and S4. So it is many-to-many relationships.
Participation Constraint
Participation Constraint is applied to the entity participating in the
relationship set.
1. Total Participation – Each entity in the entity set must participate in
the relationship. If each student must enroll in a course, the
participation of students will be total. Total participation is shown by
a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT
participate in the relationship. If some courses are not enrolled by any
of the students, the participation in the course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student
Entity set having total participation and Course Entity set having
partial participation.

Total Participation and Partial Participation


Using Set, it can be represented as,

Set representation of Total Participation and Partial Participation


Every student in the Student Entity set participates in a relationship
but there exists a course C4 that is not taking part in the relationship.
How to Draw ER Diagram?
The very first step is Identifying all the Entities, and place them in a
Rectangle, and labeling them accordingly.
The next step is to identify the relationship between them and place
them accordingly using the Diamond, and make sure that,
Relationships are not connected to each other.
Attach attributes to the entities properly.
Remove redundant entities and relationships.
Add proper colors to highlight the data present in the database.

An entity is a “thing” or “object” in the real world. An entity contains


attributes, which describe that entity. So anything about which we
store information is called an entity. Entities are recorded in the
database and must be distinguishable, i.e., easily recognized from the
group. In this article we will see difference between strong and weak
entity.
What is Strong Entity?
A strong entity is not dependent on any other entity in the schema. A
strong entity will always have a primary key. Strong entities are
represented by a single rectangle. The relationship of two strong
entities is represented by a single diamond. Various strong entities,
when combined together, create a strong entity set.
What is Weak Entity?
A weak entity is dependent on a strong entity to ensure its existence.
Unlike a strong entity, a weak entity does not have any primary key. It
instead has a partial discriminator key. A weak entity is represented
by a double rectangle. The relation between one strong and one weak
entity is represented by a double diamond. This relationship is also
known as identifying relationship.
ENHANCED ER MODEL
Today the complexity of the data is increasing so it becomes more
and more difficult to use the traditional ER model for database
modeling. To reduce this complexity of modeling we have to make
improvements or enhancements to the existing ER model to make it
able to handle the complex application in a better way.
Enhanced entity-relationship diagrams are advanced database
diagrams very similar to regular ER diagrams which represent the
requirements and complexities of complex databases.
It is a diagrammatic technique for displaying the Sub Class and Super
Class; Specialization and Generalization; Union or Category;
Aggregation etc.
Generalization and Specialization: These are very common
relationships found in real entities. However, this kind of relationship
was added later as an enhanced extension to the classical ER
model. Specialized classes are often called subclass while
a generalized class is called a superclass, probably inspired by object-
oriented programming. A sub-class is best understood by “IS-A
analysis”. The following statements hopefully make some sense to
your mind “Technician IS-A Employee”, and “Laptop IS-A
Computer”.
An entity is a specialized type/class of another entity. For example, a
Technician is a special Employee in a university system Faculty is a
special class of Employees. We call this phenomenon
generalization/specialization. In the example here Employee is a
generalized entity class while the Technician and Faculty are
specialized classes of Employee.
Example:
This example instance of “sub-class” relationships. Here we have four
sets of employees: Secretary, Technician, and Engineer. The
employee is a super-class of the rest three sets of individual sub-class
is a subset of Employee set.
An entity belonging to a sub-class is related to some super-class
entity. For instance emp, no 1001 is a secretary, and his typing speed
is 68. Emp no 1009 is an engineer (sub-class) and her trade is
“Electrical”, so forth.
Sub-class entity “inherits” all attributes of super-class; for example,
employee 1001 will have attributes eno, name, salary, and typing
speed.
Enhanced ER model of above example

Constraints – There are two types of constraints on the “Sub-class”


relationship.
Total or Partial – A sub-classing relationship is total if every super-
class entity is to be associated with some sub-class entity, otherwise
partial. Sub-class “job type based employee category” is partial sub-
classing – not necessary every employee is one of (secretary,
engineer, and technician), i.e. union of these three types is a proper
subset of all employees. Whereas other sub-classing “Salaried
Employee AND Hourly Employee” is total; the union of entities from
sub-classes is equal to the total employee set, i.e. every employee
necessarily has to be one of them.
Overlapped or Disjoint – If an entity from a super-set can be related
(can occur) in multiple sub-class sets, then it is overlapped sub-
classing, otherwise disjoint. Both the examples: job-type based and
salaries/hourly employee sub-classing are disjoint.

Multiple Inheritance (sub-class of multiple superclasses) –


An entity can be a sub-class of multiple entity types; such entities are
sub-class of multiple entities and have multiple super-classes;
Teaching Assistant can subclass of Employee and Student both. A
faculty in a university system can be a subclass of Employee and
Alumnus. In multiple inheritances, attributes of sub-class are the
union of attributes of all super-classes.
Union –
Set of Library Members is UNION of Faculty, Student, and Staff. A
union relationship indicates either type; for example, a library
member is either Faculty or Staff or Student.
Below are two examples that show how UNION can be depicted in
ERD – Vehicle Owner is UNION of PERSON and Company, and
RTO Registered Vehicle is UNION of Car and Truck.
You might see some confusion in Sub-class and UNION; consider an
example in above figure Vehicle is super-class of CAR and Truck;
this is very much the correct example of the subclass as well but here
use it differently we are saying RTO Registered vehicle is UNION of
Car and Vehicle, they do not inherit any attribute of Vehicle,
attributes of car and truck are altogether independent set, where is in
sub-classing situation car and truck would be inheriting the attribute
of vehicle class.
An Enhanced Entity-Relationship (EER) model is an extension of the
original Entity-Relationship (ER) model that includes additional
concepts and features to support more complex data modeling
requirements. The EER model includes all the elements of the ER
model and adds new constructs, such as subtypes and supertypes,
generalization and specialization, and inheritance.
Here are some of the key features of the EER model:
Subtypes and Supertypes: The EER model allows for the creation of
subtypes and supertypes. A supertype is a generalization of one or
more subtypes, while a subtype is a specialization of a supertype. For
example, a vehicle could be a supertype, while car, truck, and
motorcycle could be subtypes.
Generalization and Specialization: Generalization is the process of
identifying common attributes and relationships between entities and
creating a supertype based on these common features. Specialization
is the process of identifying unique attributes and relationships
between entities and creating subtypes based on these unique features.
Inheritance: Inheritance is a mechanism that allows subtypes to inherit
attributes and relationships from their supertype. This means that any
attribute or relationship defined for a supertype is automatically
inherited by all its subtypes.
Constraints: The EER model allows for the specification of
constraints that must be satisfied by entities and relationships.
Examples of constraints include cardinality constraints, which specify
the number of relationships that can exist between entities, and
participation constraints, which specify whether an entity is required
to participate in a relationship.
Overall, the EER model provides a powerful and flexible way to
model complex data relationships, making it a popular choice for
database design. An Enhanced Entity-Relationship (EER) model is an
extension of the traditional Entity-Relationship (ER) model that
includes additional features to represent complex relationships
between entities more accurately. Some of the main features of the
EER model are:
Subclasses and Superclasses: EER model allows for the creation of a
hierarchical structure of entities where a superclass can have one or
more subclasses. Each subclass inherits attributes and relationships
from its superclass, and it can also have its unique attributes and
relationships.
Specialization and Generalization: EER model uses the concepts of
specialization and generalization to create a hierarchy of entities.
Specialization is the process of defining subclasses from a superclass,
while generalization is the process of defining a superclass from two
or more subclasses.
Attribute Inheritance: EER model allows attributes to be inherited
from a superclass to its subclasses. This means that attributes defined
in the superclass are automatically inherited by all its subclasses.
Union Types: EER model allows for the creation of a union type,
which is a combination of two or more entity types. The union type
can have attributes and relationships that are common to all the entity
types that make up the union.
Aggregation: EER model allows for the creation of an aggregate
entity that represents a group of entities as a single entity. The
aggregate entity has its unique attributes and relationships.
Multi-valued Attributes: EER model allows an attribute to have
multiple values for a single entity instance. For example, an entity
representing a person may have multiple phone numbers.
Relationships with Attributes: EER model allows relationships
between entities to have attributes. These attributes can describe the
nature of the relationship or provide additional information about the
relationship.
Overall, these features make the EER model more expressive and
powerful than the traditional ER model, allowing for a more accurate
representation of complex relationships between entities.

Indexing in DBMS
Indexing is used to optimize the performance of a database by
minimizing the number of disk accesses required when a query is
processed.
The index is a type of data structure. It is used to locate and access
the data in a database table quickly.
Index structure:
Indexes can be created using some database columns.

The first column of the database is the search key that contains a
copy of the primary key or candidate key of the table. The values of
the primary key are stored in sorted order so that the corresponding
data can be accessed easily.
The second column of the database is the data reference. It contains a
set of pointers holding the address of the disk block where the value
of the particular key can be found.
Indexing Methods
Ordered indices
The indices are usually sorted to make searching faster. The indices
which are sorted are known as ordered indices.
Example: Suppose we have an employee table with thousands of
record and each of which is 10 bytes long. If their IDs start with 1, 2,
3....and so on and we have to search student with ID-543.
In the case of a database with no index, we have to search the disk
block from starting till it reaches 543. The DBMS will read the
record after reading 543*10=5430 bytes.
In the case of an index, we will search using indexes and the DBMS
will read the record after reading 542*2= 1084 bytes which are very
less compared to the previous case.
Primary Index
If the index is created on the basis of the primary key of the table,
then it is known as primary indexing. These primary keys are unique
to each record and contain 1:1 relation between the records.
As primary keys are stored in sorted order, the performance of the
searching operation is quite efficient.
The primary index can be classified into two types: Dense index and
Sparse index.
Dense index
The dense index contains an index record for every search key value
in the data file. It makes searching faster.
In this, the number of records in the index table is same as the
number of records in the main table.
It needs more space to store index record itself. The index records
have the search key and a pointer to the actual record on the disk.
Sparse index
In the data file, index record appears only for a few items. Each item
points to a block.
In this, instead of pointing to each record in the main table, the index
points to the records in the main table in a gap.

Clustering Index
A clustered index can be defined as an ordered data file. Sometimes
the index is created on non-primary key columns which may not be
unique for each record.
In this case, to identify the record faster, we will group two or more
columns to get the unique value and create index out of them. This
method is called a clustering index.
The records which have similar characteristics are grouped, and
indexes are created for these group.
Example: suppose a company contains several employees in each
department. Suppose we use a clustering index, where all employees
which belong to the same Dept_ID are considered within a single
cluster, and index pointers point to the cluster as a whole. Here
Dept_Id is a non-unique key.
The previous schema is little confusing because one disk block is
shared by records which belong to the different cluster. If we use
separate disk block for separate clusters, then it is called better
technique.
Secondary Index
In the sparse indexing, as the size of the table grows, the size of
mapping also grows. These mappings are usually kept in the primary
memory so that address fetch should be faster. Then the secondary
memory searches the actual data based on the address got from
mapping. If the mapping size grows then fetching the address itself
becomes slower. In this case, the sparse index will not be efficient.
To overcome this problem, secondary indexing is introduced.
In secondary indexing, to reduce the size of mapping, another level
of indexing is introduced. In this method, the huge range for the
columns is selected initially so that the mapping size of the first level
becomes small. Then each range is further divided into smaller
ranges. The mapping of the first level is stored in the primary
memory, so that address fetch is faster. The mapping of the second
level and actual data are stored in the secondary memory (hard disk).

For example:
If you want to find the record of roll 111 in the diagram, then it will
search the highest entry which is smaller than or equal to 111 in the
first level index. It will get 100 at this level.
Then in the second index level, again it does max (111) <= 111 and
gets 110. Now using the address 110, it goes to the data block and
starts searching each record till it gets 111.
This is how a search is performed in this method. Inserting, updating
or deleting is also done in the same manner.
Dynamic multilevel indexes are a data structure used in database
systems to efficiently organize and search large volumes of data. B+
trees and B- trees are two types of tree structures commonly used to
implement indexes in databases.
A B+ tree is a balanced tree data structure where each node contains a
sorted list of keys and pointers to child nodes. In a B+ tree, data
entries are stored only in leaf nodes, and non-leaf nodes are used for
navigation purposes. B+ trees are often used for indexing in databases
because they provide efficient range queries and support ordered
traversal of data.
A B- tree is similar to a B+ tree, but it allows more keys to be stored
in each node, which reduces the height of the tree and improves
performance. In a B- tree, data entries can be stored in both leaf and
non-leaf nodes, unlike B+ trees where only leaf nodes contain data
entries.
To implement dynamic multilevel indexes using B+ and B- trees, you
can use a combination of these tree structures at different levels. For
example, you can use a B+ tree as the top-level index to quickly
locate the appropriate B- tree for a given range of keys. Each B- tree
can then contain pointers to the actual data entries or further levels of
B+ or B- trees, depending on the size and distribution of the data.
This multilevel approach allows for efficient indexing and searching
of large datasets by reducing the height of the tree and minimizing the
number of disk accesses required to locate data entries. Additionally,
the use of both B+ and B- trees provides flexibility in handling
different types of queries and data distributions.
Top of Form
UNIT 2
Types of models(Already in UNIT 1)
Constraints on Relational Database Model

In modeling the design of the relational database we can put some


restrictions like what values are allowed to be inserted in the relation,
and what kind of modifications and deletions are allowed in the
relation. These are the restrictions we impose on the relational
database.
In models like Entity-Relationship models, we did not have such
features. Database Constraints can be categorized into 3 main
categories:
Constraints that are applied in the data model are called Implicit
Constraints.
Constraints that are directly applied in the schemas of the data model,
by specifying them in the DDL(Data Definition Language). These are
called Schema-Based Constraints or Explicit Constraints.
Constraints that cannot be directly applied in the schemas of the data
model. We call these Application-based or Semantic Constraints.
So here we are going to deal with Implicit constraints.
Relational Constraints
These are the restrictions or sets of rules imposed on the database
contents. It validates the quality of the database. It validates the
various operations like data insertion, updation, and other processes
that have to be performed without affecting the integrity of the data. It
protects us against threats/damages to the database. Mainly
Constraints on the relational database are of 4 types
Domain constraints
Key constraints or Uniqueness Constraints
Entity Integrity constraints
Referential integrity constraints

Types of Relational Constraints


Let’s discuss each of the above constraints in detail.
1. Domain Constraints
Every domain must contain atomic values(smallest indivisible units)
which means composite and multi-valued attributes are not allowed.
We perform a datatype check here, which means when we assign a
data type to a column we limit the values that it can contain. Eg. If we
assign the datatype of attribute age as int, we can’t give it values other
than int datatype.
Example:

EID Name Phone


Explanation: In the
above relation,
123456789 Name is a composite
01 Bikash Dutta
234456678 attribute and Phone
is a multi-values
attribute, so it is
violating domain constraint.
2. Key Constraints or Uniqueness Constraints
These are called uniqueness constraints since it ensures that every
tuple in the relation should be unique.
A relation can have multiple keys or candidate keys(minimal
superkey), out of which we choose one of the keys as the primary key,
we don’t have any restriction on choosing the primary key out of
candidate keys, but it is suggested to go with the candidate key with
less number of attributes.
Null values are not allowed in the primary key, hence Not Null
constraint is also part of the key constraint.
Example:

EID Name Phone

01 Bikash 6000000009

02 Paul 9000090009

01 Tuhin 9234567892
Explanation: In the above table, EID is the primary key, and the first
and the last tuple have the same value in EID ie 01, so it is violating
the key constraint.
3. Entity Integrity Constraints
Entity Integrity constraints say that no primary key can take a NULL
value, since using the primary key we identify each tuple uniquely in
a relation.
Example:

EID Name Phone

01 Bikash 9000900099

02 Paul 600000009

NULL Sony 9234567892

Explanation: In the above relation, EID is made the primary key, and
the primary key can’t take NULL values but in the third tuple, the
primary key is null, so it is violating Entity Integrity constraints.
4. Referential Integrity Constraints
The Referential integrity constraint is specified between two relations
or tables and used to maintain the consistency among the tuples in
two relations.
This constraint is enforced through a foreign key, when an attribute in
the foreign key of relation R1 has the same domain(s) as the primary
key of relation R2, then the foreign key of R1 is said to reference or
refer to the primary key of relation R2.
The values of the foreign key in a tuple of relation R1 can either take
the values of the primary key for some tuple in relation R2, or can
take NULL values, but can’t be empty.
Example:

EID Name DNO

01 Divine 12

02 Dino 22

04 Vivian 14

DNO Place

12 Jaipur

13 Mumbai

14 Delhi

Explanation: In the above tables, the DNO of Table 1 is the foreign


key, and DNO in Table 2 is the primary key. DNO = 22 in the foreign
key of Table 1 is not allowed because DNO = 22 is not defined in the
primary key of table 2. Therefore, Referential integrity constraints are
violated here.
Advantages of Relational Database Model
It is simpler than the hierarchical model and network model.
It is easy and simple to understand.
Its structure can be changed anytime upon requirement.
Data Integrity: The relational database model enforces data integrity
through various constraints such as primary keys, foreign keys, and
unique constraints. This ensures that the data in the database is
accurate, consistent, and valid.
Flexibility: The relational database model is highly flexible and can
handle a wide range of data types and structures. It also allows for
easy modification and updating of the data without affecting other
parts of the database.
Scalability: The relational database model can scale to handle large
amounts of data by adding more tables, indexes, or partitions to the
database. This allows for better performance and faster query
response times.
Security: The relational database model provides robust security
features to protect the data in the database. These include user
authentication, authorization, and encryption of sensitive data.
Data consistency: The relational database model ensures that the data
in the database is consistent across all tables. This means that if a
change is made to one table, the corresponding changes will be made
to all related tables.
Query Optimization: The relational database model provides a query
optimizer that can analyze and optimize SQL queries to improve their
performance. This allows for faster query response times and better
scalability.
Disadvantages of the Relational Model
Few database relations have certain limits which can’t be expanded
further.
It can be complex and it becomes hard to use.
Complexity: The relational model can be complex and difficult to
understand, particularly for users who are not familiar with SQL and
database design principles. This can make it challenging to set up and
maintain a relational database.
Performance: The relational model can suffer from performance
issues when dealing with large data sets or complex queries. In
particular, joins between tables can be slow, and indexing strategies
can be difficult to optimize.
Scalability: While the relational model is generally scalable, it can
become difficult to manage as the database grows in size. Adding new
tables or indexes can be time-consuming, and managing relationships
between tables can become complex.
Cost: Relational databases can be expensive to license and maintain,
particularly for large-scale deployments. Additionally, relational
databases often require dedicated hardware and specialized software
to run, which can add to the cost.
Limited flexibility: The relational model is designed to work with
tables that have predefined structures and relationships. This can
make it difficult to work with data that does not fit neatly into a table-
based format, such as unstructured or semi-structured data.
Data redundancy: In some cases, the relational model can lead to data
redundancy, where the same data is stored in multiple tables. This can
lead to inefficiencies and can make it difficult to ensure data
consistency across the database.

Introduction of Relational Algebra in DBMS

Pre-Requisite: Relational Model in DBMS


Relational Algebra is a procedural query language. Relational algebra
mainly provides a theoretical foundation for relational databases
and SQL. The main purpose of using Relational Algebra is to define
operators that transform one or more input relations into an output
relation. Given that these operators accept relations as input and
produce relations as output, they can be combined and used to express
potentially complex queries that transform potentially many input
relations (whose data are stored in the database) into a single output
relation (the query results). As it is pure mathematics, there is no use
of English Keywords in Relational Algebra and operators are
represented using symbols.
Fundamental Operators
These are the basic/fundamental operators used in Relational Algebra.
Selection(σ)
Projection(π)
Union(U)
Set Difference(-)
Set Intersection(∩)
Rename(ρ)
Cartesian Product(X)
1. Selection(σ): It is used to select required tuples of the relations.
Example:

A B C

1 2 4

2 2 3

3 2 3

4 3 4

For the above relation, σ(c>3)R will select the tuples which have c
more than 3.
A B C

1 2 4

4 3 4

Note: The selection operator only selects the required tuples but does
not display them. For display, the data projection operator is used.
2. Projection(π): It is used to project required column data from a
relation.
Example: Consider Table 1. Suppose we want columns B and C from
Relation R.
π(B,C)R will show following columns.

B C

2 4

2 3

3 4

Note: By Default, projection removes duplicate data.


3. Union(U): Union operation in relational algebra is the same as
union operation in set theory.
Example:

FRENCH
Student_Name Roll_Number

Ram 01

Mohan 02

Vivek 13

Geeta 17

GERMAN

Student_Name Roll_Number

Vivek 13

Geeta 17

Shyam 21

Rohan 25

Consider the following table of Students having different optional


subjects in their course.
π(Student_Name)FRENCH U π(Student_Name)GERMAN

Student_Name

Ram
Student_Name

Mohan

Vivek

Geeta

Shyam

Rohan

Note: The only constraint in the union of two relations is that both
relations must have the same set of Attributes.
4. Set Difference(-): Set Difference in relational algebra is the same
set difference operation as in set theory.
Example: From the above table of FRENCH and GERMAN, Set
Difference is used as follows
π(Student_Name)FRENCH - π(Student_Name)GERMAN

Student_Name

Ram

Mohan

Note: The only constraint in the Set Difference between two relations
is that both relations must have the same set of Attributes.
5. Set Intersection(∩): Set Intersection in relational algebra is the
same set intersection operation in set theory.
Example: From the above table of FRENCH and GERMAN, the Set
Intersection is used as follows
π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN

Student_Name
Note: The only constraint in the Set Difference
between two relations is that both relations must
Vivek have the same set of Attributes.
6. Rename(ρ): Rename is a unary operation used for
Geeta renaming attributes of a relation.
ρ(a/b)R will rename the attribute 'b' of the relation
by 'a'.
7. Cross Product(X): Cross-product between two relations. Let’s say
A and B, so the cross product between A X B will result in all the
attributes of A followed by each attribute of B. Each record of A will
pair with every record of B.
Example:

Name Age Sex

Ram 14 M

Sona 15 F

Kim 20 M

B
ID Course

1 DS

2 DBMS

AXB

Name Age Sex ID Course

Ram 14 M 1 DS

Ram 14 M 2 DBMS

Sona 15 F 1 DS

Sona 15 F 2 DBMS

Kim 20 M 1 DS

Kim 20 M 2 DBMS

Note: If A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘
n*m ‘ tuples.
Derived Operators
These are some of the derived operators, which are derived from the
fundamental operators.
Natural Join(⋈)
Conditional Join
1. Natural Join(⋈): Natural join is a binary operator. Natural join
between two or more relations will result in a set of all combinations
of tuples where they have an equal common attribute.
Example:

EMP

Name ID Dept_Name

A 120 IT

B 125 HR

C 110 Sales

D 111 IT

DEPT

Dept_Name Manager

Sales Y

Production Z

IT A

Natural join between EMP and DEPT with condition :


EMP.Dept_Name = DEPT.Dept_Name
EMP ⋈
DEPT

Name ID Dept_Name Manager

A 120 IT A

C 110 Sales Y

D 111 IT A

2. Conditional Join: Conditional join works similarly to natural join.


In natural join, by default condition is equal between common
attributes while in conditional join we can specify any condition such
as greater than, less than, or not equal.
Example:

ID Sex Marks

1 F 45

2 F 55

3 F 60

S
ID Sex Marks

10 M 20

11 M 22

12 M 59

Join between R and S with condition R.marks >= S.marks

R.ID R.Sex R.Marks S.ID S.Sex S.Marks

1 F 45 10 M 20

1 F 45 11 M 22

2 F 55 10 M 20

2 F 55 11 M 22

3 F 60 10 M 20

3 F 60 11 M 22

3 F 60 12 M 59

Relational calculus, a non-procedural query language in database


management systems, guides users on what data is needed without
specifying how to obtain it. Commonly utilized in commercial
relational languages like SQL-QBE and QUEL, relational calculus
ensures a focus on desired data without delving into procedural
details, promoting a more efficient and abstract approach to querying
in relational databases.
What is Relational Calculus?
Before understanding Relational calculus in DBMS, we need to
understand Procedural Language and Declarative Langauge.
Procedural Language - Those Languages which clearly define how to
get the required results from the Database are called Procedural
Language. Relational algebra is a Procedural Language.
Declarative Language - Those Language that only cares about What
to get from the database without getting into how to get the results are
called Declarative Language. Relational Calculus is a Declarative
Language
In the context of Database Management Systems (DBMS), TRC
(Tuple Relational Calculus) serves as a formal language for
specifying queries over relational databases. While SQL (Structured
Query Language) is the most widely used query language in practical
database systems, TRC is often used in theoretical contexts and for
formal database design.
In TRC, queries are expressed as formulas over sets of tuples, where
the result of a query is a set of tuples that satisfy certain conditions.
TRC provides a declarative way to specify what data should be
retrieved, without specifying how to obtain it.
Here's a basic overview of TRC in DBMS:
Variables: TRC uses variables to represent tuples in relations. These
variables range over the tuples in the relations being queried.
Formulas: TRC queries are formulated as logical formulas using
quantifiers (existential and universal), conjunctions, disjunctions, and
negations. These formulas specify conditions that the desired tuples
must satisfy.
Quantifiers: TRC supports existential (∃) and universal (∀)
quantifiers. These quantifiers are used to express conditions involving
variables.
Operators: TRC queries can use comparison operators (=, ≠, <, >, ≤,
≥) and other logical operators (AND, OR, NOT) to build complex
conditions.
Examples:
{T | ∃S (R(T, S) ∧ S > 100)}: This query retrieves all tuples T from
relation R such that there exists a tuple S where R(T, S) is true and S
is greater than 100.
{T | R(T) ∧ ¬(∃S (S < 50 ∧ R(T, S)))}: This query retrieves all tuples
T from relation R such that there is no tuple S where R(T, S) is true
and S is less than 50.
TRC is valuable for understanding the theoretical foundations of
relational databases, as it provides a formal way to reason about
queries and their results. However, in practical database systems, SQL
is the standard language for interacting with relational databases due
to its widespread adoption and expressive power.
Top of Form

Domain Relational Calculus (DRC) is a formal query language used


to retrieve data from relational databases. Unlike Tuple Relational
Calculus (TRC), which specifies what data to retrieve (declarative
approach), DRC specifies what data to retrieve based on the domain
of values (also declarative but from a different perspective).
In DRC, queries are expressed in terms of variables ranging over
domain elements rather than tuples. It describes the desired result
without specifying how to obtain it, leaving the database management
system to figure out the most efficient way to execute the query.
A typical query in DRC might look like:
scss
Copy code
{ x | R(x) ∧ P(x) }
Here, R(x) and P(x) are predicates over the domain, and x is a
variable ranging over the domain elements. This query retrieves all
elements x from the domain for which both R(x) and P(x) are true.
DRC is often used in theoretical discussions of relational databases
and query languages, but in practice, SQL (Structured Query
Language) is more commonly used to interact with relational
databases due to its widespread adoption and practicality.
Top of Form

DDL (Data Definition Language) commands are used to define,


modify, and delete database structures such as tables, indexes, and
views. Here are some common DDL commands in SQL:
CREATE: Used to create new database objects like tables, views,
indexes, etc.
sql
Copy code
CREATE TABLE table_name ( column1 datatype, column2 datatype,
... );
ALTER: Modifies existing database objects.
sql
Copy code
ALTER TABLE table_name ADD column_name datatype; ALTER
TABLE table_name DROP COLUMN column_name;
DROP: Deletes existing database objects.
sql
Copy code
DROP TABLE table_name; DROP INDEX index_name;
TRUNCATE: Removes all records from a table, but keeps the
structure intact.
sql
Copy code
TRUNCATE TABLE table_name;
COMMENT: Adds comments to the data dictionary.
sql
Copy code
COMMENT ON TABLE table_name IS 'Description of the table';
COMMENT ON COLUMN table_name.column_name IS
'Description of the column';
RENAME: Renames an existing database object.
sql
Copy code
ALTER TABLE table_name RENAME TO new_table_name;
These commands are essential for database administrators and
developers to manage the structure and integrity of the database. They
allow for creating, altering, and dropping database objects, ensuring
that the database schema meets the requirements of the application.
Top of Form

Adds new records into a table.


sql
Copy code
INSERT INTO table_name (column1, column2, ...) VALUES
(value1, value2, ...);
UPDATE: Modifies existing records in a table.
sql
Copy code
UPDATE table_name SET column1 = value1, column2 = value2, ...
WHERE condition;
DELETE: Removes records from a table.
sql
Copy code
DELETE FROM table_name WHERE condition;
MERGE: Combines INSERT, UPDATE, and DELETE operations
into one statement, typically used for data synchronization between
tables.
sql
Copy code
MERGE INTO target_table USING source_table ON condition
WHEN MATCHED THEN UPDATE SET column1 = value1,
column2 = value2 WHEN NOT MATCHED THEN INSERT
(column1, column2, ...) VALUES (value1, value2, ...);
CALL: Invokes a stored procedure or a user-defined function.
sql
Copy code
CALL procedure_name(arguments);
These commands are fundamental for manipulating data within a
database. They allow for retrieving, adding, updating, and deleting
records, enabling the manipulation and maintenance of data to meet
the requirements of the application.
Top of Form

DCL (Data Control Language) commands are used to manage access


permissions and control the security aspects of a database system. The
two primary DCL commands in SQL are:
GRANT: Allows users to perform specified tasks or privileges on
database objects.
sql
Copy code
GRANT privilege(s) ON object TO user;
For example:
sql
Copy code
GRANT SELECT, INSERT ON table_name TO user1;
REVOKE: Removes privileges from users.
sql
Copy code
REVOKE privilege(s) ON object FROM user;
For example:
sql
Copy code
REVOKE SELECT ON table_name FROM user1;
These commands are crucial for managing the security and access
control within a database system. They enable administrators to grant
specific permissions to users or roles and revoke those permissions
when necessary, ensuring that the database remains secure and only
authorized users can access and manipulate the data.
What Are Constraints In DBMS?
Constraints in DBMS (Database Management Systems) encompass
rules or conditions applied to database data, ensuring data integrity,
consistency, and adherence to business rules. They define limitations
and requirements that data must meet, preventing the entry of invalid
or inconsistent data. Constraints serve as pre-established rules
governing the behavior and relationships of data in a database,
contributing to the maintenance of accuracy and reliability.
The purpose of constraints is to enforce data quality and thwart data
inconsistencies, thereby bolstering the overall data integrity and
reliability of the database. Constraints set boundaries for data values,
relationships between entities, uniqueness requirements, and more.
Through constraint enforcement, DBMS ensures data conforms to
predefined standards and business rules, fortifying the database’s
robustness and reliability.
Types Of Constraints In DBMS
Within relational databases, there are primarily five types of
constraints, commonly referred to as relational constraints. These
include:
Domain Constraints
Key Constraints
Entity Integrity Constraints
Referential Integrity Constraints
Tuple Uniqueness Constraints
Domain Constraints
Define the valid values that an attribute (column) can hold.
Specify the data type (e.g., integer, string, date) and any additional
restrictions (e.g., range of values, allowed patterns).
Types of Domain Constraints
Check Constraint: A check constraint in a Database Management
System (DBMS) is a way to enforce certain conditions on the values
that are stored in a database. It is a rule or condition that is specified
at the time of table creation or alteration to restrict the values that can
be inserted or updated in a column.
Here’s a basic syntax for creating a check constraint in a table using
SQL (Structured Query Language):
CREATE TABLE TableName (
Column1 DataType,
Column2 DataType,
-- Other columns
CONSTRAINT ConstraintName CHECK (condition)
);
In the above syntax, the condition is the expression or condition that
must be satisfied for the data in the specified column(s).
Example: Here’s an example to illustrate the concept. Let’s say we
have a table for storing employee information, and we want to ensure
that his Salary is always greater than zero and the Department is one
of the specified values.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Salary DECIMAL(10, 2) CHECK (Salary > 0),
Department VARCHAR(50) CHECK (Department IN ('HR', 'IT',
'Finance'))
);
With this check constraint, any attempt to insert or update a record in
the Employee table with a Salary less than 0 or a Department other
than HR, IT, or Finance will result in a constraint violation error.
2. Not Null Constraint: The NOT NULL constraint in a Database
Management System (DBMS) is used to ensure that a column in a
table cannot contain any NULL values. NULL is a special marker
used in databases to indicate that a data value does not exist in the
database. The NOT NULL constraint ensures that a column always
has a value, and it cannot be left empty.
Here’s the basic syntax for creating a NOT NULL constraint while
defining a table using SQL (Structured Query Language):
CREATE TABLE TableName (
Column1 DataType NOT NULL,
Column2 DataType,
-- Other columns
);
In this syntax:
TableName is the name of the table.
Column 1, Column 2, and so on are the columns in the table.
DataType is the data type of the column.
The NOT NULL constraint is used after the data type to indicate that
the column cannot contain NULL values.
Example: Here’s a simple example using an “Employees” table where
the “FirstName” column cannot be NULL:
CREATE TABLE Employees (
EmployeeID INT,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50),
-- Other columns
);
With this NOT NULL constraint, every record in the “Employees”
table must have a non-null value in the “FirstName” column. If an
attempt is made to insert a record without specifying a value for
“FirstName” or updating an existing record to set “FirstName” to
NULL, it will result in a constraint violation error.
Key Constraints
A key constraint in a Database Management System (DBMS) refers
to a set of rules applied to one or more columns in a database table to
ensure the uniqueness and integrity of data. Keys are used to uniquely
identify rows in a table, and they play a fundamental role in
establishing relationships between tables. There are several types of
key constraints, each serving a specific purpose:
Primary Key Constraint:
A primary key is a column or a set of columns that uniquely identifies
each row in a table.
The primary key constraint ensures that the values in the specified
columns are unique and not NULL.
There can be only one primary key in a table.
Example: Let’s consider a simple example of a primary key constraint
in a relational database using a Students table:
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Age INT,
-- Other columns
);
In this example:
StudentID is the primary key for the Students table.
FirstName, LastName, Age, and other columns are attributes
associated with each student.
Here’s how this primary key constraint works:
a. Inserting Data:
-- Inserting two students with unique StudentID values
INSERT INTO Students (StudentID, FirstName, LastName, Age)
VALUES (1, 'John', 'Doe', 20);

INSERT INTO Students (StudentID, FirstName, LastName, Age)


VALUES (2, 'Jane', 'Smith', 22);
This is valid because each StudentID is unique.
-- Trying to insert a student with a duplicate StudentID
INSERT INTO Students (StudentID, FirstName, LastName, Age)
VALUES (1, 'Bob', 'Johnson', 21);
This would result in a constraint violation error since the primary key
must be unique, and ‘1’ is already used.
2. Unique Constraint:
A unique constraint ensures that all values in a column or a set of
columns are unique.
Unlike the primary key, a unique constraint allows NULL values.
Example: Suppose we have a Products table where we want to ensure
that each product has a unique product code:
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductCode VARCHAR(20) UNIQUE,
ProductName VARCHAR(100),
Price DECIMAL(10, 2),
-- Other columns
);
In this example:
ProductID is the primary key that uniquely identifies each product.
ProductCode is a column with a unique key constraint.
The UNIQUE constraint on the ProductCode column ensures that
each value in this column must be unique across all rows in the
Products table. It means that we cannot have two products with the
same product code.
Here’s how this unique key constraint works:
a. Inserting Data:
-- Inserting two products with different product codes
INSERT INTO Products (ProductID, ProductCode, ProductName,
Price)
VALUES (1, 'P001', 'Product A', 49.99);

INSERT INTO Products (ProductID, ProductCode, ProductName,


Price)
VALUES (2, 'P002', 'Product B', 29.99);
This is valid because the product codes (‘P001’ and ‘P002’) are
unique.
-- Trying to insert a product with a duplicate product code
INSERT INTO Products (ProductID, ProductCode, ProductName,
Price)
VALUES (3, 'P001', 'Product C', 19.99);
This would result in a constraint violation error since ‘P001’ is
already used as a product code.
3. Foreign Key Constraint:
A foreign key is a column or a set of columns in a table that refers to
the primary key of another table.
It establishes a relationship between the two tables, enforcing
referential integrity.
The foreign key constraint ensures that values in the foreign key
column(s) match values in the referenced primary key column(s).
The foreign key constraint ensures referential integrity, meaning that
relationships between tables are maintained, and it helps prevent
inconsistencies in the data. It’s a powerful tool for enforcing
relationships between tables in a relational database.
Example: Let’s consider a simple example to illustrate the use of a
foreign key constraint in a relational database. Suppose we have two
tables: Customers and Orders.
Customers Table:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Email VARCHAR(100) UNIQUE,
-- Other columns
);
2. Orders Table:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
TotalAmount DECIMAL(10, 2),
-- Other columns
FOREIGN KEY (CustomerID) REFERENCES
Customers(CustomerID)
);
In this example:
In the Customers table, CustomerID is the primary key that uniquely
identifies each customer. It is also referenced by the foreign key in the
Orders table.
In the Orders table, OrderID is the primary key that uniquely
identifies each order. The CustomerID column is a foreign key that
establishes a relationship with the Customers table.
The FOREIGN KEY (CustomerID) REFERENCES
Customers(CustomerID) statement in the Orders table specifies that
the CustomerID column in the Orders table is a foreign key, and it
must refer to a valid CustomerID in the Customers table.
Here’s how this foreign key constraint works:
a. Inserting Data:
-- Insert a customer
INSERT INTO Customers (CustomerID, FirstName, LastName,
Email)
VALUES (1, 'John', 'Doe', 'john.doe@example.com');

-- Insert an order referencing the customer


INSERT INTO Orders (OrderID, CustomerID, OrderDate,
TotalAmount)
VALUES (101, 1, '2024-01-06', 150.00);
This ensures that when you insert an order into the Orders table, the
CustomerID must exist in the Customers table.
b. Updating Data:
-- Update the CustomerID in the Orders table
UPDATE Orders
SET CustomerID = 2
WHERE OrderID = 101;
If you try to update the CustomerID in the Orders table to a value that
does not exist in the Customers table, it will violate the foreign key
constraint.
c. Deleting Data:
-- Delete a customer
DELETE FROM Customers WHERE CustomerID = 1;
If you try to delete a customer who has orders in the Orders table, it
will violate the foreign key constraint. Typically, you need to handle
such cases by either preventing deletion or cascading the deletion to
related records.
Entity Integrity Constraint
The Entity Integrity Constraint is essentially a subset of the Key
constraint in a database. While the Key constraint ensures that
Primary Key attributes are unique and non-null, the Entity Integrity
Constraint specifically emphasizes that no attribute of a Primary Key
should contain null values. This constraint highlights the perspective
that allowing null values in Primary Key attributes could lead to
multiple null entries, violating the uniqueness requirement for each
tuple in the Primary Key. Therefore, the Entity Integrity Constraint
reinforces the importance of non-null values within the Primary Key
to maintain the uniqueness of each record in a relational database.
Example: Let’s consider an example to illustrate entity constraints in
a database. Assume we have a table named Employees with the
following structure:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
Email VARCHAR(100) UNIQUE,
-- Other columns
);
Now, let’s insert some data to demonstrate the entity integrity
constraints:
-- Valid data insertion
INSERT INTO Employees (EmployeeID, FirstName, LastName,
Email)
VALUES (1, 'John', 'Doe', 'john.doe@example.com');
-- Invalid data insertion (violates NOT NULL constraint)
INSERT INTO Employees (EmployeeID, FirstName, LastName,
Email)
VALUES (2, NULL, 'Smith', 'smith@example.com');
-- This would result in an error since FirstName cannot be NULL

-- Invalid data insertion (violates PRIMARY KEY constraint)


INSERT INTO Employees (EmployeeID, FirstName, LastName,
Email)
VALUES (1, 'Jane', 'Doe', 'jane.doe@example.com');
-- This would result in an error since EmployeeID must be unique

-- Valid data insertion


INSERT INTO Employees (EmployeeID, FirstName, LastName,
Email)
VALUES (2, 'Alice', 'Johnson', 'alice@example.com');
In this example, the entity integrity constraint ensures that the
EmployeeID attribute, serving as the primary key, cannot contain null
values. Attempting to insert data with a null FirstName or violating
the uniqueness of the primary key results in constraint violation
errors, thereby enforcing the entity integrity of the database.

Rerential Integrity Constraint


Referential integrity in a database is a crucial concept ensuring data
consistency among related tables through primary and foreign keys.
The referential integrity constraint is established when a foreign key
references the primary key of another table, requiring the referencing
attribute to be a subset of the referred attribute.
This ensures that records cannot be inserted in the referencing relation
unless they exist in the referenced relation. Furthermore, any record
present in the referencing relation cannot be updated or deleted from
the referenced relation, maintaining the accuracy and coherence of the
relational database.
Example: Let’s consider an example with two tables: Orders and
Customers.
-- Customers table
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Email VARCHAR(100) UNIQUE,
-- Other columns
);

-- Orders table
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
TotalAmount DECIMAL(10, 2),
FOREIGN KEY (CustomerID) REFERENCES
Customers(CustomerID)
);
Now, let’s perform some operations to demonstrate referential
integrity:
-- Valid data insertion
INSERT INTO Customers (CustomerID, FirstName, LastName,
Email)
VALUES (1, 'John', 'Doe', 'john.doe@example.com');

INSERT INTO Orders (OrderID, CustomerID, OrderDate,


TotalAmount)
VALUES (101, 1, '2024-01-06', 150.00);
-- Invalid data insertion (violates foreign key constraint)
INSERT INTO Orders (OrderID, CustomerID, OrderDate,
TotalAmount)
VALUES (102, 2, '2024-01-07', 100.00);
-- This would result in an error since CustomerID 2 does not exist in
the Customers table

-- Valid data update


UPDATE Customers
SET FirstName = 'Jane'
WHERE CustomerID = 1;

-- Invalid data update (violates foreign key constraint)


UPDATE Customers
SET CustomerID = 3
WHERE CustomerID = 1;
-- This would result in an error since it would create an inconsistency
with the referenced CustomerID in the Orders table

-- Valid data deletion


DELETE FROM Customers WHERE CustomerID = 1;
-- This is valid, and the corresponding order (OrderID 101) should be
handled appropriately (e.g., by cascading deletion or setting null in
the foreign key)

-- Invalid data deletion (violates foreign key constraint)


DELETE FROM Customers WHERE CustomerID = 2;
-- This would result in an error since there is a referenced Order with
CustomerID 2 in the Orders table
In this example, the referential integrity constraint ensures that the
relationship between the Orders and Customers tables is maintained.
It prevents inserting, updating, or deleting data in a way that would
create inconsistencies in the relationships between these tables.
Define Integrity Constraints for Tables
To maintain data integrity in a database, you can either define rules,
defaults, indexes, and triggers; or, you can define create table integrity
constraints.
The method select depends on your requirements. Integrity constraints
offer the advantages of defining integrity controls in one step during
the table creation process (as defined by the SQL standards) and of
simplifying the process to create those integrity controls. However,
integrity constraints are more limited in scope and less comprehensive
than defaults, rules, indexes, and triggers.
For example, triggers provide more complex handling of referential
integrity than those declared in create table. The integrity constraints
defined by a create table are specific to that table; you cannot bind
them to other tables, and you can only drop or change them
using alter table. Constraints cannot contain subqueries or aggregate
functions, even on the same table.
The two methods are not mutually exclusive. You can use integrity
constraints along with defaults, rules, indexes, and triggers. This gives
you the flexibility to choose the best method for your application.
This section describes the create table integrity constraints. Defaults,
rules, indexes, and triggers are described in later chapters.
You can create these types of constraints:
unique and primary key constraints require that no two rows in a table
have the same values in the specified columns. In addition, a primary
key constraint does not allow a null value in any row of the column.
Referential integrity (references) constraints require that data being
inserted in specific columns already has matching data in the
specified table and columns. Use sp_helpconstraint to find a table’s
referenced tables.
check constraints limit the values of data inserted into columns.
In SQL, integrity constraints ensure the accuracy and consistency of
data in a database. The ALTER TABLE command is used to modify
an existing table, including adding, modifying, or dropping integrity
constraints. Here's how you can define and drop integrity constraints
using the ALTER TABLE command:
Defining Integrity Constraints:
You can define integrity constraints when creating a table using the
CREATE TABLE statement, or you can add them later using the
ALTER TABLE statement. Here are some common integrity
constraints:
Primary Key Constraint: Ensures that each record in a table is
uniquely identifiable. It's usually created using one or more columns
that have unique values.
sql
Copy code
ALTER TABLE table_name ADD CONSTRAINT constraint_name
PRIMARY KEY (column1, column2, ...);
Foreign Key Constraint: Enforces a relationship between two tables,
ensuring referential integrity. It's created on a column or columns in
one table that references the primary key in another table.
sql
Copy code
ALTER TABLE table_name ADD CONSTRAINT constraint_name
FOREIGN KEY (column1, column2, ...) REFERENCES
referenced_table_name (ref_column1, ref_column2, ...);
Unique Constraint: Ensures that all values in a column or a group of
columns are unique.
sql
Copy code
ALTER TABLE table_name ADD CONSTRAINT constraint_name
UNIQUE (column1, column2, ...);
Check Constraint: Specifies a condition that must be met for the data
in a column.
sql
Copy code
ALTER TABLE table_name ADD CONSTRAINT constraint_name
CHECK (condition);
Dropping Integrity Constraints:
If you want to remove an integrity constraint from a table, you can
use the ALTER TABLE statement with the DROP CONSTRAINT
clause. Make sure to replace constraint_name with the actual name of
the constraint.
sql
Copy code
ALTER TABLE table_name DROP CONSTRAINT constraint_name;
For example, if you want to drop a primary key constraint named
pk_constraint from a table named my_table, you would execute:
sql
Copy code
ALTER TABLE my_table DROP CONSTRAINT pk_constraint;
Similarly, you can drop other types of constraints by replacing
PRIMARY KEY, FOREIGN KEY, UNIQUE, or CHECK
accordingly with the appropriate constraint type.
In database management systems (DBMS), views and indexes serve
different purposes but both contribute to enhancing database
performance and managing data effectively.
View:
A view is a virtual table that is derived from one or more tables or
other views. It does not store any data on its own but is rather a saved
query that can be treated as a table. Views allow users to query and
manipulate data without directly accessing the underlying tables.
They provide a way to simplify complex queries, enforce security by
limiting access to specific columns or rows, and present data in a
customized format.
Creating a View:
sql
Copy code
CREATE VIEW view_name AS SELECT column1, column2, ...
FROM table_name WHERE condition;
Example:
sql
Copy code
CREATE VIEW customer_orders AS SELECT
customers.customer_id, customers.customer_name, orders.order_date
FROM customers INNER JOIN orders ON customers.customer_id =
orders.customer_id;
Using a View:
Once created, you can query a view like a table.
sql
Copy code
SELECT * FROM view_name;
Modifying or Dropping a View:
Views can be modified or dropped using the ALTER VIEW and
DROP VIEW statements respectively.
Index:
An index is a data structure that improves the speed of data retrieval
operations on a database table at the cost of additional space and
slower writes. It consists of keys built from one or more columns in
the table, along with pointers to the actual rows in the table. Indexes
are used to quickly locate data without having to scan every row in a
table, especially when querying large datasets.
Creating an Index:
sql
Copy code
CREATE INDEX index_name ON table_name (column1, column2,
...);
Example:
sql
Copy code
CREATE INDEX idx_customer_name ON customers
(customer_name);
Using an Index:
Indexes are automatically used by the database optimizer when
executing queries that involve the indexed columns. They speed up
SELECT, JOIN, and WHERE clauses that reference the indexed
columns.
Modifying or Dropping an Index:
Indexes can be modified or dropped using the ALTER INDEX and
DROP INDEX statements respectively.
Types of Indexes:
Single-Column Index: An index built on a single column.
Composite Index: An index built on multiple columns.
Unique Index: Ensures that all values in the indexed column(s) are
unique.
Clustered Index: A type of index where the physical order of rows in
the table is the same as the indexed order.
Non-clustered Index: A type of index where the physical order of
rows in the table is not the same as the indexed order.
Both views and indexes are important tools in database management,
each serving its own purpose in optimizing data access and
manipulation.

You might also like