Handout 1 - DBMS Concepts
Handout 1 - DBMS Concepts
doc
Haramaya University
College of Computing and Informatics
RECOMMENDED TEXTS:
1 What is a Database
A database is a collection of organized data used by the applications/systems of an organization.
A database consists of:
Data
Relationships between the data
Constraints on the data – or restrictions e.g. Quantity must be greater than 0
A schema – describes all the objects in the database (e.g. a table has columns, a column has
a data-type and a size)
So, in other words, a database is a self-describing collection of data elements, and relationships
between those elements.
2 What is a DBMS?
A DBMS (Database Management System) is a collection of interrelated data and a set of programs
to access that data.
A DBMS provides an environment that allows storage and retrieval of data & information, and
provides ways of carrying out database administration tasks.
The various functions provided by a DBMS are listed below (parentheses indicate what we will
cover in this course that will address this issue):
Shared update – to support concurrency i.e. when more than 1 user are updating the database at the
same time (transactions – will be covered in this course)
Recovery – if the db is damaged, need to be able to restore a working copy. A DBMS provides
backup and restore functions. It is usually possible to schedule a backup to occur on a regular basis
e.g. every night or every 4 hours (will look at how to backup & restore a database)
Security – access restricted to authorised users; users assigned permissions to carry out certain
actions (e.g. to update or delete data); usually password-protected access. Data can also be
encrypted for further protection (will look at how to create logins & users)
Integrity – mechanisms to ensure data integrity and referential integrity. Data types, formats, check
constraints and key constraints all used for this (will learn about data types & constraints)
Data independence – the manipulation of the data is independent of where the data is physically
stored – in other words, data manipulation works with logical view of the data and the process that
is manipulating the data does not need to know where or how the data is stored (will be learning
how to use SQL)
Utility services – provides ways to import & export data, query the data etc (will be using tools
such as Query Analyzer)
3 Database Models
You looked at a number of data models in the DBMS course – a data model defines a structure
for storing and manipulating data in a database. The relational model is the data model most in use
today.
Example
Tables:
Department (Department Number, Name, Manager)
Employee (Employee Number, Name, Salary, Department Number)
Relationships:
Department has many employees
Department has 1 Manager
Page 2 of 8
CCI – Computer Science Dept 655445324.doc
The E-R (Entity-Relationship) model is often used to design databases. The design is then translated
into the Relational Model for implementation.
Department
Manager
Models that have emerged since the relational model include the following:
Example
Department class – has attributes Department Number, Name, Manager
Employee class – has attributes Employee Number, Name, Salary
4 RDBMS
In this course, we will be working with databases that follow the relational model, and we will be
using one or more DBMS systems to work with those databases.
The RDBMS (Relational Database Management System) is now a core part of many information
systems, particularly e-Commerce/internet applications and client-server systems.
We will also be learning how to use SQL (Structured Query Language) to store, manipulate and
retrieve data in an RDBMS.
In a relational DBMS, the DDL and DML are unified into SQL (Structured Query Language). SQL
is the generic language for querying relational databases.
The standards for SQL are defined by ANSI (American National Standards Institute) and are also
supported by the ISO (International Standards Organization).
The current standard is called SQL-92 (as it was updated in 1992). There is also a newer set of
standards called SQL-99, but this has not yet been implemented in many DBMS packages.
Any RDBMS package should conform to the SQL-92 standards; future releases will conform to the
SQL-99 standards.
We will be making use of Microsoft SQL Server 2000, which is SQL-92 compliant.
Other RDBMS packages available are Oracle (current version is 9i), MySQL (an open source
package) and PostgreSQL.
Some makers of RDBMS packages implement their own additions to SQL, to support their own
product and its features. Some are as follows:
This is different to other database manipulation languages (e.g. Xbase or VB) – which use a row
processing or procedural programming style.
Page 4 of 8
CCI – Computer Science Dept 655445324.doc
In these DMLs, the programmer must tell the program exactly what to do with the data, on
record/row at a time:
Program loops through list of records
Each record processed one at a time, or logic applied to each record, one at a time.
SQL uses set processing – in a SQL query, data is always selected FROM a table.
The data is selected from a set called a table.
The programmer just has to tell the database what is required; not how to process each
individual row in the set.
This is sometimes called declarative programming – because the programmer declares what data is
necessary – e.g. “Give me all employees in the Computer Science Department”.
Criteria can be specified to narrow the data down to a more specific set e.g. “Give me all employees
in the Computer Science Department who are not Ethiopian.”
1.7 Atomicity
In the relational model, data is shown logically as a two-dimensional table that describes a single
entity e.g. a Department.
A column describes a specific attribute of the entity e.g. the name of a Department.
A row describes a specific entity (or instance of the entity) e.g. the Computer Science Department.
The specific values of each attribute should contain one, and only one, value. This is known as
atomicity
If the table is constructed in such a way that the intersection of a row and column can contain more
than one value, then the model needs to be changed so that the columns can have only one value.
This occurs when there is a many-to-many relationship between two tables. See the section on
Many-to-Many relationships below (section 1.10) – the way to change the model is to break the
relationship into two one-to-many relationships.
1.8 Domain/Datatypes
The values in a column also have rules of behaviour – the values in a given column must share a
common domain – also called datatype.
For example, the DepartmentNumber field should only contain whole numbers (integers), no alpha-
characters.
Therefore, this field would be defined as having an integer datatype.
The values in a column can also be controlled by applying rules – for example, a SQL rule could
specify that the DepartmentNumber be greater than 0 and less than 1000.
Page 5 of 8
CCI – Computer Science Dept 655445324.doc
Let us look at some of these relationships. In this diagram (created using the SQL Server diagram
wizard), the key symbols indicate a ‘1’ side of a relationship.
Figure 1 – some of the tables in the Pubs database, showing the relationships between them
In the diagram above, there is a 1-to-many relationship between Publishers and Titles. Each
Publisher produces many Titles, but a Title can come from only one Publisher. The relationship is
between the primary key of publishers.pub_id and the foreign key titles.pub_id.
A 1-to-many relationship is created when only one of the related columns is a primary key or has
unique values.
1.10 Many-to-many
In a many-to-many relationship between table A and table B, a row in table A can have many
matching rows in table B and vice verse.
In the pubs example, a Title can have many Authors. An Author can also write many Titles. So this
is a many-to-many relationship.
This type of relationship is modelled by introducing a third table between the related tables. The
primary keys from the related tables become foreign keys in the new table.
Page 6 of 8
CCI – Computer Science Dept 655445324.doc
We then have a 1-to-many relationship between each of the related tables and the third table. The
third table is called a junction or intersection table.
In this example, the junction table is titleauthor. It forms a link between the titles and authors
tables.
The primary key of the junction table is the combination of both the foreign keys – au_id and
title_id in this case.
1.11 1-to-1
In a 1-to-1 relationship between table A and table B, a row in table A can have no more than one
matching row in table B and vice verse.
In the above tables, there is a 1-to-1 relationship between Publishers.pub_id and pub_info.pub_id –
because a Publisher has one set of publisher information (pub_info). Conversely, a pub-info record
is for only one corresponding Publisher.
A 1-to-1 relationship is created if the column values are unique in each table – which means they
are primary keys or have a uniqueness constraint on them.
Consider this relationship – it would also be possible to store the logo and pr_info fields in the
publishers table.
Because this is often the case, 1-to-1 relationships are not used very frequently.
However, they can be used in the following scenarios:
Where the data applies to only some of the records in one of the tables e.g. if only some Publishers
have a logo and associated pr_info.
Where it may be necessary to be able to quickly and easily delete the extra data about the entity in
one of the tables e.g. the pr_info table could be deleted if the pr_info data was no longer required.
This is easier than deleting columns from the Publishers table.
9. Batch and end-user operations can change the database schema without having to
recreate it or the applications built upon it.
10. Integrity constraints must be available and stored in the RDB metadata, not in an
application program.
11. The DML of the relational system should not care where or how the physical data is
distributed and should not require alteration if the physical data is centralized or
distributed.
12. Any row processing done in the system must obey the same integrity rules and
constraints that set-processing operations do.
Page 8 of 8