[go: up one dir, main page]

0% found this document useful (0 votes)
126 views8 pages

Handout 1 - DBMS Concepts

This document provides an overview of database concepts and the relational database model. It discusses what a database and DBMS are, database models including relational and object-oriented, and key aspects of relational databases including SQL standards, sets and rows. The document is an introductory handout for a course on database design and management using SQL Server.

Uploaded by

Roha Cbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
126 views8 pages

Handout 1 - DBMS Concepts

This document provides an overview of database concepts and the relational database model. It discusses what a database and DBMS are, database models including relational and object-oriented, and key aspects of relational databases including SQL standards, sets and rows. The document is an introductory handout for a course on database design and management using SQL Server.

Uploaded by

Roha Cbc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

CCI – Computer Science Dept 655445324.

doc

Haramaya University
College of Computing and Informatics

Computer Science Department

Comp 221: Database with SQL Server

Handout 1 – DBMS Concepts

RECOMMENDED TEXTS:

• “SQL Server 2000 – The Complete Reference”


Jeffrey R. Shapiro,
Osborne McGraw Hill

1 What is a Database
A database is a collection of organized data used by the applications/systems of an organization.
A database consists of:
 Data
 Relationships between the data
 Constraints on the data – or restrictions e.g. Quantity must be greater than 0
 A schema – describes all the objects in the database (e.g. a table has columns, a column has
a data-type and a size)

So, in other words, a database is a self-describing collection of data elements, and relationships
between those elements.

2 What is a DBMS?
A DBMS (Database Management System) is a collection of interrelated data and a set of programs
to access that data.
A DBMS provides an environment that allows storage and retrieval of data & information, and
provides ways of carrying out database administration tasks.

The various functions provided by a DBMS are listed below (parentheses indicate what we will
cover in this course that will address this issue):

Storage & retrieval – can be done independent of internal structures of the db


Catalog – describes all the data items stored in the db, which are accessible to users – includes data
definitions e.g. for a column, what is the data-type and what is the size (we will look at SQL Server
system objects, which store catalog information)
Page 1 of 8
CCI – Computer Science Dept 655445324.doc

Shared update – to support concurrency i.e. when more than 1 user are updating the database at the
same time (transactions – will be covered in this course)
Recovery – if the db is damaged, need to be able to restore a working copy. A DBMS provides
backup and restore functions. It is usually possible to schedule a backup to occur on a regular basis
e.g. every night or every 4 hours (will look at how to backup & restore a database)
Security – access restricted to authorised users; users assigned permissions to carry out certain
actions (e.g. to update or delete data); usually password-protected access. Data can also be
encrypted for further protection (will look at how to create logins & users)
Integrity – mechanisms to ensure data integrity and referential integrity. Data types, formats, check
constraints and key constraints all used for this (will learn about data types & constraints)
Data independence – the manipulation of the data is independent of where the data is physically
stored – in other words, data manipulation works with logical view of the data and the process that
is manipulating the data does not need to know where or how the data is stored (will be learning
how to use SQL)
Utility services – provides ways to import & export data, query the data etc (will be using tools
such as Query Analyzer)

3 Database Models
You looked at a number of data models in the DBMS course – a data model defines a structure
for storing and manipulating data in a database. The relational model is the data model most in use
today.

1.1 Relational Model


 As the name implies, a database based on the relational model is collection of relations.
 A relation is a table; a table has attributes (columns) and rows (records).
 A relation corresponds to an entity in the application e.g. a Department, an Employee. Each
row in the table represents an instance of the entity.
 Each row is uniquely identified by one or more of the attribute values in it – this is called the
primary key.
 Each column in a table has a unique name.
 A column has a defined data type e.g. integer, character
 Relationships link rows in different tables, based on key values e.g. a Department has
Employees.
 Relationships can be 1-to-1, 1-to-many or many-to-many.
 Tables can also be used to represent relationships (e.g. a many-to-many relationship).
 SQL (Structured Query Language) provides a
way of manipulating tables, using the relationships between them, and of storing and
retrieving data from the database

Example
Tables:
Department (Department Number, Name, Manager)
Employee (Employee Number, Name, Salary, Department Number)
Relationships:
Department has many employees
Department has 1 Manager

Page 2 of 8
CCI – Computer Science Dept 655445324.doc

The E-R (Entity-Relationship) model is often used to design databases. The design is then translated
into the Relational Model for implementation.

The Relational Model was preceded by these models:

1.2 Network Model


 Records are organised into tables.
 Relationships are modelled by linking records together in chains e.g. Department ->
Employee1 -> Employee2 -> Employee3.
 Data is stored in files that contain records.

1.3 Hierarchical Model


 The hierarchical model follows a tree structure.
 Data is stored in files that contain records.

Department

Manager

Employee 1 Employee 2 Employee 3

Models that have emerged since the relational model include the following:

1.4 Object-oriented Model


 The object-oriented data model has emerged as OO has gained popularity as a design
technique.
 This model embodies concepts similar to those of OO programming – the database is seen
as a collection of objects; objects that have similar properties are grouped into a class
 So, an entity is represented as a class.
 A class has attributes and behaviour.
 The value of an attribute can be an object that is an instance of another class in the model
 An object is an instance of class.
 Subclasses inherits attributes from the parent class.

Example
Department class – has attributes Department Number, Name, Manager
Employee class – has attributes Employee Number, Name, Salary

An instance of the Department class has one or more Employee members.


Manager is a sub-class of Employee – it inherits the attributes of an Employee.
Page 3 of 8
CCI – Computer Science Dept 655445324.doc

4 RDBMS
In this course, we will be working with databases that follow the relational model, and we will be
using one or more DBMS systems to work with those databases.
The RDBMS (Relational Database Management System) is now a core part of many information
systems, particularly e-Commerce/internet applications and client-server systems.
We will also be learning how to use SQL (Structured Query Language) to store, manipulate and
retrieve data in an RDBMS.

This section looks at some of the aspects of relational databases.

1.5 SQL – Standards


Any DBMS should have a language for defining the structures in the database – this is called the
DDL (Data Definition Language). This is used by the database designer/administrator to build the
database.
It should also have a language for manipulating the data in the database – this is called the DML
(Data Manipulation Language). This language is used by users of the database to insert, retrieve,
modify and delete data in the database.

In a relational DBMS, the DDL and DML are unified into SQL (Structured Query Language). SQL
is the generic language for querying relational databases.

The standards for SQL are defined by ANSI (American National Standards Institute) and are also
supported by the ISO (International Standards Organization).
The current standard is called SQL-92 (as it was updated in 1992). There is also a newer set of
standards called SQL-99, but this has not yet been implemented in many DBMS packages.

Any RDBMS package should conform to the SQL-92 standards; future releases will conform to the
SQL-99 standards.

We will be making use of Microsoft SQL Server 2000, which is SQL-92 compliant.

Other RDBMS packages available are Oracle (current version is 9i), MySQL (an open source
package) and PostgreSQL.

Some makers of RDBMS packages implement their own additions to SQL, to support their own
product and its features. Some are as follows:

Microsoft SQL Server has T-SQL – Transact-SQL


Oracle has PL/SQL – Procedural Language/SQL
PostgreSQL has PL/pgSQL – Procedural Language/postgreSQL

1.6 Sets & Rows


In the relational data model, SQL programs operate on logical sets of data.

This is different to other database manipulation languages (e.g. Xbase or VB) – which use a row
processing or procedural programming style.
Page 4 of 8
CCI – Computer Science Dept 655445324.doc

In these DMLs, the programmer must tell the program exactly what to do with the data, on
record/row at a time:
Program loops through list of records
Each record processed one at a time, or logic applied to each record, one at a time.

SQL uses set processing – in a SQL query, data is always selected FROM a table.
The data is selected from a set called a table.
 The programmer just has to tell the database what is required; not how to process each
individual row in the set.
This is sometimes called declarative programming – because the programmer declares what data is
necessary – e.g. “Give me all employees in the Computer Science Department”.

Criteria can be specified to narrow the data down to a more specific set e.g. “Give me all employees
in the Computer Science Department who are not Ethiopian.”

1.7 Atomicity
In the relational model, data is shown logically as a two-dimensional table that describes a single
entity e.g. a Department.

Data is displayed in columns and rows.

A column describes a specific attribute of the entity e.g. the name of a Department.
A row describes a specific entity (or instance of the entity) e.g. the Computer Science Department.

The specific values of each attribute should contain one, and only one, value. This is known as
atomicity
If the table is constructed in such a way that the intersection of a row and column can contain more
than one value, then the model needs to be changed so that the columns can have only one value.
This occurs when there is a many-to-many relationship between two tables. See the section on
Many-to-Many relationships below (section 1.10) – the way to change the model is to break the
relationship into two one-to-many relationships.

1.8 Domain/Datatypes
The values in a column also have rules of behaviour – the values in a given column must share a
common domain – also called datatype.
For example, the DepartmentNumber field should only contain whole numbers (integers), no alpha-
characters.
Therefore, this field would be defined as having an integer datatype.

The values in a column can also be controlled by applying rules – for example, a SQL rule could
specify that the DepartmentNumber be greater than 0 and less than 1000.

5 Relationships between Relational Tables

Relationships link the rows in different tables, based on key values.


Consider some of the tables in the Pubs database, as shown in Figure 1 below (this is one of the
sample databases on the SQL Server 2000 server).

Page 5 of 8
CCI – Computer Science Dept 655445324.doc

Let us look at some of these relationships. In this diagram (created using the SQL Server diagram
wizard), the key symbols indicate a ‘1’ side of a relationship.

Figure 1 – some of the tables in the Pubs database, showing the relationships between them

1.9 5.1 1-to-many


In a 1-to-many relationship between table A and table B, a row in table A can have many matching
rows in table B. A row in table B can have only one matching row in table A.

In the diagram above, there is a 1-to-many relationship between Publishers and Titles. Each
Publisher produces many Titles, but a Title can come from only one Publisher. The relationship is
between the primary key of publishers.pub_id and the foreign key titles.pub_id.
A 1-to-many relationship is created when only one of the related columns is a primary key or has
unique values.

This type of relationship is most commonly used in relational databases.

1.10 Many-to-many
In a many-to-many relationship between table A and table B, a row in table A can have many
matching rows in table B and vice verse.
In the pubs example, a Title can have many Authors. An Author can also write many Titles. So this
is a many-to-many relationship.
This type of relationship is modelled by introducing a third table between the related tables. The
primary keys from the related tables become foreign keys in the new table.

Page 6 of 8
CCI – Computer Science Dept 655445324.doc

We then have a 1-to-many relationship between each of the related tables and the third table. The
third table is called a junction or intersection table.
In this example, the junction table is titleauthor. It forms a link between the titles and authors
tables.
The primary key of the junction table is the combination of both the foreign keys – au_id and
title_id in this case.

1.11 1-to-1
In a 1-to-1 relationship between table A and table B, a row in table A can have no more than one
matching row in table B and vice verse.
In the above tables, there is a 1-to-1 relationship between Publishers.pub_id and pub_info.pub_id –
because a Publisher has one set of publisher information (pub_info). Conversely, a pub-info record
is for only one corresponding Publisher.
A 1-to-1 relationship is created if the column values are unique in each table – which means they
are primary keys or have a uniqueness constraint on them.

Consider this relationship – it would also be possible to store the logo and pr_info fields in the
publishers table.
Because this is often the case, 1-to-1 relationships are not used very frequently.
However, they can be used in the following scenarios:
Where the data applies to only some of the records in one of the tables e.g. if only some Publishers
have a logo and associated pr_info.
Where it may be necessary to be able to quickly and easily delete the extra data about the entity in
one of the tables e.g. the pr_info table could be deleted if the pr_info data was no longer required.
This is easier than deleting columns from the Publishers table.

6 Codd’s Rules for a Truly Relational Database System

E.F. Codd was one of the originators of the Relational Model.


He developed Twelve Principles of Relational Databases. If a DBMS package does not meet all of
these criteria, then it is not fully relational.
These rules do not apply to applications development, but they do determine if the database engine
is truly relational.
A good understanding of these principles helps a database programmer in the proper development
and design of relational databases (RBDs).

1. Information is represented logically in tables.


2. Data must be logically accessible by table, primary key, and column.
3. Null values must be uniformly treated as ‘missing information’, not as empty strings,
blanks or zeroes.
4. Metadata (data about the database) must be stored in the database just as regular data is.
5. A single language must be able to define data, views, integrity constraints, authorization,
transactions and data manipulation (DDL – Data Definition Language, DML – Data
Manipulation Language).
6. Views must show the updates of their base tables and vice versa.
7. A single operation must be able to retrieve, insert, update or delete data.
8. Batch and end-user operations are logically separate from physical storage and access
methods.
Page 7 of 8
CCI – Computer Science Dept 655445324.doc

9. Batch and end-user operations can change the database schema without having to
recreate it or the applications built upon it.
10. Integrity constraints must be available and stored in the RDB metadata, not in an
application program.
11. The DML of the relational system should not care where or how the physical data is
distributed and should not require alteration if the physical data is centralized or
distributed.
12. Any row processing done in the system must obey the same integrity rules and
constraints that set-processing operations do.

Notes prepared by: Terri O'Sullivan, FBE Computer Science Department.

Page 8 of 8

You might also like