Uc 9
Uc 9
Uc 9
Learning outcomes:
LO1 : Identify entities and relationships
1
LO1 Identify Entities and Relationships
1.1 Understanding and analyzing business operations
Entity Types, Entity Sets, Attributes and Keys
ER Model Overview
ER model describes data in terms of:
Entities and entity sets
Objects
Relationships and relationship sets
Connections between objects
Attributes
Properties that characterize or describe entities or relationships
2
Categories of Attributes
Simple/Atomic (one value) or composite (divided into sub-parts) e.g. Name = surname +
first_name
Single- or multi-valued
Single-valued: one value for a particular entity
Multi-valued: a set of values possible for 1 entity
E.g. Student has a number of telephone-numbers
Optional attributes: unknown, not applicable or missing values are possible for this
attribute
Stored or Derived: values of attributes can be derived from other attributes (not stored;
computed when required)
E.g. net pay (gross – tax), # employees on a project
Relationship Sets
Associations between entity sets which express some real world relationship
Represented as diamonds in an E-R diagram
3
The function that an entity plays in a relationship is called that entity’s role
Relationships
Can have more than one
relationship between entities.
Mapping Cardinalities
Express the number of entities to which another entity can be associated via a relationship. We
will use this
convention:
– one to one
– One to many
– Many to one
– Many to many
Cardinality Examples
• A person has only one ID book and an
ID book belongs to only 1 person
4
• Entity set E has total participation in relationship set R: every entity in E participates in at least
one relationship in R
E.g. an ID book must have an owner
E.g. a house must have an owner
• Partial participation: only some entities in E participate in relationships in R
E.g. not every person owns a house
• We will use a double-line to show total participation and a single-line to show partial
participation (NB there are many styles of ER notation, but the ideas are the same).
Cardinality Examples
A person has only one ID book and an
ID book belongs to only 1 person
Keys
• A super key of an entity set is one or more attributes whose values uniquely identify a
particular entity.
• A candidate key is a minimal super key (i.e. no subset of the candidate key is a super key).
• A primary key: the candidate key selected by the database designer to uniquely identify
entities (chosen carefully such that attribute values rarely change).
• Keys of relationship sets are formed from the primary keys of participating entities (cardinality
must be considered).
5
1.2 Identify the Scope of the System
1. Database
What is DB Scope?
DB Scope is a tool for developers, used to view and edit databases, application information blocks, and
saved and unsaved preferences.
What's so special about DB Scope?
DB Scope uses the concept of "layouts" for your application's databases. You create a layout that
describes the format of each of your databases, using the 14 data types supported by DB Scope.
Now about the scope; well if you have good database then
Easy to locate the data or information in no time.
No redundant data.
No repetition.
More security. Like if one is accessing or changing the data other cannot change the same
data at that time.
Table references (keys like : Primary and foreign keys) are easy to maintain.
Example:- Define Scope of the Database Project
Identify which organizational subdivisions will be served by the database
Define which functions within these organizations will utilize the database
Identify which existing and planned applications will be converted to the database system
Prepare proposal for management and obtain go-ahead
6
2. Application
To determine the boundaries of an application system, it is important to examine the application
system from both functional and technical perspectives. A single application system will
normally have the following characteristics:
Consistent user interface design with and application behavior
Common architecture (e.g. program language, program design)
Shared application system components (e.g. visual objects, programs, database tables)
When undertaking an analysis of a computer systems, it is important to verify the presumed
scope of the application system and to identify all interface or integration points it has with other
application systems.
The scope of the application system affects the systems analysis effort. Scope is also significant
when formulating system maintenance, enhancement or replacement options.
Since system size is a measure of the magnitude of all components of a system that are within the
current scope, the system scope should be documented in the project plan before the system size
is estimated. The scope statement defines what the project will and will not include, in enough
detail to clearly communicate to all participants.
The external business requirements are generally the most obvious requirements and for which
the definition of scope is the easiest.
The system design may imply requirements that are not specified. For example, the design of a
client/server system may have the need for a fire wall between data moving in and out of the
environment. This may add the need for user exits or other components integrated with the data
propagation software.
7
Other components are often implied but not clearly defined, such as, performance, interfaces,
operations and implementation. These components should be included if they are within the
scope of the system being sized. If there is any question regarding whether something is
included, it should be assumed to be within the scope of the sizing (refer to Document
Assumptions) until the system scope specifically excludes it.
In addition to the scope, it is important that the system boundaries are clearly understood before
the system size is estimated. The boundaries identify where the system to be sized starts and
ends. The sizing should include everything for which the team is responsible. Items or areas
that will be excluded should be clearly stated.
There are two primary reasons to exclude something from the sizing:
The component is another team's responsibility. For example, a business to business system may
provide a standard format for suppliers to interface with their systems. The interface
components that provide the standard interface would be in scope, but the supplier interfaces
may be excluded.
The component is assumed to be already implemented. For example, if the system will use
standard reusable components, such as standard date routines or file access routines that will not
be modified, then these may be excluded from the scope while the new interfaces to these
routines may be in scope.
1.3 Reviewing business rules to determine impact
Over view of business rules
Use business rules to control the behavior of a business practice.
Once you've established the Business Rules you believe to be appropriate, review their
specification sheets. Carefully examine the specification sheet and make certain that the rule has
been properly established and that all the appropriate areas on the sheet are clearly marked. If
you find an error, make the necessary modifications and review it once more. Repeat this process
until you've reviewed every Business Rule.
Business Rules are an important component of the database. Along with contributing to overall
data integrity, Business Rules impose integrity constraints that are specific to the organization.
As you've seen, these rules help to ensure the validity and consistency of the data within the
8
context of the manner in which the organization functions or conducts its business. Furthermore,
these rules will affect the manner in which the database is implemented in an RDBMS and how
it works with the application program used to work with the database.
1.4 Relationships
A relationship works by matching data in key columns — usually columns with the same name
in both tables. In most cases, the relationship matches the primary key from one table, which
provides a unique identifier for each row, with an entry in the foreign key in the other table. For
example, book sales can be associated with the specific titles sold by creating a relationship
between the title_id column in the titles table (the primary key) and the title_id column in
the sales table (the foreign key).
There are three types of relationships between tables. The type of relationship that is created
depends on how the related columns are defined.
One-to-Many Relationships
A one-to-many relationship is the most common type of relationship. In this type of relationship,
a row in table A can have many matching rows in table B, but a row in table B can have only one
matching row in table A. For example, the publishers and titles tables have a one-to-many
relationship: each publisher produces many titles, but each title comes from only one publisher.
Make a one-to-many relationship if only one of the related columns is a primary key or has a
unique constraint.
The primary key side of a one-to-many relationship is denoted by a key symbol. The foreign key
side of a relationship is denoted by an infinity symbol.
Many-to-Many Relationships
In a many-to-many relationship, a row in table A can have many matching rows in table B, and
vice versa. You create such a relationship by defining a third table, called a junction table, whose
primary key consists of the foreign keys from both table A and table B. For example,
the authors table and the titles table have a many-to-many relationship that is defined by a one-
to-many relationship from each of these tables to the titleauthors table. The primary key of
9
the titleauthors table is the combination of the au_id column (the authors table's primary key)
and the title_id column (the titles table's primary key).
One-to-One Relationships
In a one-to-one relationship, a row in table A can have no more than one matching row in table
B, and vice versa. A one-to-one relationship is created if both of the related columns are primary
keys or have unique constraints.
This type of relationship is not common because most information related in this way would be
all in one table. You might use a one-to-one relationship to:
Divide a table with many columns.
Isolate part of a table for security reasons.
Store data that is short-lived and could be easily deleted by simply deleting the table.
Store information that applies only to a subset of the main table.
The primary key side of a one-to-one relationship is denoted by a key symbol. The foreign key
side is also denoted by a key symbol.
Business Rules
Business rules are used to describe the properties of an application, e.g., the fact that an employee
cannot earn more than his or her manager.
A business rule can be:
the description of a concept relevant to the application (also known as a business object),
an integrity constraint on the data of the application,
a derivation rule, whereby information can be derived from other information within a
schema.
Documentation Techniques
Descriptive business rules can be organized as a data dictionary. This is made up of two tables:
the first describes the entities of the schema, the others describes the relationships.
Business rules that describe constraints can be expressed in the following form:
<concept> must/must not <expression on concepts>
Business rules that describe derivations can be expressed in the following form:
<concept> is obtained by <operations on concepts>
10
11
LO2 Develop Normalization
2.1 Introduction to Normalization
Normalization: Definitions
Normalization is a method used to validate and improve a logical design so that it satisfies
certain constraints that avoid unnecessary duplication of data. The process of decomposing
relations with anomalies to produces smaller, well-structured relations.
Well-Structured Relations
A relation that contains minimal data redundancy and allows users to insert, delete, and update
rows without causing data inconsistencies.
Modification Anomaly–changing data in a row forces changes to other changing data in a row
forces changes to other rows because of duplication
12
Why do these anomalies exist?
Because there are two themes (entity types) in this one relation. This results in data
duplication and an unnecessary dependency between the entities.
13
--- Example of Bad Design
If any anomalies are present, note them clearly and make sure that the programs that update the
database will operate correctly. Due to improper grouping of attributes into a relation schema,
the following problems are encountered.
Storage wastage +
Insert anomalies +
Delete anomalies +
Modification anomalies +
14
Storage Wastage
Insert Anomalies
Delete Anomalies
15
Modification Anomalies
16
The relations should be designed to satisfy the lossless join condition. No spurious tuples
should be generated by doing a natural-join of any relations
Design relational schemas so that they can be joined with equality conditions of attributes that
are easier primary keys or foreign keys in a way that guarantees that no spurious tuples are
generated. Do not have relations that contain matching attributes other than foreign key-primary
key combination. If such relations are unavoidable, do not join them on such attributes, because
the join may produce spurious tuples.
17
Functional dependencies (FDs) are used to specify formal measures of the "goodness" of
relational designs
FDs and keys are used to define normal forms for relations
FDs are constraints that are derived from the meaning and interrelationships of the data
attributes
A particular relationship between two attributes. For a given relation, attribute B is functionally
dependent on attribute A if, for every valid value of A, that value of A uniquely determines the
value of B. or
A set of attributes X functionally determines a set of attributes Y if the value of X
determines a unique value for Y
X àY holds if whenever two tuples have the same value for X, they must have the same
value for Y
If t1[X]=t2[X], then t1[Y]=t2[Y] in any relation instance r(R)
X à Y in R specifies a constraint on all relation instances r(R)
FD Constraints
FDs are derived from the real-world constraints on the attributes
An FD is a property of the attributes in the schema R
The constraint must hold on every relation instance r(R)
If K is a key of R, then K functionally determines all attributes in R (since we never have
two distinct tuples with t1[K]=t2[K])
Examples of FD constraints
Social Security Number determines employee name
SSN à ENAME
Project Number determines project name and location
PNUMBER à {PNAME, PLOCATION}
Employee SSN and project number determines the hours per week that the employee
works on the project
{SSN, PNUMBER} à HOURS
Inference Rules for FDs
Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold
Armstrong's inference rules
A1. (Reflexive) If Y subset-of X, then X à Y
A2. (Augmentation) If X à Y, then XZ à YZ
(Notation: XZ stands for X U Z)
A3. (Transitive) If X à Y and Y à Z, then X à Z
A1, A2, A3 form a sound and complete set of inference rules
Additional Useful Inference Rules
Decomposition
If X à YZ, then X à Y and X à Z
18
Union
If X à Y and X à Z, then X à YZ
Psuedotransitivity
If X à Y and WY à Z, then WX à Z
19
2.5 General definition of second and third normal forms
Second Normal Form
A relation schema R is in second normal form (2NF) if every non-prime attribute A in R
is fully functionally dependent on the primary key
A functional dependency X->Y is a partial dependency if some attribute A belong X can
be removed from X and the dependency still holds
Uses the concepts of FDs, primary key
Definitions:
Prime attribute - attribute that is member of the primary key K
Full functional dependency - a FD Y à Z where removal of any attribute from Y
means the FD does not hold any more
{SSN, PNUMBER} à HOURS is a full FD since neither SSN à HOURS nor
PNUMBER à HOURS hold
{SSN, PNUMBER} à ENAME is not a full FD (it is called a partial dependency ) since
SSN à ENAME also holds
20
A relation schema R is in second normal form (2NF) if every non-prime attribute A in R
is fully functionally dependent on the primary key
R can be decomposed into 2NF relations via the process of 2NF normalization
21
Every BCNF relation is in 3NF
There exist relations that are in 3NF but not in BCNF
The goal is to have each relation in BCNF (or 3NF)
Rule for schema not in BCNF
Let R be a schema not in BCNF, then there is at least one nontrivial functional
dependency a®b such that a is not a superkey
Example of not BCNF
bor_loan = (customer_id, loan_number, amount)
loan_numberàamount
but loan_number is not a superkey
If a relation is not in BCNF it can be decomposed to create relations that are in BCNF
borrower = (customer_id, load_number)
Is BCNF because no nontrivial functional dependency hold onto it
loan = (loan_number, amount)
Has one nontrivial functional dependency that holds, loan_numberàamount, but loan_number
is a superkey so loan is in BCNF
22
LO3 Validate Data Model using Normalization
3.1 Validating the Data Model
Even when you have completed your data model it may not completely reflect the system being
developed. You need to review your business rules once more to see if you have created an
accurate picture.
Look at the example data model answer for the Painting Hire system. The normalization process
resulted in an entity called Portfolio, with a key of Artist No, Painting No). This would allow for
a painter to have painted several paintings. However it would also allow for the same painting to
be painted by several painters. But, a requirement of the system was 'each painting can only have
one artist associated with it'.
Therefore, the current data model needs reviewing to prevent multiple artists being associated
with the same painting.
This has been achieved in the revised data model. The entity Portfolio has been deleted and
replaced by a foreign key of Artist No in the Painting entity.
Should changes be requested after Aleberry receives a signed Final Design Approval form,
Client will be charged at an hourly rate.
By submitting this document the client declares satisfaction with the design and relinquishes the
right to any remaining uncharged design revisions of the project as it stands on:
After receiving this form, Aleberry Creative will contact you with any final design files.
23