M02 Modeling Data Object
M02 Modeling Data Object
M02 Modeling Data Object
November, 2023
Addis Ababa, Ethiopia
Ministry of Labor and Skills wish to extend thanks and appreciation to the many
representatives of TVET instructors and respective industry experts who donated their time and
expertise to the development of this Teaching, Training and Learning Materials (TTLM).
This unit is developed to provide you the necessary information regarding the following content
coverage and topics
Text Analysis is also referred to as Data Mining. It is one of the methods of data analysis to
discover a pattern in large data sets using databases or data mining tools. It used to transform raw
data into business information. Business Intelligence tools are present in the market which is
used to take strategic business decisions. Overall it offers a way to extract and examine data and
deriving patterns and finally interpretation of the data.
b) Statistical Analysis
Diagnostic Analysis shows "Why did it happen?" by finding the cause from the insight found in
Statistical Analysis. This Analysis is useful to identify behavior patterns of data. If a new
problem arrives in into the business process, then you can look into this Analysis to find similar
patterns of that problem. And it may have chances to use similar prescriptions for the new
problems.
d) Predictive Analysis
Predictive Analysis shows "what is likely to happen" by using previous data. The simplest data
analysis example is like if last year I bought two dresses based on my savings and if this year my
salary is increasing double then I can buy four dresses.
This Analysis makes predictions about future outcomes based on current or past data.
Forecasting is just an estimate. Its accuracy is based on how much detailed information you have
and how much you dig in it.
e) Prescriptive Analysis
The conceptual or scoping model defines the boundaries of the system (i.e., what is in scope and
what is out of scope). It identifies:
Events outside the system that cause the system to react,
Actors outside the system that interact with the system,
Information that flows between the system and the actors outside the system,
Major functions included in the system,
User population.
I. Clarify System Boundaries
In addition to the scope, it is important that the system boundaries are clearly understood. The
boundaries identify where the system to be sized starts and ends. The sizing should include
everything for which the team is responsible.
A scope of a system is identified based on the following
Databases
Applications
Servers
Example: The diagram below shows a conceptual model with three entity types: Book,
Publisher, andAuthor:
Customer: the set of all people having an account at the bank. Attributes are ID,
name, Gender, Gender and Phone-number.
One-to-many (1: N): is a hierarchical relationship created or viewed from the primary
entity. Any one entity instance from the primary entity can be referenced by many entity
instances from the related entity. One instance of an entity (A) is associated with zero,
one or many instances of another entity (B), but for one instance of entity B there is only
one instance of entity (A). Example, for a company with all employees working in one
building, the building name (A) is associated with many different employees (B), but
those employees all share the same singular association with entity
Many-to-many (N: N): A many-to-many relationship lets users relate one or more entity
instances from another entity to an entity instance of the current entity. A many-to-many
relationship is reciprocal. Therefore, entity instances can be related from either entity.
One instance of an entity (A) is associated with one, zero or many instances of another
entity (B), and one instance of entity B is associated with one, zero or many instances of
entity A. For example, for a company in which all of its employees work on multiple
projects, each instance of an employee (A) is associated with many instances of a project
(B), and at the same time, each instance of a project (B) has multiple employees (A)
associated with it.
4. ER Diagrams Usage
ER is able to describe just about any system, ER diagrams are most often associated with
complex databases that are used in software engineering and IT networks.
In particular, ER diagrams are frequently used during the design stage of a development process
in order to identify different system elements and their relationships with each other. For
example: an inventory software used in a retail shop will have a database that monitors elements
Secti
represented by oval shapes. (For example: a student entity may have attributes such as
Name, Roll no and Age)
Multi-valued Attribute: If an attribute can have more than one value. It is important to
note that this is different to an attribute having its own attributes. For example, a teacher
entity can have multiple subject values.
a) b)
c) d)
6. The following can be a one-to-one relation ship
a) Relation between Department and employees
b) Relation between a president and a country
c) Relation between Employee and project
d) Relation between Employees and department
7. Which one of the following is an example of One-to-Many relationship?
a) Student – to – course c) Student – to – Instructor
b) Department – to – Student d) Husband – to – Wife
8. In ERD, entities are represented by
Database schema
Employee Table
Column name Data type Size Constraints Key
Empid Char 10 Not null Primary key
FullName Char 50
Sex Char 10
Address Char 5
To change the size of data type of employee “FullName” from default one (10) to 50
please
Click on Data type of required data (Full Name)Editselect data type Length then
write number of sizeok
Step 2: Consider or identify the attributes that you need to describe each entity.draw them inside
ovals. Connect these to the relevant entity and position your attributes to the outside of your
diagram, which leaves room for relationships.
Step 3: Think through the relationships or verbs taking place within the system. customer
purchases the phone. The cell service maintains the phone. The cell service creates a bill. The
customer pays the bill.
Lap Test
Task 2: Hospital database with a set of patients and a set of medical doctors.
Task 3: Simple library management system.
Task 4: Simple car rental system.
This unit will also assist you to attain the learning outcomes stated in the cover page.
Specifically, upon completion of this learning guide, you will be able to:
Understand and analyze business data
Understandrules of normalization
Understand benefits of normalization
Contrast normalization results with ER diagram
Understand how to conform differences between data
The process of normalization is a formal method that identifies relations based on their
primary key.Primarily it is a tool to validate and improve a logical design so that it satisfies
certain constraints that avoid unnecessary redundancy of data.
Example
We’ll be using a student database as an example in this article, which records student, class, and
teacher information.
Ministry of Labor and Skills Model Data Object Version-I
Page 46 of 79 Author/Copyright November, 2023
Level III
Insert Anomaly
For example, if we wanted to add a new student but did not know their course name this will be
how
Update Anomaly
For example, let’s say the class Biology 1 was changed to “Intro to Biology”. We would have to
query all of the columns that could have this Class field and rename each one that was found.
Normalization stages
1NF - First normal form
2NF - Second normal form
3NF - Third normal form
3.5NF - Boyce Codd Normal Form (BCNF)
4NF - Fourth normal form
5NF - Fifth normal form
The Normal Forms
The above table Student Details, Course Details and Result Details can be further divided.
Student Details attribute is divided into Student#(Student Number), Student Name and
date of birth.
Course Details is divided into Course#, Course Name and duration.
Results attribute is divided into Date ofexam, Marks and Grade.
II. Second normal form (2NF): Eliminating Redundant Data
Second normal form (2NF) requires that all non-key columns are fully dependent on the
entire primary key. If the table has only a single-column primary key, this requirement is
easily met.
At this level of normalization, each column in a table that is not a determiner of the
contents of another column must itself be a function of the other columns in the table. For
example, in a table with three columns containing customer ID, product sold, and price of
the product when sold, the price would be a function of the customer ID (entitled to a
discount) and the specific product.
COURSE TABLE
Course# CourseName Duration of days
C3 Bio Chemistry 3
B3 Botany 8
P3 Nuclear Physics 1
M4 Applied Mathematics 4
H6 American History 5
B4 Zoology 9
RESULT TABLE
Student# Course# Marks Grade
1001 M4 89 A
M4 Some value
H6 Some value
C3 Some value
B3 Some value
P3 Some value
B4 Some value
In STUDENT table, the key attribute is Student# and all other non-key attributes, Student
name and Date ofBirth are fully functionally dependent on the key attribute.
In COURSE table, the key attribute is Course# and all the non-key attributes, Course
name, Duration in days are fully functional dependent on the key attribute.
In RESULT table, the key attributes are #StudentCourse# together and all other non-key
attributes, Marks and Grade are fully functional dependent on the key attributes.
In EXAM DATE table, the key attribute is Course# and the non key attribute Date
ofExam is fully functionally dependent on the key attribute.
At first look it appears like all our anomalies are taken away! Now we are storing Student
1003 and M4 record only once. We can insert prospective students and courses at our will.
III. Third normal form (3NF): Eliminating Columns Not Dependent on Keys
Requires that there are no transitive dependencies, where one column depends on another
column which depends on the primary key.At the 2NF, modifications are still possible because a
change to one row in a table may affect data that refers to this information from another table.
For example, using the customer table just cited, removing a row describing a customer purchase
(because of a return perhaps) will also remove the fact that the product has a certain price. In the
third normal form, these tables would be divided into two tables so that product pricing would be
tracked separately.
In order to perform first normalizationrule, we have to consider the following concepts
1001 M4 89
1002 M4 78
1001 H6 87
1003 C3 90
1004 B3 78
1002 P3 67
1005 P3 78
1003 B4 67
1005 H6 56
1004 M4 78
100 95 A+
94 90 A
89 85 B+
84 80 B
79 75 B-
74 70 C
69 65 C-
After normalizing tables to 3NF, we got rid of all the anomalies and inconsistencies. Now we
can add new grade systems, update the existing one and delete the unwanted ones. Hence the
Third Normal form is the most optimal normal form and 99% of the databases which require
efficiency in
Functional Dependency
Functional dependency is a relationship between two sets of attributes in a database table. It
describes the dependency of one attribute (or a set of attributes) on another attribute (or a set of
attributes). In other words, if changing the value of one attribute determines the value of another
attribute(s), then a functional dependency exists.
Example: In a table called "Employees," if the attribute "EmployeeID" determines the attribute
"EmployeeName," it can be represented as: EmployeeID -> EmployeeName. This means that for
every value of EmployeeID, there is a unique value of EmployeeName associated with it.
Partial Dependency
Partial dependency occurs when an attribute is functionally dependent on only a part of the
primary key, rather than the entire primary key. It means that a non-key attribute depends on
only a subset of the primary key, and not on the entire primary key.
Example: In a table called "Orders," if the primary key is "OrderID," and the attributes
"CustomerName" and "CustomerAddress" depend on only the attribute "OrderID," it indicates a
partial dependency. This can be represented as: Ordered ->CustomerName, CustomerAddress.
To remove the partial dependency, the table can be split into two separate tables: "Orders" and
"Customers," where the customer details are stored separately.
Transitive Dependency
Identifying and eliminating partial and transitive dependencies are crucial in achieving higher
levels of normalization (such as 3NF or BCNF) to ensure data integrity, reduce redundancy, and
avoid anomalies in a database.
1. Analyzing the data: The first step is to analyze the existing data in the database. This
involves identifying the various entities, attributes, and relationships between them.
2. Applying normalization rules: Next, the data is normalized by applying normalization rules,
specifically the rules outlined in normal forms, such as First Normal Form (1NF), Second
Normal Form (2NF), and so on. Each normalization form has specific criteria that need to be
met.
3. Breaking down tables: In order to meet the criteria for normalization forms, it may be
necessary to break down existing tables into multiple tables, with each table focusing on a
specific entity or relationship.
Additionally, we would verify if the attributes in the normalized tables align with the attributes
specified in the ER diagram. We would ensure that no redundant data exists and that the data is
properly organized according to normalization rules.
By comparing the normalization results with the ER diagram, we can validate the accuracy and
consistency of the database design, ensuring that the normalized tables effectively capture the
structure and relationships depicted in the ER diagram.
In apply the first normal form, There is no unique field, we need to create a new field.
This is our new table in its first normal form with the attributes
Student (student ID, student name, fees paid, date of birth, address, subject 1, subject 2,
subject 3, subject 4, teacher name, teacher address, course name)
Step 2: The student ID (primary key), represents the student but not subject, teacher and course.
To resolve this issue, we have to create separate table for subject, teacher and course by creating
a primary key column, just like we did for student table.
Student Table
Subject Table
Teacher Table
We have four separate tables, capturing different pieces of information. We need to capture
students are taking certain courses, have teachers, and subjects. But the data is in different tables.
So, we use the concept foreign key.
To link the student andcoursetables using a foreign key, we need to put the primary key (the
underlined column) from one table into the other table.
So, we’ve linked the course, teacher, and student tables together so far. What about the subject
table?
Step 3: In applying the third normal form, we will analyze the following concepts, this student
table,
Student (student ID, course ID, student name, fees paid, date of birth, address)
we see address it might depend on zip codeSo,We can move the ZIP code to another table, along
with everything it identifies, and link to it from the student table.
Student Table: Student (student ID, course ID, student name, fees paid, date of birth, street
address, address code ID)
Address Table: Address Code (address code ID, ZIP code, suburb, city, state)
So our table would look like this:Teacher Table: Teacher (teacher ID, teacher name, street
address, address code ID)
Address Table: Address Code (address code ID, ZIP code, suburb, city, state)
This unit is developed to provide you the necessary information regarding the following
content coverage and topics
This unit will also assist you to attain the learning outcomes stated in the cover page.
Specifically, upon completion of this learning guide, you will be able to:
Validate data model with client
Understand problem resolution and recommendation skills
Document completed data model
Have effective client approval submission skills
Identify the Issue: The first step is to identify and understand the specific issue or challenge
at hand. This could be related to performance, security, scalability, user experience, or any
other aspect of web development or database administration.
Gather Information: Collect relevant data and information related to the issue. This may
include analyzing system logs, monitoring metrics, user feedback, or conducting
performance tests. The more information you have, the better you can diagnose the problem
or make informed recommendations.
Analyze the Root Cause: Conduct a thorough analysis to determine the underlying cause of
the issue. This may involve examining the codebase, database design, network configuration,
or any other relevant factors. The goal is to identify the root cause so that appropriate
measures can be taken.
Resolving issues and making recommendations in web development and database administration
is an ongoing process. It requires a combination of technical expertise, analytical skills, and
effective communication to identify and address challenges effectively, leading to improved
performance, security, and user satisfaction.
Documenting the completed data model is crucial for maintaining a clear and comprehensive
record of the data model's structure, relationships, and constraints. This documentation serves as
a valuable reference for developers, database administrators, and other stakeholders involved in
the project. Here are some key components to include when documenting a completed data
model:
Entity Relationship Diagram (ERD): Start by including an ERD that visually represents the
entities, attributes, and relationships in the data model. This diagram provides a clear
By documenting the completed data model thoroughly, you provide a valuable resource for
understanding and working with the system. The documentation helps facilitate effective
communication, collaboration, and maintenance of the data model throughout the web
development and database administration lifecycle.
Reference
Books
Database Design - 2nd Editionby Adrienne Watt and Nelson
Database Management Systems by Raghu Ramakrishnan and Johannes Gehrke
Open-Source Database Design by Max Ortiz Catalan: at
Data Modeling and Relational Database Design by University of California, Berkeley.
URL
https://opentextbc.ca/database/.
https://www.db-book.com/.
https://www.dcs.bbk.ac.uk/~ptw/teaching/DBM/er.pdf
https://www2.eecs.berkeley.edu/Courses/CS186/
https://www.geeksforgeeks.org
Qualificati Organization/
No Name Field of Study Mobile number E-mail
on Institution
1 Frew Atkilt M-Tech Network & Bishoftu Polytechnic 0911787374 frew.frikii@gmail.com
Information Security
College
2 Gari Lencha MSc ICT Managment Gimbi Polytechnic 0917819599 Garilencha12@gmail.com
3 Kalkidan Daniel BSc Computer Science Entoto Polytechnic 0978336988 kalkidaniel08@gmail.com