Elec A1-5 - Advanced Databases
Elec A1-5 - Advanced Databases
▪ Constraints
Keyword Constraints
UNIQUE No repeated data is allowed in the field
NOT NULL The field cannot be empty
DEFAULT Adding default value to the field
• Will be applied if no value specified when inserting record
CHECK Adding validation check to the field
• Using scope operator such as BETWEEN … AND, IN, LIKE, >, <, …
Exercise
1. Write an SQL statement to create a table called Teacher with two columns named Name and Age respectively.
ICT (Elective) – 2A Databases P. 2
2. Create a table called Book that contains fields named BookID, Title and Type such that a value for BookID
must be entered for each row and its value is unique within the table.
4. Create a table “LoanRecord” with a primary key “LoanRecID” and two foreign keys “StdID” and “BookID”
that references tables “Student” and “Book” respectively.
ICT (Elective) – 2A Databases P. 3
Exercise
5. A table Student has the following structure:
Student
Field Name Type Size Contents
Sid Character 4 Student ID
Name Character 50 Name of the student
Class Character 20 Class of the student
Age Integer -- Student age
- View
▪ It is a virtual table that links to part of the data in the database, not necessarily in the same table.
▪ We can perform calculation and query with the views.
▪ Views are read only, so insert, delete and update to a view are not allowed.
▪ It is in the form of a saved query in Microsoft Access.
Advantage:
Disadvantage:
- Index
▪ It is an extra table added in the database, sorted in a frequently used field.
▪ When running query on the table, instead of sequential search, binary search can be done on index table.
Advantage:
Disadvantage:
Exercise
6. An organisation manages a professional examination which includes several subjects. Candidates enroll in the
examination through training centres. Each candidate enrolls in at least one subject. Markers can mark one
subject only. The organisation uses the following database tables to store the information on the examination.
CAND
Field name Type Description
CNO Integer Identity number of the candidate
CNAME Character Candidate name
SEX Boolean Sex
CDOB Date Date of birth
TNO Integer Identity number of the training centre
SBJCODE Integer Identity number of the subject that the candidate enrolls in
SBJNAME Character Name of the subject that the candidate enrolls in
(2 marks)
(iii) A simple SQL command has the three parts, SELECT, FROM and WHERE. Which part will the index file
facilitate? Explain briefly.
(1 mark)
[2014-DSE-ICT 2A-Q2]
ICT (Elective) – 2A Databases P. 6
Exercise
7. A company is using a table employees to store the information of its employees. Below are some data stored.
Employees
employee_id first_name last_name hire_date salary
001 John Chan 2023-07-15 60000
002 Jane Mak 2023-07-20 55000
003 Michael Leung 2023-07-25 65000
(b) The company is making a salary adjustment, and all the employees will have an increment of $500.
▪ String functions
Function Meaning Example Output
LEN String length select LEN('Munsang') 7
LOWER Lower case select LOWER('Munsang') Munsang
UPPER Upper case select UPPER('Munsang') MUNSANG
LEFT Substring from the left select LEFT('Munsang',3) Mun
RIGHT Substring from the right select RIGHT('Munsang',3) ang
MID Substring from the middle select MID('Munsang',3,3) Nsa
INSTR Check the starting position of the select INSTR('Munsang','n') 3
search string in a given string select INSTR('Munsang','xx') 0
(with an optional start searching
select INSTR(4,'Munsang','n') 6
location)
TRIM Remove leading and trailing spaces select TRIM(' IM SC ') IM SC
SPACE Return a string with the number of select SPACE(10)
spaces specified in the arguement
(10 spaces)
CHR Return the character with the given select CHR(38) &
ASCII code
VALUE Convert numerical characters into a select VAL('2457') 2457
number select VAL(' 2 45 7') 2457
(Blanks and tabs are skipped)
select VAL('24 and 57') 24
▪ Date functions
Function Meaning Example Output
DATE Create a date select DATE() (The date today)
DAY Return the day of a given date select DAY('2024/9/1') 1
MONTH Return the month of a given date select MONTH('2024/9/1') 9
YEAR Return the year of a given date select YEAR('2024/9/1') 2024
Table: student
field type width contents
id Number Integer Student identification number
name Text 10 Name
dob Date -- Date of birth
sex Text 1 Sex: (M = Male; F = Female)
class Text 2 Class
hcode Text 1 House of the student. (R = Red; Y = Yellow; B = Blue; G = Green)
dcode Text 3 District code (e.g. TST = Tsim Sha Tsui; SSP = Shum Shui Po)
remission Boolean -- Whether the student is receiving fee remission (學費減免)
mtest Number Integer Math test score
▪ Examples:
List the record of the student whose birthday is in January.
SELECT class, name, dob FROM student
WHERE MONTH(dob)=1
name age
Result Wendy 12.1
Kitty 11.5
Janet 12.4
Sandy 12.3
Mimi 12.2
form cnt
Result 1 28
2 22
ICT (Elective) – 2A Databases P. 9
Combine
Tables
Join Set
(Combine by (Combine by
columns) rows)
Natural Join
Sample data:
id sname dob sex class ClassNo hcode dcode remission id type
9801 Peter 2012/6/14 M 1A 1 R SSP FALSE 9802 Flute
9802 Mary 2012/1/10 F 1A 2 Y HHM TRUE 9803 Violin
9803 Johnny 2012/3/16 M 1A 3 G SSP FALSE 9804 Piano
9804 Wendy 2012/7/9 F 1B 1 B YMT FALSE 9806 Recorder
ICT (Elective) – 2A Databases P. 10
▪ Cross Join
Example:
SELECT student.name, student.id, music.id, music.type
FROM student, music
▪ Inner Join
Equi-join
Example:
SELECT *
FROM student, music
WHERE student.id = music.id
or
SELECT *
FROM student INNER JOIN music
ON student.id = music.id
student.id sname dob sex class ClassNo hcode dcode remission music.id type
9802 Mary 2012/1/10 F 1A 2 Y HHM TRUE 9802 Flute
9803 Johnny 2012/3/16 M 1A 3 G SSP FALSE 9803 Violin
9804 Wendy 2012/7/9 F 1B 1 B YMT FALSE 9804 Piano
Natural Join
Natural join is similar to equi-join, but the common field is appeared once only in the result.
Example:
SELECT *
FROM Student NATURAL JOIN Club
# Not supported in Microsoft Access
▪ Outer Join
Left Outer Join
◼ All the records on the left table will be kept, even if there is no match from the right table.
Example:
SELECT student.name, student.id, music.id, music.type
FROM student LEFT OUTER JOIN music
ON student.id = music.id
student.id sname dob sex class ClassNo hcode dcode remission music.id type
9801 Peter 2012/6/14 M 1A 1 R SSP FALSE (null) (null)
9802 Mary 2012/1/10 F 1A 2 Y HHM TRUE 9802 Flute
9803 Johnny 2012/3/16 M 1A 3 G SSP FALSE 9803 Violin
9804 Wendy 2012/7/9 F 1B 1 B YMT FALSE 9804 Piano
student.id sname dob sex class ClassNo hcode dcode remission music.id type
9802 Mary 2012/1/10 F 1A 2 Y HHM TRUE 9802 Flute
9803 Johnny 2012/3/16 M 1A 3 G SSP FALSE 9803 Violin
9804 Wendy 2012/7/9 F 1B 1 B YMT FALSE 9804 Piano
(null) (null) (null) (null) (null) (null) (null) (null) (null) 9806 Recorder
or
SELECT student.name, student.id, music.id, music.type
FROM student FULL OUTER JOIN music
ON student.id = music.id
student.id sname dob sex class ClassNo hcode dcode remission music.id type
9801 Peter 2012/6/14 M 1A 1 R SSP FALSE (null) (null)
9802 Mary 2012/1/10 F 1A 2 Y HHM TRUE 9802 Flute
9803 Johnny 2012/3/16 M 1A 3 G SSP FALSE 9803 Violin
9804 Wendy 2012/7/9 F 1B 1 B YMT FALSE 9804 Piano
(null) (null) (null) (null) (null) (null) (null) (null) (null) 9806 Recorder
ICT (Elective) – 2A Databases P. 12
▪ Set operations
Union
◼ Records will be in the combined result set if it is in either set.
Example:
SELECT id FROM student
UNION
SELECT id FROM music
id
9801
9802
9803
9804
9806
Intersect
◼ Records will be in the combined result set if it is in both sets.
Example:
SELECT id FROM student
INTERSECT
SELECT id FROM music
# Not supported in Microsoft Access
id
9802
9803
9804
Minus
◼ Records will be in the combined result set if it is in the first set but not in the second set.
Example:
SELECT id FROM student
MINUS
SELECT id FROM music
# Not supported in Microsoft Access
id
9801
Example:
SELECT id FROM music
MINUS
SELECT id FROM student
id
9806
ICT (Elective) – 2A Databases P. 13
- Subquery
Another common type of query is by nested queries with multiple SELECT statements. It can be used to solve multi-
table questions.
▪ Examples:
List the record of the oldest student.
SELECT * FROM student WHERE dob =
(SELECT min(dob) FROM student)
or
or
List the record of student who did not learn any musical instrument.
SELECT * FROM student WHERE id NOT IN
(SELECT id FROM music)
Or
or
id sname dob sex class ClassNo hcode dcode remission id type EXISTS?
9801 Peter 2012/6/14 M 1A 1 R SSP FALSE 9802 Flute
9803 Violin
NO
9804 Piano
9806 Recorder
student x student y
class sname dob class sname dob max(dob)
1A Peter 2012/6/14 1A Peter 2012/6/14
1A Mary 2012/1/10 2012/6/14
1A Johnny 2012/3/16
Entity
Tuple
Field (Attribute)
Key
StudentID EnglishName Sex Class ClassNumber Class Teacher Room
(Relationship)
015034 Peter Chan M 1D 1 1A Mr Leung 101
015035 Jenny Hung F 1A 16 1B Ms Wong 102
015036 Jimmy Hung M 1C 15 1C Mr Yau 103
015037 Jason Lim M 1D 23 Record 1D Ms Lam 104
015038 Betty Lam F 1C 22 (Tuple)
ICT (Elective) – 2A Databases P. 16
Inconsistent updating and deleting may cause data error and therefore it is not recommended.
Terms Meanings
Candidate Key ⚫ Since every tuple in a relational database is unique, we can identify certain attributes to
distinguish different tuples in a table. These attributes are called candidate keys.
⚫ The values of a candidate key must be unique and cannot be null.
⚫ A candidate key can consist of a single attribute or a combination of multiple attributes.
Such candidate keys are called composite keys.
Primary Key ⚫ A candidate key selected as the identifier in a table is called a primary key.
⚫ Each table can have multiple candidate keys, but only one primary key.
Foreign Key ⚫ A foreign key is an attribute in one table that is also the primary key of another table.
⚫ E.g. In a student academic record, the course number of the courses taken is stored, and
the course number is the primary key of another table. Foreign keys are typically used to
represent relationships between tables.
Schema ⚫ A relational schema consists of the relation name, attribute names, and primary key.
⚫ E.g. Employee (empID, name, department, supervisor)
or
Employee empID name department supervisor
ICT (Elective) – 2A Databases P. 17
◆ One-to-many
1 M
◆ Many-to-many
M M
◆ Optional
⚫
ICT (Elective) – 2A Databases P. 18
- Relationship
▪ A relationship is used to express the connection between entities.
▪ In an ER diagram, we sometimes omit attributes due to spatial constraints.
Exercise
8. Based on the following situational description, fill out the entity-relationship diagram (ER diagram).
1 1
1 M
▪ Multiple relationships
More than a single relationship is present between the same pair of entities.
award
1 M
Student Scholarship
M M
apply
- Attributes
▪ In an entity-relationship diagram, we use ovals to represent attributes, connected to the entity by a straight line.
▪ Since the foreign key is already represented by the relationships between entities, there is no need to draw it
again in the ER diagram.
▪ Note the following points when drawing ER diagram:
Attributes must be unique.
Each entity must include a primary key. The attribute name of the primary key is underlined.
Foreign keys do not need to be drawn.
Derived data (such as levels calculated from classes) do not need to be drawn.
If there are multivalued attributes, they should be represented by double ovals.
ICT (Elective) – 2A Databases P. 20
▪ Example:
In a school, each teacher teaches at least one class. Each class is taught by one or more teachers. Each teacher
can have multiple qualifications. On the other hand, each class must have one form teacher.
Name M M
teach
ClassID
serve as
form teacher
Qualification 1 1
Exercise
9. Based on the following situational description, fill out the entity-relationship diagram (ER diagram) (including
attributes and cardinality).
A school collects and stores student information, including student ID, name, address, and date of birth.
Students can be assigned one or more positions, such as prefect, monitor, chairperson, etc.
ICT (Elective) – 2A Databases P. 21
Section 6: Normalization
- Data Inconsistency
▪ Data anomaly
Poorly designed database with data redundancy may lead to data inconsistency.
Type Description
Insertion anomaly Inserting a new record with a foreign key not yet exist
Deletion anomaly Deleting a record which is a foreign key of other records
Update anomaly Changes are updated on part but not all of the affected records.
▪ Data integrity
We should consider the following database design factors to minimize data inconsistency.
Type Description
Entity integrity The primary key of each record should be unique and not null.
Domain integrity Data types and constraints of the fields is valid.
E.g. Mark should be in a range of 0 to 100; Name should be text; Only M or F
is accepted in gender.
Referential integrity Foreign key is present so that the relationship between tables is valid.
Exercise
10. A charity organization recruits students to organize social service groups in each district. A student may have a leader
in each district.
Field name Description Data type Example
SID Identity code of student String 012345
RDATE Date of registration Date 1/10/2020
SNO Number of services participated in Integer 6
LEADER Identity code of leader String 098765
Primary key: SID
D1
SID RDATE SNO LEADER
012345 1/10/2020 6 098765
010040 2/9/2023 2 098765
011564 8/8/2018 7 012011
012011 15/8/2022 4
098765 2/2/2017 4
ICT (Elective) – 2A Databases P. 22
(c) In each of the following parts, there is a new record to be inserted into D1, but it will cause an integrity problem
(such as entity integrity, referential integrity and domain integrity). State the integrity problem and explain your
answer briefly.
(2 marks)
(2 marks)
(2 marks)
[2024-DSE-ICT 2A-Q4]
- Normalization
▪ Due to the limitations of relational databases, even after we establish a data model, we may not be able to create
the corresponding database.
▪ If we can normalize the data model into a specific schema, the data transformation can succeed. For example,
entities in the data model will be converted into tables, and relationships will be converted into foreign keys
within those tables. However, many-to-many relationships cannot be represented by foreign keys.
ICT (Elective) – 2A Databases P. 23
▪ Steps to normalization
Handling Relationships with Attributes
◼ Most attributes will be placed on the entity side.
◼ If it is unavoidable to place attributes on the relationship, this typically occurs only in many-to-many
relationships.
◼ Since we will ultimately convert entities into tables, relationships with attributes must be
transformed into entities if they contain attributes.
◼ Please refer to the examples in the next section.
CustomerID GoodsID
M M
Customer Order Goods
Name Details
Date Quantity
CustomerID GoodsID
號
1 M M 1
Customer place Order include Goods
◼ These entities produced from relationships are also known as associative entities.
➢ Associative entities are formed from the relationship between two entities, so their primary
keys are usually composed of the foreign keys of the related entities.
➢ Therefore, the primary keys of associative entities are often not shown in the ER diagram
ICT (Elective) – 2A Databases P. 24
EmpID M
Employee cooperate
Name
M
EmpID GroupID
M M
Employee cooperate Group
Name
More to Learn:
The transformed relationship may still be a binary many-to-many relationship. Try using the techniques from the
previous section to break down the binary many-to-many relationship in the example above.
ContactPerson ContactPerson1
CompID CompID
Company Company ContactPerson2
CompName CompName
ContactPerson3
◼ To resolve this issue, we can create a new entity to represent the multivalued attribute.
ContactPerson CompID
◼ A database table design which contains no multiple value attributes and repeating attributes is said
to be in First Normal Form (1NF).
ICT (Elective) – 2A Databases P. 25
Database schema:
Invoice Invoice_No Order_Date Product_ID Product_Name Unit_Price Quantity
Invoice
Invoice_No Order_Date Product_ID Product_Name Unit_Price Quantity
0307 13/8/2011 011 Obobe 1200 5
0307 13/8/2011 012 Piccolo 1300 4
0308 14/8/2011 011 Obobe 1200 6
0309 16/8/2011 020 Violin 7100 3
0309 16/8/2011 021 Cello 12000 2
0309 16/8/2011 030 Piano 9000 1
0310 18/8/2011 020 Violin 7100 2
We know that:
➢ Order_Date is only related to Invoice_No.
➢ Product_Name and Unit_Price are only related to Product_ID.
➢ Quantity is related to both Invoice_No and Product_ID.
Therefore, this database design has partial dependency.
In this situation, anomalies may occur when adding, modifying, or deleting data, such as:
➢ A new product cannot be added to the database without an order.
➢ If a product only appears in one order and that order is cancelled, the product's information
will be lost.
➢ If a product's information is changed only in one order, data inconsistency will arise.
◼ A database design which is in 1NF with no partial dependency is said to be in Second Normal Form
(2NF).
ICT (Elective) – 2A Databases P. 26
Database schema:
Product Product_ID Product_Name Godown_ID Address Extension
Product
Product_ID Product_Name Godown_ID Address Extension
001 Clarinet 001 2/F 8 Kerfule Road 52
002 Trombone 001 2/F 8 Kerfule Road 52
003 Bugle 001 2/F 8 Kerfule Road 52
004 Trumpet 001 2/F 8 Kerfule Road 52
005 Saxophone 002 5/F 8 Kerfule Road 23
006 Tuba 002 5/F 8 Kerfule Road 23
007 French Horn 003 8/F 8 Kerfule Road 12
In the above example, the non-key fields Address and Extension are related to another non-key field
Godown_ID, which leads to the following issues:
➢ We cannot add new warehouse data without existing products.
➢ Deleting product records may result in the loss of warehouse data.
➢ When warehouse data is changed, all records must be updated; otherwise, data inconsistency
will occur.
To resolve this issue, we can split this table into smaller tables so that no fields are related to non-
key fields.
◼ A database design which is in 2NF with no transitive dependency is said to be in Third Normal Form
(3NF).
ICT (Elective) – 2A Databases P. 27
Exercise – Normalization
11. A teacher is using the following database table to store his students’ information and their results:
Stud_ID is a unique student identification number. Name , Addr and Tel are the personal information of a student.
Class is the class of a student. Each student will be assigned into one class. Class_Teacher is the form teacher
of the class. Test keeps marks of many tests. Since each student takes multiple tests, therefore Test is a multi-
value attribute.
(a) Is the table above in 1st normal form (1NF)? If yes, explain why it is. Otherwise, convert it into 1NF.
(b) Is the table in (a) in 2nd normal form (2NF)? If yes, explain why it is. Otherwise, convert it into 2NF.
(c) Is the table in (b) in 3rd normal form (3NF)? If yes, explain why it is. Otherwise, convert it into 3NF.
12. The organizing committee of a joint-school competition is using the following tables to record the information of
athletes and events:
In table ATHLETE, Stud_Name and Stud_Addr are the name and address of an athlete. Sch_Name and
Sch_Addr are the school name and school address of their representing school.
In table EVENT, Event_ID is the event identification code. Event_Name is the event name, e.g. Shot Put, Javelin,
etc. Result is the measured results of an athlete in an event. For example, if Peter Wong got a result 11.82s in his
100m event, Result stores 11.82. Position is the rank of an athlete in an event.
(f) Redesign the tables, so as to remove data redundancy. Write down all the foreign key(s).