[go: up one dir, main page]

0% found this document useful (0 votes)
9 views48 pages

INFO90002 Week1-2 DatabasesIntro

Uploaded by

Mohamed Shahin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views48 pages

INFO90002 Week1-2 DatabasesIntro

Uploaded by

Mohamed Shahin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Warning

This material has been reproduced and communicated to you by or on behalf of the
University of Melbourne pursuant to Part VB of the Copyright Act 1968 (the Act).

The material in this communication may be subject to copyright under the Act.

Any further copying or communication of this material by you may be the subject
of copyright protection under the Act.

Do not remove this notice


Introduction to
Databases
Database Systems & Information Modelling
INFO90002

Week 1 – Database basics


Dr Tanya Linden
David Eccles

2
Learning Objectives
Difference between data and information
Unstructured/Semi-structured/Structured data
Problems with flat files
Database and DBMS – not the same!
Relational DBMS
Client server architecture
Database development lifecycle and importance of database design

3
Context: software and data

• Computer systems consist of software


working to process data.

• You will learn about creating software and


algorithms in other subjects (COMP90041)

• This subject is about data.

• Focus away from computation to data


Processed data = information

4
The Modern Data Challenge
Modern organisations need to store and retrieve
large amounts of data

Data can be divided into three major categories


• Structured Data
• Semi-Structured Data
• Unstructured Data

Inside Google’s Giant Data Centers


Structured Data is typically used by https://markosun.wordpress.com/2012/10/21
/inside-googles-giant-data-centers/
Relational Database Management Systems (RDBMSs)
such as Access, Oracle, SQL Server, MySQL.
5
Unstructured data

• handwritten notes
• printed books
• spreadsheets
(semi-structured?)
etc.

6
Unstructured Data
Unstructured data is not organised in a pre-defined manner.
The organisation does not know the format, nor the content of the data in advance.

Consider data sourced from social media, email, etc. The contents are unpredictable.
• The data may contain text, audio, video, links, images. One item may include many data
about many organisational functions.

How do organisations store such data so that it can be retrieved, collated, analysed?

7
Semi-Structured Data
Semi-structured data is information that doesn't match the requirements of a relational
database.

The data is organised / arranged that makes it easier to analyse.

Examples of semi-structured data include XML documents and NoSQL databases.

We will briefly deal with the topic of semi-structured data in future weeks

8
Structured Data
Relational Database Management Systems require data to be stored in a very structured
way.
These systems deal with data that has a repetitive pattern or format.

Consider Student data stored by a University. While every student is different, the
university wants to store data in the same format for every student. Data Types are also
specified for each piece of information.
• Student ID – Numeric/Digits HomeAddress - Alphanumeric
• Student Name – Alpha PhoneNo – Numeric or Digits+space+brackets
• Gender – Alpha NextOfKin – Alpha
• DateOfBirth – Date
9
File Processing

Problems with flat-files:


• Data access routines must be programmed in detail
• Each program must include full detail of data structure
• Multiple users cannot simultaneously access data
• Multiple copies of data - not centrally managed
• Lack of security 10
Why use a DBMS?
DBMS = DataBase Management System
Data independence
• Applications should not be exposed to data representation and storage
• DBMS provides an abstract view that hides representation and storage
Efficient access
• More efficient data storage and retrieval than flat files
Data integrity and security
• DBMS enforces data integrity constraints, access controls and govern user access
• Not reliant on just the operating system

Ramakrishnan and Gehrke 3rd Edition, p.8 11


Why use a DBMS?
Uniform data administration
• Specialist skills in data management and administration
• Layer of expertise reduces risk to data and data owners
Concurrent access and crash recovery
• Schedules concurrent access. Protects data from system failures
Datasets increasing in diversity and volume
Data Independence
• Logical data independence: the ability to modify the logical schema without causing application programs to be
rewritten (e.g. Add/Modify/Delete a new attribute, entity or relationship)
• Physical data independence: e.g. using a new storage (Cloud instead of local server Hard Drive; changes to
compression techniques)

Reduced Application Development time*


12
Ramakrishnan and Gehrke 3rd Edition, p.9
Relational Databases
• The first relational databases from Oracle and IBM appear around 1980

• Others appear later

13
Client-
Server
Architecture
Introduction to MySQL Server
and MySQL Workbench
Identifier first line
Second line

14
RDBMS and Database Servers
A database server is a computer that is networked to other computers
The database server stores databases
Users on the network can access the data stored in the databases

There is only one copy of the data (excluding backups, etc.)

RDBMSs allow multiple users on the network to update data in database tables.
• Many people can check the price of product 20
• Many people may choose to enroll in INFO90002 simultaneously
• Many users may purchase tickets for a flight at the same time
15
Client Server – In Industry
Server computer
DBMS application / web server
(database management
system)

Client computers
Database DBMS Web
client admin tool browser,
SAP client

16
… in teaching (and development)
info90002db.eng.unimelb.edu.au 127.0.0.1 (local host)

MySQL Server MySQL Server

VPN

MySQL MySQL MySQL


Workbench Workbench Workbench
Client Client Client

Labs or own laptop Your PC 17


Relational databases
Basic concepts

18
Relational Data Model
RDBMSs are based on the Relational Data Model
• Developed by Ted Codd in 1970.
• Data is represented in the form of two-dimensional tables.

Each two-dimensional table has the following properties:


• A set of uniquely named Columns / Attributes
• A list of unnamed/unnumbered Rows
• The order of the rows is irrelevant.

A Row consists of a sequence of Attributes


• One cell for each Attribute
• Only one value per cell is allowed.

All Relational Database Management Systems are based on the Relational Data Model. 19
Relational Data Model (cont.)
Table

Uniquely named columns called Attributes


Attribute

ID StName Degree Enrolled


Row 1001 Josh Blogs 615 20/01/2024
1004 Emma Lewis 308 03/02/2023
1006 Dave Rossi 234 10/07/2024

20
Relational Databases
A relational database is a collection of related tables
The relational model’s foundation is a mathematical concept known as a relation
• A relation is a two-dimensional structure composed of intersecting rows and columns
– Imagine it as a table
• Each row in a relation is called a tuple and each column represents an attribute

Example:
• Student Table (stores data about students)
• Subject Table (stores data about university subjects)
• Enrolment Table (stores data about a student’s enrolment into a specific subject in a
specified year and semester)
21
RDBMS
A RDBMS is a collection of programs that allow developers / users to store
and retrieve data from relational databases

It allows users to perform CRUD (create, read, update and delete) operations
on data in the tables, e.g.
• Create a student record
• Retrieve the student's details
• Update the student's details
• Delete the student from a table
22
Setting up a RDBMS
Tables
• Follow a 2-dimensional structure
• Each row of data is identified by a unique Primary Key
• No duplicates, e.g. Student ID

Constraints (based on business rules) can be added to validate data


• Student ID is correct length
• Student type is PG or UG (postgrad or undergraduate)
• Student is enrolled in a subject that is actually offered in the current semester

23
Case Study - Overview
Let's consider a business called Pizza OnLine

• It allows customers to order pizzas via the


online portal.
• Customers have pizzas delivered.
• Only people who register with a credit card
number can be an on-line customer.

24
Case Study - Customer Details
Customers need to supply personal details:

• Name
• Address
• Credit Card No

25
Case Study – Order Details
The Order Form
• This is One order.
• The customer may add multiple items and
different quantities.

26
Case Study - Customer Details
Add any delivery instructions that are required for the order:

27
Case Study – The completed Order
This Order is for
Customer C2045.

Order Date and Time is


Sept 1, 2020 17:35

This order has special


delivery instructions

The order is for many


different Items.

Each item ordered


has a Quantity value
28
Pizza Online Database
Customer ID Surname First name Address Postcode Credit card

C2045 Smith Fred 1 John St, 3122 1234234534564567


Hawthorn

Order No Date Time Delivery Total Code Pizza name Price Quan
Instructions tity
3224 1/09/2021 17:35 The bell does 59.55 P1 Mario’s 6.95 2
not work, Supreme Pizza
3224 1/09/2021 17:35 The bell does 59.55 P3 Hawaiian Pizza 6.95 1
not work,
3224 1/09/2021 17:35 The bell does 59.55 P4 Hot ‘n’ spicy 6.95 1
not work, Pizza

Problems:
• Order details repeated for every pizza ordered
• How do we link the order to the customer? 29
Tables in the Pizza Online Database
Customer ID Surname First name Address Postcode Credit card
C2045 Smith Fred 1 John St, Hawthorn 3122 1234234534564567
C2048 Nguyen Vincent 2/7 Oak Ave, Altona 3018 4554123423457899
C2146 Davis Liz 32 Lyle St, Toorak 3142 4564564578970022

Order No Line Item Quan Code Pizza name Price


tity P1 Mario’s Supreme Pizza 6.95
3224 P1 2 P2 Vegetarian Pizza 6.95
3224 P3 1 P3 Hawaiian Pizza 6.95
3224 P4 1 P4 Hot ‘n’ spicy Pizza 6.95
3224 B1 4 B1 Garlic Bread 4.95
3224 D1 2 B2 Herb Bread 4.95
D1 2 Litre Cola 2.50
D2 2 Litre Lemonade 2.50
Order No Customer ID Date Time Delivery Instructions Total
3224 C2045 1/09/2021 17:35 The bell does not work, 59.55 30
Manipulating Tables Structure
Customer ID Surname First name Address Postcode Credit card
C2045 Smith Fred 1 John St, Hawthorn 3122 1234234534564567
C2048 Nguyen Vincent 2/7 Oak Ave, Altona 3018 4554123423457899
C2146 Davis Liz 32 Lyle St, Toorak 3142 4564564578970022

Code Pizza name Price


P1 Mario’s Supreme Pizza 6.95
P2 Vegetarian Pizza 6.95 Working with tables is like working with files
P3 Hawaiian Pizza 6.95 – there are 4 things you can do:
P4 Hot ‘n’ spicy Pizza 6.95
B1 Garlic Bread 4.95 CREATE a table
B2 Herb Bread 4.95 DROP (i.e. delete) a table
D1 2 Litre Cola 2.50 ALTER a table (e.g. add a column)
D2 2 Litre Lemonade 2.50 RENAME a table

31
Manipulating Table Contents
Customer ID Surname First name Address Postcode Credit card
C2045 Smith Fred 1 John St, Hawthorn 3122 1234234534564567
C2048 Nguyen Vincent 2/7 Oak Ave, Altona 3018 4554123423457899
C2146 Davis Liz 32 Lyle St, Toorak 3142 4564564578970022

Code Pizza name Price


P1 Mario’s Supreme Pizza 6.95
P2 Vegetarian Pizza 6.95 For each table you need to be able to:
P3 Hawaiian Pizza 6.95 SELECT, or read, data from the table
P4 Hot ‘n’ spicy Pizza 6.95
B1 Garlic Bread 4.95
INSERT new rows into the table
B2 Herb Bread 4.95 DELETE existing rows from the table
D1 2 Litre Cola 2.50 UPDATE existing rows in the table
D2 2 Litre Lemonade 2.50

32
Database lifecycle

• Design the database


• data modelling, E-R diagrams
•CREATE
• Implement the database •DROP
•ALTER
• data definition language DDL •RENAME
•SELECT
• Data access / programming •INSERT
• data manipulation language DML •UPDATE
•DELETE

• Database administration •GRANT


• data control language DCL •REVOKE
33
Database
Development
Lifecycle
Part of system development
lifecycle
Identifier first line
Second line

34
Database Development Lifecycle
Database Design
Database Planning • Planning how to do the
Conceptual Design project.
Systems Definition • How does the
Logical Design enterprise work
Requirements Definition
and Analysis
• Enterprise data
Physical Design model
• How can the stages be
Application Design
completed efficiently and
effectively.

Operational • Outside scope of the


Implementation
Maintenance course
Data Conversion and
Testing 35
Loading
Database Development Lifecycle
Database Planning
Database Design • Specifying scope and
Conceptual Design boundaries
Systems Definition • Users
Logical Design • Major user views
Requirements Definition • Application areas
and Analysis
Physical Design • How does it interact with
other systems
• User views – how the
Application Design
system operates from
differing perspectives
Operational
Implementation
Maintenance • Outside scope of the
course (slightly)
Data Conversion and
Testing 36
Loading
Database Development Lifecycle
Database Planning
Database Design • Collection and analysis of
Conceptual Design requirements for the new
Systems Definition system
Logical Design

Requirements Analysis
Physical Design

Application Design

Operational
Implementation
Maintenance

Data Conversion and 37


Testing
Loading
Database Development Lifecycle
Database Design
Database Planning • High-level, first-pass
Conceptual Design model of entities and
Systems Definition their connections
Logical Design • Typically omits attributes*
Requirements Definition
• Could potentially be
and Analysis
Physical Design implemented in a non-
relational database
• Thus can include many-to-
Application Design
many relationships,
repeating groups,
Operational composite attributes
Implementation
Maintenance

Data Conversion and


Testing 38
Loading
* Typically we list only attributes in the case study
Database Development Lifecycle
Database Design
Database Planning • Builds on the conceptual
Conceptual Design design
Systems Definition • Designing now for a
Logical Design relational database
Requirements Definition • Includes columns and keys
and Analysis
Physical Design • Independent of a specific
vendor and other physical
considerations
Application Design

Operational
Implementation
Maintenance

Data Conversion and


Testing 39
Loading
Database Development Lifecycle
Database Design • Implements the logical
Database Planning
Conceptual Design design for a specific
Systems Definition
DBMS.
Logical Design • Describes:
Requirements Definition • Base tables
and Analysis
Physical Design • Data types
• Indexes
• Integrity constraints
Application Design
• File organisation
• Security measures
• We will cover some
Operational
Implementation aspects of physical design
Maintenance

Data Conversion and 40


Testing
Loading
Database Development Lifecycle
Database Design
Database Planning • Done in conjunction with
Conceptual Design database design
Systems Definition • Design of the interface
Logical Design and application programs
Requirements Definition
and Analysis
that use and process the
Physical Design database

Application Design
• Mostly outside scope of
the course, but briefly
discussed in the Web
Operational Apps lecture
Implementation
Maintenance

Data Conversion and -41-


Testing
Loading
Database Development Lifecycle
Database Design
Database Planning • Implementation of the
Conceptual Design design as a working
Systems Definition database
Logical Design
Requirements Definition
and Analysis
Physical Design

Application Design

Operational
Implementation
Maintenance

Data Conversion and 42


Testing
Loading
Database Development Lifecycle
Database Design
Database Planning • Transfer existing data into
Conceptual Design the database
Systems Definition • Conversion from old
Logical Design systems
Requirements Definition
and Analysis
• Non trivial task
Physical Design • Mostly outside scope of
the course (concepts
Application Design
covered in the Data
Warehouse lecture)

Operational
Implementation
Maintenance

Data Conversion and 43


Testing
Loading
Database Development Lifecycle
Database Design • Running the database to
Database Planning
Conceptual Design find errors in the design /
Systems Definition setup
Logical Design • Other issues also
Requirements Definition • Performance
and Analysis
Physical Design • Robustness
• Recoverability
• Adaptability
Application Design
• Security
• Mostly outside scope of
Operational
the course (see ISYS90086
Implementation
Maintenance Data Warehousing)

Data Conversion and 44


Testing
Loading
Database Development Lifecycle
Database Planning
Database Design • The process of monitoring
Conceptual Design and maintaining the
Systems Definition database following its
Logical Design commissioning
Requirements Definition • Monitoring and improving
and Analysis
Physical Design performance
• Handling changes to
requirements
Application Design

• We will touch on some of


Operational
these topics later in
Implementation
Maintenance lectures 15, 20

Data Conversion and 45


Testing
Loading
Database Lifecycle (Summary)
Database Design
Database Planning
Conceptual Design

Systems Definition
Logical Design

Requirements Definition and


Analysis
Physical Design

Application Design

Operational Maintenance Implementation

Testing Data Conversion and Loading


46
Thank you
COMMONWEALTH OF AUSTRALIA

Copyright Regulations 1969

Warning

This material has been reproduced and communicated to you by or on behalf of the
University of Melbourne pursuant to Part VB of the Copyright Act 1968 (the Act).

The material in this communication may be subject to copyright under the Act. Any
further copying or communication of this material by you may be the subject of
copyright protection under the Act.

Do not remove this notice

You might also like