0% found this document useful (0 votes)

4 views6 pages

01 Relationalmodel

The document provides an overview of databases and database management systems (DBMS), explaining the relational model and its advantages over flat file systems. It discusses the structure, integrity, and manipulation of data within a relational database, as well as the fundamental operations of relational algebra. Additionally, it introduces other data models, such as document and vector data models, and briefly mentions the concept of a Skip List as a data structure for ordered sets.

Uploaded by

Darion Yaphet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views6 pages

01 Relationalmodel

Uploaded by

Darion Yaphet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Lecture #01: Relational Model & Algebra

15-445/645 Database Systems (Spring 2025)

https://15445.courses.cs.cmu.edu/spring2025/
Carnegie Mellon University
Jignesh Patel

1 Databases
A database is an organized collection of inter-related data that models some aspect of the real-world (e.g
modeling the students in a class or a digital music store). Databases are the core component of most
computer applications.
People often confuse ”databases” with ”database management systems” (e.g. MySQL, Oracle, MongoDB,
Snowflake). A database management system (DBMS) is the software that manages a database. Among
many things, a DBMS is responsible for inserting, deleting, and retrieving data from a database.

2 Flat File Strawman

Consider that we want to store data for a music store like Spotify. We want to hold information about the
artists and which albums those artists have released. This database has two entities: an Artists entity and
an Albums entity.
We choose to store the database’s records as comma-separated value (csv) files that the DBMS manages,
and each entity is stored in its own file (an artists.csv and albums.csv). The application has to parse
these files each time it wants to read or update any records.
In this example, artists each have a name, year, and country attribute. Albums have name, artist, and
year attributes.
The following is an example CSV file for information about artists with the schema (name, year, country):
"Wu-Tang Clan", 1992, "USA"
"Notorious BIG", 1992, "USA"
"GZA", 1990, "USA"

Issues with Flat Files

There are several problems with using flat files as our database. Here are some questions to consider:
• Data Integrity How do we ensure that the artist is the same for each album entry? What if some-
body overwrites the album year with an invalid string? What if there are multiple artists on an
album? What happens if we delete an artist that has albums?
• Implementation How do you find a particular record? What if we now want to create a new
application that uses the same database? What if that application is running on a different machine?
What if two threads try to write to the same file at the same time?
• Durability What if the machine crashes while our program is updating a record? What if we want
to replicate the database on multiple machines for high availability?
Spring 2025 – Lecture #01 Relational Model & Algebra

3 Database Management System

A DBMS is software that allows applications to store and analyze information in a database.
A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administra-
tion of databases in accordance with some data model.
A data model is a collection of concepts for describing the data in database. Some examples include:
• Relational (most common)
• NoSQL (key/value, document, graph)
• Array / Matrix / Vector (for machine learning)
A schema is a description of a particular collection of data based on a data model. This defines the structure
of data for a data model.

Common Data Models

• Relational (Most DBMSs)
• Key/Value (NoSQL)
• Graph (NoSQL)
• Document/XML/Object (NoSQL)
• Wide-Column/Column-family (NoSQL)
• Array / Matrix / Vector (Machine Learning)
• Hierarchical (Obsolete/Legacy/Rare)
• Network (Obsolete/Legacy/Rare)
• Multi-Value (Obsolete/Legacy/Rare)

Early DBMSs
Early database management systems were difficult to build and maintain because there was a tight coupling
between logical and physical layers.
The logical layer describes which entities and attributes the database has, while the physical layer is how
those entities and attributes are actually being stored (file layout etc). In early DBMSs, the physical layer
was defined in the application code (as opposed to being abstracted away), so if database administrators
wanted to change how data was stored, they would need to change all of the application code to match
the new physical layer.
For example, if we wanted to maintain a key-value store that supported equality lookups (find the value
that matches this key), a logical data structure to use would be a hash map. However, if an application
developer comes along and asks to perform a range scan over the keys (find all of the values with keys in
between a and b), a tree or B-tree would be a much better solution. In order to support both range scans
and equality lookups, the database would either need to provide multiple interfaces, or the application
code would need to be implemented twice, neither of which is appealing.

15-445/645 Database Systems

Page 2 of 6
Spring 2025 – Lecture #01 Relational Model & Algebra

4 Relational Model
Ted Codd at IBM Research in the late 1960s noticed that people were rewriting DBMSs every time they
wanted to change the physical layer. In 1969, he proposed the relational model as a potential solution to
this.
The relational model defines a database abstraction based on relations to avoid maintenance overhead. It
has three key ideas:
• Store database in simple data structures (relations)
• Physical storage left up to the DBMS implementation
• Access data through a high-level (declarative) language, where the DBMS figures out best execution
strategy
The relational data model defines three concepts:
• Structure: The definition of relations and their contents independent of their physical representa-
tion. Each relation has a set of attributes. Each attribute has a domain of values.
• Integrity: Ensure the database’s contents satisfy certain constraints. An example of a constraint
would be that the age of a person cannot be a negative number.
• Manipulation: An interface for accessing and modifying a database’s contents (e.g. SQL, Dataframes)
A relation is an unordered set that contains the relationship of attributes that represent entities. Since the
relationships are unordered, the DBMS can store them in any way it wants, allowing for optimization. It
is possible to have repeated / duplicated elements in a relation as long as their primary key is different.
A tuple is a set of attribute values (also known as its domain) in the relation. In the past, values had to be
atomic or scalar, but now values can also be lists or nested data structures. Every attribute can be a special
value, NULL, which means for a given tuple the attribute is undefined.
A relation with n attributes is called an n-ary relation. We will interchangeably use relation and table in
this course. An n-ary relation is equivalent to a table with n columns.
A relation’s primary key uniquely identifies a single tuple in a table. Some DBMSs automatically create an
internal primary key if you do not define one.
A foreign key specifies that an attribute from one relation maps to a tuple in another relation. Generally,
the foreign key will point / be equal to a primary key in another table.
A constraint is a user-defined condition that must hold for any instance of the database. Unique key and
referential (foreign key) constraints are the most common. Another common example of constraint would
be to prohibit a column from having NULL values.

15-445/645 Database Systems

Page 3 of 6
Spring 2025 – Lecture #01 Relational Model & Algebra

5 Data Manipulation Languages (DMLs)

DMLs refer to methods to store and retrieve information from a database. There are two classes of lan-
guages for Manipulating a database:
• Procedural: The query specifies the (high-level) execution strategy the DBMS should use to find
the desired result based on sets / bags. For example, use a for loop to scan all records and count
how many records are there to retrieve the number of records in the table.
• Non-Procedural (Declarative): The query specifies only what data is wanted and not how to find
it. For example, we can use SQL SELECT COUNT(*) FROM artist to count how many records are
there in the table.

6 Relational Algebra
Relational Algebra is a set of fundamental operations to retrieve and manipulate tuples in a relation. Each
operator takes in one or more relations as inputs, and outputs a new relation. To write queries we can
”chain” these operators together.

Select
Select takes in a relation and outputs a subset of the tuples from that relation that satisfy a selection
predicate. The predicate acts as a filter, and we can combine multiple predicates using conjunctions and
disjunctions.
Syntax: σpredicate (R).
Example: σa id=’a2’ (R)
SQL: SELECT * FROM R WHERE a_id = 'a2'

Projection
Projection takes in a relation and outputs a relation with tuples that contain only specified attributes. You
can rearrange the ordering of the attributes in the input relation as well as manipulate the values.
Syntax: πA1,A2,. . . ,An (R).
Example: πb id-100, a id (σa id=’a2’ (R))
SQL: SELECT b_id-100, a_id FROM R WHERE a_id = 'a2'

Union
Union takes in two relations and outputs a relation that contains all tuples that appear in at least one of
the input relations. Note: The two input relations have to have the exact same attributes.
Syntax: (R ∪ S).
SQL: (SELECT * FROM R) UNION ALL (SELECT * FROM S)

Intersection
Intersection takes in two relations and outputs a relation that contains all tuples that appear in both of the
input relations. Note: The two input relations have to have the exact same attributes.
Syntax: (R ∩ S).

15-445/645 Database Systems

Page 4 of 6
Spring 2025 – Lecture #01 Relational Model & Algebra

SQL: (SELECT * FROM R) INTERSECT (SELECT * FROM S)

Difference
Difference takes in two relations and outputs a relation that contains all tuples that appear in the first
relation but not the second relation. Note: The two input relations have to have the exact same attributes.
Syntax: (R − S).
SQL: (SELECT * FROM R) EXCEPT (SELECT * FROM S)

Product
Product takes in two relations and outputs a relation that contains all possible combinations for tuples
from the input relations.
Syntax: (R × S).
SQL: (SELECT * FROM R) CROSS JOIN (SELECT * FROM S), or simply SELECT * FROM R, S

Join
Join takes in two relations and outputs a relation that contains all the tuples that are a combination of two
tuples where for each attribute that the two relations share, the values for that attribute of both tuples is
the same.
Syntax: (R ▷◁ S).
SQL: SELECT * FROM R JOIN S USING (ATTRIBUTE1, ATTRIBUTE2...)

Observation
Relational algebra defines the fundamental operations to retrieve and manipulate tuples in a relation. It
also defines an ordering of the high-level steps to compute a query.
For example, σb id=102 (R ▷◁ S) represents joining R and S and then selecting / filtering the result. However,
(R ▷◁ (σb id=102 (S))) will do the selection on S first, and then join the result of the selection with R.
These two statements will always produce the same answer. However, if S has 1 billion tuples and there is
only 1 tuple in S with b id=102, then (R ▷◁ (σb id=102 (S))) will be significantly faster than σb id=102 (R ▷◁ S).
How would you know this if you were using Pandas or another procedural DML?
A better approach is to state the high-level result you want (retrieve the joined tuples from R and S where
bid equals 102), and let the DBMS decide the steps it should take to compute the query.
In SQL (a declarative language) we only express what we want to be computed and we do not specify how
to compute the result. The DBMS is responsible for finding the best strategy to execute the query (through
Query Optimization). This powerfull abstraction has made SQL the de facto standard for writing queries
on a relational DBMS since the user of the DBMS does not need to know anything about the internals and
can query the database in the most efficient way.

7 Other Data Models

Document Data Model
The document data model is a collection of record documents containing a hierarchy of named field/value
pairs.

15-445/645 Database Systems

Page 5 of 6
Spring 2025 – Lecture #01 Relational Model & Algebra

A field’s value can be either a scalar type, an array of values, or a pointer to another document.
Modern implementations use JSON. Older systems use XML or custom object representations.
The Document Model has some use cases but still runs into many of the problems discussed in the flat flie
strawman example.

Vector Data Model

The vector data model represents one-dimensional arrays used for nearest-neighbor search (exact or ap-
proximate).
Vector databases are generally used for semantic search on embeddings generated by ML-trained trans-
former models (think ChatGPT), and native integration with modern ML tools and APIs (e.g., LangChain,
OpenAI). At their core, these systems use specialized indexes to perform NN searches quickly.
Recently, many relational DBMSs have shipped vector index features or extensions (pgvector) that allow
NN search within the relational model.

8 P0 Intro: Skip List

A Skip List is a probabilistic data structure that can be used to implment an ordered set. A Skip List
supports inserting, deleting and searching for elements in logarithmic complexity (average case). Other
ways to implement an ordered set include a Binary Search Tree (BST), an AVL Tree , or a Red Black Tree
(most common).
The basic idea behind the Skip List is that it acts like a linked list but has few more additions. In a Skip List
each node that has a link to the next node (e.g 1− > 2 ) might also have links to other nodes after the next
node (i.e to node 1− > 3, 1− > 5, 1− > 9). This allows for efficient search since we might be able to skip
traversing all the nodes and immediately find the desired node after a few pointer chases.

15-445/645 Database Systems

Page 6 of 6

01 Relationalmodel
No ratings yet
01 Relationalmodel
5 pages
Lecture #01: Relational Model & Relational Algebra: 1 Databases
No ratings yet
Lecture #01: Relational Model & Relational Algebra: 1 Databases
4 pages
01 Introduction
No ratings yet
01 Introduction
4 pages
M1 - Intro
No ratings yet
M1 - Intro
56 pages
Slides1 Introduction-Merged
No ratings yet
Slides1 Introduction-Merged
96 pages
Relational Database Management Systems
No ratings yet
Relational Database Management Systems
55 pages
01 Introduction
No ratings yet
01 Introduction
63 pages
Unit 4 Data Models
No ratings yet
Unit 4 Data Models
34 pages
Relational Databases for CS Students
No ratings yet
Relational Databases for CS Students
30 pages
DBMS3 - Relational Model
No ratings yet
DBMS3 - Relational Model
47 pages
Database Basics for Beginners
No ratings yet
Database Basics for Beginners
30 pages
Bca 4th Sem Rdbms Well
No ratings yet
Bca 4th Sem Rdbms Well
31 pages
Lect 1-2pdf
No ratings yet
Lect 1-2pdf
55 pages
1904001-DBMS Notes 5 Units
100% (2)
1904001-DBMS Notes 5 Units
70 pages
Lab Internal Questions
No ratings yet
Lab Internal Questions
11 pages
Database Concepts: Need For A Database
No ratings yet
Database Concepts: Need For A Database
3 pages
CS4416 Lectures 1-7
No ratings yet
CS4416 Lectures 1-7
148 pages
DBMS Concepts for CSE Students
No ratings yet
DBMS Concepts for CSE Students
55 pages
Database Concepts Till Features of MySQL
No ratings yet
Database Concepts Till Features of MySQL
13 pages
Unit 1-DBMS
No ratings yet
Unit 1-DBMS
100 pages
CPE 313 Database Management Systems: Fall 2021/2022
No ratings yet
CPE 313 Database Management Systems: Fall 2021/2022
24 pages
It Faq
No ratings yet
It Faq
12 pages
Database Concepts Final
No ratings yet
Database Concepts Final
18 pages
Lec Database
No ratings yet
Lec Database
57 pages
UE20CS301 Unit2 Slides
No ratings yet
UE20CS301 Unit2 Slides
266 pages
HND Computing UNIT 38: Database Management: Faisal Saghir
No ratings yet
HND Computing UNIT 38: Database Management: Faisal Saghir
38 pages
DBMS 2nd Semester
No ratings yet
DBMS 2nd Semester
74 pages
CH 1
No ratings yet
CH 1
130 pages
Unit I REVIEW OF RELATIONAL DATA MODEL AND RELATIONAL DATABASE CONSTRAINTS
No ratings yet
Unit I REVIEW OF RELATIONAL DATA MODEL AND RELATIONAL DATABASE CONSTRAINTS
20 pages
DBMS W 1-4
No ratings yet
DBMS W 1-4
68 pages
CH - 11 - Data Base Concepts
No ratings yet
CH - 11 - Data Base Concepts
7 pages
DBMS Unit 3 Notes
No ratings yet
DBMS Unit 3 Notes
29 pages
Jameel - 1995 - 3521 - 1 - ITB-Lecture 9 Database-24
No ratings yet
Jameel - 1995 - 3521 - 1 - ITB-Lecture 9 Database-24
69 pages
Adbms Unit1
No ratings yet
Adbms Unit1
27 pages
Rdbms
100% (7)
Rdbms
60 pages
الوحدة الرابعة -
No ratings yet
الوحدة الرابعة -
9 pages
Lecture 1-Introduction and Relational Model
No ratings yet
Lecture 1-Introduction and Relational Model
33 pages
UCI 204 Notes (Education)
No ratings yet
UCI 204 Notes (Education)
33 pages
Database Concepts
No ratings yet
Database Concepts
32 pages
Lecture 1.2.5 - Relational Model (RM)
No ratings yet
Lecture 1.2.5 - Relational Model (RM)
15 pages
Dbs Unit I Notes
No ratings yet
Dbs Unit I Notes
15 pages
Pythonlearn 15 Databases
No ratings yet
Pythonlearn 15 Databases
96 pages
DBS Unit 1
No ratings yet
DBS Unit 1
23 pages
Database
No ratings yet
Database
47 pages
All in One Mysql Notes
No ratings yet
All in One Mysql Notes
24 pages
Advanced Data Base Note
No ratings yet
Advanced Data Base Note
62 pages
DBMS English
No ratings yet
DBMS English
4 pages
Dbms Ptu
No ratings yet
Dbms Ptu
245 pages
Dbms
No ratings yet
Dbms
95 pages
Chapter 3 (DBMS)
No ratings yet
Chapter 3 (DBMS)
30 pages
Introduction to Relational Databases
No ratings yet
Introduction to Relational Databases
116 pages
CS121Lec01 PDF
No ratings yet
CS121Lec01 PDF
37 pages
Class 6
No ratings yet
Class 6
29 pages
Database2024-Not para
No ratings yet
Database2024-Not para
95 pages
XII 1st PRE BOARD QP Withsolution 2023
No ratings yet
XII 1st PRE BOARD QP Withsolution 2023
13 pages
Text Mining
No ratings yet
Text Mining
16 pages
Hashing in DBMS
No ratings yet
Hashing in DBMS
4 pages
CSE6242 010 AnalyticsBuildingBlocks
No ratings yet
CSE6242 010 AnalyticsBuildingBlocks
24 pages
Oracle-Key-Vault-Administrators-Guide v21.1
No ratings yet
Oracle-Key-Vault-Administrators-Guide v21.1
533 pages
Saurabh Rathore CV
No ratings yet
Saurabh Rathore CV
2 pages
Weatherapp
No ratings yet
Weatherapp
46 pages
Career Insights and Job Skills Guide
No ratings yet
Career Insights and Job Skills Guide
3 pages
MS Access Notes: 9-25-2020 Dit-Ii Arshad Ali Soomro CS Instructor at IBA ITC Gambat
No ratings yet
MS Access Notes: 9-25-2020 Dit-Ii Arshad Ali Soomro CS Instructor at IBA ITC Gambat
18 pages
CMDB Schema Design White Paper
100% (1)
CMDB Schema Design White Paper
26 pages
Microsoft Azure Exam Az-900 Certification Whizcard
No ratings yet
Microsoft Azure Exam Az-900 Certification Whizcard
29 pages
Chapter 9 Business Analytics Assignements
No ratings yet
Chapter 9 Business Analytics Assignements
7 pages
Addmrpt - Jftech01 Snap 56453 Thru 56455
No ratings yet
Addmrpt - Jftech01 Snap 56453 Thru 56455
9 pages
BDA Unit 2
No ratings yet
BDA Unit 2
31 pages
Core Java Question
No ratings yet
Core Java Question
4 pages
Spring Cloud Data Flow - Animated
No ratings yet
Spring Cloud Data Flow - Animated
33 pages
SAP - Data Migration
No ratings yet
SAP - Data Migration
6 pages
Library Management System Adebesin Tobi Tosin-with-cover-page-V2
No ratings yet
Library Management System Adebesin Tobi Tosin-with-cover-page-V2
71 pages
DBA Task
No ratings yet
DBA Task
3 pages
SQL Server Logs & Monitoring Guide
No ratings yet
SQL Server Logs & Monitoring Guide
2 pages
Sample Paper
No ratings yet
Sample Paper
5 pages
Unit 5
No ratings yet
Unit 5
46 pages
Data Science Curriculum
No ratings yet
Data Science Curriculum
3 pages
402 Information Tech MS
No ratings yet
402 Information Tech MS
7 pages
Java Developer Profile
No ratings yet
Java Developer Profile
1 page
Akbar Gadhiya-Profile
No ratings yet
Akbar Gadhiya-Profile
6 pages
University Database Design Guide
0% (1)
University Database Design Guide
22 pages
E4C Administration and Monitoring Guide
No ratings yet
E4C Administration and Monitoring Guide
48 pages
Veritas NetBackup v10 Advanced Administration
No ratings yet
Veritas NetBackup v10 Advanced Administration
5 pages

01 Relationalmodel

Uploaded by

01 Relationalmodel

Uploaded by

Lecture #01: Relational Model & Algebra

15-445/645 Database Systems (Spring 2025)

2 Flat File Strawman

Issues with Flat Files

3 Database Management System

Common Data Models

15-445/645 Database Systems

15-445/645 Database Systems

5 Data Manipulation Languages (DMLs)

15-445/645 Database Systems

SQL: (SELECT * FROM R) INTERSECT (SELECT * FROM S)

7 Other Data Models

15-445/645 Database Systems

Vector Data Model

8 P0 Intro: Skip List

15-445/645 Database Systems

You might also like