0% found this document useful (0 votes)

14 views43 pages

08 Storage

Uploaded by

KUNCHANGI GNANA SURYA DEEPIKA 421205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views43 pages

08 Storage

Uploaded by

KUNCHANGI GNANA SURYA DEEPIKA 421205

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Lecture #08

ADVANCED
DATABASE
SYSTEMS
Storage Models &
Data Layout
@Andy_Pavlo // 15-721 // Spring 2020
2

D ATA O R G A N I Z AT I O N
Fixed-Length Variable-Length
Index Data Blocks Data Blocks

Block Id + C++11 alignas

Offset
Block Pointer Offset
44-bits 20-bits

15-721 (Spring 2020)

D ATA O R G A N I Z AT I O N

One can think of an in-memory database as just a

large array of bytes.
→ The schema tells the DBMS how to convert the bytes
into the appropriate type.
→ Each tuple is prefixed with a header that contains its
meta-data.

Storing tuples with as fixed-length data makes it

easy to compute the starting point of any tuple.

15-721 (Spring 2020)

Type Representation
Data Layout / Alignment
Storage Models
System Catalogs

15-721 (Spring 2020)

D ATA R E P R E S E N TAT I O N

INTEGER/BIGINT/SMALLINT/TINYINT
→ C/C++ Representation
FLOAT/REAL vs. NUMERIC/DECIMAL
→ IEEE-754 Standard / Fixed-point Decimals
TIME/DATE/TIMESTAMP
→ 32/64-bit int of (micro/milli)seconds since Unix epoch
VARCHAR/VARBINARY/TEXT/BLOB
→ Pointer to other location if type is ≥64-bits
→ Header with length and address to next location (if
segmented), followed by data bytes.

15-721 (Spring 2020)

VA R I A B L E P R E C I S I O N N U M B E R S

Inexact, variable-precision numeric type that uses

the “native” C/C++ types.

Store directly as specified by IEEE-754.

Typically faster than arbitrary precision numbers.

→ Example: FLOAT, REAL/DOUBLE

15-721 (Spring 2020)

VA R I A B L E P R E C I S I O N N U M B E R S

Rounding Example
#include <stdio.h>

Output int main(int argc, char* argv[]) {

x+y = 0.30000001192092895508 float x = 0.1;
0.3 = 0.29999999999999998890 float y = 0.2;
printf("x+y = %.20f\n", x+y);
printf("0.3 = %.20f\n", 0.3);
}

15-721 (Spring 2020)

FIXED PRECISION NUMBERS

Numeric data types with arbitrary precision and

scale. Used when round errors are unacceptable.
→ Example: NUMERIC, DECIMAL

Typically stored in an exact, variable-length binary

representation with additional meta-data.
→ Like a VARCHAR but not stored as a string

15-721 (Spring 2020)

D ATA L AYO U T

char[]
CREATE TABLE AndySux (
id INT PRIMARY KEY, header id value
value BIGINT
);
reinterpret_cast<int32_t*>(address)

15-721 (Spring 2020)

VA R I A B L E -L E N G T H F I E L D S

char[]
CREATE TABLE AndySux (
value VARCHAR(1024) header id Andy|64-BIT
64-BIT POINTER POINTER
);
Variable-Length Data Blocks
INSERT INTO AndySux
VALUES ("Andy has the worst LENGTH NEXT
Andy has the worst
hygiene that I have ever seen. I hygiene that I have ever seen. I hate
hate him so much.");

LENGTH NEXT him so much.

15-721 (Spring 2020)
11

N U L L D ATA T Y P E S

Choice #1: Special Values

→ Designate a value to represent NULL for a data type (e.g.,
INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what
attributes are null.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null.
→ Have to use more space than just a single bit because this
messes up with word alignment.
15-721 (Spring 2020)
12

DISCLAIMER

The truth is that you only need to worry about

word-alignment for cache lines (e.g., 64 bytes).

I’m going to show you the basic idea using 64-bit

words since it’s easier to see…

15-721 (Spring 2020)

W O R D -A L I G N E D T U P L E S

All attributes in a tuple must be word aligned to

enable the CPU to access it without any
unexpected behavior or additional work.
CREATE TABLE AndySux (
32-bits id INT PRIMARY KEY, char[]
64-bits cdate TIMESTAMP, id cdate c zipc
16-bits color CHAR(2),
32-bits zipcode INT 64-bit Word 64-bit Word 64-bit Word 64-bit Word
);

15-721 (Spring 2020)

W O R D -A L I G N E D T U P L E S

Approach #1: Perform Extra Reads

→ Execute two reads to load the appropriate parts
of the data word and reassemble them.

Approach #2: Random Reads

→ Read some unexpected combination of bytes
assembled into a 64-bit word.
Approach #3: Reject
Source: Levente Kurusa → Throw an exception and hope app handles it.
15-721 (Spring 2020)
15

W O R D -A L I G N M E N T: PA D D I N G

Add empty bits after attributes to ensure that tuple

is word aligned.

CREATE TABLE AndySux (

32-bits id INT PRIMARY KEY, char[]
00000000 00000
64-bits cdate TIMESTAMP, id 00000000
00000000 cdate c zipc 000
00000
00000000 000
16-bits color CHAR(2),
32-bits zipcode INT 64-bit Word 64-bit Word 64-bit Word 64-bit Word
);

15-721 (Spring 2020)

W O R D -A L I G N M E N T: R E O R D E R I N G

Switch the order of attributes in the tuples'

physical layout to make sure they are aligned.
→ May still have to use padding.

CREATE TABLE AndySux (

32-bits id INT PRIMARY KEY, char[]
000000000000
64-bits cdate TIMESTAMP, id zipc cdate c 000000000000
000000000000
000000000000
16-bits color CHAR(2),
32-bits zipcode INT 64-bit Word 64-bit Word 64-bit Word 64-bit Word
);

15-721 (Spring 2020)

C M U -D B A L I G N M E N T E X P E R I M E N T
Processor: 1 socket, 4 cores w/ 2×HT
Workload: Insert Microbenchmark

Avg. Throughput
No Alignment 0.523 MB/sec
Padding 11.7 MB/sec
Padding + Sorting 814.8 MB/sec

Source: Tianyu Li

15-721 (Spring 2020)

STORAGE MODELS

N-ary Storage Model (NSM)

Decomposition Storage Model (DSM)
Hybrid Storage Model

COLUMN-STORES VS. ROW-STORES: HOW

DIFFERENT ARE THEY REALLY?
SIGMOD 2008

15-721 (Spring 2020)

N -A R Y S T O R A G E M O D E L ( N S M )

The DBMS stores all of the attributes for a single

tuple contiguously.

Ideal for OLTP workloads where txns tend to

operate only on an individual entity and insert-
heavy workloads.

Use the tuple-at-a-time iterator model.

15-721 (Spring 2020)

N -A R Y S T O R A G E M O D E L ( N S M )

Advantages
→ Fast inserts, updates, and deletes.
→ Good for queries that need the entire tuple.
→ Can use index-oriented physical storage.

Disadvantages
→ Not good for scanning large portions of the table and/or
a subset of the attributes.

15-721 (Spring 2020)

D E C O M P O S I T I O N S TO R A G E M O D E L ( D S M )

The DBMS stores a single attribute for all tuples

contiguously in a block of data.

Ideal for OLAP workloads where read-only

queries perform large scans over a subset of the
table’s attributes.

15-721 (Spring 2020)

D E C O M P O S I T I O N S TO R A G E M O D E L ( D S M )

Advantages
→ Reduces the amount wasted work because the DBMS
only reads the data that it needs.
→ Better compression.

Disadvantages
→ Slow for point queries, inserts, updates, and deletes
because of tuple splitting/stitching.

15-721 (Spring 2020)

D S M S Y S T E M H I S TO R Y

1970s: Cantor DBMS

1980s: DSM Proposal
1990s: SybaseIQ (in-memory only)
2000s: Vertica, VectorWise, MonetDB
2010s: Everyone

15-721 (Spring 2020)

DSM: DESIGN DECISIONS

Tuple Identification
Data Organization
Update Policy
Buffering Location

OPTIMAL COLUMN LAYOUT FOR

HYBRID WORKLOADS
VLDB 2019

15-721 (Spring 2020)

D S M : T U P L E I D E N T I F I C AT I O N

Choice #1: Fixed-length Offsets

→ Each value is the same length for an attribute.
Choice #2: Embedded Tuple Ids
→ Each value is stored with its tuple id in a column.
Offsets Embedded Ids
A B C D A B C D
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3

15-721 (Spring 2020)

D S M : D ATA O R G A N I Z AT I O N

Choice #1: Insertion Order

→ Tuples are inserted into any free slot that is available in
existing blocks.

Choice #2: Sorted Order

→ Tuples are inserted based into a slot according to some
ordering scheme.

Choice #3: Partitioned

→ Assign tuples to blocks according to their attribute values
and some partitioning scheme (e.g., hashing, range).
15-721 (Spring 2020)
27

D S M : D ATA O R G A N I Z AT I O N

Data Table Sorted Table

INSERT INTO xxx A B C A B C
VALUES (a2, b1, c5); 0 a1 b1 c1 2 a1 b2 c8
1 a3 b2 c9 5 a1 b2 c9
2 a1 b2 c8 0 a1 b1 c1
3 a2 b2 c7 3 a2 b2 c7
4 a2
a3 b2
b1 c9
c5
c6 1 a3 b2 c9
5 a3
a1 b1
b2 c1
c9 6 a3 b1 c1
6 a3 b1 c6
c1 4 a3 b1 c6
7 a2 b1 c5 7
Sort Order: (A↑, B↓, C↑)
15-721 (Spring 2020)
28

C A S P E R D E LTA S TO R E

Range-partitioned column store with a "shallow"

order-preserving index above it.
→ Shallow index maps value ranges to partitions.
→ Index keys are sorted but the individual columns are not.

DBMS runs an offline optimization algorithm to

determine the optimal partitioning of data.

OPTIMAL COLUMN LAYOUT FOR

HYBRID WORKLOADS
VLDB 2019

15-721 (Spring 2020)

C A S P E R D E LTA S TO R E
Data Table
A B C
INSERT INTO xxx 0 a1 b2 c8
VALUES (a2, b1, c5); 1 a1 b2 c9
2 a1 b1 c1
3 a2 b2 c7
INSERT INTO xxx 4 a2 b1 c5
VALUES (a2, b2, c6); 5 a2
a3 b2
b1 c6
6 a3 b1 c1
Shallow Index 7 a3 b2 c9
key→partition 8 a3 b1 c6
9

15-721 (Spring 2020)

O B S E R VAT I O N

Data is “hot” when it enters the database

→ A newly inserted tuple is more likely to be updated again
the near future.

As a tuple ages, it is updated less frequently.

→ At some point, a tuple is only accessed in read-only
queries along with other tuples.

15-721 (Spring 2020)

H Y B R I D S TO R A G E M O D E L

Single logical database instance that uses different

storage models for hot and cold data.

Store new data in NSM for fast OLTP

Migrate data to DSM for more efficient OLAP

15-721 (Spring 2020)

H Y B R I D S TO R A G E M O D E L

Choice #1: Separate Execution Engines

→ Use separate execution engines that are optimized for
either NSM or DSM databases.

Choice #2: Single, Flexible Architecture

→ Use single execution engine that can efficiently operate
on both NSM and DSM databases.

15-721 (Spring 2020)

S E PA R AT E E X E C U T I O N E N G I N E S

Run separate “internal” DBMSs that each only

operate on DSM or NSM data.
→ Need to combine query results from both engines to
appear as a single logical database to the application.
→ Must use a synchronization method (e.g., 2PC) if a txn
spans execution engines.

Two approaches to do this:

→ Fractured Mirrors (Oracle, IBM)
→ Delta Store (SAP HANA)

15-721 (Spring 2020)

FRACTURED MIRRORS

Store a second copy of the database in a DSM

layout that is automatically updated.
→ All updates are first entered in NSM then eventually
copied into DSM mirror.

NSM DSM
(Primary) (Mirror) Analytical
Transactions Queries

A CASE FOR FRACTURED MIRRORS

VLDB 2002

15-721 (Spring 2020)

D E LTA S TO R E

Stage updates to the database in an NSM table.

A background thread migrates updates from delta
store and applies them to DSM data.

DSM
NSM Historical Data
Delta Store
Transactions

15-721 (Spring 2020)

P E LOT O N A D A P T I V E S TO R A G E

Employ a single execution engine architecture that

can operate on both NSM and DSM data.
→ Don’t need to store two copies of the database.
→ Don’t need to sync multiple database segments.

Note that a DBMS can still use the delta-store

approach with this single-engine architecture.

BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND

COLUMN-STORES FOR HYBRID WORKLOADS
SIGMOD 2016

15-721 (Spring 2020)

P E LOT O N A D A P T I V E S TO R A G E

Original Data Adapted Data

UPDATE AndySux A B C D A B C D
SET A = 123,
B = 456, Hot
C = 789
WHERE D = “xxx” A B C D

SELECT AVG(B)
FROM AndySux
WHERE C = “yyy”

Cold
15-721 (Spring 2020)
39

P E LOT O N A D A P T I V E S T O R A G E
Row Layout Column Layout Adaptive Layout
Execution Time (ms)

1600

1200

800

400

0 Scan Insert Scan Insert Scan Insert Scan Insert Scan Insert Scan Insert

Sep-15 Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

15-721 (Spring 2020)
40

S Y S T E M C ATA LO G S

Almost every DBMS stores their database's

catalogs the same way that they store regular data.
→ Wrap object abstraction around tuples.
→ Specialized code for "bootstrapping" catalog tables.

The entire DBMS should be aware of transactions

in order to automatically provide ACID guarantees
for DDL commands and concurrent transactions.

15-721 (Spring 2020)

SCHEMA CHANGES

ADD COLUMN:
→ NSM: Copy tuples into new region in memory.
→ DSM: Just create the new column segment

DROP COLUMN:
→ NSM #1: Copy tuples into new region of memory.
→ NSM #2: Mark column as "deprecated", clean up later.
→ DSM: Just drop the column and free memory.

CHANGE COLUMN:
→ Check whether the conversion can happen. Depends on
default values.
15-721 (Spring 2020)
42

INDEXES

CREATE INDEX:
→ Scan the entire table and populate the index.
→ Must record changes made by txns that modified the table
while another txn was building the index.
→ When the scan completes, lock the table and resolve
changes that were missed after the scan started.

DROP INDEX:
→ Just drop the index logically from the catalog.
→ It only becomes "invisible" when the txn that dropped it
commits. All existing txns will still have to update it.

15-721 (Spring 2020)

SEQUENCES

Typically stored in the catalog. Used for

maintaining a global counter
→ Also called "auto-increment" or "serial" keys

Sequences are not maintained with the same

isolation protection as regular catalog entries.
→ Rolling back a txn that incremented a sequence does not
rollback the change to that sequence.
→ All INSERT queries would incur write-write conflicts.

15-721 (Spring 2020)

PA R T I N G T H O U G H T S

We abandoned the hybrid storage model

→ Significant engineering overhead.
→ Delta version storage + column store is almost
equivalent.

Catalogs are hard.

15-721 (Spring 2020)

04-Storage2 2
No ratings yet
04-Storage2 2
4 pages
05 Storage2
No ratings yet
05 Storage2
4 pages
Database Storage: Intro To Database Systems Andy Pavlo
No ratings yet
Database Storage: Intro To Database Systems Andy Pavlo
54 pages
Lecture 17
No ratings yet
Lecture 17
24 pages
Lecture3 PDF
No ratings yet
Lecture3 PDF
28 pages
04 Storage2
No ratings yet
04 Storage2
72 pages
03 Storage1
No ratings yet
03 Storage1
65 pages
14-Record Nei Blocchi
No ratings yet
14-Record Nei Blocchi
14 pages
Lec 05
No ratings yet
Lec 05
41 pages
06 Storage3
No ratings yet
06 Storage3
5 pages
4 DBMS
No ratings yet
4 DBMS
78 pages
Architecture and Implementation of Database Systems HS 07 Indexing
No ratings yet
Architecture and Implementation of Database Systems HS 07 Indexing
9 pages
Lec 05
No ratings yet
Lec 05
29 pages
Notes 03 - Database Storage - II
No ratings yet
Notes 03 - Database Storage - II
74 pages
(IT) 08 Physical DM Dan Implementasi DB - DDL - DML
No ratings yet
(IT) 08 Physical DM Dan Implementasi DB - DDL - DML
68 pages
3 Storage
No ratings yet
3 Storage
34 pages
Data Storage and Access Methods: Min Song IS698
No ratings yet
Data Storage and Access Methods: Min Song IS698
50 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
6 Data Storage and Querying
100% (1)
6 Data Storage and Querying
58 pages
Mysql Ppt1
No ratings yet
Mysql Ppt1
70 pages
03 Storage1
No ratings yet
03 Storage1
55 pages
Database System Implementation
No ratings yet
Database System Implementation
16 pages
Database Storage: Intro To Database Systems Andy Pavlo
No ratings yet
Database Storage: Intro To Database Systems Andy Pavlo
63 pages
SQL Data Types Overview
No ratings yet
SQL Data Types Overview
10 pages
Compressao de Dados PDF
No ratings yet
Compressao de Dados PDF
55 pages
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
No ratings yet
Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering &technology
38 pages
01 Relationalmodel
No ratings yet
01 Relationalmodel
61 pages
05 Storage3
No ratings yet
05 Storage3
76 pages
Unit 2
No ratings yet
Unit 2
96 pages
01 Relationalmodel
No ratings yet
01 Relationalmodel
70 pages
InfoMan Week 4 Data Types and DB Principles2024
No ratings yet
InfoMan Week 4 Data Types and DB Principles2024
33 pages
CMU Database Course Intro
No ratings yet
CMU Database Course Intro
43 pages
C++ Database Library Guide
No ratings yet
C++ Database Library Guide
5 pages
DBMS Indexing and Storage
No ratings yet
DBMS Indexing and Storage
53 pages
03 Storage1
No ratings yet
03 Storage1
4 pages
Ashokit Oracle 6pm Batch
No ratings yet
Ashokit Oracle 6pm Batch
72 pages
Lecture 03 Storage (2) - Without Answers
No ratings yet
Lecture 03 Storage (2) - Without Answers
45 pages
Database Management System Chapter 2
No ratings yet
Database Management System Chapter 2
19 pages
Topic: Databases: A Database Is A Way of Storing Information in A Structured, Logical Way. They Are Used To Collect and
No ratings yet
Topic: Databases: A Database Is A Way of Storing Information in A Structured, Logical Way. They Are Used To Collect and
8 pages
DBMS - Unit 1 - A
No ratings yet
DBMS - Unit 1 - A
18 pages
DBMS Internals: How Does It All Work?
No ratings yet
DBMS Internals: How Does It All Work?
94 pages
Unit3 Datastorage Structre
No ratings yet
Unit3 Datastorage Structre
29 pages
Ashokit SQL T Notes-1
No ratings yet
Ashokit SQL T Notes-1
17 pages
File Structure and Indexing
No ratings yet
File Structure and Indexing
18 pages
Ch4-Data Storage and Indexing
No ratings yet
Ch4-Data Storage and Indexing
116 pages
cs186 Notes
No ratings yet
cs186 Notes
31 pages
DBMS Lecture 2
No ratings yet
DBMS Lecture 2
13 pages
MySQL Lab Manual: Data & Table Types
No ratings yet
MySQL Lab Manual: Data & Table Types
20 pages
Visual Fox Pro 1
No ratings yet
Visual Fox Pro 1
53 pages
Config
No ratings yet
Config
12 pages
Chapter 1 Innovation Management
No ratings yet
Chapter 1 Innovation Management
21 pages
Step For FOC Invoice
No ratings yet
Step For FOC Invoice
4 pages
PPSC Assistant Director Planer Detail
No ratings yet
PPSC Assistant Director Planer Detail
1 page
OSSSC RI ARI AMIN SFS & ICDS (Mains) Live Course
No ratings yet
OSSSC RI ARI AMIN SFS & ICDS (Mains) Live Course
9 pages
49 Rera
No ratings yet
49 Rera
40 pages
SITXINV004 - VCI - (Student Name) - Assessment Tool
No ratings yet
SITXINV004 - VCI - (Student Name) - Assessment Tool
47 pages
Appointment Letter
100% (1)
Appointment Letter
12 pages
Flattening and Reshaping in CNN
No ratings yet
Flattening and Reshaping in CNN
4 pages
Current Approved Mixed Design
No ratings yet
Current Approved Mixed Design
1 page
Spare Parts List: Feed 304, Feed 484
No ratings yet
Spare Parts List: Feed 304, Feed 484
23 pages
IOT-UNIT-3 Material
100% (1)
IOT-UNIT-3 Material
19 pages
Practical Business Analytics Using R and Python 2nd Edition Umesh R. Hodeghatta Available Instanly
No ratings yet
Practical Business Analytics Using R and Python 2nd Edition Umesh R. Hodeghatta Available Instanly
85 pages
Vistara 2024 Data Entry Jobs India
No ratings yet
Vistara 2024 Data Entry Jobs India
2 pages
Asynchronous Synchronous Reset Design Techniques-P
No ratings yet
Asynchronous Synchronous Reset Design Techniques-P
39 pages
Chapter 1: Integers and Its Properties
100% (1)
Chapter 1: Integers and Its Properties
23 pages
Electr - Connect.diagram: Neutraubling Plant
No ratings yet
Electr - Connect.diagram: Neutraubling Plant
238 pages
VBNG - CGNAT Router Documentation Wiki
No ratings yet
VBNG - CGNAT Router Documentation Wiki
3 pages
Energies: Recent Advances in Transcritical CO (R744) Heat Pump System: A Review
No ratings yet
Energies: Recent Advances in Transcritical CO (R744) Heat Pump System: A Review
35 pages
US90A - 24V Hall IC Fan Driver
No ratings yet
US90A - 24V Hall IC Fan Driver
12 pages
Amulet Felicia (Pre-Installation Requirement)
No ratings yet
Amulet Felicia (Pre-Installation Requirement)
12 pages
A Real-Time Solution For Application Fraud Prevention
No ratings yet
A Real-Time Solution For Application Fraud Prevention
11 pages
Notes Dated 8-4 Pharmaceutical Calibration Introduction, Definition and Principles BP 606T
No ratings yet
Notes Dated 8-4 Pharmaceutical Calibration Introduction, Definition and Principles BP 606T
8 pages
Bondiolipavesi CardanDrivelineCatalog Series100-006
No ratings yet
Bondiolipavesi CardanDrivelineCatalog Series100-006
250 pages
FCS Unit 3
No ratings yet
FCS Unit 3
5 pages
Mototec - Troubleshooting: Edition 1.0
No ratings yet
Mototec - Troubleshooting: Edition 1.0
4 pages
CV-170 Brochure EN 42181
No ratings yet
CV-170 Brochure EN 42181
2 pages
Contour Plots (MATLAB)
No ratings yet
Contour Plots (MATLAB)
4 pages
Wa0000
No ratings yet
Wa0000
3 pages
Achieving Total Quality Management (TQM) at Wipro
No ratings yet
Achieving Total Quality Management (TQM) at Wipro
9 pages