CREATE
DATABASE
[ behind the curtain ]
Ever wondered what does CREATE DATABASE
[FooBar]; command do and which files get created?
This graph will give you a quick overview!
.mdf and .ldf FooBar.mdf and
files get
FooBar.ldf are two
binary files that store
created! EVERYTHING that
your DB contains!
FooBar.mdf contains all
the data that you have in .mdf stands
your DB. Schemas, for
Main Database
Tables, Columns, Rows,
Indexes, ... Pretty much
anything you'd call 'data'! File
.ldf stands FooBar.ldf contains log
for Log Data
of all WRITE commands
(e.g. INSERT, UPDATE,
File DELETE) that you
executed against a
Database
and that's it!
We will discuss Pages
and Data rows next
time!
C:\BitesizedEngineering
Pages &
Extents
Ever wondered what the content inside the .mdf file
is? SQL Server manages memory by splitting it into
chunks called "Pages" and then groups them into
sequences called "Extents". It's actually quite
interesting, so let's dive deeper into it!
Page (8KB)
Page is a Unit
Page Header of Memory
Mgmt!
Data row 1
SQL Server stores all of your
Data row 2 data (tables, rows, indexes,
etc.) using units of storage
called "Pages". You can think
of them as Book Pages -
Data row n single page contains
paragraphs, headings,
Free
images, etc.
space
slot array
Page Header Page Header Page Header Page Header
8 consecutive
Data row 1 Data row 1 Data row 1 Data row 1
Data row 2 Data row 2 Data row 2 Data row 2
Data row n Data row n Data row n Data row n
Free Free Free Free
space space space space
pages form Page Header
Data row 1
Page Header
Data row 1
Page Header
Data row 1
Page Header
Data row 1
"Extent"
Data row 2 Data row 2 Data row 2 Data row 2
Data row n Data row n Data row n Data row n
Free Free Free Free
space space space space
Extent (64KB)
Just like in books, you There are
have pages that contain DIFFERENT
Table of Content,
types of
Pages
Chapters & Text, Index of
C:\BitesizedEngineering
Words used, etc. It's the
same in SQL Server -
different pages are for
different purpose.
Data, Index,
LOB & GAM
are most
Data pages contain data itself,
Index pages contain info about
interesting Indexes, LOB pages contain
types Large Objects and GAM is a
Global Allocation Map which
tells you which extents are free
for further use!
You Should
You Should
Know!
Know!
Concept of Pages is not unique to
databases. It has it's origins in Virtual Pages are not unique to SQL Server
Memory which is how OS represents but are used in other relational DB
the memory you have available! solutions as well
BUFFER POOL
[ explained ]
Buffer Pool represents an in-memory cache storage
that stores all of the recently used Pages! As such, it is
one of the most important parts of the SQL Server's
DB Engine. Let's see how it works.
Gimme
data row
with ID 5
SQL Server Buffer Pool
Data is always
fetched from Same for
Buffer Pool! WRITING!
Whenever you want to read a It's the same! Every time you
row, SQL Server first checks if update a row, it's immediately
the Page with given row exists stored to Buffer Pool. In
in Buffer Pool. If yes - it's parallel, you have async process
returned immediately. If not - it running that once in a while
is read from HDD and stored in flushes the Buffer Pool content
buffer pool immediately! to disk.
Update
rows with
IDs 1, 2, 3
SQL Server
Buffer Pool
But, why?
Because SQL Server is I/O heavy and
Ideal size?
writing to RAM is order of magnitude
faster than writing to HDD! By using in-
memory cache (i.e. Buffer Pool), SQL
The bigger the better!
Server makes both reads and writes
super fast!
Enterprise version allows 32
times max_server_memory,
whereas Standard version
allows 4 times
What if data
max_server_memory.
is lost?
As we all know, RAM is volatile and is prone
to losing it's data! Luckily, SQL Server is
aware of it and this issue is solved using a
technique called "write-ahead logging!".
We'll go into more details on what it is in
the next article!
C:\BitesizedEngineering
WRITE-AHEAD
LOG
[ and WHYs behind it ]
Write-ahead logging represents one of the most
fundamental techniques employed in any DB engine.
The idea is simple - immediately write down every
WRITE transaction that is received, and THEN and
ONLY THEN do the rest of the mambo-jumbo (e.g.
updating Pages, Buffers Pools, etc.).
But,
why? Because
Durability!!
You wouldn't expect your DB to
lose your data, would you? Heck,
ensuring your data will be there no
matter WHAT happens is one of the
primary guarantees of any durable
DB engine! It's the D in ACID!
But,
how?
Save first,
process
afterwards!
As stated above idea is extremely simple,
whenever client sends a WRITE command (e.g.
INSERT, DELETE, ALTER, CREATE INDEX, etc.)
you IMMEDIATELY append it in it's row form to
Benefits?
a log file. This ensures that this file contains
chained list of commands in the order that they
were received from the client. ONLY after the
HDD confirms that write has, indeed, happened,
may DB process doing other things! And this has
MANY BENEFITS as you will see!
Crash
Recovery!
Should anything go wrong after you write
data to transaction log (e.g. SQL Server
Replication! dies abruptly), you are 100% safe that you
can RECOVER from it by simply REDOING
everything from the transaction log!
Want to make a clone of your Literally just apply the SQL statements in
current state? Just copy all the the same order they were received and,
commands that led to your current guess what - you end up in the same place
state, replay them and - voila! you were before! It's amazing really!
You're at the same state as the
Primary replica! Again - super
Auditing!
simple and extremely effective!
You can analyze the logs to see
WHO did WHAT and WHEN.
Reasons vary from regular audit to
C:\BitesizedEngineering analyzing security breaches!
EXTENTS
[ a deeper look ]
Extents are simple! They are just a name given to 8
consecutive Pages; and that's it! Their most important role is
in how SQL Server allocates space. In order to understand
that, we need to understand TYPES of Extents first ;)
8 KB 8 KB 8 KB 8 KB
Extent (64 KB)
There are TWO types:
Mixed Uniform
Extents Extents
Uniform Extents are
Mixed Extents may nothing more than 8
contain pages allocated consecutive Pages that
to different objects (e.g. belong to a SINGLE object
different tables, indexes, (e.g. single table or single
etc.). Starting from SQL index). Newer versions of
Server 2016, mixed SQL Server always tend to
extents are rarely used. allocate uniform extent
Table Table Table Table Index Index Table Table
A A B A 1 2 A B
Mixed
Extent
Table Table Table Table Table Table Table Table
A A a A A A A A
Uniform
Extent
C:\BitesizedEngineering
How is Disk Space
Being managed
[ story of Allocation Maps ]
We've so far learned that SQL Server stores everything in
Pages, which are then grouped in Extents. But how does
SQL Server know which extent are in use, and which are
free? And what belongs to what really? That's what
Allocation Maps are for! Let's dive into them.
What are
They are just
another type of
Allocation Page! Page whose
Maps? job is to do the
boring accounting
work!
There are
Global Allocation Maps
(GAMs), Shared Global
Allocation Maps Three Main
(SGAMs) and Page Types
Free Space (PFS).
GAM Pages keep
Global Allocation track of which Extents
Maps (GAM) are in use and which
ones are unallocated.
That's it!
SGAMs keep track of
which Extents are
used as Mixed Shared Global
Extents, which also Allocation Maps (SGAM)
happen to have at
least ONE free page
PFS Maps keep usage
data for each Data
Page. Specifically, they
track which ones are
Page Free Space free, which ones are
(PFS) occupied and how
much space is left on
each.
C:\BitesizedEngineering
Allocation Maps
[ from top to bottom ]
We've agreed that Extents are how SQL
Server is allocating & tracking storage. In
order to do that, it needs to have couple of
levels of overview. This article will discuss
those levels from highest to lowest one.
GAM
Global Allocation Maps are
highest level of organization.
GAM
They are simply tracking
which Extents are in-use and
which ones are free. That's it!
SGAM & PFS
SGAM & PFS
Shared Global Allocation Maps &
C:\BitesizedEngineering
Page Free Space go a bit deeper.
SGAM tracks which Extents are
Mixed and which ones are Uniform.
Page Free Space tracks how many
free pages are remaining in each
allocated Extent.
IAM
IAM
Index Allocation Maps are
most detailed ones! They exist
for each Table and keep track
of table data and allocation
units across the pages.
How are Large
Rows stored?
We said before that single table row has to fit
on a single Page. We also said that Page's
size is fixed to 8KB. So how the heck does
SQL Server store rows that are larger than
8KB? Let's find out!
How?
C:\BitesizedEngineering
8 KB
500 KB Page
Data
Turns out there are THREE types
of Data Pages out there. The basic
Three one is called IN_ROW data and
Types of
this one is limited to Page size.
Second one is OVERFLOW_DATA
Data Pages! which gets split from IN_ROW
data and stores only the overflow
1. 2OVER
. 3.
columns. Third one is Large Object
Data (LOB) that stores Texts,
IN
LOB
Images, etc.
ROW FLOW
one IN_ROW
one or more OVERFLOW_DATA
one or more LOB_DATA
500 KB
Data
Your row can't
exCeed 8kb in size.!
[ except that it can! ]
Ever tried defining VARCHAR column with
C:\BitesizedEngineering
>8000 bytes? It gives an error - column size
must be <= 8000 bytes. That's because row
has to fit into a single Page (8KB). Yet, you
can define MULTIPLE VARCHAR(8000)
columns and SQL server is happy to do it.
What's the trick?
Your FIXED-Size columns
MUST fit into 8KBs. No
exceptions! Try defining a table
Devil is in with 2 CHAR(8000) columns
the details!
and you'll get an error. But if
you change other column to
VARIABLE-sized one (e.g.
VARCHAR(800)) it will work.
The devil strikes!
From my research, initial versions
of SQL Server simply forced you
to either store your data in a Devil is in
the history!!
single page, or use "big data
storage" (TEXT or IMAGE). But
people complained so they had to
do something. And they did -
they introduced concept of
OVERFLOWS.
Fix was simple! Columns that vary
ROW in size and are larger than the
OVERFLOW
page size (8KB), get moved to
ANOTHER page (called
IS BORN! ROW_OVERFLOW), whereas the
original page with 'base data'
remains on initial IN_ROW page.
This IN_ROW page will contain a
pointer to new page so that SQL
Server knows where to find
'additional' data.
If your row spans multiple pages,
then Engine has to read all those
pages in order to fetch all the
necessary data! And that adds
unnecessary I/O which affects Mind the
performance! For max performance, Performance!
always aim at fitting your data into
IN_ROW allocations if possible!
2 GB sized
columns?
[ story of LOBs ]
8KB is max
column size you
can define But
varchar(max)
allows up to
We mentioned in previous
article that 8000 bytes is the
2gigs!
maximum size of the column
you can specify. And that's Huh would be a proper reaction,
because a single column has to right? What is
fit on a single Page (8KB in VARCHAR(MAX)? Turns out it's
size). a huge BLOB that can store up
to 2 gigs! And it can do so
because it uses storage referred
to as LOB data.
>> LOB stands for Large OBject! <<
What is LOB
data?
It's just a THIRD type of data
row, alongside with IN_ROW LOBs allow you
and ROW_OVERFLOW. It splits to store large
data. But With
your data into bunch of 8KB
chunks that get spread over
a penalty!
multiple pages. And max size is
2 gigs btw! By defining VARCHAR(MAX) or
NVARCHAR(MAX) you are
effectively defining a LOB
column that can hold up to 2
gigs of data. But that comes
with a performance cost!
Why
performance
cost?
Your data ends up being spread
over multiple pages! And given
that single page can hold up to
8KBs, that means that 2 gigs of
data would be spread accross Should I
TONS of pages! That's a LOT of avoid it?
data to iterate through to fetch
the value!
If you can - sure! It just takes
tons of space and requires
additional IAM pages to track
them. But, obviously, if you
You should know!
can't avoid them, just be sure to
know the cost that you will be
paying for it :)
LOB columns are still part of
IN_ROW data. However, the
value of a column is a pointer to
where LOB data begins and not
the data itself! C:\BitesizedEngineering
Hi, I Am
IAM!
[ the real purpose of IAM Pages ]
We briefly mentioned IAM Pages when we
talked about Allocation Units. In order to
understand them, we had to introduce
concept of three types of Data (IN_ROW,
OVERFLOW, etc.). Now that we covered all
of that, we can finally discuss what IAM
pages are. So, let's do it!
If you re-read previous
graphics, you'd notice that out
GAM, SGAM & PFS have no of four allocation maps, three
clue about your data! don't care about your tables. All
they care is free space and
which extents are in use. That
leaves the fourth one who
either cares or DB would never
know where to find table data!
Index Allocation Maps are
your savior! They are the
IAM to
ones that actually track
which extents contain data
from your table. And that's the rescue!
their only job!
There's
one IAM Each table gets its own IAM
per table Page(s).
What's more, each table will
And one IAM have one (or more IAMs) per
per Allocation Allocation Unit. What this
Unit means is that there's one IAM
chain for IN_ROW pages, one
chain for OVERFLOW and one
for LOBs.
IAMs are, after all - Pages. And
they are limited to 8KBs in size.
IAM
That means that each IAM can
track up to 4GB of Extents
Chains?
allocated to your table. If your
table is bigger than that - you'll
need additional IAM page(s).
Hence, they can be chained for
as long as needed to cover all
table data! C:\BitesizedEngineering
FileGroups &
Partitions
[ Divide & Conquer at its finest ]
Now that we know about .MDF being a file
that contains sequence of pages which in turn
store ALL your data, a question one simply
has to ask is - how much of a bottleneck can
single file be? And can we do something
about it? The answer is - YES, we can. That's
where FileGroups & Partitions come into play.
No Filegroups Yes Filegroups
.NDF .NDF
.MDF .LDF .MDF .LDF
.NDF
No Filegroups = your Filegroups split your
DB is one huge DB into smaller
binary file! chunks.
Divide&Conquer!
But, Load balancing at it's best!
Why?
Instead of having a single
huge binary blob, you split
your DB into bunch of
smaller files. Smaller file =
faster scan, easier backup,
etc. Divide & Conquery, baby!
NoPartitions.mdf YesPartitions.mdf
Table Table Table Table
Table A Table B
A A B B
Table Table
A B
Partitions are like Filegroups, but on Table-level!
They let you split your table into smaller chunks!
.MDF .NDF .NDF
Table Table Table Table Table Table Table Table
A A B B B A B B
Splitting Partitions over FileGroups gives
you the finest of load balancing!
C:\BitesizedEngineering