Table of Contents
1. Database - Choosing a database.............................................................................................................................1
2. Database - Relational – ACID and Isolation levels....................................................................................................2
3. Database - Relational – ACID in distributed environment, 2PC?..............................................................................3
4. Database - Relational – Indexes, Faster writes........................................................................................................3
5. Database - Relational – What is N+1 problem.........................................................................................................4
6. Database - Non-relational – What is NoSQL............................................................................................................5
7. Database - Non-relational – Types of NoSQL...........................................................................................................5
8. Database - Non-relational - MongoDB....................................................................................................................6
9. Database - Non-relational - Can we have ACID in NoSQL?.......................................................................................6
10. Database Optimization/Performance - how to scale DB?....................................................................................7
11. Database Optimization/Performance - Database Partitioning.............................................................................7
12. Database Optimization/Performance - Sharding.................................................................................................7
13. Database Backup – DB Replication......................................................................................................................8
14. Database - Database Performance - Caching.......................................................................................................9
15. Database - Common Databases (AWS, Azure).....................................................................................................9
16. Database - Common NoSQL databases................................................................................................................9
17. Database - No SQL Performance – Lucene Index, Inverted index?......................................................................11
18. Datawarehouse - OLAP vs OLTP........................................................................................................................11
19. Datawarehouse - Row based vs Column based..................................................................................................12
20. Datawarehouse - OLAP – AWS Redshift.............................................................................................................12
1. Database - Choosing a database
Functional considerations
Query pattern
Search by key (Redis/ DynamoDB /s3)
Search by key + sometimes other key (DynamoDB /Cassandra)
Search by fuzzy (elastic)
Search by joins (rdbms), graph
Nature of data
Structured or unstructured
Data models
Data models need joins, relationship
Null values
No fixed columns
Non-Functional considerations
Consistency or Availability
Strong Consistency(rdbms)
Eventual Consistency (NoSQL, focus on HA), multiple datacenter and compromise on SC.
Volume of growth
TB/PB per month?
Object storage if infinite growth like S3
Performance
Scalability
RW separation, sharding
Distribute data horizontally
Scalability
RW separation, sharding
Distribute data horizontally
License cost
Switch to Aurora
2. Database - 4 main differentiator (Schema, Txn, Analytics,Scaling)
Fixed schema (Fixed vs Flexibility)
RDBMS follows fixed schema
(-) NoSQL, the responsibility lies with the application to maintain data integrity
Transaction Management (ACID vs Base)
(-) NoSQL does not follow ACID strictly, rather it follows BASE model (Basically Available, Soft state, Eventual consistency)
Soft state means the state of the system can be in a flux and change over time, even without input.
However, different NoSQL db offer different transaction management.
MongoDB
- Mongo supports ACID transaction in "document level". However, in multi document level, it started giving
ACID support since v4.0, that is complex and has performance limitations. With v4.2, MongoDB extended
multi-document transaction support to sharded clusters, enabling ACID transactions across distributed data.
Cassandra:
- Cassandra supports lightweight transactions for operations like compare-and-set (CAS), but it does not
support full ACID transactions. Cassandra follows the BASE model and focuses on eventual consistency
rather than strong consistency.
Redis:
- Transactions: Redis supports transactions with commands like MULTI, EXEC, DISCARD, and WATCH.
However, Redis transactions are not fully ACID-compliant and do not support rollbacks.
Modern NoSQL databases increasingly support ACID transactions, though the scope and implementation vary
widely. While single-document ACID compliance is standard, more complex multi-document or multi-record
transactions are now possible in some systems, but they may have limitations in distributed scenarios. The
choice depends on the specific use case, required transactionality, and trade-offs in performance and scalability.
Analytics
- SQL databases are generally better suited for traditional analytics with structured data and complex queries,
JOINS, while NoSQL databases shine in scenarios requiring scalability, flexibility, and real-time analytics.
Often, a hybrid approach using both types of databases, along with specialized solutions, can provide the
best of both worlds.
2 main advantages for NoSQL
Flexibility
Scalability
3. Database - Relational – ACID and Isolation levels
ACID
4 essentials of a relational transaction i.e., a unit of work
Atomic
All or none
Consistency
One valid state to another
Correctness, maintain DB integrity and constraints
Isolation
Tx1 unaware of Tx2 local changes
Dirty Read
Txn1 cannot read of Tx2 unless changes are committed.
Changes local till committed.
Solution: READ_COMMITTED
Non repeatable Read
Txn1 reads two different values of Tx2 for the “same row” data
Txn1 reads 100EUR for CustA-> Tx2 updates CustA balance to 50EUR -> Txn1 reads 50EUR, cant find 100
Solution: REPEATABLE_READ by using versions, timestamp
Phantom Read
Reading additional or missing rows in query i.e., essentially not same rows
Txn1 reads 2 customers with balance less than 100 EUR -> Tx2 has updated CustC balance -> Txn1 reads
again and sees 3 customers.
Solution: SERIALIZABLE
While both involve inconsistencies caused by concurrent transactions, the key distinction is whether the
inconsistency relates to row count (phantom read) or data values (non-repeatable read).
4 isolation levels
READ_UNCOMMITTED
Lowest level, dirty read possible
READ_COMMITTED
Resolves dirty read issue, i.e., Txn1 will read only committed values of Tx2
REPEATABLE_READ
Resolves unrepeatable read issue
Txn1 will work on their respective “snapshot” version and modify locally
SERIALIZABLE
Highest level of isolation
Resolves all issues, but Locks and performance
ACID or BASE?
ACID
Consistency (Strong) over Availability
Focus on correctness
Feature of Relational
BASE
Basically Available, Soft state, Eventual consistency
Availability over consistency
Soft state i.e., state may change eventually
+ No blocking, + No ACID overhead
New NoSQL focusing on ACID as well
Primary vs Unique
Primary attribute in a table to identify row vs Multiple attributes
Not changed once set vs Multiple updates
Not null vs nullable
Clustered index vs non-clustered index.
4. Database - Relational – ACID in distributed environment, 2PC?
Difficult to achieve.
Careful coordination between nodes in 2 phases – Prepare + Commit
Negatives
- Blocking, records blocked until all nodes consent to execute transaction
System unavailable unless finished.
- Network latency
- Difficult to achieve scalability, 2PC is antithesis of Scalability
5. Database - Relational – Indexes, Faster writes
Table scan
Scan all rows by rows
Index Scan
Query optimizer finds a particular page.
Index -> page -> row
Select * from employee where eid<10
S1 = Find rows stored in index
S2 = Go to heap and find 10 records sequentially
Index Only Scan
More efficient, faster
Get column information from index only, avoid going to heap; No need to go to actual table data
All necessary information in index itself, no need to go to table for any additional info.
Index -> page -> row
Select * from employee where eid<10
S1 = find rows stored in index
No S2
Why write is faster in row-based database? But not aggregation?
Row based data is stored next to each other, if write one more, append it.
Read is fine, aggregation is the issue.
Select sum(salary) from employee.
If employee is in different disks, load to memory first.
Columnar database
Columns are stored next to each other
Sum(age), data stored in single disk access
OLAP, compression, aggregation
How are tables stored and how is index stored?
Row -> Page -> Heap
Rowid
Pseudo id, system maintained
Page
Rows stored in pages
Fixed size, e.g., 8kb for postgress, 16kb in MySQL
Select * from employee.
Doesn’t read rows but returns page
Heap
Data structure holding pages, actual data storage
Heap = Page0 + Page1 + Page2 + Page3 ….. + PageN
Indexes used to find what part of heap we need to read
Select * from employee where eid=10
//no index
//scan page0=no result
//scan page1=no result
//scan page3=find
//index
//scan page3=find , index tells which page to hit
Clustered vs non clustered index?
Clustered index
Index which stores data based on PK
Determines the physical order of data rows on disk.
One CI per table, always stored sorted
Usually CI is PK or the column which is most frequently used in queries
Non-Clustered index
NCI are used for query optimization purpose, e.g., employee is mostly searched using name
No affect on the physical order of data on disk.
Data Structure of index
BTree
Self-balanced tree
They automatically maintain their balance during insertions and deletions by redistributing keys between
nodes. This ensures that the height of the tree remains relatively constant, which in turn guarantees
efficient search and access operations, making them suitable for database indexing
Stores data in both internal and leaf nodes,
BTree+
Self-balanced tree
They automatically maintain their balance during insertions and deletions by redistributing keys between
nodes. This ensures that the height of the tree remains relatively constant, which in turn guarantees
efficient search and access operations, making them suitable for database indexing
Stores data in only leaf nodes, optimizing range queries and sequential access.
6. Database - Relational – What is N+1 problem
Problem issue while querying relational database
Initial query returns 1 row for all primary entities
Subsequent N query for associated values with each N primary entities.
Example
2 students, and each having 3 course each
StudentA = History,Science,Maths
StudentB = Geography,Humanities,English
N = primary entities i.e. 2.
If we want to retrieve all students and their respective courses? N+1 i.e. 3 times
- 1 time to fetch all students initially
- N time to fetch respective courses for each student, select 2 times iteratively for each student
Solution
Use eager Fetching, fetch all in one go, don’t go for second query
Use join
7. Database - Non-relational – What is NoSQL
Important properties
1. Non-relational (no select * from employee)
2. Unstructured data
3. Schema less, flexible
4. Follows CAP and not ACID
5. Usually used for narrow access pattern
Positives
Scale horizontally
Performance
Flexible schema
When to choose SQL When to choose No-SQL
Data is structured Data is un-structured; Needs flexiblity
ACID is required No ACID is required
Complex joins Scalability and performance
Volume growth stable Huge volume of data
Access patterns in database – Narrow vs wide
How query retrieves data
Narrow using some key like ID or wide using some range like in (20,30,40)
NoSQL has narrow predefined access pattern to have predictable performance.
If access pattern is not well defined – use SQL
If access patten is narrow – use NoSQL
Narrow access patten
SELECT * FROM employee WHERE eid=10;
Wide access patten
SELECT * FROM employee WHERE age BETWEEN 25 AND 35;
8. Database - Non-relational – Types of NoSQL
1. Key Value
Stored as KV, query by a field
Stores data which rarely changes
Use cases
Session management, user preferences, profile
Positives
Horizontal scalability
Example
Redis, Memcached
2. Document DB
Data represented as document e.g., json
Use cases
Content Management
Example
Elastic search, AWS DocumentDB
3. Graph DB
Stores data as entity and relationship
Use cases
Social networking, fraud management
Example
Neo4J, AWS Neptune
4. In memory
Stores data on RAM storage, no disk access
Positives
Minimum latency, no disk access
Negatives
Data loss on crash
Use cases
Realtime gaming, analytics
Example
Redis, Memcached , AWS Elasticache
5. Time Series
Collect store, process data by timestamp sequence
Use cases
Event tracking
6. Columnar DB
Traditional DB, all columns of a row stored together
Columnar, data stored in column oriented
Column wise compression, aggregation, OLAP
Use cases
Reporting, data warehouse, OLAP
Positives
Faster instead of reading rows by row; directly from column
Negatives
Slow writes vs fast writes of row based
Example
Cassandra, HBase
9. Database - Non-relational - MongoDB
Requirement of storing unstructured or semi structured data
E.g., product catalogs, descriptions, images etc
E.g., IoT data, large volume sent by IoT devices, sensor data, telemetry data
E.g., Geospatial Data: MongoDB has built-in support for geospatial data and is used in applications that require
location-based services, such as mapping and location tracking.
Negatives
Lack of Transactions: Until recent versions, MongoDB lacked support for multi-document transactions, which
made it challenging to maintain data consistency in certain scenarios. While transactions are now supported,
they may not perform as efficiently as in traditional relational databases.
Memory Usage: MongoDB relies heavily on memory for performance. Large datasets or indexes can consume a
significant amount of RAM, potentially leading to memory pressure on the system.
Complexity in Shard Configuration: Horizontal scalability in MongoDB involves sharding, which can be complex
to set up and manage, especially in large-scale deployments.
10. Database - Non-relational - Can we have ACID in NoSQL?
NoSQL prioritize horizontal scalability and availability over consistency.
Hence, not usually; BASE and eventual consistency preferred over ACID
MongoDB supports distributed multi document acid transactions starting v4
Perform multiple operations on multiple documents within a transaction, MongoDB ensures ACID maintained.
Keep in mind the performance trade-offs. Prefer non-transactional operations for simple use case.
RAFT algorithm
Raft is a distributed consensus algorithm designed to ensure that a distributed system remains consistent
even in the presence of failures.
A designated transaction coordinator manages the process, ensuring that write operations are committed
only when acknowledged by majority of nodes in the replica set, providing strong consistency and durability
even in the face of failures.
Like Master-Slave, there is a Primary node responsible to handle all W and coordinate.
Consensus among secondary nodes before commit. Elect primary in case of failure.
Choose Relational if we want to have ACID, NoSQL is not made to be ACID compliant
11. Database Optimization/Performance - how to scale DB?
1. Hardware upgrade, simplest
2. Sharding or horizontally partition data across multiple servers. Done usually for NoSQL
DB Partitioning or vertically partition data in single server. Done usually for RDBMS
3. Read/Write replicas, different DB for read and write. Not easy for RDBMS
12. Database Optimization/Performance - Database Partitioning
More common for Relational DB. Although can be applied for both.
Large tables split into several partitions based on some range like date range.
Partitioning = aim is query performance
Partitioning = single server
Vertical partitioning
Divides a database table into multiple tables, each containing a subset of columns (attributes).
Each partitioned table represents a distinct aspect or subset of the original data.
E.g employee table with 100 records divided into 2 tables => EMP_BASIC, EMP_CONTACT etc
Horizontal partitioning
Sharding
Horizontal partitioning divides a database table into multiple tables, each containing a subset of rows (records
or tuples). Each partitioned table represents a segment of the original data, often based on a defined criterion
like a range of values or a specific attribute.
E.g employee table with 100 records divided into EMP_EU, EMP_IN etc
13. Database Optimization/Performance - Sharding
More common for NoSQL. Although can be applied for both.
Sharding is a database architecture strategy used to horizontally partition data across multiple servers or nodes to
improve scalability and performance.
A large database is divided into smaller, more manageable pieces called "shards," and each shard is stored on a
separate server or node. This allows data to be distributed evenly and reduces the load on any single server,
enabling the database to handle high volumes of data and requests.
Positives
Scalability
Performance
High Availability, No SPOF
Fault tolerance
Negatives
Complexity of data distribution, data consistency etc.
Query routing
Sharding key selection
Data migration and rebalancing shards.
Some shards overutilized, some underutilized.
For which kind of Database sharding is done?
Both kind of databases
Relational database
Sharding is less common in traditional relational databases due to inherent complexity of splitting data.
Non-Relational database
Sharding is more common in NoSQL databases like MongoDB, Cassandra, Couchbase etc.
NoSQL are made with horizontal scalability in mind.
Factors to consider while sharding
Data size, is volume expected to grow?
Performance, too many RW?
Latency, is it multi geo app?
How to shard data, selection of sharding key?
Geo Sharding, if column like location, region exists in database
Create consistent hashing on some keys.
How to know which shard to query?
Each query will have keys/attributes from which SHARD key will be derived.
Key should be predictable, immutable and part of request e.g. /search/country=IN&name=abcd
Based on those keys, create hasing key
ConHash(IN, abcd) = K1
ConHash(UK, xyz) = K2
Client library request -> Master node -> Calculate hash based on request -> Redirect to respective node
14. Database Backup – DB Replication
Different types of replications
1. Master-Slave (Primary/Secondary)
Master writes
Slave copies, replicates from Master
+ No write conflicts
+ High R availability
+ Sync or Async
- Consistency
Client writes ----> Master --Unidirectional Copy--> Slave
Client reads ----> Slave
2. Master-Master (Multi Master)
Scale write operations
Complex and conflicts
+ High R/W availability
- Complex and conflicts
Client writes ----> Any Node --Bidireactional Copy --> Other Nodes
Client reads ----> Any Node
DynamoDB
RAID (Redundant Array of Independent Disks)
Object storage principle
Store same data in multiple places (HD)
Commonly used in DB to protect data from failure, performance or fault tolerance.
RAID CONTROLLER, handles the operations
Different raid levels (6+), different offers of data loss, capacity
RAID 0 (Striping)
data stripped across multiple drives.
(+) Fastest e.g., 100 rows = (50 rows) HD1 + (50 rows) HD2
(-) No redundancy, if one drive fails, data lost
RAID 1 (Mirroring)
data duplicated in 2+ drives.
(+) Redundancy, if one drive fails, data exists in another
RAID 3 (Byte level Stripping with Dedicated parity)
data is striped at the byte level across multiple data drives, and a dedicated parity drive is used to store
parity information.
(+) High Data transfer rates because of byte level stripping
(-) No redundancy for parity drive, if its lost, entire array is unusable.
Other RAID etc.
15. Database - Database Performance - Caching
Negative: Cache miss storm
Cache used for database slowness
But what if cache itself brings down application performance
Multiple clients simultaneously requesting same data
All will hit DB
Solutions
Keep stale data till cache compute is over
Sleep shortly, then refetch
Negative: In memory cache for user details, downside of it?
Scalability issue
If user details are over cache, user must access cache for authorization
During scaling, this has to be additionally considered
16. Database - Common Databases (AWS, Azure)
AWS
Object Storage S3
RDBMS Aurora
NoSQL/Key ElasticCache
NoSQL/Document ElasticSearch
NoSQL/Graph Neptune
NoSQL/Timeseries Timestream
Ledger AWS Quantum
OLAP Athena
17. Database - Common NoSQL databases
Cassandra
Columnar Database
Master Less i.e., no master slave.
CQL like SQL
CAP, highly available, partition tolerant, tunable consistency
Taken concepts from DynamoDB
Partitioning is hash based (consistent hashing)
+ Superfast Efficient Writes
Because it does not require disk read or seek. In memory or Memtables, SSTabales (sorted string tables),
Commit log
Client write -> memTable -> ssTable (no disk access)
Write written to log file
All writes are APPEND only. No insert, only update.
+ Highly Available
+ Scalable
Consistency/Quorum in Cassandra
5 nodes = N1, N2, N3, N4, N5
E.g., Replication Factor=3, Quorum = 2, N5 becomes coordinator.
Client -> N5 -> replicates to any 3 servers -> if success in 2, commit, won’t care for 3rd
Quorum = (Replication Factor/2)+1
MongoDB
Document Database
Data in binary json (bjson)
2PC on multiple documents
+ Efficient read/search
- Write overhead
DynamoDB
Key Value Database, KV less than 1MB
No Master, user can write on any node. Consistent hashing
Highly Available / Scalable
10 million R/W per second
Primary key has 2 parts
Partition Key (determine partition, ensures even data distribution)
Sort key (Optional, sorting within partition)
Example
In a table storing books, ISBN can be Partition Key to uniquely identify books, and Publication Date as Sort Key
to organize editions chronologically.
Elastic Search/Open Search
Document Database
Analytics engine + search engine
Enforce schema like RDMBS using – Mappings
POST /localhost/employee “mappings” {“properties” {id:”type:num”}}
DB -> tables -> rows
Index -> types -> document properties
Document: smallest unit of storage e.g., row of db
Type: Collection of similar documents
Index: Logical namespace pointing to 1-n shards
Place to store related data
Shard: Index divided into logical shards
Ix1 = S1(R1) + S2(R2) + S3(R3) .…. R=replica, optional
Node: server
Cluster: group of servers
Lucene
Single instance of shard, high performance search engine, in java
Lucene uses inverted index
Shard
How index is stored in cluster
Shards are container of data
Shards allocated to nodes
Shards can be primary or replica
Each document belongs to 1 primary shard
Yellow status of shard
All primary nodes are active, but NO REPLICA
Cluster is successfully capable of accepting requests
Clustering
Automatically assigning shards to nodes.
18. Database - No SQL Performance – Lucene Index, Inverted index?
Normal/Forward Index Inverted Index
A sentence is broken into terms e.g., “Hello World” has 2 terms = Stores words as index
“Hello”, “World” Hello = Doc1
Stores document as index, words are mapped reference World = Doc1, Doc2
Doc1= hello, world hey= Doc1
Doc2=good, morning
+ Indexing is faster because key remains the same, new words added - Indexing is slower, as words must be checked before preparing index
- Slow search, e.g., index at beginning of the book + Fast search, e.g., glossary at end of book
Lucene Index
Example of inverted index
Maps words/terms to the document
data structure used for full-text search in information retrieval systems
Apache Lucene, open-source search library written in Java.
19. Datawarehouse - OLAP vs OLTP
OLAP vs OLTP
Transaction vs Aggregation
Row based vs Column based
Read/Write vs Read intensive
20. Datawarehouse - Row based vs Column based
Row based vs Column based
21. Datawarehouse - OLAP – AWS Redshift
Redshift
Clustered warehouse
Each cluster can house multiple databases.
Each database contains multiple objects like tables, views, stored procedures, etc.
Nodes and Slices
Redshift is clustered services, hence multiple nodes involved
Nodes have 2 types => Leader + Compute (contain Slices)
Leader Node
Manages data distribution and query execution across compute nodes
No data is stored. Data in compute node only.
Compute Nodes
Manages data
Slices
- Slice is logical partition for disk storage.
- Multiple slices allow parallel processing across slices on each node.
- The number of slices per node depends on the node instance types.
Redshift offers 3 families of instances: Dense Compute(dc2), Dense Storage (ds2) , Managed Storage(ra3).
- Slices can range from 2 per node to 16 per node depending on the instance family and instance type
- Slice is to distribute the workload of queries evenly across all nodes to leverage the parallel compute
Columnar Storage
Data stored in columnar format
Disk I/O is reduced significantly e.g., query selecting 5 columns out of 100 column table only access 5% of the
data block space.
Each block of data contains values from a single column. This means the data type within each block is always
the same
Compression, Redshift can apply specific and appropriate compression on each block increasing the amount of
data being processed within the same disk and memory space