24-06-2025
DSC 403C-5: NoSQL
Unit - 1
Dr. Kavitha R
Mission Vision Core Values
Christ University is a nurturing ground for an individual’s Excellence and Service Faith in God | Moral Uprightness
holistic development to make effective contribution Love of Fellow Beings | Social Responsibility
to the society in a dynamic environment Pursuit of Excellence
Unit - 1
Introduction
Overview, History of NoSQL databases, Definition of four types of NoSQL database,
value of relational databases, getting at persistent data, concurrency, Integration,
Impedance mismatch, Application and integration databases, attack of clusters,
emergence of NoSQL, Key points, comparison of relational databases to new NoSQL
stores, Replication and Sharding, MapReduce on databases. Distribution models,
single server, Master Slave Replication, Peer to Peer Replication, Combining Sharding
and Replication.
1
24-06-2025
Foundations
Overview
Why NoSQL?
Downfalls of RDBMS:
● High scalability
(vertical scaling, tightly coupled schema – complex joins in distributed systems)
● Flexibility
(Fixed schema – defined columns, structured data, complex joins. Real time data
ingestion, tight coupling with business logic)
● Performance
(data volume handling, scalability, query performance, write speed, read speed,
careful normalization)
2
24-06-2025
Overview
Why NoSQL?
Downfalls of RDBMS:
● Unstructured and Semi structured data
(storage and querying)
● Real time applications
● Distributed workloads
(geographically distributed data)
Overview
When NoSQL?
● need to handle large volumes of data
● require flexible and dynamic schema
● performance and scalable are critical
● data is unstructured and semi structured
● need to build real time applications
3
24-06-2025
History
1998 Term is coined (lightweight open source relational db) by Carlo Strozzi
2000 Neo4j launched
2004 Google launched BigTable
2005 CouchDB released
2007 Amazon Dynamo released
2008 Cassandra released by Facebook
2009 Term revived for non relational db by Johan Oskarsson and Eric Evans
Types of NoSQL DB
● Document based db
● Key Value stores
● Column Oriented db
● Graph based db
4
24-06-2025
Types of NoSQL DB
Types of NoSQL DB
SQL vs NoSQL
10
5
24-06-2025
Types of NoSQL DB
I. Key Value DB
● simplest, easy to implement
● stored as key - value pair
● uses hash table with unique key pointing to particular item of data
● to store large amount of data and complex queries are not needed to retrieve it
● suitable for horizontal scaling and distributed storage
Redis, Dynamo DB, Riak
11
Types of NoSQL DB
Key – Value DB
Each data item is
identified by a unique key,
and the value associated
with that key can be
anything, such as a
string, number, object,
or even another data structure
12
6
24-06-2025
Types of NoSQL DB
II. Document DB
● consistently ranked as the popular NoSQL DB
● uses document to store db
● documents are retrieved by using index value
● group of documents that store documents that have similar contents is collection
● not all documents to be in a collection as they require a similar schema
● Mostly there will not be direct relationship between documents
MongoDB, CouchDB
13
Types of NoSQL DB
Document DB
14
7
24-06-2025
Types of NoSQL DB
III. Column Oriented DB
● stores data in columns instead of rows.
● each row does not need to have same columns
● data is organized into column families, which are groups of columns that share
the same attributes.
● unique row key, and the columns in that row are further divided into column
names and values.
● designed to read more efficiently and retrieve data with greater speed
● high scalability, efficient data compression
Cassandra, HBase, BigTable
15
Types of NoSQL DB
Column Oriented DB
16
8
24-06-2025
Types of NoSQL DB
IV. Graph DB
● flexible graph model
● store data in nodes and edges and query highly coneected data
● Nodes store information about people, places, and things
● Edges store information about the relationships between the nodes
● to traverse relationships to look for patterns
Neo4j, FlockDB
17
Types of NoSQL DB
Graph DB
18
9
24-06-2025
Types of NoSQL DB
Graph DB
19
Types of NoSQL DB
When to use which NoSQL DB?
20
10
24-06-2025
Activity
Explore NoSQL DB
Matching DB with use case
✓ Social media relationships
✓ Shopping cart or user sessions
✓ Blog post storage with comments
✓ Real time IoT sensor data
✓ Real time leader board in a gaming app
✓ Product catalogue with varied item description
✓ Recommendation engine based on user interaction
21
Activity
Explore NoSQL DB
Identify use cases for each type
Which NoSQL type is most intuitive?
Which type would you use for your own project idea?
22
11
24-06-2025
Why NoSQL?
✓ Value of Relational DB
✓ Impedance Mismatch
✓ Application and Integration of DBs
✓ Attack of Clusters
✓ Emergence of NoSQL
23
I. Value of Relational DB
Critical part in computing
1. Getting at Persistent Data
2. Concurrency
3. Integration
4. Standard Model
24
12
24-06-2025
I. Value of Relational DB
1. Getting at Persistent Data
Memory → fast and volatile main memory, larger but slower backing store
File system
Enterprise applications → database
More flexible to access and retrieve data quickly and easily
2. Concurrency
Many users will access the same database at once (Read / Write)
Same bits of data or different
Controlling through Transactions (Rollback is possible)
25
I. Value of Relational DB
3. Integration
Enterprise app requires multiple apps developed by different team
Inter application collaboration
Apps may use same data and changes to be visible to all
Shared database integration
Single database allows all apps to use
Concurrency control handles multiple accessing (like multiple users)
26
13
24-06-2025
I. Value of Relational DB
4. Standard Model
follows a standard way
developers can apply it to many projects
core mechanism is same, though different tools
27
II. Impedance Mismatch
Difference b/w relational model and in-memory data structure
Relational Model
Tables – Relations (Tuples → Name, Value pairs)
All operations consume and return relations → mathematically relational algebra
provides simplicity, but with limitations
values in tuples need to be simple and cannot contain nested record or list
in-memory data structure can take richer structures
so, 2 different representations that require translations
Object Oriented Programming Language – Object Oriented DB
28
14
24-06-2025
II. Impedance Mismatch
An order which
looks like a single
aggregate structure
in UI, is split into
many rows from
many tables in
relational db.
29
II. Impedance Mismatch
Object Oriented vs Relational Model
Feature Object Oriented Relational
Data Structure Objects Tables (Rows)
Relationships References Foreign Key
Identity Object Reference Primary Key
Inheritance Supported Not natively supported
Data Representation Objects are nested and linked Tables are flat and relational
Access Direct via attributes Needs SQL Joins
30
15
24-06-2025
II. Impedance Mismatch
Object Oriented vs Relational Model
Student Course StuCourse
Stu Id Stu Name Course Id Course Stu Id Course ID
Name
101 Preeti 101 C1
C1 Artificial
102 John Intelligence 101 C3
C2 Data 102 C2
Analytics
C3 Time Series
Analysis
31
II. Impedance Mismatch
Object Oriented as Front End and Relational Model as Back End
Object-Relational Mapping (ORM)
Tools to translate b/w objects and tables:
Hibernate
SQL Alchemy
Entity Framework
32
16
24-06-2025
III. Application and Integration DB
Multiple applications developed by separate teams store data in a common db
May be diff. DBs → Integration → for unified access and decision making
technical, semantic and many challenges.
1. Heterogeneous Data Source
✓ diff. DB models (Relational, Non Relational, Graph, …)
✓ diff. data formats (Excel, CSV, JSON, XML, …)
✓ diff. query languages
33
III. Application and Integration DB
2. Schema Mismatch
✓ entity identification issue - diff. naming conventions (Stu ID, Reg.No, …)
✓ data type conflict (String, Integer, …)
✓ structural diff. (Normalized / denormalized data)
✓ business logic diff. semantic (discount for shopped items, hotel room tariff)
✓ inconsistent use of codes (grade)
✓ inconsistent use of units (height, weight, …)
✓ inconsistent use of data representation (date)
✓ specification of null condition (w.r.t. data type also – Aadhar, Vehicle number)
Ex: Address
34
17
24-06-2025
III. Application and Integration DB
3. Security and Access Control
✓ varied authorization mechanisms
✓ data privacy regulations
4. Data Synchronization and Latency
✓ Real-time vs Batch updates
✓ data inconsistency due to time lags
5. Data Redundancy and inconsistency
✓ duplicate records, conflicting
35
IV. Attack of Clusters
Growth of data
scaling up → more processors, disk storage, memory (expensive)
scaling out → lots of small machines as clusters.
small machine with commodity h/w and cheaper
more resilient → overall cluster keep going despite failure of a machine
rdbms is not designed to run on clusters
clustered relational databases are needed (Oracle RAC, MS SQL Server)
need to look on alternative way to data storage
36
18
24-06-2025
V. Emergence of NoSQL
NoSQL don’t use SQL, but similar kind like CQL
Operate without Schema
Relational DB handle concurrency across whole db which is complicated in clusters
(different strategies in NoSQL)
Polyglot Persistence – using diff. data stores in diff. circumstances
modern applications handle diverse data with varied requirements
Ex. E-commerce platform
product catalog, shopping cart, recommendation engine, login, analytics
37
Relational DB vs NoSQL DB
Feature Relational DB NoSQL DB
Data Model Tabular Varies (Document, Key-Value,
Column Family, Graph)
Schema Fixed Flexible, Schema less
Scalability Vertical Horizontal
ACID Compliance Fully ACID compliance Varies (eventual consistency)
Query Language SQL (Standard) Database specific languages
Relationships Handled through Table Joins Often denormalized and
handled in application
Data Integrity Strong (through constraints) Varies
Consistency Strong Eventual consistency
Performance Excellent for complex queries Excellent for read/ write – heavy
work
38
19
24-06-2025
Relational DB vs NoSQL DB
Feature Relational DB NoSQL DB
Flexibility Less (schema changes is Highly flexible (easy to modify
challenging) data model)
Transactions Support for complex Limited transaction support
transactions
Data Size Smaller to medium Can handle very large datasets
Standardization High (SQL is standardized) Low (each db has unique
features)
Examples MySQL, SQL Server, Postgre MongoDB, Cassandra, Riak,
SQL, Oracle Redis, Neo4j
39
Thank You….. !
40
20