Introduction to NoSQL
Databases
Your presenters
Cedrick Lunven David Jones-Gilardi
Director of Developer Advocacy Developer Advocate
Apache Cassandra™ expert Apache Cassandra™ expert
Open Source Developer Experienced developer and
educator
Java Geek Still have an Oracle 8 cert
somewhere from the mid 90’s
@SonicDMG
@clun
@david-gilardi
@clunven
Your presenters
Ryan Welford David Jones-Gilardi
Developer Advocate Developer Advocate
Apache Cassandra™ expert Apache Cassandra™ expert
Front End Developer Experienced developer and
educator
Still have an Oracle 8 cert
somewhere from the mid 90’s
@RyanWelford @SonicDMG
@ryanwelford @david-gilardi
Your presenters
Aleks Volochnev Cedrick Lunven
Developer Advocate at DataStax Director of Developer Advocacy at DataStax
• Apache Cassandra™ expert • Apache Cassandra™ expert
• Experienced developer and educator • Kubernetes rookie
• Certified cloud architect • Java Geek
@hadesarchitect @clun @clunven
4
Your presenters
Aleks Volochnev Ryan Welford
Developer Advocate at DataStax Developer Advocate
• Apache Cassandra™ expert Apache Cassandra™ expert
• Experienced developer and educator
• Certified cloud architect Front End Developer
@hadesarchitect @RyanWelford
5
Housekeeping
Livestream: youtube.com/DataStaxDevs Runtime: astra.new/intro-nosql
YouTube
Twitch
Questions: https://dtsx.io/discord Quizz: menti.com
Discord
YouTube
6
Achievement Unlocked!
dtsx.io/badges
Aron L. Prateek Jain Santosh Nepali
Marcin Roozbeh Dargahi Jorge Ortiz
Brzozowski Andrey Deryabin Ankit Bhavsar
Demre Buyuk Akshay Wakhare Aneliya Klevleeva
Muthu Krishnan Pranav Anant Martin Coronel
Arvind V. Joshi Govindasamy
Parth Trambadiya Haris Ville Kerminen
Jasbir Singh Juan Alonso Priya Jakhar
Paul Robu Sylwester Lachiewicz
Avinash Upadhyaya Ankit Bhavsar
Włodzimierz Kozłowski Jasbir Singh
Sharath Koushik
Tom Rota
Joel Reis
Francesco Abbate
AND MANY OTHERS!
menti.com
Hands-on exercise material
Get your instance here:
- http://astra.new/intro-nosql
Repository:
- https://github.com/datastaxdevs/
workshop-introduction-to-nosql
10
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
11
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
12
Get Ready = Hands-on #1
Get your instance here:
- http://astra.new/intro-nosql
Repository:
- https://github.com/datastaxdevs/work
shop-introduction-to-nosql
Databases
Software to save things
and retrieve them later
with queries
14
15
Databases
INTERFACE (format, language,transport)
EXECUTION (parser,analyzer,dispatcher)
STORAGE (indexing, disk)
16
OLTP / OLAP
● OnLine Transaction Processing
● OnLine Analytical Processing
OLTP OLAP
Traditional RDBMS
- Need answers NOW - Answers can wait
- Simple Queries - Complex Queries
- Queries don’t change often - Queries tend to change
(Adhoc)
IO Bound
Relational Databases (Capacity, Storage)
Relational
DB
Streaming Throughput
(Events) (Transactions)
CPU Bound
(Compute)
18
IO Bound
At some point it does not fit
Relational Databases (Capacity, Storage)
a single machine you need
Limits Distributed to scale out.
Storage
Relational DB
Streaming Throughput
(Events) (Transactions)
Streaming XTP
Platforms
GPU/Parallel
Programming
CPU Bound
(Compute)
19
IO Bound Copes with new requirements in
(Capacity, Storage)
volume (capacity) and velocity
Distributed (throughput) + format (variety)
Storage
3V: Volume, Velocity, Variety
NO
DA T O
Streaming TA NL Throughput
BA Y
(Events)
Streaming SE SQ (Transactions)
S L XTP
Platfoms
GPU/Parallel
Programming
CPU Bound
(Compute)
20
Introduction to the C.A.P Theorem (Eric Brewer)
NoSQL are
Distributed Systems
Clouds like
Distributed Systems
21
22
Main NoSQL Databases Types
Column Oriented Document
Tabular
Key/value Graph
23
$25/month credit
Launch a database in the cloud
with a few clicks, no credit card
required.
User Interface
Swagger UI GraphQL Playground Tools Web based
Developer Tools
OSS Stargate.io
A data gateway to allow
c
multiple usages
OSS Apache Cassandra
A Column oriented NoSQL
Database
24
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
25
Tabular or Column Type
Model: Stored Tables sharded on keys to distribute on nodes
● Tables like relational (with a Schema)
● Distributes data based on a key
● Data is stored sorted on disk
Query
● Request with the partition key
● Secondary Indices are possibles
● Select one or more columns for the record
● No joins but denormalization
Use Cases
● CF CASSANDRA AFTER THAT
26
Apache Cassandra™ = NoSQL Distributed Database
1 Installation = 1 NODE
NODE ✔ Capacity = ~ 2-4TB
✔ Throughput = LOTS Tx/sec/core
NODE NODE
DataCenter | Ring
NODE NODE
Communication:
✔ Gossiping
NODE NODE
27
Data is Distributed
Country City Population
USA New York 8.000.000
USA Los Angeles 4.000.000
FR Paris 2.230.000
DE Berlin 3.350.000
UK London 9.200.000
AU Sydney 4.900.000
DE Nuremberg 500.000
CA Toronto 6.200.000
CA Montreal 4.200.000
FR Toulouse 1.100.000
JP Tokyo 37.430.000
IN Mumbai 20.200.000
Partition Key
Data is Distributed
USA New York 8.000.000
Country City Population
USA Los Angeles 4.000.000
FR Paris 2.230.000
DE Berlin 3.350.000
FR Toulouse 1.100.000
DE Nuremberg 500.000
UK London 9.200.000 JP Tokyo 37.430.000
AU Sydney 4.900.000 CA Toronto 6.200.000
IN Mumbai 20.200.000 CA Montreal 4.200.000
Data is Replicated
RF = ? 83 17
Replication Factor
means the number
of nodes used to
store each partition
67 33
50
30
Data is Replicated
RF = 1 83 17
Replication Factor 1
means that every
partition is stored
on 1 node USA New York 8.000.000
USA
67
Los Angeles 4.000.000
33
50
31
Data is Replicated
RF = 2 USA
USA
New York
83
Los Angeles
8.000.000
4.000.000
17
Replication Factor 2
means that every
partition is stored
on 2 nodes USA New York 8.000.000
USA
67
Los Angeles 4.000.000
33
50
32
Data is Replicated
USA New York 8.000.000
USA
0
Los Angeles 4.000.000
RF = 3 USA
USA
New York
83
Los Angeles
8.000.000
4.000.000
17
Replication Factor 3
means that every
partition is stored
on 3 nodes USA New York 8.000.000
USA
67
Los Angeles 4.000.000
33
50
33
Replication within the Ring
USA New York 8.000.000
USA Los Angeles 4.000.000
0
59 (data)
83 17
RF = 3
67 33
50
34
Replication within the Ring
USA New York 8.000.000
83 59 (data)
17USA Los Angeles 4.000.000
RF = 3
67 33
50
35
Replication within the Ring
59 (data)
0
59 (data)
83 17
RF = 3
59 (data)
67 33
50
36
Node Failure
59 (data)
0
83 17 Hint
59 (data)
RF = 3
59 (data)
67 33
50
37
Node Failure Recovered
59 (data)
0
83 17 Hint
59 (data)
RF = 3
59 (data)
67 33
50
38
Node Failure Recovered
59 (data)
0
59 (data)
83 17
RF = 3
59 (data)
67 33
50
39
Data Distributed Everywhere
Geographic Distribution Hybrid-Cloud and Multi-Cloud
On-premise
40
Understanding Use Cases
High Throughput Heavy Writes Event Streaming Log Analytics
Scalability
High Volume Heavy Reads Internet of Things Other Time Series
No Data Loss Caching Pricing
Availability Mission-Critical
Always-on Market Data Inventory
Global Presence Banking Retail
Distributed Compliance /
GDPR Customer
Workload Mobility Tracking / Logistics
Experience
Modern Cloud API Layer Hybrid-cloud
Cloud-native Applications
Enterprise Data
Multi-cloud
Layer
41
HandsOn #2 Tabular Databases
Get your instance here:
- http://astra.new/intro-nosql
Repository:
- https://github.com/datastaxdevs/work
shop-introduction-to-nosql
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
43
Document-Oriented Database
Model: Structured Objects identified by a key
● Documents are structured data but with no schema
● Multiple format but mostly JSON
● Group of documents of same nature as “collections”
Queries
● Request by the key
● Request on other fields tag/path in the document
Use Cases
● Mainly reads, less writes
● Document storage with a structure but no schema
● Used in FrontEnd development matching the JSON used
44
Document Shredding
45
Document Shredding
46
Document Shredding
47
HandsOn #3 Documents DB
Get your instance here:
- http://astra.new/intro-nosql
Repository:
- https://github.com/datastaxdevs/work
shop-introduction-to-nosql
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
49
Key Value Database
Model: Like a distributed HashTable
● One key, one value
● Keys are hashed into buckets (partitions)
● Similar to tabular but with a single value
Queries
● GET/PUT/DELETE/UPDATE direct CRUD only
● Value can be a single valued lists
Use Cases
● Distributed Cache !
● User cache Data, User Sessions
● data Deduplications
50
HandsOn #4 Key-Value DB
Get your instance here:
- http://astra.new/intro-nosql
Repository:
- https://github.com/datastaxdevs/work
shop-introduction-to-nosql
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
52
Graph Database Database
Model: Store Vertices and Edges data structured
● Data is represented as a Graph (Vertices & Edges)
● Dedicated to highly connected dataset (lot of “Joins”)
● Discovering simple and complex relationships between
objects.
Queries
● Find data based on filters on attributes for both nodes and
edges
● Traversal following edges (cf gremlin)
Use Cases
● Social Network, Customer 360
● Internet of Things
● Personalization and recommendation
● Health Care, Path finding, Security, Fraud Detection
53
Positioning graphs Scalability and flexibility
Key Value
High
Graph
Tabular
Scalability
Document
Low
Relational
Low High
Value in Relationships
54
HandsOn #5 Graph Databases
Docker and Compose
Repository:
- https://github.com/datastaxdevs/work
shop-introduction-to-nosql
Agenda
01 02 03
Definitions and Tabular Document
objectives of NoSQL Databases Databases
04 05 06
Key/values Graph Games
Databases Databases TakeAways
56
menti.com
Developer Resources
New hands-on learning at www.datastax.com/dev
LEARN
Classic courses available at DataStax Academy
Join community.datastax.com
ASK/SHARE
Ask/answer community user questions - share your expertise
Follow us @DataStaxDevs
CONNECT
We are on Youtube - Twitter - Twitch!
Slides and exercises for this workshop are available at
https://github.com/DataStax-Academy/workshop-crud-with-
MATERIALS
python-and-node
58
Homework (datastax.com/dev)
[Complete hands-on #5]
● Install docker, play with the notebooks and show us some
screenshots.
[Try another content] - https://www.datastax.com/try-it-out
● Go to datastax/dev and use the try-it-out
59
Certifications
https://www.datastax.com/dev/certifications
Vouchers (145$ each) , valid 3 months, with 2 attempts will be given
to people who apply and register to the 3 episodes.
60
Weekly Workshops https://www.datastax.com/workshops
Join our 10k Discord Community https://dtsx.io/discord
The Fellowship of the RINGS
Thank you!
@hadesarchitect
@clun
@SonicDMG
@hadesarchitect @hadesarchitect
@clunven @clunven
@SonicDMG @david-gilardi
Thank you!