+
A Comparison of SQL and NoSQL Databases
Slides from: Keith W. Hare
Metadata Open Forum
More reading: http://martinfowler.com/articles/nosqlKeyPoints.html
Metadata Open Forum
+
Abstract
NoSQL databases (either no-SQL or Not Only SQL)
are currently a hot topic in some parts of
computing. In fact, one website lists over a
hundred different NoSQL databases.
This presentation reviews the features common to
the NoSQL databases and compares those
features to the features and capabilities of SQL
databases.
2
BIG DATA! 14 April 2019
3 14 April 2019
+
SQL Characteristics
n Data stored in rows, columns and tables
n Relationships represented by data
n Data Manipulation Language
n Data Definition Language
n Transactions
n Abstraction from physical layer
4 14 April 2019
+
SQL Physical Layer Abstraction
n Applications specify what, not how
n Query optimization engine
n Physical layer can change without modifying
applications
n Createindexes to support queries
n In Memory databases
5 14 April 2019
+
Data Manipulation Language (DML)
n Datamanipulated with Select, Insert, Update, &
Delete statements
n Select T1.Column1, T2.Column2 …
From Table1, Table2 …
Where T1.Column1 = T2.Column1 …
n Data Aggregation
n Compound statements
n Functions and Procedures
n Explicit transaction control
6 14 April 2019
+
Data Definition Language
n Schema defined at the start
n Create Table (Column1 Datatype1, Column2 Datatype 2, …)
n Constraints to define and enforce relationships
n Primary Key
n Foreign Key
n Etc.
n Triggers to respond to Insert, Update , & Delete
n Stored Modules
n Alter …
n Drop …
n Security and Access Control
7 14 April 2019
+
Transactions – ACID Properties
n Atomic – All of the work in a transaction completes
(commit) or none of it completes
n Consistent – A transaction transforms the database
from one consistent state to another consistent state.
Consistency is defined in terms of constraints.
n Isolated – The results of any changes made during a
transaction are not visible until the transaction has
committed.
n Durable – The results of a committed transaction
survive failures
8 14 April 2019
+
NewSQL: real-time analytics
n ) SQL as the primary mechanism for application interaction
n 2) ACID support for transactions
n 3) A non-locking concurrency control mechanism so real-time
reads will not conflict with writes, and thereby cause them to
stall.
n 4) An architecture providing much higher per-node performance
than available from the traditional "elephants”
n 5) A scale-out, shared-nothing architecture, capable of running
on a large number of nodes without bottlenecking
+
NoSQL Definition
From www.nosql-database.org:
Next Generation Databases mostly addressing
some of the points: being non-relational,
distributed, open-source and horizontal
scalable. The original intention has been
modern web-scale databases. The movement
began early 2009 and is growing rapidly. Often
more characteristics apply as: schema-free,
easy replication support, simple API, eventually
consistent / BASE (not ACID), a huge data
amount, and more.
10 14 April 2019
+
NoSQL Products/Projects
http://www.nosql-database.org/ lists 122
NoSQL Databases
nCassandra
nCouchDB
nHadoop & Hbase
nMongoDB
nStupidDB
nEtc.
11 14 April 2019
+
NoSQL Products/Projects
http://www.nosql-
database.org/ lists 122 NoSQL
Databases
nCassandra
nCouchDB
nHadoop & Hbase
nMongoDB
nStupidDB
12 nEtc. 14 April 2019
+
NoSQL Distinguishing Characteristics
n Large data volumes n Asynchronous Inserts &
n Google s big data Updates
n Scalable replication and n Schema-less
distribution
n Potentially thousands of n ACIDtransaction
machines properties are not
n Potentially distributed around needed – BASE
the world
n CAP Theorem
n Queries need to return
answers quickly n Open source
development
n Mostly query, few updates
13 14 April 2019
+
BASE Transactions
n Acronym contrived to be the opposite of
ACID
n Basically Available,
n Soft state,
n Eventually Consistent
n Characteristics
n Weak consistency – stale data OK
n Availability first
n Best effort
n Approximate answers OK
n Aggressive (optimistic)
n Simpler and faster
14 14 April 2019
+
Brewer s CAP Theorem
A distributed system can support only two of the
following characteristics:
n Consistency
n Availability
n Partition tolerance
15 14 April 2019
+
+
NoSQL Database Types
Discussing NoSQL databases is
complicated because there are a variety of
types:
nColumn Store – Each storage block
contains data from only one column
nDocument Store – stores documents
made up of tagged elements
nKey-Value Store – Hash table of keys
17 14 April 2019
+
NoSQL Example: Column Store
n Eachstorage block contains data from only
one column
n Example: Hadoop/Hbase
n http://hadoop.apache.org/
n Yahoo, Facebook
n Example: Ingres VectorWise
n Column Store integrated with an SQL database
n http://www.ingres.com/products/vectorwise
Metadata Open Forum 18
+
Column Store Comments
nMore efficient than row (or document)
store if:
n Multiple row/record/documents are inserted
at the same time so updates of column
blocks can be aggregated
n Retrievals access only some of the
columns in a row/record/document
Metadata Open Forum 19
+
Other Non-SQL Databases
nXML Databases
nGraph Databases
nCodasyl Databases
nObject Oriented Databases
nEtc…
nWill not address these today
20 14 April 2019
+
Storing and Modifying Data
nSyntax varies
n HTML
n Java Script
n Etc.
nAsynchronous – Inserts and updates
do not wait for confirmation
nVersioned
nOptimistic Concurrency
23 14 April 2019
+
Retrieving Data
nSyntax Varies
n No set-based query language
n Procedural program languages such as
Java, C, etc.
nApplication specifies retrieval path
nNo query optimizer
nQuick answer is important
nMay not be a single right answer
24 14 April 2019
+
Open Source
nSmall upfront software costs
nSuitable
for large scale distribution on
commodity hardware
25 14 April 2019
+
NoSQL Summary
nNoSQL databases reject:
n Overhead of ACID transactions
n Complexity of SQL
n Burden of up-front schema design
n Declarative query expression
n Yesterday s technology
nProgrammer responsible for
n Step-by-step procedural language
n Navigating access path
26 14 April 2019
+
Summary
n SQL Databases
n Predefined Schema
n Standard definition and interface language
n Tight consistency
n Well defined semantics
n NoSQL Database
n No predefined Schema
n Per-product definition and interface language
n Getting an answer quickly is more important
than getting a correct answer
27 14 April 2019
+
Web References
n NoSQL -- Your Ultimate Guide to the Non - Relational Universe!
http://nosql-database.org/links.html
n NoSQL (RDBMS)
http://en.wikipedia.org/wiki/NoSQL
n PODC Keynote, July 19, 2000. Towards Robust. Distributed
Systems. Dr. Eric A. Brewer. Professor, UC Berkeley. Co-Founder
& Chief Scientist, Inktomi .
www.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
n Brewer's CAP Theorem posted by Julian Browne, January 11,
2009. http://www.julianbrowne.com/article/viewer/brewers-cap-
theorem
n How to write a CV Geek & Poke Cartoon
http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql.html
28 14 April 2019
+
Web References
n Exploring CouchDB: A document-oriented database for
Web applications , Joe Lennon, Software developer,
Core International.
http://www.ibm.com/developerworks/opensource/library/o
s-couchdb/index.html
n Graph Databases, NOSQL and Neo4j Posted by Peter
Neubauer on May 12, 2010 at:
http://www.infoq.com/articles/graph-nosql-neo4j
n Cassandra vs MongoDB vs CouchDB vs Redis vs Riak
vs HBase comparison , Kristóf Kovács.
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-
redis
n Distinguishing Two Major Types of Column-Stores
Posted by Daniel Abadi onMarch 29, 2010
http://dbmsmusings.blogspot.com/2010/03/distinguishing-
two-major-types-of_29.html
29 14 April 2019
+
Web References
n MapReduce: Simplified Data Processing on Large
Clusters , Jeffrey Dean and Sanjay Ghemawat,
December 2004.
http://labs.google.com/papers/mapreduce.html
n Scalable SQL , ACM Queue, Michael Rys, April 19,
2011
http://queue.acm.org/detail.cfm?id=1971597
n a practical guide to noSQL , Posted by Denise Miura on
March 17, 2011 at
http://blogs.marklogic.com/2011/03/17/a-practical-guide-
to-nosql/
30 14 April 2019