0% found this document useful (0 votes)

47 views79 pages

ECS781P-9-Cloud Data Management

The document discusses traditional relational database systems and their ACID properties, as well as the evolution of cloud applications and requirements for scalable cloud data management. It introduces concepts like horizontal scaling, data partitioning, and database types like NoSQL that sacrifice strict consistency for improved performance and scalability in the cloud. Challenges of partitioning like data and load skew, as well as rebalancing partitions as nodes are added or removed, are also covered.

Uploaded by

Yen-Kai Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views79 pages

ECS781P-9-Cloud Data Management

Uploaded by

Yen-Kai Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

ECS781P

CLOUD COMPUTING

Cloud Data Management

Lecturer: Dr. Sukhpal Singh Gill
School of Electronic Engineering and Computer Science

Ignacio Castro| Cloud Computing 1

Cloud Computing: roadmap for this module

▪ Network layer:
▪ Networking
▪ Application layer:
▪ Client/server, RPC, Web Services
▪ REST
▪ Performance:
▪ SLA
▪ Management
▪ Security
▪ Trends
▪ Monolithic applications → microservices
▪ Serverless: “hide complexity”
Contents

▪ Data at Cloud Scale

▪ Dealing with partitioning
▪ Dealing with replication
▪ Cloud Data Management Systems
Traditional Data Management Systems

▪ Table model for data, based on relational algebra

▪ Very powerful query language (SQL)
▪ Requires expertise, careful design of table structure, indices and
queries
▪ Data consistency and reliability are guaranteed by ACID
properties
▪ Atomicity, Consistency, Isolation, and Durability
▪ Single, powerful machines (vertical scalability)
SQL Tables

products
orders PID DESCRIPTION PRICE
1 Intel i7 £400
OID CID ODATE SDATE
2 Surface Pro £899
1 1 29/4/2018 NULL
3 32Gb Ram £200
2 2 20/1/2018 24/1/2008
4 Raspberry PI+ £30
5 Geforce 1080 £4000
clients
order_details
CID NAME COMPANY ADDRESS
1 Theresa GOVUK Downing st OID PID AMOUNT
2 Colin QMUL Mile End Rd 1 1 2
1 5 2
2 2 1
Sample SQL Queries

▪ INSERT INTO orders VALUES (1000, 1, ‘2018-04-29’,

‘2018-05-01’);
▪ UPDATE products SET PRICE=8000, DESCRIPTION=‘Geforce
1080’ WHERE PID = ‘1’;
▪ DELETE FROM orders WHERE ODATE < ‘2012-12-31’ and
SDATE IS NOT NULL;
▪ SELECT DESCRIPTION, PRICE FROM products;
▪ SELECT * FROM products;
▪ SELECT * FROM products WHERE PRICE < 300;
▪ SELECT * FROM products WHERE PID = ‘1’;
ACID Database Properties

▪ Atomicity
▪ All or nothing: transactions must act as a whole, or not at all
▪ No changes within a transaction can persist if any change within the transaction fails
▪ Consistency
▪ Must transform database from one valid state to another: changes made by a
transaction respect all database integrity constraints and the database remains in a
consistent state at the end of a transaction
▪ Isolation
▪ Transactions cannot see changes made by other concurrent transactions
▪ Concurrent transactions leave the database in the same state as if the transactions
were executed serially
▪ Durability
▪ Database data are persistent,
▪ Changes made by a transaction that completed successfully are stored permanently
How much data is generated in the cloud?
Evolution of Cloud applications
Concerns of cloud application developers

▪ Reliability: the system should continue to work correctly in the

face of adversity

▪ Scalability: as the system growths, there should be reasonable

ways of dealing with that growth

▪ Maintainability: over time, it should be productive to not only

maintain the current behaviour of the system, but also adapt it
to new use cases
Requirements for Cloud Data management

▪ Reliability: preventing data loss, making database services

highly available
▪ Preventing single points of failure
▪ Scalability: more data to store, more users requesting data
▪ Latency: Internet users from all locations
▪ Maintainability
▪ As services/applications change, so might the data model
NoSQL databases

▪ NoSQL (Not only SQL) systems

▪ Started in the 2000s, exploded in 2010s
▪ Motivation:
▪ scalability
▪ simpler data models
▪ programmer friendly
ACID vs BASE

▪NoSQL sacrifices ACID for

performance/scalability
▪ Updates eventually propagated, but limited
guarantees on the consistency of reads
▪“BASE” instead of “ACID”:
▪ BASE = Basically Available, Soft state,
Eventually consistent
▪ ACID = Atomicity, Consistency,
Isolation, Durability
NoSQL database types and examples

▪ Key/value Databases
▪ Simple value or row, indexed by a key
▪ e.g. Voldemort, Vertica, memcached
▪ Big table Databases
▪ “a sparse, distributed, persistent multidimensional sorted map”
▪ e.g. Google BigTable, Azure Table Storage, Amazon SimpleDB, Apache
Cassandra
▪ Document Databases
▪ Multi-field documents (or objects) with JSON access
▪ e.g. MongoDB, RavenDB (.NET specific), CouchDB
▪ Graph Databases
▪ Manage nodes, edges, and properties
▪ e.g. Neo4j, TitanDB
NoSQL databases landscape

Support for various interfaces for data

access

Logically model data using loosely typed

extensible data schema

Horizontal scaling through data

distribution model across multiple nodes

Data persists either in disk or memory or

both; sometimes in pluggable custom
stores
Taking advantage of cloud elasticity

▪ Vertical scaling
▪ Use more powerful VMs… up to a point
▪ Horizontal scaling
▪ Data is partitioned into multiple cloud instances
▪ Need mapping from data to partition
▪ Replication
▪ Achieve reliability
▪ Replica management (updates)
Contents

▪ Data at Cloud Scale

▪ Dealing with partitioning
▪ Dealing with replication
▪ Cloud Data Management Systems
How to scale: Partition

▪ Partition: division of a logical database or its constituent

elements into distinct independent parts
▪ Critical for scalability
▪ Two main types of partitioning:
▪ Horizontal
▪ vertical
Partition

Division of a logical database or its elements into distinct

independent parts
▪ Critical for scalability
▪ Two main types of partitioning:
▪ Horizontal
▪ vertical
Types of data partitioning

▪ Horizontal partitioning: different rows into different tables

▪ Vertical partitioning: different columns into different tables

Horizontal partitioning Vertical partitioning

https://docs.microsoft.com/en-us/azure/architecture/best-practices/data-partitioning
Horizontal partitioning

▪ Horizontally scaling data storage

▪ Known as sharding in the database world
▪ Data partitions are served from different machines
▪ Improves scalability → queries can be parallelised
▪ Clients need to know how to locate the data
Finding the data: request routing

▪ Clients need to know where the data is located

▪ Multiple approaches

[Klepperman, Designing Data Intensive Applications, pg 215]

Common partitioning strategies

▪ Range partitioning: split the space of keys into ranges,

and allocate ranges to partitions

https://docs.oracle.com/cd/B10500_01/server.920/a96524/c12parti.htm
Common partitioning strategies

▪ Range partitioning: split the space of keys into ranges,

and allocate ranges to partitions

Ben
A-H
Dami
Zico
Ignacio I-P

Felix
Joseph
Q-Z

https://docs.oracle.com/cd/B10500_01/server.920/a96524/c12parti.htm
Common partitioning strategies

▪ Range partitioning: split the space of keys into ranges,

and allocate ranges to partitions
▪ Hash partitioning: hash each key (essentially obtaining a
random number), and allocate these numbers to
partitions (using the modulo operation)

https://docs.oracle.com/cd/B10500_01/server.920/a96524/c12parti.htm
Common partitioning strategies

▪ Hash partitioning: hash each key (essentially obtaining a

random number), and allocate these numbers to
partitions using the modulo operation
Hash(x) Mod 3
1
Ben 18 0
Dami 06 0
Zico 22 1 2
Ignacio 20 2
Felix 95 2
3
Joseph 19 1

https://docs.oracle.com/cd/B10500_01/server.920/a96524/c12parti.htm
Partition challenges

▪ Scalability:
▪ dealing with data growth

▪ Partition rebalancing:
▪ mapping data to nodes

▪ Querying across partitions:

▪ finding the right data
Scalability challenges: skew

▪ Data skew: some partitions might be holding more data than

others
▪ Some partition schemes don’t guarantee equal number of elements
▪ Some keys might hold more values than others

▪ Load skew: Partitions might not receive the same amount of

requests
Scalability challenges: rebalancing partitions

▪ Changing the number of nodes might require remapping

between nodes and partitions
▪ Worst case scenario: hash mod N
▪ Adding a new node to a hash partitioned element would change the
location of every item
▪ Good strategy for most schemes:
▪ Fixed number of virtual partitions
▪ Nodes have tokens, these tokens can be handed over gracefully as
elements are added to the system
Challenges in Rebalancing Partitions

Adding a new node to a hash partitioned element would change

the location of every item!

Hash(x) Mod 3 Mod 4 1

Ben 18 0 2
Dami 06 0 2
2
Zico 22 1 2
Ignacio 20 2 0
Felix 95 2 1 3
Joseph 19 1 3
4
https://docs.oracle.com/cd/B10500_01/server.920/a96524/c12parti.htm
Common partitioning strategies

▪ Hash partitioning + ranges

Hash(x) 0-10
Ben 18
Dami 06
11-20
Zico 22
Ignacio 20
Felix 95 21-30
Joseph 19
76-100
Solution: Consistent Hashing

▪ Special type of partitioning strategy that reduces the

impact of nodes entering and leaving the system
▪ Each database object is mapped to a point on a
circumference (‘hash ring’) by hashing its key value (as
in hash partitioning)
▪ Each available machine (node) is mapped to a point on
the edge of the same circle
▪ To find the node to store an object:
▪ Hashes the object’s key to a point on the edge of the circle,
▪ Walks clockwise around the circle until it encounters the
first node
Consistent Hashing

A Objects o1 and o4 are

o1 o2 stored on the node A

B Object o2 is stored on
the node B
o4
Object o3 is stored on
o3 the node C
C
Consistent Hashing: node changes

▪ A node leaves the network: the node in the clockwise direction

stores all new objects that would belong to the failed node
▪ The data objects of the failed node have to be redistributed to
remaining nodes
▪ Nodes added to the network:
▪ Mapped to a point in the hash ring
▪ All objects that map between the point of the new node and the first
counter clock wise neighbour, map to the new node
Consistent Hashing: node changes

A
o1 o2

B
o4

C
Consistent Hashing: node changes

The node C has left and the node

D has entered the network
A
o1 o2

D B
o4

C
Consistent Hashing: node changes

The node C has left and the node

D has entered the network
A
o1 o2
Object o1 is stored on
the node A
D B
o4
Object o2 is stored on
the node B
o3
Objects o3 and o4 are
stored on the node D
Query challenges: Range queries across partitions

Select count (*)

62 from Emp
Partition-1
Partition 1
Query where age > 50
Server-1 AND
sal > 90,000’;
Partition-2

440 Query
. Server-2

. . Query
Coordinator
. .
Emp Table
.
Ans = 62 + 440 + ... + 1,123 = 99,000
1,123 Query
Partition-k Server-k
Secondary indexes help find matching rows
Contents

▪ Data at Cloud Scale

▪ Dealing with partitioning
▪ Dealing with replication
▪ Cloud Data Management Systems
Data replication

▪ Horizontally scaling data storage

▪ Multiple nodes are responsible to store the same
data
▪ All replicas must be synchronised
▪ Multiple variant techniques that use replication:
▪ Memory caches
▪ Leader-based replication
▪ Clustering
Memory Caches

▪ Transient, partitioned and replicated in-memory databases

▪ Replicate most frequently requested data
▪ Benefits:
▪ Fast response to clients
▪ Off-loading database servers
Memcached

▪ High performance distributed in-memory caching service that

manages key-value pairs
▪ Very similar to a Hashtable, dictionary
▪ Key-value API: get and set methods
▪ Goal: attend majority of requests for read-heavy workloads
▪ Popular: used in Facebook, LinkedIn, Flickr…
Memcached use with standard DB

Web App
Memcache get(key) Memcached DB
d Client Node

get_from_db(key)

foo = memcached.get(fooId)
if foo is None:
foo = get_from_db(fooId)
memcached.set(fooId, foo)
Scaling Horizontally Memcached

Example: Memcached
Node #1
Web App hash(key) = 2
Memcached
Memcache get(key) DB
d Client ..
Node #2
.
Memcached
Node #N
get_from_db(key)

▪ Multiple Memcached nodes, oblivious that they are part of a

cluster
▪ Client hashes keys, partitioning them between Memcached
nodes
Leader based replication

▪ One server is dedicated to writes (leader)

▪ A number of replica servers are dedicated to satisfy
reads (followers)
▪ The leader propagates data updates to followers
▪ If the leader crashes before completing replication to at least one
follower, the write operation is lost
▪ Otherwise the most up to date follower undertakes the role of the leader
Clustering

▪ Cluster: Group of redundant computers that provide a

continued service and can sustain high user workloads
▪ High-availability clusters provide continued service
in the event of failures to any element of the
infrastructure.
▪ Redundancy eliminates Single Points Of Failures
▪ When a HA cluster detects a hardware/software fault, it
immediately restarts the application on another system
without requiring administrative intervention, a process
known as failover
Synchronous vs Asynchronous replication

▪ Synchronous: the Leader waits for confirmation that the write was
received, then report success to the user and makes the write visible

▪ Asynchronous: the leader sends the message but does not wait for
the follower’s response

[Klepperman, Designing Data Intensive Applications, pg 154]

The problem with synchronous updates

▪ Safe, but negatively impact latency

▪ Some update strategies help (e.g. chain replication)
▪ High availability of a service is often characterized by
small latency
▪ Amazon: 0.1 secs more response time will cost them 1% in
sales
▪ Akamai: 0.1 secs more response time causes 7% drop in
traffic,
▪ Google: 0.5 secs in latency caused traffic to drop by 20%
▪ Asynchronous updates improve latency, at the cost of potential
side effects
Risk 1: Non-monotonic reads

▪ User can see things “moving backward in time”

▪ Can happen if a user makes several reads from different replicas

Follower 1 has
little lag and
reflects changes

Follower 2 has
larger lag and it is
not up to date

User makes the

same query twice

[Klepperman, Designing Data Intensive Applications]

Risk 2: Violation of causality

▪ If some partitions are replicated slower than others, an observer

may see the answer before they see the question

Information does
not propagate
homogeneously
across partitions!

Observer sees the

answer before
the question
Goal: linearisability

Intuition: the system should appear as if there was only

one copy of the data
▪ Usual expectation from developers
▪ Equivalent to atomic consistency in ACID databases
▪ As soon as a write operation is complete, all reads will
view the updated value
▪ If one client reads one updated value, all future reads
will also see that updated value (or any further
update)
▪ Requires constraints in replica update rules
Impact of network failures (partitions)

▪ A network interruption forces a choice between linearizability

and availability
Brewer’s CAP Theorem

▪ Brewer’s Conjecture (2000)

▪ Symposium on Principles of Distributed Computing
▪ Formally proven in 2002 (Gilbert and Lynch, MIT)
▪ A distributed system cannot guarantee at the
same time all three of:
▪ Consistency (Linearisability)
▪ Availability
▪ Network Partitioning tolerance
Brewer’s CAP Theorem

João Ricardo Lourenço et al. "Choosing the right NoSQL database for the job: a quality attribute evaluation." Journal of Big Data 2.1 (2015): 18.
CAP Theorem properties

▪ Network Partition tolerance - the service operates normally in the

presence of network partitions
A network partition occurs if two or more “islands” of network nodes
cannot connect to each other, messages are highly delayed or lost.
▪ Strong Consistency – Linearisability. Reads either obtain the latest
updated value, or an error
▪ Availability - The system is available to server requests

Network partitioning is a network failure that causes the members to split into multiple groups such that a member in a group
cannot communicate with members in other groups. In a partition scenario, all sides of the original cluster operate
independently assuming members in other sides are failed.
Consistent systems

If application requires consistency (linearizability) and

some replicas are disconnected (e.g. network problem)

some replicas cannot process requests while

disconnected ➔ unavailability

▪ wait until the network problem is fixed, or

▪ return an error
Eventually Consistent systems

▪ Asynchronous updates:
“replicas update in the background”
▪ The system converges eventually
▪ Not a single model:
▪ active research area,
▪ many consistency models: between linearizability and pure asynchronous
updates
Diversity of database features

NA
▪ No choice: network partitions will happen
(whether you like it or not)
▪ Thus, a better way of phrasing CAP would be:
either Consistent or Available when Partitioned
Critique of CAP

▪ No choice: network partitions will happen

(whether you like it or not)

▪ A better way of phrasing CAP would be:

either Consistent or Available when Partitioned
Contents

▪ Data at Cloud Scale

▪ Dealing with partitioning
▪ Dealing with replication
▪ Cloud Data Management Systems
Cassandra

▪ A free and open-source, distributed,

wide column store, NoSQL database
management system designed to
handle large amounts of data across
many commodity servers
▪ Initially developed at Facebook to
power the Facebook inbox search
feature
▪ Facebook released Cassandra as an
open-source project on Google code in
July 2008.
Apache Cassandra

▪ Open Source Distributed Database

▪ Based on two influential NoSQL systems:
▪ Amazon Dynamo and Google BigTable
▪ Released in 2008, used by Facebook, eBay, etc.
▪ Great example of partition AND replication features
Cassandra features

1. Elastic: read and write throughput increases linearly

as new machines are added
▪ High performance for write operations
2. Supports content replication, per node and
datacenter
3. Decentralised: fault tolerant with no single point of
failure; no “master” node
4. Data model: Column based, multi dimensional
hashtable, no join capabilities
5. Tunable consistency, per operation
Cassandra partitioning

▪ Consistent hashing: multiple Cassandra nodes form a ring

▪ Rings can span multiple datacentres (Cassandra is aware of topology)
▪ Virtual nodes (vnodes): every physical node is split into many
virtual nodes (256 default) for easier partition rebalancing

[Hewitt and Carpenter, Cassandra the Definitive Guide, 2nd Ed]

Cassandra Data Replication

▪ Keys have associated a replication factor (number of replicas

over the network)
▪ The nodes in charge of storing replicas are the ones
immediately to the right (i.e., clockwise) of the node that has to
store the key copy based on the consistent hash function
Cassandra: Access to Data Replicas

#6 #2

HASH
VALUE

ReplicationFactor = 1
Coordinator #5 #3

Client
#4
Cassandra: Access to Data Replicas

#6 #2

HASH
VALUE

ReplicationFactor = 3
Coordinator #5 #3

Client
#4
Remember: update propagation across replicas

reply reply reply

Which option is best?

Tunable Consistency: Write / Read Operations

▪ How many replicas need to reply for the operation to

become a success?
Level Description
ANY One node Eventually consistent system
QUORUM N/2 + 1 replicas
LOCAL_QUORUM N/2 + 1 replicas in local data centre
EACH_QUORUM N/2 + 1 replicas in each data centre
ALL All replicas
Consistent system:
no response if one replica is
not accessible!
https://docs.datastax.com/en/archived/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
Tunable consistency : Write / Read Operations

▪ Level of consistency depends on:

▪ N = replication factor
▪ W = number of write nodes
▪ R = number of read nodes
▪ Consistency is an spectrum!
▪ “Golden rule” (i.e., rule of thumb)
▪ R + W <= N : eventual consistency
▪ R + W > N “Strong consistency
Still not a strong as
linearizability!
Data model of Cassandra

▪ Based on Bigtable (Google):

▪ 1 single table
▪ Chang, Fay, et al. "Bigtable: A distributed storage system for
structured data." ACM Transactions on Computer Systems
(TOCS) 26.2 (2008): 1-26.
▪ Similarities with SQL databases (tables, columns…)
▪ Less powerful
▪ Similar query language
eBay Cassandra Deployment

https://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376
eBay Cassandra Deployment
High availability / minimise latency

2 copies in
each DC

Only 1 node
needs to reply
https://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376
Recommended bibliography

▪ M. Kleppmann, Designing Data-Intensive Applications.

▪ Part II: Distributed Data
▪ Chapters 5, 6, 9
▪ De Candia et al, “Dynamo: Amazon's highly available
key-value store”, SOSP 2007
▪ W. Vogels, “Eventually Consistent”, CACM 2009
79
Cloud Computing: roadmap for this module

Nosql Databases
No ratings yet
Nosql Databases
379 pages
04 NoSQL
No ratings yet
04 NoSQL
126 pages
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
Blockchain
No ratings yet
Blockchain
273 pages
Lecture 5 Distributed Storage Systems
No ratings yet
Lecture 5 Distributed Storage Systems
26 pages
09 - Cloud-Enabling Technologies - v2
No ratings yet
09 - Cloud-Enabling Technologies - v2
45 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
28 pages
Lecture 3 - Principles of NoSQL Databases
No ratings yet
Lecture 3 - Principles of NoSQL Databases
49 pages
ECS765P - W11 - Stream Processing II
No ratings yet
ECS765P - W11 - Stream Processing II
47 pages
No SQL
No ratings yet
No SQL
39 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
Module 2
No ratings yet
Module 2
40 pages
4.NoSQL 1
No ratings yet
4.NoSQL 1
69 pages
System Design
No ratings yet
System Design
30 pages
L19 Mod6 ReplicationPartitioning P2
No ratings yet
L19 Mod6 ReplicationPartitioning P2
27 pages
No SQL
No ratings yet
No SQL
109 pages
CassandraTraining v3.3.4
100% (1)
CassandraTraining v3.3.4
183 pages
Lec21Notes Merged
No ratings yet
Lec21Notes Merged
20 pages
Unit-3 BDA
No ratings yet
Unit-3 BDA
21 pages
Nosql KK
No ratings yet
Nosql KK
23 pages
Oracle GoldenGate 11g Implementer's guide
From Everand
Oracle GoldenGate 11g Implementer's guide
John P Jeffries
5/5 (1)
BigData NoSQL
No ratings yet
BigData NoSQL
30 pages
Matt Mello - Thought Control
No ratings yet
Matt Mello - Thought Control
16 pages
System Design
No ratings yet
System Design
32 pages
CC - Lecture 8-Final
No ratings yet
CC - Lecture 8-Final
51 pages
CC - Lecture 6-Data
No ratings yet
CC - Lecture 6-Data
44 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
System Design
No ratings yet
System Design
6 pages
2: Data Model: Creating An E Cient Data Model For Highly-Loaded Applications
No ratings yet
2: Data Model: Creating An E Cient Data Model For Highly-Loaded Applications
83 pages
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
26 pages
05 - synchCritSec Args HW
No ratings yet
05 - synchCritSec Args HW
47 pages
17 Transaction Processing
100% (1)
17 Transaction Processing
7 pages
Big Data Slides
No ratings yet
Big Data Slides
26 pages
Chapter 10
No ratings yet
Chapter 10
25 pages
ECS765P - W3 - Hadoop Principles and Components
No ratings yet
ECS765P - W3 - Hadoop Principles and Components
47 pages
Lecture 06
No ratings yet
Lecture 06
68 pages
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
No ratings yet
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
43 pages
III Sharding Strategies
No ratings yet
III Sharding Strategies
30 pages
ECS765P - W4 - Introduction To Spark
No ratings yet
ECS765P - W4 - Introduction To Spark
39 pages
Tutorial
0% (2)
Tutorial
3 pages
2.1.SummerSOC2015 Tutorial NoSQL
No ratings yet
2.1.SummerSOC2015 Tutorial NoSQL
62 pages
Algomasterio System Design Interview Handbook
No ratings yet
Algomasterio System Design Interview Handbook
19 pages
Distributed Systems Practitioners Dimos Raptis Raspoznan
No ratings yet
Distributed Systems Practitioners Dimos Raptis Raspoznan
259 pages
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
No ratings yet
Big Data Management and Nosql Databases: Doc. Rndr. Irena Holubova, PH.D
27 pages
Class 7 - Scaling, Sharding, Consistent Hashing
No ratings yet
Class 7 - Scaling, Sharding, Consistent Hashing
4 pages
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
No ratings yet
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
30 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
No ratings yet
Nosql Systems: Sharding, Replication and Consistency: Riccardo Torlone Università Roma Tre
28 pages
Lez.d-01-Hadoop (A) Intro
No ratings yet
Lez.d-01-Hadoop (A) Intro
58 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Introduction To Big Data and NoSQL
No ratings yet
Introduction To Big Data and NoSQL
52 pages
ECS765P - W9 - Large-Scale Graph Processing
No ratings yet
ECS765P - W9 - Large-Scale Graph Processing
51 pages
Cloud Security UNIT 1
No ratings yet
Cloud Security UNIT 1
17 pages
FactoryTalk® View Site Edition User's Guide
100% (3)
FactoryTalk® View Site Edition User's Guide
702 pages
Distributed Data Store
No ratings yet
Distributed Data Store
11 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
DBS Reviewer
No ratings yet
DBS Reviewer
4 pages
Ecs765p W2
No ratings yet
Ecs765p W2
55 pages
W3 Ecs7020p
No ratings yet
W3 Ecs7020p
51 pages
BLISS 03-T3 Unit 1 Slides v3.0 Final Controled
No ratings yet
BLISS 03-T3 Unit 1 Slides v3.0 Final Controled
72 pages
Mastering DuckDB: High-Performance Analytics Made Easy
From Everand
Mastering DuckDB: High-Performance Analytics Made Easy
Robert Johnson
No ratings yet
Storagesystems
No ratings yet
Storagesystems
41 pages
ECS765P - W10 - Stream Processing
No ratings yet
ECS765P - W10 - Stream Processing
39 pages
Nosql Overview: Implementation Free
No ratings yet
Nosql Overview: Implementation Free
40 pages
ECS781P-3-Cloud Applications
No ratings yet
ECS781P-3-Cloud Applications
50 pages
Ecs781p 4 Rest
No ratings yet
Ecs781p 4 Rest
47 pages
ECS765P - W6 - Big Data Ingestion and Storage
No ratings yet
ECS765P - W6 - Big Data Ingestion and Storage
34 pages
Searchable BT Technical Publication
No ratings yet
Searchable BT Technical Publication
58 pages
Ryan Thesis
No ratings yet
Ryan Thesis
78 pages
A Low-Cost Cross-Border Payment System Based On Auditable Cryptocurrency With Consortium Blockchain Joint Digital Currency
No ratings yet
A Low-Cost Cross-Border Payment System Based On Auditable Cryptocurrency With Consortium Blockchain Joint Digital Currency
14 pages
ParallelDBs PDF
No ratings yet
ParallelDBs PDF
23 pages
Bigdata and Nosql DBS: Piyushgupta July2013
No ratings yet
Bigdata and Nosql DBS: Piyushgupta July2013
27 pages
Note - Wireless Communications For Everybody
No ratings yet
Note - Wireless Communications For Everybody
2 pages
Unit - 3 - DC - Distributed - Deadlock (Part-2)
No ratings yet
Unit - 3 - DC - Distributed - Deadlock (Part-2)
13 pages
NoSQL - Unit2
No ratings yet
NoSQL - Unit2
8 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
Big Data IN A Gist
No ratings yet
Big Data IN A Gist
16 pages
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
No ratings yet
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
23 pages
Distributed Computing - Syllabus: Course Code: Cs3551 REGULATION:2021
No ratings yet
Distributed Computing - Syllabus: Course Code: Cs3551 REGULATION:2021
7 pages
Session 5-Azure Components
No ratings yet
Session 5-Azure Components
28 pages
A Thorough Introduction To Distributed Systems
No ratings yet
A Thorough Introduction To Distributed Systems
31 pages
Nosqldbs
No ratings yet
Nosqldbs
149 pages
Parallel and Distributed Computing
33% (3)
Parallel and Distributed Computing
10 pages
Hadoop 2
No ratings yet
Hadoop 2
27 pages
Moving Queries To The Data, Not Data To The Queries
No ratings yet
Moving Queries To The Data, Not Data To The Queries
2 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
Big Data Cat Questions
No ratings yet
Big Data Cat Questions
7 pages
Process Synchronization Continued: Producer/Consumer (Unbounded Buffer) Producer/Consumer (Bounded Buffer)
No ratings yet
Process Synchronization Continued: Producer/Consumer (Unbounded Buffer) Producer/Consumer (Bounded Buffer)
25 pages
Shard Hash Pattern
No ratings yet
Shard Hash Pattern
9 pages
Unit 1
No ratings yet
Unit 1
21 pages
Serializability of Transaction Schedule
No ratings yet
Serializability of Transaction Schedule
11 pages
Tom Rose - From The Red Notebook 2nd Edition
75% (4)
Tom Rose - From The Red Notebook 2nd Edition
33 pages
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
No ratings yet
Week 3 v1.1 (Hidden) Supervised Learning (Regression)
52 pages
W4 Ecs7020p
No ratings yet
W4 Ecs7020p
48 pages
ECS7020P ClassificationExercisesSolutions II
No ratings yet
ECS7020P ClassificationExercisesSolutions II
7 pages
ECS726-Week01 Intro
No ratings yet
ECS726-Week01 Intro
70 pages
ECS726-Week02 Symmetric EncryptionP
No ratings yet
ECS726-Week02 Symmetric EncryptionP
62 pages
Cloud Computing Lab 2
No ratings yet
Cloud Computing Lab 2
4 pages
ECS726-Week05 Cryptographic Protocols Key Management-P
No ratings yet
ECS726-Week05 Cryptographic Protocols Key Management-P
58 pages
Computing Pradigm
No ratings yet
Computing Pradigm
10 pages
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
From Everand
The DynamoDB Handbook: Practical Solutions for Modern NoSQL Database Management
Robert Johnson
No ratings yet
ECS781P-11-Edge of The Cloud
No ratings yet
ECS781P-11-Edge of The Cloud
30 pages
ECS726-Week04 - Hash - MAC - Digital Sinatures - Freshness - Dynamic Password Schemes
No ratings yet
ECS726-Week04 - Hash - MAC - Digital Sinatures - Freshness - Dynamic Password Schemes
52 pages
Scalping Trade Sheet (Rana Ahmad Kafeel)
No ratings yet
Scalping Trade Sheet (Rana Ahmad Kafeel)
2 pages
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Types of Middle Ware
No ratings yet
Types of Middle Ware
20 pages
ECS765P - W5 - Spark Programming
No ratings yet
ECS765P - W5 - Spark Programming
43 pages
ECS781P 10 Microservices
No ratings yet
ECS781P 10 Microservices
34 pages
Big Data Technologies PG-DBDA March 2022
No ratings yet
Big Data Technologies PG-DBDA March 2022
8 pages
ECS781P 6 CloudPerformanceSLAs
No ratings yet
ECS781P 6 CloudPerformanceSLAs
39 pages
Lights Illusions Script 08-26-19
No ratings yet
Lights Illusions Script 08-26-19
6 pages
The Passion of An Amateur Card Magician
100% (4)
The Passion of An Amateur Card Magician
557 pages
Hadoop
No ratings yet
Hadoop
5 pages
Concurrency Control: S X S True False X False False
No ratings yet
Concurrency Control: S X S True False X False False
9 pages
Features of MapReduce
No ratings yet
Features of MapReduce
4 pages
GSP313 Create and Manage Cloud Resources - Challenge Lab
No ratings yet
GSP313 Create and Manage Cloud Resources - Challenge Lab
2 pages
Conflit Serializability
No ratings yet
Conflit Serializability
3 pages
Big Data Defined: Andrew J. Brust
No ratings yet
Big Data Defined: Andrew J. Brust
5 pages
Magic Pen Script 10-05-19
No ratings yet
Magic Pen Script 10-05-19
4 pages
Meha Desai Resume
No ratings yet
Meha Desai Resume
1 page