0% found this document useful (0 votes)

7 views12 pages

3 Module NOSQL Preparation

Uploaded by

vvchandrahasreddy7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views12 pages

3 Module NOSQL Preparation

Uploaded by

vvchandrahasreddy7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

1) Explain with a neat diagram, the partitioning and combining in MapReduce

A) Diagram - 4 Marks

Combining reduces data before sending it across the network. - 3 Marks

Sol:
❖ In the simplest form, we think of a map-reduce job as having a single reduce
Function.
❖ The outputs from all the map tasks running on the various nodes
are concatenated together and sent into the reduce.
❖ While this will work, there are things we can do to increase the parallelism and to reduce
the data transfer
❖ The first thing we can do is increase parallelism by partitioning the output of the
mappers.
❖ Each reduce function operates on the results of a single key.

❖ A combiner function is, in essence, a reducer function—indeed, in many cases

the same function can be used for combining as the final reduction.
❖ The reduce function needs a special shape for this to work: Its output must match
its input. We call such a function a combinable reducer.
2) Explain basic map reduce, with neat diagram // . Explain Mappers and Reducers with
examples.
A)
MapReduce is a programming model used for processing large datasets in a distributed and
parallel manner across many computers.

MapReduce operates in two main phases: Map and Reduce. Each phase involves specific
tasks:

Map Phase

● The input data is divided into chunks (splits).

● Each chunk is processed by a Mapper function, which transforms the input into
intermediate key-value pairs.
Reduce Phase

● Each group of key-value pairs is processed by a Reducer function.

● The Reducer aggregates values for each key to produce the final result.

Shuffle and Sort

● The intermediate key-value pairs are shuffled and grouped by key.

● Keys are sent to the appropriate Reducer based on a partitioning function.

3) Explain two stages Map reduce example, with neat diagram

A)
1. As map-reduce calculations get more complex, it’s useful to break them down into
stages using a pipes-and-filters approach, with the output of one stage serving as input
to the next, rather like the pipelines in UNIX.
2. A first stage (Figure 7.9) would read the original order records and output a series of
key-value pairs for the sales of each product per month.
3. The second-stage mappers (Figure 7.10) process this output depending on the year. A
2011 record populates the current year quantity while a 2010 record populates a prior
year quantity.

4) How are calculations composed in Map reduce? Explain with neat diagram
A)

MapReduce is designed to process large datasets by dividing the work into smaller tasks.
However, it imposes some constraints:

1. In the Map Phase: You can only process one piece of data (record) at a time.
2. In the Reduce Phase: You can only process one group of data (key) at a time.

This means you must think differently when solving problems, especially for tasks like
calculating averages, which aren’t straightforward in this model.
5) What Is a Key-Value Store? Single Bucket , Popular Key-Value Databases?

A) Key-value stores are the simplest NoSQL data stores to use from an API perspective.
The client can either get the value for the key, put a value for a key, or delete a key from
the data store.
Single Bucket:

Bucket Organization in Key-Value Stores:

Single Bucket Approach: All data (e.g., session data, shopping carts) can be stored
within a single bucket under one key-value pair, creating a unified object. However, this
can risk key conflicts due to different data types being stored under the same bucket. 29
Separate
Buckets for Data Types:
By appending object names to keys or creating specific buckets for each data type (e.g.,
sessionID_userProfile), it’s possible to avoid key conflicts and access only the
necessary object types without needing extensive key design changes.

Popular Key-Value Databases:

● Popular Key-Value Databases:
1. Riak: Uses a "bucket" structure for segmenting keys, aiding organization.
2. Redis: Often referred to as a data structure server, supports complex
structures like lists, sets, and hashes, enabling more versatile use.
3. Memcached, Berkeley DB, HamsterDB, Amazon DynamoDB, Project
Voldemort.
6) Give a brief description of the features of key value stores.

A) Key-Value Store Features

Key-value stores are a type of NoSQL database that store data in a simple
format: a unique key associated with a value. Think of it as a dictionary where
each key points to a specific value.

Let’s explore the features with respect to the mentioned points:

1. Consistency

● Explanation: Consistency refers to whether data remains the same across

all replicas in a distributed database.
● In key-value stores:
○ They usually follow the CAP theorem, where they can trade off
between consistency, availability, and partition tolerance.
○ Some systems prioritize eventual consistency: changes to a value
may take time to propagate to all replicas but will eventually become
consistent.
○ Others may enforce strong consistency, ensuring all clients see
the same data at any given time.
● Example: If you update a key’s value in a distributed system, not all nodes
may show the update immediately if eventual consistency is used.

2. Transactions

● Explanation: Transactions involve ensuring that a group of operations are

completed successfully or not at all (atomicity).
● In key-value stores:
○ Many do not natively support ACID transactions (Atomicity,
Consistency, Isolation, Durability) like relational databases.
○ Some advanced key-value stores (e.g., Redis) provide limited
transaction-like mechanisms (like multi-operations or optimistic
locking).
○ They are generally designed for speed and scalability rather than
transactional integrity.
● Use Case: A key-value store may not handle banking transactions well
because it lacks strong transaction guarantees.

3. Query Features

● Explanation: Query features determine how you retrieve and manipulate

data in the store.
● In key-value stores:
○ Queries are very simple—data is accessed using the key.
○ There are no complex querying capabilities (e.g., SQL JOINs or
WHERE clauses).
○ Some systems provide additional features like range queries or
secondary indexing, but these are not standard.
● Example:
○ To retrieve data: GET key1
○ To update data: SET key1 value1
○ Advanced querying like "find all users older than 30" is not directly
supported.

4. Structure of the Data

● Explanation: This refers to how data is organized and stored.

● In key-value stores:
○ Data is stored as a simple key-value pair.
○ The value can be of any type—string, number, JSON, or even a
binary object (like an image).
○ They are schema-less, meaning no predefined structure is required
for values.
● Example:
○ Key: user123
○ Value: { "name": "John", "age": 30, "city": "New
York" }
5. Scaling

● Explanation: Scaling determines how well the database handles an

increase in data or traffic.
● In key-value stores:
○ They are designed to scale horizontally (by adding more servers to
the cluster).
○ Distributed systems partition the data using techniques like
consistent hashing to balance the load across servers.
○ Scaling is easy because there is no need to maintain relationships
between data (unlike relational databases).
● Example: When a shopping website grows and needs to handle millions of
users, a key-value store can distribute data across multiple servers
seamlessly.

7) Explain with suitable use cases of key value stores

A)
1. Storing Session Information
2. User Profiles, Preferences
3. Shopping Cart Data

NOSQL Module-3
100% (2)
NOSQL Module-3
67 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
43 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
NGD Mini Notes
No ratings yet
NGD Mini Notes
7 pages
Key-Value Stores - Updated
No ratings yet
Key-Value Stores - Updated
65 pages
4 - Key-Value Storage
No ratings yet
4 - Key-Value Storage
109 pages
Big Data Notes (All Lectures)
No ratings yet
Big Data Notes (All Lectures)
44 pages
Module 3
No ratings yet
Module 3
79 pages
DBMS 11
No ratings yet
DBMS 11
13 pages
07 BigData DataAnalysis
No ratings yet
07 BigData DataAnalysis
66 pages
DRKP Module 3
No ratings yet
DRKP Module 3
44 pages
11-NoSQL Nhom8
No ratings yet
11-NoSQL Nhom8
72 pages
NOSQL
No ratings yet
NOSQL
55 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
43 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
Big Data Computing
No ratings yet
Big Data Computing
36 pages
Nosql Prepared
No ratings yet
Nosql Prepared
60 pages
4 NoSql
No ratings yet
4 NoSql
25 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
Unit II
No ratings yet
Unit II
83 pages
Key Value Pair Database
No ratings yet
Key Value Pair Database
24 pages
Unit 3 NoSQL
No ratings yet
Unit 3 NoSQL
98 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Big Data NOTES
No ratings yet
Big Data NOTES
14 pages
BDH Answer Bank
No ratings yet
BDH Answer Bank
21 pages
Module 3
No ratings yet
Module 3
37 pages
Nosql Qbsol Ia-02
No ratings yet
Nosql Qbsol Ia-02
18 pages
Nosql Mod3
No ratings yet
Nosql Mod3
18 pages
Lec 12
No ratings yet
Lec 12
16 pages
Bda QB 2
No ratings yet
Bda QB 2
15 pages
Module 1
No ratings yet
Module 1
69 pages
Big Data Slides
No ratings yet
Big Data Slides
26 pages
Module 2
No ratings yet
Module 2
22 pages
NOSQL Databases
No ratings yet
NOSQL Databases
19 pages
QB 1
No ratings yet
QB 1
9 pages
No SQL
No ratings yet
No SQL
12 pages
Unit-3 BDA
No ratings yet
Unit-3 BDA
21 pages
10 Nosql
No ratings yet
10 Nosql
23 pages
Key-Value Databases
No ratings yet
Key-Value Databases
17 pages
BDA Answers
No ratings yet
BDA Answers
6 pages
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
26 pages
Explain The Update Consistency - Update (Write-Write Conflict), Read (Read-Write Conflict) With An Example and A Neat Diagram
No ratings yet
Explain The Update Consistency - Update (Write-Write Conflict), Read (Read-Write Conflict) With An Example and A Neat Diagram
6 pages
Nosql Unit 3
No ratings yet
Nosql Unit 3
7 pages
Lecture 3 - MapReduce
No ratings yet
Lecture 3 - MapReduce
9 pages
QB 2
No ratings yet
QB 2
17 pages
NoSQL Databases
No ratings yet
NoSQL Databases
20 pages
Big Data Analysis
No ratings yet
Big Data Analysis
9 pages
Big Data Unit-Ii Notes
No ratings yet
Big Data Unit-Ii Notes
7 pages
Unit 2 (Big Data Analytics)
No ratings yet
Unit 2 (Big Data Analytics)
11 pages
Be Sem 7 Ia 1 Question Bank
No ratings yet
Be Sem 7 Ia 1 Question Bank
4 pages
Introduction To Nosql: - Key Value Databases
No ratings yet
Introduction To Nosql: - Key Value Databases
14 pages
Lesson 2 A Review of Hadoop
No ratings yet
Lesson 2 A Review of Hadoop
6 pages
SQL Queries Succinctly
100% (4)
SQL Queries Succinctly
102 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
NOSQL
No ratings yet
NOSQL
23 pages
Big Data Analysis PDF 2
No ratings yet
Big Data Analysis PDF 2
18 pages
NOSQL
No ratings yet
NOSQL
2 pages
Data Warehouse Lab Manual
No ratings yet
Data Warehouse Lab Manual
61 pages
Secondary Memory
No ratings yet
Secondary Memory
17 pages
Module 3 Nosql
No ratings yet
Module 3 Nosql
12 pages
DBMS ER Model Concept - Javatpoint
No ratings yet
DBMS ER Model Concept - Javatpoint
16 pages
955a4286865876cdc22a3a5128fca0c8
No ratings yet
955a4286865876cdc22a3a5128fca0c8
269 pages
C 1942940
No ratings yet
C 1942940
511 pages
Nosql Final
No ratings yet
Nosql Final
50 pages
Data Mining and KDD
No ratings yet
Data Mining and KDD
15 pages
Section 14
No ratings yet
Section 14
9 pages
OR Module - 3 Transportation Problems
No ratings yet
OR Module - 3 Transportation Problems
18 pages
What Is A Business Intelligence Framework
No ratings yet
What Is A Business Intelligence Framework
2 pages
IS201 June 2021 Alternative Summative Assessment
No ratings yet
IS201 June 2021 Alternative Summative Assessment
33 pages
MYSQL Board Most Expected Questions
No ratings yet
MYSQL Board Most Expected Questions
35 pages
Distributed Database Design
No ratings yet
Distributed Database Design
15 pages
OR Module-1 Graphical Method Problems
No ratings yet
OR Module-1 Graphical Method Problems
19 pages
Adv SQL
No ratings yet
Adv SQL
22 pages
Session5 6 (Am) PDF
No ratings yet
Session5 6 (Am) PDF
57 pages
Chapter - 5 Algebra
No ratings yet
Chapter - 5 Algebra
18 pages
RPS Customer Relationship Management Sesuai OBE
No ratings yet
RPS Customer Relationship Management Sesuai OBE
11 pages
DDL DML DCL TCL and DQL
No ratings yet
DDL DML DCL TCL and DQL
6 pages
Oracle Stored Procedure I
No ratings yet
Oracle Stored Procedure I
8 pages
21bce0968 VL2023240100969 Ast01
No ratings yet
21bce0968 VL2023240100969 Ast01
22 pages
Module - 4
No ratings yet
Module - 4
6 pages
Assignment Problem Set-Wk3 - Sanjib
No ratings yet
Assignment Problem Set-Wk3 - Sanjib
10 pages
PCVL Brgy 1228016 Sagnap
No ratings yet
PCVL Brgy 1228016 Sagnap
14 pages
5 Segw
No ratings yet
5 Segw
19 pages
2 PPT Security Audit Vault Reports Tech Screens
No ratings yet
2 PPT Security Audit Vault Reports Tech Screens
9 pages
OPERATIONS RESEARCH Syllabus
No ratings yet
OPERATIONS RESEARCH Syllabus
3 pages
Acid and Base
No ratings yet
Acid and Base
2 pages
Kalido Generic Data Modeling
No ratings yet
Kalido Generic Data Modeling
23 pages
Online Analytical Processing: OLAP (Or Online Analytical Processing) Has Been Growing in Popularity Due To The
No ratings yet
Online Analytical Processing: OLAP (Or Online Analytical Processing) Has Been Growing in Popularity Due To The
12 pages
Part 7 - Data Tables - Participants
No ratings yet
Part 7 - Data Tables - Participants
4 pages
Assignment 8
No ratings yet
Assignment 8
2 pages
What Is A FACTLESS FACT TABLE?Where We Use Factless Fact
No ratings yet
What Is A FACTLESS FACT TABLE?Where We Use Factless Fact
3 pages
Accelerate Cloud Analytics Modernization: Rapidly Deliver Trusted, Data-Driven Business Decisions
No ratings yet
Accelerate Cloud Analytics Modernization: Rapidly Deliver Trusted, Data-Driven Business Decisions
3 pages
C++ Data Structures Explained: A Practical Guide with Examples
From Everand
C++ Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet

3 Module NOSQL Preparation

Uploaded by

3 Module NOSQL Preparation

Uploaded by

1)​ Explain with a neat diagram, the partitioning and combining in MapReduce

A)​ Diagram - 4 Marks

❖​ A combiner function is, in essence, a reducer function—indeed, in many cases

●​ The input data is divided into chunks (splits).

●​ Each group of key-value pairs is processed by a Reducer function.

Shuffle and Sort

●​ The intermediate key-value pairs are shuffled and grouped by key.

3) Explain two stages Map reduce example, with neat diagram

Bucket Organization in Key-Value Stores:

Popular Key-Value Databases:

A)​ Key-Value Store Features

Let’s explore the features with respect to the mentioned points:

●​ Explanation: Consistency refers to whether data remains the same across

●​ Explanation: Transactions involve ensuring that a group of operations are

●​ Explanation: Query features determine how you retrieve and manipulate

4. Structure of the Data

●​ Explanation: This refers to how data is organized and stored.

●​ Explanation: Scaling determines how well the database handles an

7) Explain with suitable use cases of key value stores

You might also like

1) Explain with a neat diagram, the partitioning and combining in MapReduce

A) Diagram - 4 Marks

❖ A combiner function is, in essence, a reducer function—indeed, in many cases

● The input data is divided into chunks (splits).

● Each group of key-value pairs is processed by a Reducer function.

● The intermediate key-value pairs are shuffled and grouped by key.

A) Key-Value Store Features

● Explanation: Consistency refers to whether data remains the same across

● Explanation: Transactions involve ensuring that a group of operations are

● Explanation: Query features determine how you retrieve and manipulate

● Explanation: This refers to how data is organized and stored.

● Explanation: Scaling determines how well the database handles an