3 Module NOSQL Preparation
3 Module NOSQL Preparation
Sol:
❖ In the simplest form, we think of a map-reduce job as having a single reduce
Function.
❖ The outputs from all the map tasks running on the various nodes
are concatenated together and sent into the reduce.
❖ While this will work, there are things we can do to increase the parallelism and to reduce
the data transfer
❖ The first thing we can do is increase parallelism by partitioning the output of the
mappers.
❖ Each reduce function operates on the results of a single key.
MapReduce operates in two main phases: Map and Reduce. Each phase involves specific
tasks:
Map Phase
A)
1. As map-reduce calculations get more complex, it’s useful to break them down into
stages using a pipes-and-filters approach, with the output of one stage serving as input
to the next, rather like the pipelines in UNIX.
2. A first stage (Figure 7.9) would read the original order records and output a series of
key-value pairs for the sales of each product per month.
3. The second-stage mappers (Figure 7.10) process this output depending on the year. A
2011 record populates the current year quantity while a 2010 record populates a prior
year quantity.
4) How are calculations composed in Map reduce? Explain with neat diagram
A)
MapReduce is designed to process large datasets by dividing the work into smaller tasks.
However, it imposes some constraints:
1. In the Map Phase: You can only process one piece of data (record) at a time.
2. In the Reduce Phase: You can only process one group of data (key) at a time.
This means you must think differently when solving problems, especially for tasks like
calculating averages, which aren’t straightforward in this model.
5) What Is a Key-Value Store? Single Bucket , Popular Key-Value Databases?
A) Key-value stores are the simplest NoSQL data stores to use from an API perspective.
The client can either get the value for the key, put a value for a key, or delete a key from
the data store.
Single Bucket:
Key-value stores are a type of NoSQL database that store data in a simple
format: a unique key associated with a value. Think of it as a dictionary where
each key points to a specific value.
1. Consistency
2. Transactions
3. Query Features