Approved by AICTE, New Delhi, Affiliated to Anna University Chennai, Accredited by NBA & TCS
DEPARTMENT OF COMPUTER SCIENCE AND BUSINESS SYSTEM
Test Model (set1) Sub/Code: CCS334/Big Data Analytics
Year / SEM: III/V Date: 12.11.2024
Time: 3 hrs Marks: 100
PART – A (10 X 2 = 20 Marks)
C
Level Q.no
O
What is Big data? Write down the four computing resources of Big Data
storage.
1 1 1.
What are the characteristics of unstructured data?
1 1 2.
Differentiate scheme-less and schema database.
2 2 3.
Define materialized view and its uses.
2 2 4.
What is the limitation of the Map Reduce model?
3 1 5.
List out the failures in Map Reduce and YARN.
3 2 6.
What are the features of Cassandra’s client model
4 2 7.
Test Model (set1) Sub/Code: CCS334/Big Data Analytics
Comparison of text file, sequence file and AVRO file format
4 2 8.
Define praxis.
5 2 9.
What are the features of PIG Latin scripts?
5 2 10.
PART – B (5X13=65 Marks)
i) Differentiate between Cloud data analytics and Big data analytics with
neat separate diagram (10)
1 2 11
ii) Explain characteristics of Big data with its 6V’s with its diagram. (3)
(a)
(b) i) Explain Inter and Trans firewall analytics based on Focus, Method,
Benefits and Limitation. (10)
1
2 iii) Define Crowd sourcing analytics. Give example (3)
i) Differentiate key-store, Document-Data store with its neat diagram and
its key features. (10)
12
2 2 ii) Explain Consistency model of NoSQL with related to CAP Theorem (3)
(a)
2 i) Explain in detail of Distribution model of NoSQL with Master-Slave
relationship and Peer-to-Peer architecture. (8)
2 (b)
3 ii) How to combine sharding and Replication in distribution database of
NoSQL? (5)
To count the frequency of each word in the given input string “This is an
3 5 Apple, Apple is red in color, Apple is good for health”. Explain in detail
13 about each phases of Map-Reduce algorithm with corresponding input,
(a) Processing and output format with neat diagram.
1 (b)
Test Model (set1) Sub/Code: CCS334/Big Data Analytics
i)Describe YARN Architecture with its diagram and its advantage and
limitation. In Detail manner. (8)
3 2 ii) Define MRUnit test and its steps (2)
iii) Define test data and local test perform in map reduce problem. (3)
i)Explain in detail about Hadoop Physical Organization nodes. (3)
2 ii) Explain the Hadoop data flow of Read operation occur between these
14 nodes with suitable diagram. (5)
4 iii) Explain the Hadoop dataflow of Write operation occur between these
(a)
nodes with suitable diagram (5)
3
i)Comparison of various file format with features, format, diagram and
4 (b) example (text, sequence, RC, ORC, AVRO (8)
1 ii) Explain various Hadoop Compression techniques and its uses (5).
15 i)Comparison of Hadoop and Cassandra based on Architecture, Data center,
4 3 Replication, Latency, Indexing and Compression. (8)
(a)
ii)Also Explain any five Cassandra queries with examples.(5)
i) Explain HIVE Optimization Query languages.(5)
2
(b)
ii) Assign a employee table with fields named Id, Name, Salary,
5 Designation, and Dept. Generate a query to retrieve the employee details
4 who earn a salary of more than Rs 30000. (4)
iii) write syntax for Left Outer Join, Right Outer Join with example.(4)
PART – C (1X15=15 Marks)
5 2 Comparison of HIVE , PIG and HBASE tools based on Features, need,
Architectural Components and also their limitation with suitable example.
16
CO1: Describe Bigdata and use case using selected Business Domains.
CO2: Explain NoSQL Big Data Management
CO3: Perform map Reduce Analytics using Hadoop.
CO4: Install, Configure and run Hadoop and HDFS
CO5: Use Hadoop related tools such as Hbase, Hive, Pig, Cassandra for Bigdata analytics.
K1 – Remember, K2 – Understand, K3 – Apply, K4 – Analysis, K5 – Evaluate, K6 – Create
Prepared By Verified By Approved by
Subject In-charge Batch Coordinator HOD/CSBS PRINCIPAL