[go: up one dir, main page]

0% found this document useful (0 votes)
59 views7 pages

Big Data Cat Questions

Good morning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views7 pages

Big Data Cat Questions

Good morning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

1 ~ ~ ·1 1 1 1 1 1 1 1 1 1 1 1 I

... uo~,._~,_-~.,,,~, _~_~_~~~1■1


~

lii"
'i1 &\ 114,1_ _ _ 1n

-I
CO NT INU OU S ASSESSMENT TE ST
/ Question Paper Code / 24NUS1F233

A ESSENTIALS
Course Code - Name 20ITPC502-BIG DAT
Degree & Program : B. Tech(Il)
Date of Exam : 19.8.24 Year / Semester : IIIN

Duration : 1 ½ Hours Max. Marks : 50

Answer ALL Questions


K-
PA RT -A (lOx 1 = 10 Marks) Level
co
? Kl CO i
1 What is a primary characteristic of Big Data
c) Sma JJ volumed) Variety and volume
a) Structured format b) Low velocity
Data in algorithmic trading? Kl COI
2 Which industry is known for using Big are d) Finance
a) Agriculture b) Retail c) Hea lthc
nolo gy used for real-time analytics? Kl CO l
3 What is a common Big Data tech d) Excel
a) Hadoop b) SQL c) Spark
to the trustworthiness and accnraey of data
? Kl CO l
rs
4 Which characteristic of Big Data refe Veracity
a) Volume b) Variety c) Velocity d) KI CO i
of Big Data in marketing?
5 Which of the foJlowing is a use case
it<M'ing c) Customer behavior analysis
a) Crop yield prediction b)Weather mon-
d) Space exploration Kl CO2
ble for distributed data storage?
6 Which Hadoop component is responsi
YARNc) Hadoop Distributed File System
a) Hadoop MapReduce b) Hadoop
(HDFS) d) Hadoop Hive Kl CO2
Hadoop?
7 Who is credited with the creation of
Ghe maw atb)Doug Cutting and Mike Cafarella
a) Jeff Dean and Sanjay
Eich and Marc Andreessen
c) Larry Page and Sergey Brind) Brendan Kl CO2
stored in HDFS?
8 What is the default block size for files d) 512 MB
a) 64 MB b) 128 MB c) 256 MB CO2
KI
users to do?
9 What does Hadoop Streaming allow
a) Use a graphica l interface to manage Hadoop clusters.
or programs in languages other than
b) Execute MapReduce jobs with scripts
compression.
Java.c) Store data in IIDFS without using
the Hadoop ecosystem.
d) Visualize Hadoop data directly within Kl CO2
ributed File System (HDFS)?
IO What is a design goal ofHadoop Dist
for data access
a) Low fault-toleranceb) High throughput
ability
c) Centralized storage modeld) Limited scal

Create Page 1 of2


K J _ Remember; K2 - Understand; K3 -
Apply; K4 - Analyze; KS - Evaluate; K6 -
PART-B(l0 x 2 =20Mark.1)
11 Why is unstructured text data important in decision making? 1<.1
12 What is the recommended best practice for managing big data analytics programs? Kl ~
13 Define Web Analytics? ~~\
Kl tt)\
14 Define Bounce Rate.
Kl CC>\
15 List out some of the Data Visualization tool. Kl CC>\
16 Define Hadoop.
Kl CO2
17 What is meant by Streaming in hadoop?
Kl CO2
18 Summariz~ Scaling Out in Hadoop
Kl CO2
19 Give two Example Commands used in Hadoop
Kl CO2
20 What are the different version of Hadoop.-
Kl CO2

PART - C (2 x 10 = 20 Marks) Mark K-


Split up Level co
21. a) Define Big Data and the Vs of Big Data.
10 Kl COl
(Or)
b) Illustrate Big Data Analytics process and its types. List out the benefits 10 Kl
and challe~ges of Big Data Analytics. COi
22. a) Discuss about Components of Hadoop.
10 K2 CO2
(Or)
b) Explain briefly about Hadoop Distributed File System
10 K2 CO2

Course Outcomes
COi Illustrate various bi data conce ts and its use cases in various a lication domains. Kl
CO2 Understand the Hadoop distributed file systems on different applications.(K2)
C03 Infer the workin of Hadoo architecture and Ma reduce Framework. K2
C04 Articulate the different Hadoop ecosystem components.(K.3)
COS Demonstrate the bi data solutions usin S ark Pro ammin K3
C06 Solve the various distributed a lications usin the Bi data technolo ies. K3

Distribution of COs (Percentage wise)


-
CO No. COi CO2 C03 CO4 C05
.------ -
% so so -- -- I,.. -
-- -

. K4 - Anulyze; K5 - Hvl\luatc; K6 - Ct·Qulo Pogo l ofl


Kl -Remember; K2- Understand·' K3 -AppIY,
lD■

)
y ~~- I I I •.1 I I I I I I I I I l
SAi RAM ENGINEERING COLLEGE !'JI
An A_utonomous Institution I Afflllated to Anna University &Approved by AICTE New Delhi .~.......:;::•...........
~ by NBA 800 NAAC 'A+' I Bls.£0111S ISO 21001 : 2018 and 9001: '/015 Cenified and NIRF ~nled imlmioon , .
Sa1 Leo Nagar, West Tambaram, Chennai - 600 044. www.sairam.edu.ln ·
--___,;~

CONTINUOUS ASSESSMENT TEST- II


I Question Paper Code I 24NU52F233

Course Code - Name : 20ITPC502/ Big Data Essentials


Degree & Program :B .Tech /IT
Date of Exam :30 -09-2024 Year I Semester :II1/V
Duration : I ½ Hours Max. Marks : 50
Answer ALL Questions
PART-A (lOx 1 = 10 Marks) K-Level CO
1 Which tool is used to ingest data from external databases into Hadoop? Kl CO2
A) Apache Flume 8) Apache Sqoop C) Apache Kafka D) Apache Pig
2 What is the purpose of Hadoop Archives (HAR files)? Kl CO2
A) To compress data for efficient storage
B) To provide a high-speed access mechanism for frequently accessed data
C) To pack'1ge and archive large numbers of small files into fewer large files
for more efficient storage
D) To facilitate real-time data processing
3 Which command-line interface tool is used to interact with HDFS? Kl CO2
A) hadoop fs B) hdfs dfs C) hadoop cli D) hdfs admin
4 During the· Shutlle phase in MapReduce, what happens to the intermediate Kl CO3
data?
A) lt is sorted and grouped by keys B} It is partitioned and reduced
C) It is processed by mappers D) It is stored in HDFS
5 In the context of failures in Classic MapReduce, what mechanism is·used to Kl CO3
handle task failures?
A) TaskTracker re-execution B) JobTracker re~execution
C) Speculative execution D) HDFS replication
Which type of MapReduce input format is suitable for handling binary data? Kl CO3
6
A) Textlnpu tFormat B) SequenceFilelnputFormat

Kt _ Remember; K2 - Understand; K3 - Apply; K4 - Analyze; K5 - Evaluate; K6 - Create Page I of 2


- ,cw: w

C) KeyValueTextlnput Format D) Avrolnput Format


7 What is the maximum number of reducers that can be configured in a Kl CO3
MapReduce job?
A) 1 B)l0 C) Unlimited D) Depends on the cluster configuration
8 What is a combiner in MapReduce? Kl CO3
A)A function that combines multiple ·outputs from the reducer
B)A mini-reducer that performs local aggregation
C) A tool for input data validatio~ D) A format for output data \~
9 What does the term "data_locality" refer to in_the context of Hadoop? I

Kl CO3
A) Storing dat~ in a centralized location B) Processing data closet~ where it is
stored C) Moving data to a different cluster D) Encrypting data 'ror security \
10 Which of the following types of failures can occur in a MapReduce job?
Kl CO3
A) Task failure 8) Node failure
C) Application failure D) All of the above
r
\\
PART- B {t0x 2 = 20 Marks)
Write the Hadoop Archives Operations?
K-Level co I
12 o·efine Codec. List out some of the compression algorithm. \
K2
K2
'
CO2
CO2
\ / '
13 What is serialization and deserialization? i
K2 CO2
14 \
What are the features of Map Reduce?
K2 CO3
15 What is the use of Org.apcahe.hadoop.io.pac~age? K2 CO3
16 What is YARN? K2 CO3
17. List out the difference between Fair scheduling and Capacity scheduling. K2 CO3
18 List 5 steps in sub1nitting an application in YARN. K2 CO3
19 State one Map-Side tuning property and describe it. K2 CO3
20 What i~ Speculative execution? K2 CO3

PART- C (2 x 10 = 20 Marks) Mark K- co


Split up Level

21. a)
Explain briefly about Data Ingest with Flume and Scoop to K2 CO2
(Or) :I
• • detail about Hadoop VO- Compression. to K2 CO2
b) Exp1a1n 1n
. . d tail about the map reduce features. to K2 CO3
a) Explain 1n e
(Or)

-~-.h~r, !~ _ understand~ KJ - Apply~ K4 _ Analyze; KS - Evaluate; K6 - Create Pagelof2


2
. .
th e fo llo wi ng ph ases of Ma Red uce with one 10 K2 C0 3
g of
b) Ex pl ai n wo rk in
pl e
. P
co m m on ex am
(i) M ap Ph as e
as e
(ii) Sh uf fle an d sort ph
(ii i) Re du ce r Phas e

Course O ut co m es ains.(KI)
d its us e ca se s in various application dom
an
COl Jllustrate va rio us big data concepts
plications.(K2)
di str ib ut ed fil e sy stems on different ap
CO2 Understand th e Ha do op
ework.(K.2)
Ha do op ar ch ite ctu re and Mapreduce Fiam
CO3 In fe r th e wo rk in g of
mponents.(K3)
4 ul at e th e di ffe re nt Ha do op ecosystem co
CO Ar tic
g(K3)
g da ta so lu tio ns us ing Spark Programmin
cos Demonstrate the bi
ta technologies.(K3)
I

di str ib ut ed ap pl ica tions using the Big da


C 06 So lv e the various

Percenta e wise
Distribution of COs co s C 06
C 03 C 04
COi CO2
C O No.
41 .5 58.5
%

Page 3 of2
c r:valuate; K6 - Create
I ~:~- I I i, I J 1- I5 I I ~I~1° I § 7- 3 I I
2 0 C\
~

SAi RAM ENGINEERING COLLEGE tm l


A n - . . , lnslltulion Alfiliated to Anna UnivMily &Appro,ed by A/GTE, New Delhi 6 ~
I =-
Acaded by NBA aoo NAAC •A+' I BIS,fOIIS ISO 21001:2018 and 9001: 2015 Certified and NIRF rankfKI instHuoon
Sai Leo Nagar, West Tambaram, Chennai - 600 044. www.salram.edu.in _
_ __

CONTINUOUS ASSESSMENT TEST - Ill


I Question Paper Code j 24NU53F234

Course Code - Name : 20ITPC502/ Big Data Essentials

Degree & Program : B. Tech( IT)


Year / Semester : IIIN
Date of Exam : 15-11-2024
Max. Marks : 50
Duration : 1 ½ Hours

Answer ALL Questions


K-LeveJ co
PART - A (t0x 1 = 10 Marks)
Kl CO4
1 Pig Latin is a
A) Query language for databases B) Dataflow language
C) Markup language D) Scripting language for wtb development Kl CO4
2 In which scenario would you choose Pig over traditional databases?
ctured
A) When real-time transactions are required B) When handljng unstru
access is
data C) When ACID properties are important D) When low-latency
needed Kl C04
3 What is the Grunt shell used for in Pig? "
A) Running Pig scripts B) Managing 1-Iadoop clusters
C) Querying databases D) Creating Hive tables
4 Which shell is used to run Spark interactively?
Kl cos
A) Python shell B) Scala shell C) Spark shell D) R
5 What is the core abstraction in Spark called?
Kl cos
A) RDD (Resilient Distributed Dat.-u;et) B) DataFrame C) DataSet
D) Task Scheduler
6 Which of the following programming m9dels does Spark use?
Kl cos
A) MapReduce B) Directed Acyclic Graph (DAG)
C) Neural Networks D) Linear Regression
7 Which Spark component allows for real-time stream processing?
Kl cos
A) Spark SQL B) Spark Streaming C) MLib D) GraphX
8 Which company developed CUDA? Kl C06
A) AMD B) Intel C) NVIDIA D) Microsoft
I
I 9 Which operation is commonly accelerated using GPUs in big data Kl C06
'
applications?
parsin g
A) Matrix multiplication B) String manipulation C) File 1/0 D) HTML
betwe en thread s in the same Kl C06
10 Which of the following memory types is shared
block in CUDA?
A) Global memory B) Shared memory C) Constant memory

I KI - Remember; K2 - Understand; K3 - Apply; K4 - Analyze; KS _ Evaluat


e; K6 _ Create Page 1 of2 /

L_
7JRSi -,.~~ .... ,. . .err-:
iJI
~
V

-----
~
' . ·,

K-Level co
' ' \
~

11
PART-B (lOx 2 =20 Marks)
K2 CO4
'
Define PIG.
12 What are the major component in the Apache Pig· framework? K2 CO4
13 Compare HBase with RDBMS. K2 CO4
14 Define NULL value in PIG LATIN. K2 CO4
15
Explain the difference between transformations and actions in Spark. K2 cos
16 What is the use of' spark-shell' in Spark? K2 cos
17 What is DAG (Directed Acyclic Graph) in Spark? K2 cos
18 How does Spark handle fault tolerance in RDDs? K2 cos
19 What is the role of shared memory in CUDA? K2 CO6
20 What is the purpose of thread divergence in CUDA? K2 CO6

PART-C (2 x 10 = 20 Mark~) Mark K- co


Split up Levci
21. a) Explain in detail about Data Processing operators in PIG. 10 K2 CO4
(Or)
b) Explain about HBASE Concepts. 10 K2 CO4
22. a) Compare and contrast Spark's RDD, DataFrame, and Dataset APls 10 K2 cos
(Or)
b) Describe the process of writing and running a Spark application using 10 K2 cos
Scala and Python

Course Outcomes
~-
-
COl Illustrate various big data concepts and its use cases in various application domains.(Kl) 7
CO2 Understand the Hadoop distributed file systems on different applications.(K2)
C03 Infer the working of Hadoop architecture and Mapreduce Framework.(K2)
C04 Articulate the different Hadoop ecosystem components.(K3) •
-1
l
cos Demonstrate the big data solutions using Spark Programming(K3)
C06 Solve the various distributed applications using the Big data technologies.(K3)

I
Distribution of COs (Percentage wise)
.

CO No. COl CO2 CO3 CO4 cos


I

C06 '
I
.
__..._
% ,.___
---- 35% 35% 30% \
KI - Remember; K2 - Understand; K3 - Apply; K4 - Analyze; KS - Evaluate; K6 - Create

You might also like