ASSESSMENT METHODS:
CAT 1 CAT 2 Model Exam End Semester Assignments Case
Exams Studies
Quiz MCQ Projects Seminars Demonstration/ Open
Presentation book test
OEC INTRODUCTION TO DATA ANALYTICS 3 0 0 3
Course Objectives
To provide the knowledge and expertise to become a proficient data scientist
To explore the fundamental concepts of big data & data analytics.
To gain knowledge on Hadoop related tools such as MongoDB, Cassandra, Pig,
and Hive for big data analytics
UNIT I : Introduction to Big Data 9
Types of Digital Data-Characteristics of Data – Evolution of Big Data – Definition of
Big Data – Challenges with Big Data – 3Vs of Big Data – Non Definitional traits of Big
Data – Business Intelligence vs. Big Data – Data warehouse and Hadoop environment –
Coexistence.
UNIT II : Big Data Analytics 9
Classification of analytics – Data Science – Terminologies in Big Data – CAP Theorem
– BASE Concept. NoSQL: Types of Databases – Advantages – NewSQL – SQL vs.
NOSQL vs NewSQL
UNIT III : Introduction to Hadoop 9
Features – Advantages – Versions – Overview of Hadoop Eco systems – Hadoop
distributions – Hadoop vs. SQL – RDBMS vs. Hadoop – Hadoop Components –
Architecture – HDFS – Map Reduce: Mapper – Reducer – Combiner – Partitioner –
Searching – Sorting – Compression. Hadoop 2 (YARN): Architecture – Interacting with
Hadoop Eco systems
242
UNIT IV : No SQL databases 9
Mongo DB: Introduction – Features – Data types – Mongo DB Query language – CRUD
operations – Arrays – Functions: Count – Sort – Limit – Skip – Aggregate – Map
Reduce. Cursors – Indexes – Mongo Import – Mongo Export. Cassandra: Introduction –
Features – Data types – CQLSH – Key spaces – CRUD operations – Collections –
Counter – TTL – Alter commands – Import and Export – Querying System tables.
UNIT V : Hadoop Eco systems 9
Hive – Architecture – data type – File format – HQL – SerDe – User defined functions –
Pig: Features – Anatomy – Pig on Hadoop – Pig Philosophy – Pig Latin overview – Data
types – Running pig – Execution modes of Pig – HDFS commands – Relational
operators – Eval Functions – Complex data type – Piggy Bank – User defined Functions
– Parameter substitution – Diagnostic operator
TOTAL HOURS : - 45
TEXT BOOKS:
1. Seema Acharya, Subhashini Chellappan, “Big Data and Analytics”, Wiley
Publication, 2015
REFERENCE BOOKS:
1. Judith Hurwitz, Alan Nugent, Dr. Fern Halper, Marcia Kaufman, “Big Data for
Dummies”, John Wiley & Sons, Inc., 2013.
2. Tom White, “Hadoop: The Definitive Guide”, O’Reilly Publications, 2011.
3. Kyle Banker, “Mongo DB in Action”, Manning Publications Company, 2012.
4. Russell Bradberry, Eric Blow, “Practical Cassandra A developers Approach“,
Pearson Education, 2014
Weblinks:
1. https://onlinecourses.nptel.ac.in/noc20_cs01/preview
COURSE OUTCOMES
CO1 Identify the need of big data K3
CO2 Interpret basic concepts of data analytics K5
CO3 Analyze the framework for storing the data K4
CO4 Examine about NoSQL databases K4
243
CO5 Choose an appropriate framework to solve real world problems K3
Mapping Of Course Outcomes to Program Outcomes
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
CO1 3 2 2 2 - 2 2 - - 2 2 2 3 2
CO2 2 2 2 2 - 2 2 - - 2 2 2 3 3
CO3 3 2 2 2 - 2 2 - - 2 2 2 2 2
CO4 2 2 3 3 - 3 3 - - 3 3 2 3 2
CO5 2 3 2 3 - 3 2 - - 3 2 3 1 3
ASSESSMENT METHODS:
CAT 1 CAT 2 Model Exam End Semester Assignments Case
Exams Studies
Quiz MCQ Projects Seminars Demonstration/ Open book
Presentation test
244