Data Analytics Lesson Plan
Spring 6th Semester, 2020-21
School of Computer Engineering,
KIIT Deemed to be University
(Institute of Eminence)
Course Details
Program(s) Academic Subject Name Subject Code Credit
Session, Semester
B.Tech Spring 2020-21 Data Analytics IT-3006 3
[CSE, IT, CSCE, (December-May)
CSSE] 6th Sem
Course Committee Faculties
• Dr. Hrudaya Kumar Tripathy
• Dr. Manjusha Pandey- C
• Prof. Rajat Kumar Behera
• Dr. Siddharth S. Rautaray
• Prof. Jayanti Dansana
• Dr. Mainak Bandyopadhyay
• Dr. Minakhi Rout
• Dr. Sarita Tripathy
• Prof. Manas Ranjan Biswal
Syllabus
Introduction to Big Introduction to Data, Big Data Characteristics, Types of Big Data, Challenges of Traditional, Systems,
Data Web Data, Evolution of Analytic Scalability, OLTP, MPP, Grid Computing, Cloud Computing, Fault
Tolerance, Analytic Processes and Tools, Analysis Versus Reporting, Statistical Concepts, Types of
Analytics.
Data Analysis Introduction to Data Analysis, Importance of Data Analysis, Data Analytics Applications, Regression
Modelling Techniques: Linear Regression, Multiple Linear Regression, Non Linear Regression, Logistic
Regression, Bayesian Modelling, Basian Networks, Support Vector Machines, Time Series Analysis,
Rule Induction, Sequential Cover Algorithm.
Mining Data Streams Introduction to Mining Data Streams, Data Stream Management Systems, Data Stream Mining,
Examples of Data Stream Applications, Stream Queries, Issues in Data Stream Query, Processing,
Sampling in Data Streams, Filtering Streams, Counting Distinct Elements in a Stream, Estimating
Moments, Querying on Windows − Counting Ones in a Window, Decaying Windows, Real-Time
Analytics Platform (RTAP).
Frequent Itemsets and Introduction to Frequent Itemsets, Market-Basket Model, Algorithm for Finding Frequent, Itemsets,
Clustering Association Rule Mining, Apriori Algorithm, Introduction to Clustering, Overview of Clustering
Techniques, Hierarchical Clustering, Partitioning Methods, K- Means Algorithm, Clustering High-
Dimensional Data.
Frameworks and Introduction to framework and Visualization, Introduction to Hadoop, Core Components of Hadoop,
Visualization Hadoop Ecosystem, Physical Architecture, Hadoop Limitations, Hive, MapReduce and The New
Software Stack, MapReduce, Algorithms Using MapReduce, NOSQL, NoSQL Business Drivers,
NoSQL Case Studies, NoSQL Data Architectural Patterns, Variations of NoSQL, Architectural Patterns,
Using NoSQL to Manage Big Data, Visualizations.
Books
• Text Book:
– Data Analytics, Radha Shankarmani,M. Vijayalaxmi, Wiley India Private Limited, ISBN:
9788126560639.
• Reference Books:
– Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data by
EMC Education Services (Editor), Wiley, 2014
– Bill Franks, Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with
advanced analystics, John Wiley & sons, 2012.
– Glenn J. Myatt, Making Sense of Data, John Wiley & Sons, 2007 Pete Warden, Big
– Data Glossary,O’Reilly, 2011.
– Jiawei Han, MichelineKamber “Data Mining Concepts and Techniques”, Second Edition, Elsevier,
Reprinted 2008.
– Stephan Kudyba, Thomas H. Davenport, Big Data, Mining, and Analytics, Components of Strategic
Decision Making, CRC Press, Taylor & Francis Group. 2014
– Big Data, Black Book, DT Editorial Services, Dreamtech Press, 2015
Course Outcomes
Course Outcomes Modules
CO1 Understand and classify the characteristics, concepts and Introduction to Big Data
principles of big data.
CO2 Apply the data analytics techniques and models. Data Analysis
CO3 Implement and analyze the data analysis techniques for mining Mining Data Streams
data streams.
CO4 Examine the techniques of clustering and frequent item sets. Frequent Itemsets and Clustering
CO5 Analyze and evaluate the framework and visualization for big Frameworks and Visualization
data analytics.
CO6 Formulate the concepts, principles and techniques focusing on Applications of all modules
the applications to industry and real world experience.
Lesson Plan
Modules Lecture Topics/Coverage
Days
Introduction to Data, Big Data Characteristics, Types of Big Data,
1st to 3rd Challenges of Traditional, Systems, Web Data, Evolution of Analytic
1. Introduction to Big Scalability, OLTP, MPP, Grid Computing, Cloud Computing,
Data Fault Tolerance, Analytic Processes and Tools, Analysis Versus
(9 hrs) 4th to 8th Reporting, Statistical Concepts, Types of Analytics.
9th Module 1 Activities
2. Data Analysis 10th - 12th Introduction to Data Analysis, Importance of Data Analysis, Data
(12hrs) Analytics Applications
13th to 15th Regression Modelling Techniques: Linear Regression, Multiple Linear
Regression, Non Linear Regression, Logistic Regression.
16th to 17th Bayesian Modelling, Bayesian Networks, Support Vector Machines.
18th to 20th Time Series Analysis, Rule Induction, Sequential Cover Algorithm.
21st Module 2 Activities
Lesson Plan
Modules Lecture Days Topics/Coverage
Introduction to Mining Data Streams, Data Stream Management
22rd to 25th Systems, Data Stream Mining, Examples of Data Stream Applications,
Stream Queries, Issues in Data Stream Query, Processing
3. Mining Data Sampling in Data Streams, Filtering Streams, Counting Distinct
26th 28th
Streams Elements in a Stream, Estimating Moments.
(10hrs) MID SEMESTER
29th-30th Querying on Windows − Counting Ones in a Window, Decaying
Windows, Real-Time Analytics Platform (RTAP).
31st Module 3 Activities
32nd to 34th Introduction to Frequent Itemsets, Market-Basket Model, Algorithm for
Finding Frequent, Itemsets, Association Rule Mining.
4. Frequent Itemsets
and Clustering 35th to 38th Apriori Algorithm, Introduction to Clustering, Overview of Clustering
(10 Hrs) Techniques, Hierarchical Clustering, Partitioning Methods
39th to 40th K- Means Algorithm, Clustering High-Dimensional Data.
41th Module 4 Activities
5. Frameworks and 42th to 45th Introduction to framework and Visualization, Introduction to Hadoop,
Visualization Core Components of Hadoop, Hadoop Ecosystem, Physical
(8 Hrs) Architecture, Hadoop Limitations, Hive, MapReduce and The New
Software Stack, MapReduce, Algorithms Using MapReduce.
46rd to 48th NOSQL, NoSQL Business Drivers, NoSQL Case Studies, NoSQL Data
Architectural Patterns, Variations of NoSQL, Architectural Patterns,
Using NoSQL to Manage Big Data, Visualizations.
49th Module 5 Activities
Activities
Group formation (For this Semester)
• In the beginning of session, students may be divided into • Activities have been identified in every
groups (8 to 10 members) module.
• Different focus areas have been
Real world project prototypes
identified as applicable across all
• Right after the group formation, Students will be asked to
modules.
participate in prototype projects in different domains in
real world.
• After every module, they will be preparing the relevant
and necessary diagrams/documents/ code/algorithm/test
plan as required as part of activity deliverables.
Focus areas have been identified as
– Real world problem identification & solution
approach.
– Analyzing the probable solution.
– Critical thinking
– Creation of design
– Interactivity Focus
– Reflection
Evaluation Scheme
EXAM MARKS
End Semester 50
Internal Mid Semester 20
Activities 30
Total 100