[go: up one dir, main page]

0% found this document useful (0 votes)
14 views11 pages

New CP - Cse2500 Data Analytics

Uploaded by

buran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views11 pages

New CP - Cse2500 Data Analytics

Uploaded by

buran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

School of CSE & IS

Dept. of Computer Science and Engineering


COURSE PLAN
Academic Year 2025-26 ODD SEMESTER

School/Department of Students COM / CBC / CEI

Name of the Program(s) of Students B. Tech in Computer Science & Engineering (CSE)

PRC Approval Ref. No. PU/AC-21.5/SoCSE2/COM/2023-2027

Semester/Year V/III

Course Code & Name CSE2500 & Data Analytics

Credit Structure (L-T-P-C) 2-0-0-2

Contact Hours 2 Sessions per week -28 Sessions

Course In-Charge (IC) Dr. Leelambika KV


Dr. Leelambika KV, Mr. Saptasi Sanyal, Ms. Amreen
Course Instructor(s)
Khanum
Course URL https://presidencyuniversity.linways.com

1. COURSE PRE-REQUISITES:

Nil

2. COURSE DESCRIPTION:

Data Analytics is designed for inspecting, cleansing, transforming, and modeling data with the goal
of discovering useful information, and supports in decision-making. The course begins by covering
Data extraction, pre-processing, and transformation. It delivers the basic statistics and taught in an
intuitive way to analysis the data. This course will help the students to apply the knowledge on data
analysis to a wide range of applications.

3. COURSE OBJECTIVES:

The objective of the course is to familiarize the learners with the concepts of Data
Analytics and attain SKILL DEVELOPMENT through PROBLEM SOLVING
Methodologies.
4. COURSE OUTCOMES:
TABLE 1: COURSE OUTCOMES

Statement of CO Blooms
CO Cognitive
Number
On successful completion of the course the students shall be able to Level
CO1 Describe different types of data and variables.
Understand
CO2 Explain data using appropriate statistical methods.
Understand
CO3 Demonstrate the collection, processing and analysis of data for any
given application and illustrate various charts using visualization Apply
CO4 methods.
Examine the Data Analysis techniques by R Programming
Apply

5. MAPPING OF COURSE OUTCOMES WITH PROGRAM OUTCOMES AND


PROGRAM SPECIFIC OUTCOMES:
5.1 PROGRAM OUTCOMES:
(A new set of POs, if any, should be used for the courses offered to the students admitted in the 2025 batch.)
On successful completion of the Program, the students will be able to:
PO1. Engineering knowledge: Apply the knowledge of mathematics, science,
engineering fundamentals, and an engineering specialization to the solution of
complex engineering problems.
PO2. Problem analysis: Identify, formulate, review research literature, and analyze
complex engineering problems reaching substantiated conclusions using first
principles of mathematics, natural sciences, and engineering sciences.
PO3. Design/development of solutions: Design solutions for complex engineering
problems and design system components or processes that meet the specified needs
with appropriate consideration for the public health and safety, and the cultural,
societal, and environmental considerations.
PO4. Conduct investigations of complex problems: Use research-based knowledge and
research methods including design of experiments, analysis and interpretation of
data, and synthesis of the information to provide valid conclusions.
PO5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex
engineering activities with an understanding of the limitations.
PO6. The engineer and society: Apply reasoning informed by the contextual knowledge
to assess societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to the professional engineering practice.
PO7. Environment and sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and demonstrate the
knowledge of, and need for sustainable development.
PO8. Ethics: Apply ethical principles and commit to professional ethics and
responsibilities and norms of the engineering practice.
PO9. Individual and team work: Function effectively as an individual, and as a member
or leader in diverse teams, and in multidisciplinary settings.
PO10. Communication: Communicate effectively on complex engineering activities with
the engineering community and with society at large, such as, being able to
comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.
PO11. Project management and finance: Demonstrate knowledge and understanding of
the engineering and management principles and apply these to one’s own work, as
a member and leader in a team, to manage projects and in multidisciplinary
environments.
PO12. Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of
technological change.

TABLE 2a: CO-PO Mapping


CO.
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
No
CO1 M L - - - - - - - L - -

CO2 M L - - - - - - - M - -

CO3 H M L - - - - - - M - -

CO4 H M L - - - - - - M - -

5.2 PROGRAM SPECIFIC OUTCOMES:


On successful completion of the Program, the students will be able to:
(New Set of PSOs, if any, needs to be used)
Problem Analysis: Identify, formulate, research literature, and analyze complex
engineering problems related to AI & ML principles and practices, Programming
PSO1
and Computing technologies reaching substantiated conclusions using first
principles of mathematics, natural sciences and engineering sciences
Design/development of Solutions: Design solutions for complex engineering
problems related to AI & ML principles and practices, Programming and
PSO2 Computing technologies and design system components or processes that meet
the specified needs with appropriate consideration for the public health and
safety, cultural, societal and environmental considerations.
Modern Tool usage: Create, select, and apply appropriate techniques, resources,
and modern engineering and IT tools including prediction and modelling to
PSO3 complex engineering activities related to AI & ML principles and practices,
Programming AI & ML Computing & analytics with an understanding of the
limitations.
TABLE 2b: CO-PSO Mapping

CO Number PSO1 PSO2 PSO3

CO1 L M

CO2 M M M

CO3 M M H

CO4 M M H

6. COURSE CONTENT:

Module Module Name Number of


Number Sessions
Introduction to Data Analysis
Introducing Data, overview of data analysis: Data in the Real World,
Data vs. Information, The Many “Vs” of Data, Structured Data and
Unstructured Data, Types of Data, Data Analysis Defined, Types of
Variables, Central Tendency of Data, Scales of Data, Sources of
1 Data. Data preparation. 7
R Studio: Base R-R Studio IDE-Introduction to R Projects and R
Markdown. Basic R: R as a calculator-Scripts and Comments-R
Variables. Data I/O: Working Directories-Importing Data
Exporting Data-More ways to save-Data I/O in Base R.
Data Analysis and Visualization
Data Summarization: One Quantitative and Categorical Variable -
Multivariate Statistical Analysis for Engineering Data.
Data Classes: One Dimensional Data Classes-Data Frames and
Matrices-Lists - Nested Data Frames for Hierarchical Engineering

2 Data - Sparse Matrices for Large-Scale Simulations 7


Data Cleaning: Dealing with Missing Data-Strings and Recoding
Variables- Imputation for Sensor Data. Manipulating Data in R:
Reshaping Data-Merging Datasets. Data Visualizations: Plotting
with ggplot2- Plotting with Base R- Interactive Dashboards for
Engineering Data - Geospatial Visualizations for Infrastructure
Statistical Analysis
Proportion tests-Chi squared test-Fisher exact test-Correlation-T
test-Wilcoxon Rank sum tests-Wilcoxon signed rank test- one-way
3 ANOVA test- Kruskal Wallis test. Advanced Nonparametric 7
Methods - Robust Rank-Based Methods- Nonparametric
Regression- Permutation-Based ANOVA.
Predictive Analysis
Linear least-squares – implementation – the goodness of fit – testing
a linear model – weighted resampling. Regression using Stats
models – multiple regression – nonlinear relationships – logistic
4 regression – estimating parameters – accuracy. Time series analysis 7
– moving averages – missing values – serial correlation –
autocorrelation. Introduction to survival analysis - Generalized
Linear Mixed Models (GLMMs) - Ridge, Lasso, and Elastic Net
Regression - Monte Carlo Simulation and Sensitivity Analysis

REFERENCE MATERIALS:
Text Books:
T1. Glenn J. Myatt and Wayne P. Johnson, “Making Sense of Data I: A Practical Guide to
Exploratory Data Analysis and Data Mining Paperback”, Import, 22 July 2014.
T2. Introduction to statistics and Data analytics, Christian H, Michael S, Springer,2016
T3. Introduction to R- Robert Parker, John Mushcelli and Andrew Jaffe, Johns Hopkins
University, 2020 (E-resource)
T4. Introduction to Time Series and Forecasting (Springer Texts in Statistics), Peter
Brockwell, Richard A. Davis, Springer, 2016.

Reference Books:
R1. Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining
Paperback, Glenn J. Myatt and Wayne P. Johnson, Import, 22 July 2014.
R2. The R Software-Fundamentals of Programming and Statistical Analysis -Pierre Lafaye de
Micheaux, Remy Drouilhet, Benoit Liquet, Springer 2013.

Online Resources
http://www.modernstatisticswithr.com/solutions.html#solutionsch3
https://johnmuschelli.com/intro_to_r/
https://users.phhp.ufl.edu/rlp176/Courses/PHC6089/R_notes/

7. DETAILED SCHEDULE OF INSTRUCTION

TABLE 3: LESSON PLAN


Session CO
Topic Sub-Topic Reference
Number Number
T1
Program Integration & Overview of the Course, Chapter 1
1. CO1
Course Integration Course Integration Page No: 1
-3
Module 1
Over View of the course. R3
Introducing Data, overview Chapter 16
Introduction to Data Page no:
2. of data analysis: Data in the CO1
Analysis 178 - 179
Real World, Data vs.
Information
T1
Introduction to Data The Many “Vs” of Data, Chapter 2
3. CO1
Analysis Structured Data and Page no:
Unstructured Data, Types 17 - 20
of Data
Data Analysis Defined, T1
Introduction to Data Types of Variables, Central Chapter 2
4. CO1 Page no:
Analysis Tendency of Data, Scales
of Data 21 - 24
T1
Chapter 1
Page no:2-
Introduction to Data Sources of Data, Data 3; Chapter
5. CO1
Analysis preparation 3
Page no:
47- 49
Introduction to Data R Studio: Base R-R Studio Labsheet 1
6. CO1
Analysis IDE-Introduction to R
T2
Chapter 1
Basic R: R as a calculator- Page no: 12
Introduction to Data
7. Scripts and Comments-R CO1 Appendix A
Analysis
Variables Page no.
323

T2
Data I/O: Working
Appendix A
Introduction to Data Directories-Importing Data
8. CO1 Page no.
Analysis Exporting Data-More ways
299
to save-Data I/O in Base R
Module 2
Course integration for Module T2
2, Data Summarization: One Chapter 1
Data Analysis and Quantitative and Categorical Page no.
9. Variable - Multivariate
CO2
Visualization 5-7
Statistical Analysis for
Engineering Data.
T2
Appendix A
Data Analysis and Data Classes: One
10. CO2 Page no.
Visualization Dimensional - Data Frames
300,302,311

Matrices-Lists - Nested Data T2


Frames for Hierarchical Appendix A
Data Analysis and
11. Engineering Data - Sparse CO2 Page no.
Visualization
Matrices for Large-Scale 300,302,311
Simulations
T1
Data Analysis and Data Cleaning: Dealing with Chapter 3
12. Missing Data-Strings.
CO2
Visualization Page no. 48
- 49
T1
Data Analysis and Recoding Variables - Chapter 3
13. CO2
Visualization Imputation for Sensor Data Page no. 50
-55
R1
Manipulating Data in R: Chapter 5
Data Analysis and
14. Reshaping Data-Merging CO2 Page no. 85
Visualization
Datasets. -113

R1
Data Visualizations: Plotting Chapter 7
Data Analysis and
15. with ggplot2 - Plotting with CO2 Page no.
Visualization
Base R 187

Interactive Dashboards for R1


Data Analysis and Engineering Data - Geospatial Chapter 7
16. Visualizations for
CO2
Visualization Pg no:187
Infrastructure
Midterm Exam Question Paper and Scheme of Evaluation – Discussion
Module 3
T1
Course integration for Module
Chapter 4
17. Statistical Analysis 3, Proportion tests-Chi CO3
squared test Page no.
79-81
R2
Fisher exact test - Correlation- Chapter 12
18. Statistical Analysis
T Test Page no. to
288-289
T1
Wilcoxon Rank sum tests - Chapter 4
19. Statistical Analysis
Wilcoxon signed rank test Page no.
74-75
R2
Chapter 12
20. Statistical Analysis one-way ANOVA test
Page no.
291-298
R2
Chapter 12
21. Statistical Analysis Kruskal Wallis test
Page no.
293
T1
Advanced Nonparametric
Chapter 4
22. Statistical Analysis Methods - Robust Rank-Based
Methods Page no. 76
-79
R2
Nonparametric Regression- Chapter 12
23. Statistical Analysis
Permutation-Based ANOVA Page no.
298-300
Module 4
T2
Course integration for Module
Chapter 11
24. Predictive Analysis 4, Linear least-squares - CO4
Implementation Page no.
252-256
T2
25. Predictive Analysis The goodness of fit Chapter 11
Page no.
256-259
T1
Testing a linear model- Chapter 6
26. Predictive Analysis
weighted resampling. Page no.
149-153
T1
Regression using Stats
Chapter 6
27. Predictive Analysis models – multiple regression,
Nonlinear relationships Page no.
153-154
T1
Logistic regression –
Chapter 6
28. Predictive Analysis estimating parameters
accuracy. Page no.
153-154
Time series analysis – moving T1
averages – missing values – Chapter 6
29. Predictive Analysis
serial correlation – Page no.
autocorrelatio 161-167
Introduction to survival R4
analysis -Generalized Linear Chapter 2
Mixed Models (GLMMs) - Page no.
30. Predictive Analysis Ridge, Lasso, and Elastic Net 21-31
Regression - Monte Carlo Chapter 13
Simulation and Sensitivity Page no.
Analysis 495-502
Program Integration, Revision
31. and Conclusion of the Course

The main pedagogical methods in the course are as follows:


 Lecture mode.
 Power Point Presentation.
 Seminar by students.
 Video based learning.
 Problem based learning method.
 Simulation Practical system case study/Model Design.

TABLE 4: SPECIAL DELIVERY METHOD


Subtopic
S. No Session Number Pedagogical Method
(as per lesson plan)
18 Demonstrate Fisher's Flipped Class pedagogy
exact test using the
1. functions in R and
Interpret the results of
Fisher's exact test
26 Demonstrate various Activity Based Learning – DEMO
graphs that can be made OF CODE
2.
and altered using the
ggplot2 package.

8. ASSESSMENT SCHEDULE

TABLE 5: ASSESSMENT SCHEDULE


Sl. Assessment CO Duration
Coverage Marks Weightage
No Type Number(s) in Minutes
Assignment-I
1. Module-1 & 2 CO1 & CO2 NA 25 15%
(Quiz -I)
Midterm 120
2. Module- 1&2 CO1 & CO2 50 25%
Exam Minutes
Assignment-II
3. (Problem Module- 3& 4 CO3 & CO4 NA 25 10%
Solving
Assignment)
End Term CO1, CO2, 180
4. Module-1,2,3 & 4 100 50%
Examination CO3 & CO4 Minutes

9. COURSE CLEARANCE CRITERIA:


This is in accordance with the Academic Regulations of the University and the Program
Regulations and Curriculum of the respective program.
10. SAMPLE QUESTIONS:
TABLE 6: SAMPLE QUESTIONS
Blooms
Sl. CO
Question Marks Cognitive
No Number
Level
Describe nominal, ordinal, and interval scales of 10 CO1 Understand
1 measurement in data analysis. Recognize categorical Marks
variable and a continuous variable, and briefly explain
the difference between them.
Explain, how can the principles of good design 10 CO2 Remember
2 improve the clarity and impact of visualizations Marks
created with ggplot2?
Determine, in what scenarios would you choose the 10 CO3 Apply
3 Fisher exact test over the chi-squared test, and explain Marks
in detail?

Apply the key assumptions of a linear regression 10 CO4 Apply


4 model to evaluate a given dataset and determine if the Marks
model is appropriate.

11. MAPPING WITH SUSTAINABLE DEVELOPMENT GOALS (SDGs):


TABLE 7: SDG MAPPING
S. No Topic SDG Number Justification
Promotes SDG 4through foundational
Module 1: R Studio: skills in data literacy and SDG 9 by
1 Base R-R Studio IDE, SDG 4, SDG 9
introducing data-driven technologies
Basic R: R as a
and tools such as R.
calculator-Scripts and
Comments-R Variables.

Encourages SDG 11 by using data


Module 2: Data visualization to support urban
2 summarization, Data SDG 11, SDG 12 decision-making and SDG 12 through
cleaning efficient data interpretation and
management.

Module 3: Hypothesis
Supports SDG 3 through data-driven
testing (Chi-square, t-
health research and SDG 10 by
test, ANOVA, etc.),
3 SDG 3, SDG 10 enabling equitable data
correlation analysis,
representation and unbiased
R-based statistical
statistical analysis.
methods

Contributes to SDG 1 by enabling


Module 4: Regression,
financial and risk modeling and to SDG
4 logistic modeling, SDG 1, SDG 8
8 through predictive insights in
time series analysis
economic and social datasets.

12. CRITERIA FOR COURSE OUTCOME ATTAINMENT CALCULATION:


TABLE 8: Threshold and Target Set for Course Outcomes
C.O. Threshold in
Sl. No Course Outcomes Target in %
No. %
Describe different types of data and
1. CO1 60 65%
variables.
Explain data using appropriate statistical
2. CO2 55 60%
methods.
Demonstrate the collection, processing
and analysis of data for any given
3. CO3 55 60%
application and illustrate various charts
using visualization methods.
Apply the Data Analysis techniques by
4. CO4 55 60%
R Programming.

13. SUMMARY:
TABLE 9: SUMMARY OF COURSE SCHEDULE
Sl. Total number of
Activity Start date End date
No. Sessions
Program Integration & Over 11-08-2025 11-08-2025 1
1.
View of the course
Module : 01: Introduction to 12-08-2025 05-09-2025 7
2.
Data Analysis
Module : 02: Data Analysis and 08-09-2025 06-10-2025 7
3.
Visualization
4. Assignment- I (Quiz)
5. Midterm Exam 07-10-2025 11-10-2025
Midterm Exam question paper
6.
discussion
7. Module:03: Statistical Analysis 13-10-2025 03-11-2025 7
Assignment- II
8.
(Digital Assignment)
9. Module:04:Predictive Analysis 04-11-2025 27-11-2025 7
10. Revision 28-11-2025 28-11-2025 1

CONTACT TIMINGS IN THE CHAMBER FOR DISCUSSION


Students can meet the respective course instructor during the Chamber Consultation Hour to clarify
doubts related to the course.

SPECIFIC GUIDELINES TO STUDENTS, IF ANY:


 Attend all classes regularly.
 Bring a scientific calculator to every class.
 Refer to online study materials and watch the suggested videos available on the NPTEL
website.

Name and Signature of the course In-Charge

APPROVAL:
This course has been duly verified and approved by the Departmental Academic Committee (DAC).

Name and Signature of the Chairperson - DAC

You might also like