Sanjivani Rural Education Society’s
Sanjivani College of Engineering,
Kopargaon
(An Autonomous Institute Affiliated to Savitribai Phule Pune University, Pune)
B. Tech. Honors (Data Science)
2023 Pattern
Curriculum
(B. Tech. Honors Sem- V, VI, VII & VIII with effect from Academic
Year 2023-2024)
At. Sahajanandnagar, Post. Shingnapur Tal. Kopargaon Dist. Ahmednagar,
Maharashtra State, India PIN 423603
Sanjivani College of Engineering, Kopargaon Page 1 of 25
Sanjivani College of Engineering, Kopargaon
(An Autonomous Institute affiliated to SPPU, Pune)
DECLARATION
We, the Board of Studies (Computer Engineering), hereby declare that,
we have designed the Curriculum of B.Tech honors (DS) Program
Curriculum Structure and Syllabus for semester V,VI,VII & VIII of
Pattern 2023 w.e.f. from A.Y 2023-24 as per the guidelines. So, we are
pleased to submit and publish this FINAL copy of the curriculum for
the information to all the concerned stakeholders.
Submitted by
(Dr.D.B.Kshirsagar)
BoS Chairman
Approved by
Dean Academics Director
Sanjivani College of Engineering, Kopargaon Page 2 of 25
Vision
To develop world class engineering professionals with good moral characters and make them
capable to exhibit leadership through their engineering ability, creative potential and effective
soft skills which will improve the quality of life in society.
Mission
To impart quality technical education to the students through innovative and interactive teaching
and learning process to acquire sound technical knowledge, professional competence and to have
aptitude for research and development.
Develop students as excellent communicators and highly effective team members and leaders
with full appreciation of the importance of professional, ethical and social responsibilities.
Sanjivani College of Engineering, Kopargaon Page 3 of 25
Program Educational Objectives (PEOs)
1 To prepares the committed and motivated graduates by developing technical competency, research
attitude and life-long learning with support of strong academic environment.
2. Train graduates with strong fundamentals and domain knowledge, update with modern technology to
analyse, design & create novel products to provide effective solutions for social benefits.
3. Exhibit employability skills, leadership and right attitude to succeed in their professional career.
Program Outcomes (POs)
Engineering Graduates will be able to:
1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,
and an engineering specialization to the solution of complex engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex engineering
problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety, and the cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to
provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
Sanjivani College of Engineering, Kopargaon Page 4 of 25
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of
the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team, to
manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs)
1. Professional Skills: The ability to apply knowledge of problem solving, algorithmic analysis, software
Engineering, Data Structures, Networking, Database with modern recent trends to provide the effective
solutions for Computer Engineering Problems.
2. Problem-Solving Skills: The ability to inculcate best practices of software and hardware design for
delivering quality products useful for the society.
3. Successful Career: The ability to employ modern computer languages, environments, and platforms in
creating innovative career paths.
Sanjivani College of Engineering, Kopargaon Page 5 of 25
SRES’s Sanjivani College of Engineering, Kopargaon
(An Autonomous Institute Affiliated to SPPU Pune)
COURSE STRUCTURE- 2023 PATTERN
B. TECH HONORS: (AIML AND DS)
LIST OF ABBREVIATIONS
Abbreviation Full Form Abbreviation Full Form
PCC Professional Core courses CIA Continuous Internal Assessment
PEC Professional Elective courses OR End Semester Oral Examination
OEC Open Elective courses PR End Semester Practical Examination
ISE In-Semester Evaluation TW Continuous Term work Evaluation
ESE End-Semester Evaluation MLC Mandatory Learning Course
PROJ Project L Lecture
LC Laboratory course P Practical
T Tutorial NC Non-Credit
Cat Category
Sanjivani College of Engineering, Kopargaon Page 6 of 25
COURSE STRUCTURE- 2023 PATTERN
Data Science
Hrs./Week Evaluation Scheme
Theory Practical Gra
Year Credits
Cat Code Course Title nd
/Sem L T P T O
CIA ESE PR Tota
W R
l
TY Mathematical Foundation
PCC CO1801 4 - - 4 40 60 - - - 100
Sem-I for Data Science
TY
Advanced Python
Sem- PCC CO1802 4 - - 4 40 60 - - - 100
II Programming
Final PostgreSQL and Data
PCC CO1901 4 - - 4 40 60 - - - 100
Sem-I Pipeline
Final
Practical Machine
Sem- PCC CO1902 4 - - 4 40 60 - - - 100
II Learning for Data Science
Final
Mini Project :Data Science
Sem- PCC CO1903 - - 4 2 - - 50 - - 50
II Project implementation
Total 16 - 04 18 160 240 50 - - 450
HoD Dean Academics Director
Dr.D.B.Kshirsagar Dr.A.B.Pawar Dr.A.G.Thakur
Sanjivani College of Engineering, Kopargaon Page 7 of 25
Data Science
Sanjivani College of Engineering, Kopargaon Page 8 of 25
CO1801: Mathematical Foundations for Data Science
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week Continuous Internal 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
===================================================================
Prerequisite Course: Data Structures, Vector Calculus and Differential Equation
===================================================================
Course Objectives:
1. To Study of multidimensional, homogenous array of fixed-size elements using
NumPy.
2. To Study of different types of data visualization using Matplotlib.
3. To study of Probability and Statistics for Data Science
4. To study of Linear Algebra mathematical for Data Science.
5. To study of various techniques used for Optimization problems
6. To study of representation of dataset in mathematical form and matrix
Course Outcome (COs): On completion of the course, students will be able to-
Course Outcomes Bloom's Taxonomy
Level Descriptor
Apply the Homogenous array of fixed-size elements using NumPy 3 Apply
Apply the various data visualization methods using Matplotlib 3 Apply
Apply the Probability and Statistics for Data Science 3 Apply
Apply the concept of Linear Algebra for Data Science 3 Apply
Apply various optimization techniques and learn where to apply in ML 3 Apply
Apply for dimensionality reduction and where to apply 3 Apply
Sanjivani College of Engineering, Kopargaon Page 9 of 25
Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific
Outcomes(PSOs):
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
-
CO1 2 3 - - - - - - - 2- 3 3 1 -
-
CO2 2 3 - - - - - - - 2 2 3 2 -
CO3 2 2 3 2 - - - - - - 2 3 3 1 -
CO4 2 2 3 2 - - - - - - 2 2 3 2 -
CO5 2 2 3 2 - - - - - - 2 2 3 2 -
CO6 2 2 3 2 - - - - - - 2 2 3 2 -
COURSE CONTENTS
Unit I NumPy No. of COs
Hours
Introduction to NumPy Arrays, NumPy N-dimensional 6 CO1
Array , Functions to Create Arrays , Combining Arrays
Index, Slice and Reshape NumPy Arrays List to Arrays,
Array Indexing, Array Slicing, Array Reshaping.
Unit II Matplotlib & Scipy No. of COs
Hours
Introduction to Matplotlib, Matplotlib Subplots, Important 7 CO2
Types of Plots, Three-dimensional Plotting, Introduction to
Scipy, SciPy Sub-packages.
Unit Probability and Statistics No. of COs
III Hours
Introduction to Probability and Statistics, Conditional 7 CO3
probability and bayes theorem ,Population and Sample
,Population parameters and sample statistics,
Gaussian/Normal Distribution-distribution ,Physical
significance of mean ,median mode, Central limit theorem,
Standard deviation, Tensors, All the above concepts need to
be implemented and discuss in class using numpay and
Sanjivani College of Engineering, Kopargaon Page 10 of 25
Matplotlib & Scipy
Unit Linear Algebra No. of COs
IV Hours
Introduction to Vectors(2-D, 3-D, n-D) , Row Vector and 8 CO4
Column Vector ,Dot Product and Angle between 2 Vectors
,Projection and Unit Vector ,Equation of a line (2-D),
Plane(3-D) and Hyperplane (n-D), Plane Passing through
origin, Normal to a Plane ,Distance of a point from a
Plane/Hyper plane, Half-Spaces ,Matrices: Diagonal
matrices, scalar matrices, identity matrices, multiplication
and transpose of matrices, Derivatives ,Maxima and Minima
,Understanding Distance Metrics Used in Machine Learning
,All the above concepts need to be implemented and discuss
in class using using numpay and matlib & scipy
Unit V Optimization Techniques No. of COs
Hours
Importance of Optimization, Optimization used in 8
DS/ML/DL, Convex Functions, different from other CO5
mathematical problems, Types of Optimization Problems,
Various techniques used for Optimization problems
Unit Dimensionality reduction and No.of COs
VI Visualization Hours
Introduction to Dimensionality reduction, Row Vector and 6 CO6
Column Vector, representation of dataset in mathematical
form and matrix, Data Preprocessing: Feature
Normalization, Mean of a data matrix, Column
Standardization, Co-variance of a Data Matrix, PCA and
SVD.
Books:
Text Books(T):
T1:Basics of Linear Algebra for Machine Learning Discover the Mathematical Language of Data in
Python, Jason Brownlee
T2: Milton. J. S. and Arnold. J.C., "Introduction to Probability and Statistics", Tata McGraw Hill,
4thEdition, 2007.
T3: Johnson. R.A. and Gupta. C.B., "Miller and Freund’s Probability and Statistics for Engineers",
Pearson Education, Asia, 7th Edition, 2007.
Reference Books(R):
R1: Spiegel. M.R., Schiller. J. and Srinivasan. R.A., "Schaum’s Outline of Theory and Problems of
Sanjivani College of Engineering, Kopargaon Page 11 of 25
Probability and Statistics", Tata McGraw Hill Edition, 2004.
R2: Devore. J.L., "Probability and Statistics for Engineering and the Sciences”, Cengage
Learning,New Delhi, 8th Edition, 2012.
R3:Probability, Random Variables, Statistics, and Random Processes: Fundamentals &
Applications,Ali Grami, ISBN: 978-1-119-30081-6
Online Resource:
https://www.analyticsvidhya.com/blog/2020/02/4-types-of-distance-metrics-in-machine-learning/
https://www.analyticsvidhya.com/blog/2022/10/optimization-essentials-for-machine-learning/
NPTEL Coursers:
Data Science for Engineers https://onlinecourses.nptel.ac.in/noc20_cs28/preview
Essential Mathematics for Machine Learning https://nptel.ac.in/courses/111107137
Sanjivani College of Engineering, Kopargaon Page 12 of 25
CO1802: Advanced Python Programming
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week Continuous Internal 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
===================================================================
Prerequisite Course: Python, Mathematical Foundations for Data Science
===================================================================
Course Objectives:
1. To understand the use of python development environments.
2. To get well versed with built in data structures, functions and files
3. To implement different python Itertools, modules and packages.
4. To learn and use different modules, packages and error handling.
5. To learn and use various python libraries like pandas.
6. To learn and use various python libraries like NLTK for text preprocessing.
Course Outcomes:
On completion of the course, students will be able to-
Course Outcomes BTL Blooms
Taxonomy
Descriptor
CO1 Understand and configure python IDEs and Environment. 3 Apply
CO2 Understand and implement Built-in Data Structures, Functions, 3 Apply
and Files
CO3 Implement Itertools and modules 3 Apply
CO4 Understand and apply modules and packages 3 Apply
CO5 Understand python Pandas packages and implement for data 3 Apply
frame manipulation.
CO6 Understand and Implement python text processing libraries. 3 Apply
Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific Outcomes (PSOs):
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 1 1 - - 2 - - - - - - 2 - - -
Sanjivani College of Engineering, Kopargaon Page 13 of 25
CO2 2 2 2 - 2 - - - - - - 2 - 2 2
CO3 2 2 3 2 2 - - - - - - 2 1 2 2
CO4 2 2 3 2 2 - - - - - - 2 1 2 2
CO5 2 2 3 3 2 - - - - - - 2 1 2 2
CO6 2 2 3 2 3 - - - - - - 2 1 2 2
COURSE CONTENTS
Unit I Python IDEs No of COS
Hrs
Visual Studio Code: VS code settings and themes.
PyCharm: Exploring all features of PyCharm, code debugging. 06 CO1
Jupiter Notebook: Combine code, text, and images for greater
user experience.
Google Colab: How to import data in Colab, how to compile
and develop code in Colab?
Spyder: All features, debugging Creation of virtual environment
using Anaconda, installation of various packages, up-gradation
of various packages, How to access help from documentation.
Unit II Built-in Data Structures, Functions and Files No of COS
Hrs
Tuples: Tuples operations, List: Adding and removing elements,
Concatenating and combining lists, Sorting, Slicing, Built-in
Sequence Functions: enumerate, sorted, zip, reversed.
Dictionaries: Dictionary operations, List comprehension, if 06 CO2
statement in list comprehension, set and dictionary
comprehension, generator expression and functions
Functions: Optional arguments or parameters, keyword
arguments, arbitrary positional arguments, return keyword,
lambda function
Files: Open file, reading file, writing files.
Unit III Itertool Modules No of COS
Hrs
Sanjivani College of Engineering, Kopargaon Page 14 of 25
Count, cycle, repeat, combinations, permutation, product,
combination with replacement, chain, I Slice, compress, filter,
Group by decorators. 06 CO3
Unit IV Modules, Packages, Error handling No of COS
Hrs
· Reusing code with modules, organizing modules into packages, 06
handling errors with try and except block
CO4
Unit V Python with Pandas No of COS
Hrs
Pandas Series Introduction, Create Pandas Data Frame, Select
rows, select columns, Add new rows, add new column, rename
columns, drop rows by label/index, drop columns by label or 06 CO5
index, drop rows based on column values, cast column types,
get row count, apply, group by, shuffle data frame rows, join
data frames, merge data frames concat data frames, Fill Nan
with values, loc, iloc, filter, where, reset index
Unit VI Python Text Processing No of COS
Hrs
String operations and string methods, Counting Token in
Paragraphs, Filter Duplicate Words, Tokenization, Remove
stopwords, word replacement, search and match, Text munging, 06 CO6
text wrapping, RegEx Library, Various Pattern used in RegEx.
NLTK & Various functions used in NLTK.
Books
Text Books(T):
T1. Prateek Joshi, Artificial Intelligence with Python: A Comprehensive Guide to Building
i intelligent Apps for Python Beginners and Developers, Packt Publishing; 2nd edition,
Sanjivani College of Engineering, Kopargaon Page 15 of 25
ISBN: 178646439X
T2. Andreas C. Müller, Srah Guido, Introduction to Machine Learning with Python:
A Guide for Data Scientists, O′Reilly Publisher, 1st edition, ISBN: 978-1449369415
T3. Python for Data Analysis, Wes Mc Kinney O’relly publications, Second Edition.
Reference Books(R):
R1. Giuseppe Bonaccorso, “Machine Learning Algorithms”, Packt Publishing Limited,
I SBN10: 1785889621, ISBN-13: 978-1785889622.
R2. Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python:
A Analyzing Text with the Natural Language Toolkit, O'Reilly Media, Inc., ISBN:
9780596516499
E- Resources:
https://www.tutorialspoint.com/python_text_processing/index.html
https://www.tutorialspoint.com/python_pandas/python_pandas_series.html
https://sparkbyexamples.com/python-pandas-tutorial-for-beginners/
Sanjivani College of Engineering, Kopargaon Page 16 of 25
CO1901: PostgreSQL and Data Pipeline
Teaching Scheme Examination Scheme
Lectures: 4 Hrs. / Week
Continuous 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
==================================================================
Prerequisite Course: Python, Mathematical Foundations for Data Science
==================================================================
Course Objectives:
1 .To learn architecture of PostgreySQL
2. To understand various data types
3. To create database tables and various table operations
4. To understand various database operations
5. To learn how to establish connection using python
Course Outcomes:
On completion of the course, students will be able to-
Course Outcomes BTL Blooms
Taxonomy
Descriptor
CO1 Understand various data types in database system 2 Understand
CO2 Database and various database tables operations 3 Apply
CO3 Apply Data Exploration methods to extract data from table 3 Apply
CO4 Importing And Exporting Techniques on Data 3 Apply
CO5 Apply Join Operation in Relational Database 3 Apply
CO6 Apply Data Pipeline using Psychopg2 and SQL alchemy ,3 Apply
Sanjivani College of Engineering, Kopargaon Page 17 of 25
Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific Outcomes
(PSOs):
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 3 2 - - - - - - - - - - 3 - -
CO2 2 2 3 2 2 - - - - - - 1 3 - -
CO3 2 2 3 2 3 - - - - - - - 3 - -
CO4 3 1 2 1 3 - - - - - - - 3 - -
CO5 3 - 1 - - - - - - - - - 3 - -
CO6 1 2 3 1 3 - - - - - - - 3 - -
COURSE CONTENTS
Unit I Data Types No of COS
Hrs
Characters, Numbers, Choosing Data Type, Date and Time,
Using the interval Data Type in Calculations, Transforming 06 CO1
Values from One Type to Another with CASTING
.
Database and various database tables operations
Unit II No of COS
Hrs
Creating a Table, Primary keys and foreign keys, constraints,
Insert, update, Delete a Table, Alter table, drop table, check
constraints.
06 CO2
Unit III Data Exploration No of COS
Hrs
Select statement, select distinct, count, Filtering Rows with
WHERE, Sorting Data with ORDER BY, limit, between ,IN
,Using LIKE and ILIKE with WHERE, Combining Operators
06 CO3
with AND and OR, Aggregation statements, GROUPBY,
Having.
Unit IV Importing And Exporting Techniques on Data No of COS
Sanjivani College of Engineering, Kopargaon Page 18 of 25
Hrs
Working with Delimited Text Files, Header Rows, Using
· COPY to Import Data, Importing a Subset of Columns with
copy Adding a Default Value to a Column During Import,
06 CO4
Using COPY to Export Data, Exporting Particular Columns,
Exporting Query Results, Importing and Exporting Through
pgAdmin
Join Operations in Relational Database
Unit V No of COS
Hrs
Introduction to join, Join types, Inner join, Full outer join, left
outer join, right join, Using NULL to Find Rows with Missing
Values, Three Types of Table Relationships, One-to-One
06 CO5
Relationship, One-to-Many Relationship, Many-to-Many
Relationship, Selecting Specific Columns in a Join, Joining
Multiple Tables
Unit VI Data Pipeline No of COS
Hrs
Overview of python and postgre, Psychopg2 example,
importing and exporting csv using Psycopg2,Working with 06 CO6
Data via SQL Alchemy Core, Inserting Data, Querying Data,
Limiting, Built-In SQL Functions and Labels.
B Books
TText Books(T):
T1. Practicalsql A Beginner’s Guide to, Storytelling with Data, no starch publication by Anthony
DeBarros ,ISBN-10: 1-59327-827-6
T2. PostgreSQL A Practical guide to the advanced open source Regina Obe and Leo Hsu
T3.Mastering PostgreSQL 13
R Reference Books(R):
Sanjivani College of Engineering, Kopargaon Page 19 of 25
R1. Essential SQL Alchemy ,Jason Myers and Rick Copeland,O’reillypublication,ISBN-978-1-491-
91646-9
R2. Postgresql To High Performance, Enrico Pirozzi
E- Resources:
https://realpython.com/python-continuous-integration/
https://www.coursera.org/learn/database-design-postgresql
https://www.coursera.org/umich
Sanjivani College of Engineering, Kopargaon Page 20 of 25
CO1902: Practical Machine Learning for Data Science
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week CIA: 40 Marks
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
====================================================================
Prerequisite Course: Mathematical Foundations for Data Science, Advanced Python Programming,
PostgreSQL and Data Pipeline
====================================================================
Course Objectives:
1. To understand and apply CRISP-ML(Q) method of machine learning models
2. To understand and apply Clustering, Dimensionality Reduction
3. To understand and apply various NLP strategies
4. To learn different and apply Supervised Machine Learning algorithm
5. To learn how to and apply evaluate the models and performance metrics
6. To understand and apply Regression and its types
Course Outcome (COs): On completion of the course, students will be able to-
Course Bloom's Taxonomy
Outcomes
Level Descriptor
Apply project management methodology and Exploratory data analysis 3 Apply
Apply feature engineering techniques 3 Apply
Apply text mining and NLP 3 Apply
Apply SVM, KNN, Naïve Bayes, Decision tree 3 Apply
Apply Supervised Learning -ensemble techniques. 3 Apply
Apply Supervised Learning Regression 3 Apply
Mapping of Course Outcomes to Program Outcomes (POs) & Program
Specific Outcomes (PSOs):
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 1 1 1 1 3 - - - 3 1 3 1 1 1 1
CO2 3 1 1 1 3 1 - 1 2 - - 2 2 2 1
CO3 3 2 1 1 3 1 - 1 2 - - 2 2 2 1
Sanjivani College of Engineering, Kopargaon Page 21 of 25
CO4 3 2 1 2 3 1 - - - - - 1 1 2 2
CO5 3 1 2 1 3 1 - - 2 - - 1 1 2 2
CO6 3 1 2 1 3 1 - - 2 - - 1 1 2 2
COURSE CONTENTS
Unit I EDA and Feature Engineering No. of Cos
Techniques Hours
Project management methodology (CRISP-ML-(Q)), Exploratory 6 CO1
data analysis (5 number summary, boxplot, bar graph, Histogram,
correlation graph, scatter plots), exploring two or more variables,
Data sampling and its types, various types of bias. Dummy variables
conversion techniques Standardization and normalization, outlier
identification and outlier treatment techniques, skewness
identification and its treatment. Finding null values and its treatment.
Unit II Unsupervised Learning-Clustering, Dimensionality Reduction No. of Cos
Hours
Supervised Vs Unsupervised Learning, Clustering/Segmentation 6 CO2
algorithms-Hierarchical, Distance metrics for categorical data,
Distance metrics for continuous ,distance metrics for mixed data,
distance for Clusters, K-means Clustering, K-selection-Elbow curve,
drawbacks and comparison
Need for Dimensionality Reduction, Principal component
analysis(PCA), applications for PCA, Singular Value
Decomposition(SVD), application of SVD
Unit Text Mining-Sentiment Analysis and No. of Cos
III NLP Hours
Need of text mining, Bag of words, terminology and preprocessing, 6 CO3
DTM and TDM, Corpus level Word Cloud. Introduction of NLP, data
preprocessing in NLP context, NLP terminology, feature extraction
from text, topic modeling, vector representation.
Unit Supervised Learning Algorithms No. of Cos
IV Hours
Bayesian classifier- definition, selection criteria, needs, applications, 6 CO4
advantages and constraints
K-nearest neighbor classifier- definition, selection criteria, needs,
applications, advantages, constraints and controlling complexity
Decision Trees, building of decision trees, SVM
Unit V Supervised Learning -Ensemble No. of Cos
Techniques Hours
Sanjivani College of Engineering, Kopargaon Page 22 of 25
Ensemble primer-Bias Vs variance trade off, Generative models vs 6
Non generative models, Bagging-Random Forest trees, Voting types, CO5
boosting -Ada boost, gradient boosting, XGBoost, and stacking,
selection criterion, constraints of ensemble techniques.
Unit Supervised Learning Regression No.of Cos
VI Hours
Scatter diagrams, Correlation Analysis, Ordinary least squares, 6 CO6
Regression Analysis, Simple Linear Regression, LINE assumptions,
LINE criterions training, Transformation, bias Vs variance, Stepwise
Regression. Overfitting in Linear Regression-Lasso Regression and
Ridge Regression.
Multiple Linear Regression, Logistic Regression-odds, Need,
Logistic Regression training
Books:
Text Books(T):
T1: Abhishek Vijayvargia, “Machine learning with Python: an approach to applied machine learning”, BPB
Publications , 1st Edition
T2: Ethem Alpaydin, “Introduction to Machine Learning”, PHI 2nd Edition-2013.
T3: Andreas C Muller and Sarah Guido, “Introduction to Machine learning with Python: Guide for data
scientists”, O’Reilly publication 1st Edition
Reference Books(R):
R1: Introduction to Data Science, A Python Approach to Concepts, Techniques and Applications, With
contributions from Jordi Vitrià, Eloi Puertas,Petia Radeva, Oriol Pujol, Sergio Escalera, Francesc Dantí and
Lluís Garrido,Springer publication, ISBN 978-3-319-50016-4
R2: Advanced Data Analytics Using Python, Sayan Mukhopadhyay, Apress publication ,ISBN-13 (pbk):
978-1-4842-3449-5
R3: C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer 1st Edition-2013
R4: Hastie, Tibshirani, Friedman, “Introduction to Machine Learning”, Springer, 2nd Edition-12
Online Resource:
Introduction to Machine Learning
https://onlinecourses.nptel.ac.in/noc23_cs18/preview
Data Science for Engineers
https://onlinecourses.nptel.ac.in/noc23_cs97/preview
Sanjivani College of Engineering, Kopargaon Page 23 of 25
CO1903: Mini Project :Data Science Project implementation
Teaching Scheme Evaluation Scheme
Lectures: 2 Hrs. / Week Term Work 50 Marks
Assessment:
Credits: 2
Total: 50 Marks
====================================================================
Prerequisite Course: Mathematical Foundations for Data Science, Advanced Python Programming,
PostgreSQL and Data Pipeline , Practical Machine Learning for Data Science
====================================================================
Course Objectives:
1. To follow SDLC meticulously and meet the objectives of proposed work
2. To test rigorously before deployment of system
3. To validate the work undertaken
4. To consolidate the work as furnished report
Course Outcomes:
Course Outcomes BTL Blooms
Taxonomy
Descriptor
CO1 Show evidence of independent investigation critically 3 Apply
analyze the results and their interpretation.
CO2 Report and present the original results in an orderly way 3 Apply
and placing the open questions in the right perspective.
CO3 Link techniques and results from literature as well as 3 Apply
actual research and future research lines with the
research.
CO4 Appreciate practical implications and constraints of the 3 Apply
specialist subject
Mini Project Teamwork Assignments
Project workstation selection, installations along with setup and installation report
Preparations, Programming of the project functions, interfaces and GUI (if any) ,Test tool
selection and testing of various test cases for the project performed and generate
Various testing result charts, graphs etc. including reliability testing
1) T20 World Cup 2022 Analysis using Python
Every sports event generates a lot of data which we can use to analyze the performance of
players, teams, and many highlights of the game. As the ICC Men’s T20 world cup has
just finished, it has generated a lot of data we can use to summarize the event. So, if you
want to learn how to analyze a sports event like the t20 world cup, this article is for you.
This article will take you through the task of T20 World Cup 2022 analysis using Python.
Sanjivani College of Engineering, Kopargaon Page 24 of 25
2) Job Recommendation System using Python
A recommendation system is a popular application of Data Science that recommends
personalized content based on the users’ interests. Almost all the popular websites you
visit today use a recommendation system. As the name suggests, a job recommendation
system is an application that recommends jobs based on the skills and the user’s desired
role. So, if you want to learn how to recommend jobs using the Python programming
language, this article is for you. This article will help you learn about creating a Job
Recommendation System using Python.
3) Netflix Recommendation System using Python
Netflix is a subscription-based streaming platform that allows users to watch movies and
TV shows without advertisements. One of the reasons behind the popularity of Netflix is
its recommendation system. Its recommendation system recommends movies and TV
shows based on the user’s interest. If you are a Data Science student and want to learn
how to create a Netflix recommendation system, this article is for you. This article will
take you through how to build a Netflix recommendation system using Python.
4) Online Food Order Prediction with Machine Learning
There has been a high demand for online food orders after the introduction of Swiggy and
Zomato in the market. Food delivery companies use your buying habits to make the
delivery process faster. The food order prediction system is one of the useful techniques
these companies can use to make the entire delivery process fast. In this article, I will
take you through the task of Online Food Order Prediction with Machine Learning using
Python.
5) Flipkart Reviews Sentiment Analysis using Python
Flipkart is one of the most popular Indian companies. It is an e-commerce platform that
competes with popular e-commerce platforms like Amazon. One of the most popular use
cases of data science is the task of sentiment analysis of product reviews sold on e-
commerce platforms. So, if you want to learn how to analyze the sentiment of Flipkart
reviews, this article is for you. In this article, I will walk you through the task of Flipkart
reviews sentiment analysis using Python.
.
Sanjivani College of Engineering, Kopargaon Page 25 of 25