[go: up one dir, main page]

0% found this document useful (0 votes)
50 views25 pages

B.tech Honors Data Science 2023 Pattern Syllabus

The document outlines the curriculum for the B. Tech. Honors (Data Science) program at Sanjivani College of Engineering, effective from the academic year 2023-2024. It includes the program's vision, mission, educational objectives, outcomes, and specific outcomes, along with detailed course structures for various semesters. Additionally, it provides course objectives and outcomes for specific subjects such as Mathematical Foundations for Data Science and Advanced Python Programming.

Uploaded by

Sachin Borade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views25 pages

B.tech Honors Data Science 2023 Pattern Syllabus

The document outlines the curriculum for the B. Tech. Honors (Data Science) program at Sanjivani College of Engineering, effective from the academic year 2023-2024. It includes the program's vision, mission, educational objectives, outcomes, and specific outcomes, along with detailed course structures for various semesters. Additionally, it provides course objectives and outcomes for specific subjects such as Mathematical Foundations for Data Science and Advanced Python Programming.

Uploaded by

Sachin Borade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Sanjivani Rural Education Society’s

Sanjivani College of Engineering,


Kopargaon
(An Autonomous Institute Affiliated to Savitribai Phule Pune University, Pune)

B. Tech. Honors (Data Science)


2023 Pattern

Curriculum

(B. Tech. Honors Sem- V, VI, VII & VIII with effect from Academic
Year 2023-2024)

At. Sahajanandnagar, Post. Shingnapur Tal. Kopargaon Dist. Ahmednagar,


Maharashtra State, India PIN 423603

Sanjivani College of Engineering, Kopargaon Page 1 of 25


Sanjivani College of Engineering, Kopargaon
(An Autonomous Institute affiliated to SPPU, Pune)

DECLARATION

We, the Board of Studies (Computer Engineering), hereby declare that,


we have designed the Curriculum of B.Tech honors (DS) Program
Curriculum Structure and Syllabus for semester V,VI,VII & VIII of
Pattern 2023 w.e.f. from A.Y 2023-24 as per the guidelines. So, we are
pleased to submit and publish this FINAL copy of the curriculum for
the information to all the concerned stakeholders.

Submitted by

(Dr.D.B.Kshirsagar)
BoS Chairman

Approved by

Dean Academics Director

Sanjivani College of Engineering, Kopargaon Page 2 of 25


Vision

 To develop world class engineering professionals with good moral characters and make them
capable to exhibit leadership through their engineering ability, creative potential and effective
soft skills which will improve the quality of life in society.

Mission

 To impart quality technical education to the students through innovative and interactive teaching
and learning process to acquire sound technical knowledge, professional competence and to have
aptitude for research and development.
 Develop students as excellent communicators and highly effective team members and leaders
with full appreciation of the importance of professional, ethical and social responsibilities.

Sanjivani College of Engineering, Kopargaon Page 3 of 25


Program Educational Objectives (PEOs)

1 To prepares the committed and motivated graduates by developing technical competency, research
attitude and life-long learning with support of strong academic environment.
2. Train graduates with strong fundamentals and domain knowledge, update with modern technology to
analyse, design & create novel products to provide effective solutions for social benefits.
3. Exhibit employability skills, leadership and right attitude to succeed in their professional career.

Program Outcomes (POs)

Engineering Graduates will be able to:


1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,
and an engineering specialization to the solution of complex engineering problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex engineering
problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety, and the cultural, societal, and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to
provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.

Sanjivani College of Engineering, Kopargaon Page 4 of 25


8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of
the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and
design documentation, make effective presentations, and give and receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team, to
manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.

Program Specific Outcomes (PSOs)

1. Professional Skills: The ability to apply knowledge of problem solving, algorithmic analysis, software
Engineering, Data Structures, Networking, Database with modern recent trends to provide the effective
solutions for Computer Engineering Problems.

2. Problem-Solving Skills: The ability to inculcate best practices of software and hardware design for
delivering quality products useful for the society.

3. Successful Career: The ability to employ modern computer languages, environments, and platforms in
creating innovative career paths.

Sanjivani College of Engineering, Kopargaon Page 5 of 25


SRES’s Sanjivani College of Engineering, Kopargaon
(An Autonomous Institute Affiliated to SPPU Pune)
COURSE STRUCTURE- 2023 PATTERN
B. TECH HONORS: (AIML AND DS)

LIST OF ABBREVIATIONS

Abbreviation Full Form Abbreviation Full Form

PCC Professional Core courses CIA Continuous Internal Assessment

PEC Professional Elective courses OR End Semester Oral Examination

OEC Open Elective courses PR End Semester Practical Examination

ISE In-Semester Evaluation TW Continuous Term work Evaluation

ESE End-Semester Evaluation MLC Mandatory Learning Course

PROJ Project L Lecture

LC Laboratory course P Practical

T Tutorial NC Non-Credit

Cat Category

Sanjivani College of Engineering, Kopargaon Page 6 of 25


COURSE STRUCTURE- 2023 PATTERN
Data Science
Hrs./Week Evaluation Scheme
Theory Practical Gra
Year Credits
Cat Code Course Title nd
/Sem L T P T O
CIA ESE PR Tota
W R
l
TY Mathematical Foundation
PCC CO1801 4 - - 4 40 60 - - - 100
Sem-I for Data Science
TY
Advanced Python
Sem- PCC CO1802 4 - - 4 40 60 - - - 100
II Programming
Final PostgreSQL and Data
PCC CO1901 4 - - 4 40 60 - - - 100
Sem-I Pipeline
Final
Practical Machine
Sem- PCC CO1902 4 - - 4 40 60 - - - 100
II Learning for Data Science
Final
Mini Project :Data Science
Sem- PCC CO1903 - - 4 2 - - 50 - - 50
II Project implementation
Total 16 - 04 18 160 240 50 - - 450

HoD Dean Academics Director


Dr.D.B.Kshirsagar Dr.A.B.Pawar Dr.A.G.Thakur

Sanjivani College of Engineering, Kopargaon Page 7 of 25


Data Science

Sanjivani College of Engineering, Kopargaon Page 8 of 25


CO1801: Mathematical Foundations for Data Science
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week Continuous Internal 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
===================================================================
Prerequisite Course: Data Structures, Vector Calculus and Differential Equation
===================================================================
Course Objectives:
1. To Study of multidimensional, homogenous array of fixed-size elements using
NumPy.
2. To Study of different types of data visualization using Matplotlib.
3. To study of Probability and Statistics for Data Science
4. To study of Linear Algebra mathematical for Data Science.
5. To study of various techniques used for Optimization problems
6. To study of representation of dataset in mathematical form and matrix

Course Outcome (COs): On completion of the course, students will be able to-
Course Outcomes Bloom's Taxonomy
Level Descriptor
Apply the Homogenous array of fixed-size elements using NumPy 3 Apply

Apply the various data visualization methods using Matplotlib 3 Apply


Apply the Probability and Statistics for Data Science 3 Apply

Apply the concept of Linear Algebra for Data Science 3 Apply


Apply various optimization techniques and learn where to apply in ML 3 Apply
Apply for dimensionality reduction and where to apply 3 Apply

Sanjivani College of Engineering, Kopargaon Page 9 of 25


Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific
Outcomes(PSOs):

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
-
CO1 2 3 - - - - - - - 2- 3 3 1 -
-
CO2 2 3 - - - - - - - 2 2 3 2 -

CO3 2 2 3 2 - - - - - - 2 3 3 1 -

CO4 2 2 3 2 - - - - - - 2 2 3 2 -

CO5 2 2 3 2 - - - - - - 2 2 3 2 -

CO6 2 2 3 2 - - - - - - 2 2 3 2 -

COURSE CONTENTS

Unit I NumPy No. of COs


Hours

Introduction to NumPy Arrays, NumPy N-dimensional 6 CO1


Array , Functions to Create Arrays , Combining Arrays
Index, Slice and Reshape NumPy Arrays List to Arrays,
Array Indexing, Array Slicing, Array Reshaping.

Unit II Matplotlib & Scipy No. of COs


Hours
Introduction to Matplotlib, Matplotlib Subplots, Important 7 CO2
Types of Plots, Three-dimensional Plotting, Introduction to
Scipy, SciPy Sub-packages.
Unit Probability and Statistics No. of COs
III Hours
Introduction to Probability and Statistics, Conditional 7 CO3
probability and bayes theorem ,Population and Sample
,Population parameters and sample statistics,
Gaussian/Normal Distribution-distribution ,Physical
significance of mean ,median mode, Central limit theorem,
Standard deviation, Tensors, All the above concepts need to
be implemented and discuss in class using numpay and

Sanjivani College of Engineering, Kopargaon Page 10 of 25


Matplotlib & Scipy

Unit Linear Algebra No. of COs


IV Hours
Introduction to Vectors(2-D, 3-D, n-D) , Row Vector and 8 CO4
Column Vector ,Dot Product and Angle between 2 Vectors
,Projection and Unit Vector ,Equation of a line (2-D),
Plane(3-D) and Hyperplane (n-D), Plane Passing through
origin, Normal to a Plane ,Distance of a point from a
Plane/Hyper plane, Half-Spaces ,Matrices: Diagonal
matrices, scalar matrices, identity matrices, multiplication
and transpose of matrices, Derivatives ,Maxima and Minima
,Understanding Distance Metrics Used in Machine Learning
,All the above concepts need to be implemented and discuss
in class using using numpay and matlib & scipy
Unit V Optimization Techniques No. of COs
Hours
Importance of Optimization, Optimization used in 8
DS/ML/DL, Convex Functions, different from other CO5
mathematical problems, Types of Optimization Problems,
Various techniques used for Optimization problems
Unit Dimensionality reduction and No.of COs
VI Visualization Hours
Introduction to Dimensionality reduction, Row Vector and 6 CO6
Column Vector, representation of dataset in mathematical
form and matrix, Data Preprocessing: Feature
Normalization, Mean of a data matrix, Column
Standardization, Co-variance of a Data Matrix, PCA and
SVD.
Books:
Text Books(T):
T1:Basics of Linear Algebra for Machine Learning Discover the Mathematical Language of Data in
Python, Jason Brownlee
T2: Milton. J. S. and Arnold. J.C., "Introduction to Probability and Statistics", Tata McGraw Hill,
4thEdition, 2007.
T3: Johnson. R.A. and Gupta. C.B., "Miller and Freund’s Probability and Statistics for Engineers",
Pearson Education, Asia, 7th Edition, 2007.

Reference Books(R):
R1: Spiegel. M.R., Schiller. J. and Srinivasan. R.A., "Schaum’s Outline of Theory and Problems of

Sanjivani College of Engineering, Kopargaon Page 11 of 25


Probability and Statistics", Tata McGraw Hill Edition, 2004.
R2: Devore. J.L., "Probability and Statistics for Engineering and the Sciences”, Cengage
Learning,New Delhi, 8th Edition, 2012.
R3:Probability, Random Variables, Statistics, and Random Processes: Fundamentals &
Applications,Ali Grami, ISBN: 978-1-119-30081-6

Online Resource:
https://www.analyticsvidhya.com/blog/2020/02/4-types-of-distance-metrics-in-machine-learning/
https://www.analyticsvidhya.com/blog/2022/10/optimization-essentials-for-machine-learning/

NPTEL Coursers:
Data Science for Engineers https://onlinecourses.nptel.ac.in/noc20_cs28/preview
Essential Mathematics for Machine Learning https://nptel.ac.in/courses/111107137

Sanjivani College of Engineering, Kopargaon Page 12 of 25


CO1802: Advanced Python Programming
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week Continuous Internal 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
===================================================================
Prerequisite Course: Python, Mathematical Foundations for Data Science
===================================================================
Course Objectives:
1. To understand the use of python development environments.
2. To get well versed with built in data structures, functions and files
3. To implement different python Itertools, modules and packages.
4. To learn and use different modules, packages and error handling.
5. To learn and use various python libraries like pandas.
6. To learn and use various python libraries like NLTK for text preprocessing.

Course Outcomes:
On completion of the course, students will be able to-
Course Outcomes BTL Blooms
Taxonomy
Descriptor
CO1 Understand and configure python IDEs and Environment. 3 Apply
CO2 Understand and implement Built-in Data Structures, Functions, 3 Apply
and Files
CO3 Implement Itertools and modules 3 Apply

CO4 Understand and apply modules and packages 3 Apply


CO5 Understand python Pandas packages and implement for data 3 Apply
frame manipulation.
CO6 Understand and Implement python text processing libraries. 3 Apply

Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific Outcomes (PSOs):

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 1 1 - - 2 - - - - - - 2 - - -

Sanjivani College of Engineering, Kopargaon Page 13 of 25


CO2 2 2 2 - 2 - - - - - - 2 - 2 2
CO3 2 2 3 2 2 - - - - - - 2 1 2 2
CO4 2 2 3 2 2 - - - - - - 2 1 2 2
CO5 2 2 3 3 2 - - - - - - 2 1 2 2
CO6 2 2 3 2 3 - - - - - - 2 1 2 2

COURSE CONTENTS

Unit I Python IDEs No of COS


Hrs

Visual Studio Code: VS code settings and themes.


PyCharm: Exploring all features of PyCharm, code debugging. 06 CO1
Jupiter Notebook: Combine code, text, and images for greater
user experience.
Google Colab: How to import data in Colab, how to compile
and develop code in Colab?
Spyder: All features, debugging Creation of virtual environment
using Anaconda, installation of various packages, up-gradation
of various packages, How to access help from documentation.

Unit II Built-in Data Structures, Functions and Files No of COS


Hrs

Tuples: Tuples operations, List: Adding and removing elements,


Concatenating and combining lists, Sorting, Slicing, Built-in
Sequence Functions: enumerate, sorted, zip, reversed.
Dictionaries: Dictionary operations, List comprehension, if 06 CO2
statement in list comprehension, set and dictionary
comprehension, generator expression and functions
Functions: Optional arguments or parameters, keyword
arguments, arbitrary positional arguments, return keyword,
lambda function
Files: Open file, reading file, writing files.

Unit III Itertool Modules No of COS


Hrs

Sanjivani College of Engineering, Kopargaon Page 14 of 25


Count, cycle, repeat, combinations, permutation, product,
combination with replacement, chain, I Slice, compress, filter,
Group by decorators. 06 CO3

Unit IV Modules, Packages, Error handling No of COS


Hrs

· Reusing code with modules, organizing modules into packages, 06


handling errors with try and except block
CO4

Unit V Python with Pandas No of COS


Hrs

Pandas Series Introduction, Create Pandas Data Frame, Select


rows, select columns, Add new rows, add new column, rename
columns, drop rows by label/index, drop columns by label or 06 CO5
index, drop rows based on column values, cast column types,
get row count, apply, group by, shuffle data frame rows, join
data frames, merge data frames concat data frames, Fill Nan
with values, loc, iloc, filter, where, reset index

Unit VI Python Text Processing No of COS


Hrs

String operations and string methods, Counting Token in


Paragraphs, Filter Duplicate Words, Tokenization, Remove
stopwords, word replacement, search and match, Text munging, 06 CO6
text wrapping, RegEx Library, Various Pattern used in RegEx.
NLTK & Various functions used in NLTK.

Books
Text Books(T):

T1. Prateek Joshi, Artificial Intelligence with Python: A Comprehensive Guide to Building

i intelligent Apps for Python Beginners and Developers, Packt Publishing; 2nd edition,

Sanjivani College of Engineering, Kopargaon Page 15 of 25


ISBN: 178646439X

T2. Andreas C. Müller, Srah Guido, Introduction to Machine Learning with Python:

A Guide for Data Scientists, O′Reilly Publisher, 1st edition, ISBN: 978-1449369415

T3. Python for Data Analysis, Wes Mc Kinney O’relly publications, Second Edition.

Reference Books(R):

R1. Giuseppe Bonaccorso, “Machine Learning Algorithms”, Packt Publishing Limited,

I SBN10: 1785889621, ISBN-13: 978-1785889622.

R2. Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python:

A Analyzing Text with the Natural Language Toolkit, O'Reilly Media, Inc., ISBN:

9780596516499

E- Resources:
https://www.tutorialspoint.com/python_text_processing/index.html
https://www.tutorialspoint.com/python_pandas/python_pandas_series.html
https://sparkbyexamples.com/python-pandas-tutorial-for-beginners/

Sanjivani College of Engineering, Kopargaon Page 16 of 25


CO1901: PostgreSQL and Data Pipeline
Teaching Scheme Examination Scheme
Lectures: 4 Hrs. / Week
Continuous 40 Marks
Assessment:
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
==================================================================
Prerequisite Course: Python, Mathematical Foundations for Data Science
==================================================================
Course Objectives:
1 .To learn architecture of PostgreySQL
2. To understand various data types
3. To create database tables and various table operations
4. To understand various database operations
5. To learn how to establish connection using python
Course Outcomes:
On completion of the course, students will be able to-
Course Outcomes BTL Blooms
Taxonomy
Descriptor
CO1 Understand various data types in database system 2 Understand

CO2 Database and various database tables operations 3 Apply

CO3 Apply Data Exploration methods to extract data from table 3 Apply

CO4 Importing And Exporting Techniques on Data 3 Apply

CO5 Apply Join Operation in Relational Database 3 Apply

CO6 Apply Data Pipeline using Psychopg2 and SQL alchemy ,3 Apply

Sanjivani College of Engineering, Kopargaon Page 17 of 25


Mapping of Course Outcomes to Program Outcomes (POs) & Program Specific Outcomes
(PSOs):

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 3 2 - - - - - - - - - - 3 - -
CO2 2 2 3 2 2 - - - - - - 1 3 - -
CO3 2 2 3 2 3 - - - - - - - 3 - -
CO4 3 1 2 1 3 - - - - - - - 3 - -
CO5 3 - 1 - - - - - - - - - 3 - -
CO6 1 2 3 1 3 - - - - - - - 3 - -

COURSE CONTENTS

Unit I Data Types No of COS


Hrs

Characters, Numbers, Choosing Data Type, Date and Time,


Using the interval Data Type in Calculations, Transforming 06 CO1
Values from One Type to Another with CASTING
.
Database and various database tables operations
Unit II No of COS
Hrs

Creating a Table, Primary keys and foreign keys, constraints,


Insert, update, Delete a Table, Alter table, drop table, check
constraints.
06 CO2

Unit III Data Exploration No of COS


Hrs

Select statement, select distinct, count, Filtering Rows with


WHERE, Sorting Data with ORDER BY, limit, between ,IN
,Using LIKE and ILIKE with WHERE, Combining Operators
06 CO3
with AND and OR, Aggregation statements, GROUPBY,
Having.

Unit IV Importing And Exporting Techniques on Data No of COS

Sanjivani College of Engineering, Kopargaon Page 18 of 25


Hrs

Working with Delimited Text Files, Header Rows, Using


· COPY to Import Data, Importing a Subset of Columns with
copy Adding a Default Value to a Column During Import,
06 CO4
Using COPY to Export Data, Exporting Particular Columns,
Exporting Query Results, Importing and Exporting Through
pgAdmin
Join Operations in Relational Database
Unit V No of COS
Hrs

Introduction to join, Join types, Inner join, Full outer join, left
outer join, right join, Using NULL to Find Rows with Missing
Values, Three Types of Table Relationships, One-to-One
06 CO5
Relationship, One-to-Many Relationship, Many-to-Many
Relationship, Selecting Specific Columns in a Join, Joining
Multiple Tables

Unit VI Data Pipeline No of COS


Hrs

Overview of python and postgre, Psychopg2 example,


importing and exporting csv using Psycopg2,Working with 06 CO6
Data via SQL Alchemy Core, Inserting Data, Querying Data,
Limiting, Built-In SQL Functions and Labels.
B Books
TText Books(T):

T1. Practicalsql A Beginner’s Guide to, Storytelling with Data, no starch publication by Anthony
DeBarros ,ISBN-10: 1-59327-827-6

T2. PostgreSQL A Practical guide to the advanced open source Regina Obe and Leo Hsu

T3.Mastering PostgreSQL 13
R Reference Books(R):

Sanjivani College of Engineering, Kopargaon Page 19 of 25


R1. Essential SQL Alchemy ,Jason Myers and Rick Copeland,O’reillypublication,ISBN-978-1-491-
91646-9
R2. Postgresql To High Performance, Enrico Pirozzi

E- Resources:

https://realpython.com/python-continuous-integration/
https://www.coursera.org/learn/database-design-postgresql
https://www.coursera.org/umich

Sanjivani College of Engineering, Kopargaon Page 20 of 25


CO1902: Practical Machine Learning for Data Science
Teaching Scheme Evaluation Scheme
Lectures: 4 Hrs. / Week CIA: 40 Marks
Credits: 4 End-Sem Exam: 60 Marks
Total: 100 Marks
====================================================================
Prerequisite Course: Mathematical Foundations for Data Science, Advanced Python Programming,
PostgreSQL and Data Pipeline
====================================================================
Course Objectives:
1. To understand and apply CRISP-ML(Q) method of machine learning models
2. To understand and apply Clustering, Dimensionality Reduction
3. To understand and apply various NLP strategies
4. To learn different and apply Supervised Machine Learning algorithm
5. To learn how to and apply evaluate the models and performance metrics
6. To understand and apply Regression and its types
Course Outcome (COs): On completion of the course, students will be able to-
Course Bloom's Taxonomy
Outcomes
Level Descriptor
Apply project management methodology and Exploratory data analysis 3 Apply

Apply feature engineering techniques 3 Apply


Apply text mining and NLP 3 Apply

Apply SVM, KNN, Naïve Bayes, Decision tree 3 Apply

Apply Supervised Learning -ensemble techniques. 3 Apply

Apply Supervised Learning Regression 3 Apply

Mapping of Course Outcomes to Program Outcomes (POs) & Program


Specific Outcomes (PSOs):
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 1 1 1 1 3 - - - 3 1 3 1 1 1 1
CO2 3 1 1 1 3 1 - 1 2 - - 2 2 2 1
CO3 3 2 1 1 3 1 - 1 2 - - 2 2 2 1

Sanjivani College of Engineering, Kopargaon Page 21 of 25


CO4 3 2 1 2 3 1 - - - - - 1 1 2 2
CO5 3 1 2 1 3 1 - - 2 - - 1 1 2 2
CO6 3 1 2 1 3 1 - - 2 - - 1 1 2 2

COURSE CONTENTS

Unit I EDA and Feature Engineering No. of Cos


Techniques Hours
Project management methodology (CRISP-ML-(Q)), Exploratory 6 CO1
data analysis (5 number summary, boxplot, bar graph, Histogram,
correlation graph, scatter plots), exploring two or more variables,
Data sampling and its types, various types of bias. Dummy variables
conversion techniques Standardization and normalization, outlier
identification and outlier treatment techniques, skewness
identification and its treatment. Finding null values and its treatment.
Unit II Unsupervised Learning-Clustering, Dimensionality Reduction No. of Cos
Hours
Supervised Vs Unsupervised Learning, Clustering/Segmentation 6 CO2
algorithms-Hierarchical, Distance metrics for categorical data,
Distance metrics for continuous ,distance metrics for mixed data,
distance for Clusters, K-means Clustering, K-selection-Elbow curve,
drawbacks and comparison
Need for Dimensionality Reduction, Principal component
analysis(PCA), applications for PCA, Singular Value
Decomposition(SVD), application of SVD

Unit Text Mining-Sentiment Analysis and No. of Cos


III NLP Hours
Need of text mining, Bag of words, terminology and preprocessing, 6 CO3
DTM and TDM, Corpus level Word Cloud. Introduction of NLP, data
preprocessing in NLP context, NLP terminology, feature extraction
from text, topic modeling, vector representation.

Unit Supervised Learning Algorithms No. of Cos


IV Hours
Bayesian classifier- definition, selection criteria, needs, applications, 6 CO4
advantages and constraints
K-nearest neighbor classifier- definition, selection criteria, needs,
applications, advantages, constraints and controlling complexity
Decision Trees, building of decision trees, SVM
Unit V Supervised Learning -Ensemble No. of Cos
Techniques Hours

Sanjivani College of Engineering, Kopargaon Page 22 of 25


Ensemble primer-Bias Vs variance trade off, Generative models vs 6
Non generative models, Bagging-Random Forest trees, Voting types, CO5
boosting -Ada boost, gradient boosting, XGBoost, and stacking,
selection criterion, constraints of ensemble techniques.
Unit Supervised Learning Regression No.of Cos
VI Hours
Scatter diagrams, Correlation Analysis, Ordinary least squares, 6 CO6
Regression Analysis, Simple Linear Regression, LINE assumptions,
LINE criterions training, Transformation, bias Vs variance, Stepwise
Regression. Overfitting in Linear Regression-Lasso Regression and
Ridge Regression.
Multiple Linear Regression, Logistic Regression-odds, Need,
Logistic Regression training

Books:

Text Books(T):

T1: Abhishek Vijayvargia, “Machine learning with Python: an approach to applied machine learning”, BPB
Publications , 1st Edition

T2: Ethem Alpaydin, “Introduction to Machine Learning”, PHI 2nd Edition-2013.

T3: Andreas C Muller and Sarah Guido, “Introduction to Machine learning with Python: Guide for data
scientists”, O’Reilly publication 1st Edition
Reference Books(R):
R1: Introduction to Data Science, A Python Approach to Concepts, Techniques and Applications, With
contributions from Jordi Vitrià, Eloi Puertas,Petia Radeva, Oriol Pujol, Sergio Escalera, Francesc Dantí and
Lluís Garrido,Springer publication, ISBN 978-3-319-50016-4
R2: Advanced Data Analytics Using Python, Sayan Mukhopadhyay, Apress publication ,ISBN-13 (pbk):
978-1-4842-3449-5

R3: C. M. Bishop, “Pattern Recognition and Machine Learning”, Springer 1st Edition-2013

R4: Hastie, Tibshirani, Friedman, “Introduction to Machine Learning”, Springer, 2nd Edition-12

Online Resource:
Introduction to Machine Learning
https://onlinecourses.nptel.ac.in/noc23_cs18/preview
Data Science for Engineers
https://onlinecourses.nptel.ac.in/noc23_cs97/preview

Sanjivani College of Engineering, Kopargaon Page 23 of 25


CO1903: Mini Project :Data Science Project implementation
Teaching Scheme Evaluation Scheme
Lectures: 2 Hrs. / Week Term Work 50 Marks
Assessment:
Credits: 2
Total: 50 Marks
====================================================================
Prerequisite Course: Mathematical Foundations for Data Science, Advanced Python Programming,
PostgreSQL and Data Pipeline , Practical Machine Learning for Data Science
====================================================================
Course Objectives:
1. To follow SDLC meticulously and meet the objectives of proposed work
2. To test rigorously before deployment of system
3. To validate the work undertaken
4. To consolidate the work as furnished report
Course Outcomes:

Course Outcomes BTL Blooms


Taxonomy
Descriptor
CO1 Show evidence of independent investigation critically 3 Apply
analyze the results and their interpretation.
CO2 Report and present the original results in an orderly way 3 Apply
and placing the open questions in the right perspective.
CO3 Link techniques and results from literature as well as 3 Apply
actual research and future research lines with the
research.
CO4 Appreciate practical implications and constraints of the 3 Apply
specialist subject

Mini Project Teamwork Assignments


Project workstation selection, installations along with setup and installation report
Preparations, Programming of the project functions, interfaces and GUI (if any) ,Test tool
selection and testing of various test cases for the project performed and generate
Various testing result charts, graphs etc. including reliability testing

1) T20 World Cup 2022 Analysis using Python


Every sports event generates a lot of data which we can use to analyze the performance of
players, teams, and many highlights of the game. As the ICC Men’s T20 world cup has
just finished, it has generated a lot of data we can use to summarize the event. So, if you
want to learn how to analyze a sports event like the t20 world cup, this article is for you.
This article will take you through the task of T20 World Cup 2022 analysis using Python.

Sanjivani College of Engineering, Kopargaon Page 24 of 25


2) Job Recommendation System using Python
A recommendation system is a popular application of Data Science that recommends
personalized content based on the users’ interests. Almost all the popular websites you
visit today use a recommendation system. As the name suggests, a job recommendation
system is an application that recommends jobs based on the skills and the user’s desired
role. So, if you want to learn how to recommend jobs using the Python programming
language, this article is for you. This article will help you learn about creating a Job
Recommendation System using Python.

3) Netflix Recommendation System using Python


Netflix is a subscription-based streaming platform that allows users to watch movies and
TV shows without advertisements. One of the reasons behind the popularity of Netflix is
its recommendation system. Its recommendation system recommends movies and TV
shows based on the user’s interest. If you are a Data Science student and want to learn
how to create a Netflix recommendation system, this article is for you. This article will
take you through how to build a Netflix recommendation system using Python.

4) Online Food Order Prediction with Machine Learning


There has been a high demand for online food orders after the introduction of Swiggy and
Zomato in the market. Food delivery companies use your buying habits to make the
delivery process faster. The food order prediction system is one of the useful techniques
these companies can use to make the entire delivery process fast. In this article, I will
take you through the task of Online Food Order Prediction with Machine Learning using
Python.

5) Flipkart Reviews Sentiment Analysis using Python


Flipkart is one of the most popular Indian companies. It is an e-commerce platform that
competes with popular e-commerce platforms like Amazon. One of the most popular use
cases of data science is the task of sentiment analysis of product reviews sold on e-
commerce platforms. So, if you want to learn how to analyze the sentiment of Flipkart
reviews, this article is for you. In this article, I will walk you through the task of Flipkart
reviews sentiment analysis using Python.
.

Sanjivani College of Engineering, Kopargaon Page 25 of 25

You might also like