Introduction to Data Science Course Outline

The document outlines the course 'Introduction to Data Science' offered by Wachemo University, detailing its objectives, learning outcomes, and assessment methods. It covers essential topics such as data exploration, statistical concepts, machine learning, and ethical issues in data science, aiming to equip students with foundational skills in the field. Prerequisites include basic knowledge of algorithms and programming, and the course utilizes Python and various data science tools for practical applications.

Uploaded by

Abdulkarim Emam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views5 pages

Introduction to Data Science Course Outline

Uploaded by

Abdulkarim Emam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

WACHEMO UNIVERSITY

COLLEGE OF ENGINEERING AND TECHNOLOGY

COMPUTER SCIENCE DEPARTMENT

Course Title: Introduction to Data Science (CoSc Continuous Assessment (100%)

2042)
Credit Hours: 3 Cr. Hrs
Year: II • Class Participation (5%)
Semister:I • Mid-exam (20)
Prerequisites: • Assignments (25%)
Office: 4 • Final exam (50%)
Email: habteshiferaw27@gmail.com
Instructor: Habtamu sh.
Course Description
Data Science is the study of the generalizable extraction of knowledge from data. Being a data
scientist requires an integrated skill set spanning mathematics, statistics, machine learning,
databases and other branches of computer science along with a good understanding of the craft of
problem formulation to engineer effective solutions. This course will introduce students to this
rapidly growing field and equip them with some of its basic principles and tools as well as its
general mindset. Students will learn concepts, techniques and tools they need to deal with various
facets of data science practice, including data collection and integration, exploratory data analysis,
predictive modeling, descriptive modeling, data product creation, evaluation, and effective
communication. The focus in the treatment of these topics will be on breadth, rather than depth,
and emphasis will be placed on integration and synthesis of concepts and their application to
solving problems. To make the learning contextual, real datasets from a variety of disciplines will
be used.

Learning Outcomes
At the conclusion of the course, students should be able to:

▪ Describe what Data Science is and the skill sets needed to be a data scientist.
▪ Explain in basic terms what Statistical Inference means. Identify probability distributions
commonly used as foundations for statistical modeling. Fit a model to data.
▪ Use python to carry out basic statistical modeling and analysis.
▪ Explain the significance of exploratory data analysis (EDA) in data science. Apply basic
tools (plots, graphs, summary statistics) to carry out EDA.

1
▪ Describe the Data Science Process and how its components interact.
▪ Use APIs and other tools to scrap the Web and collect data.
▪ Apply EDA and the Data Science process in a case study.
▪ Apply basic machine learning algorithms (Linear Regression, k-Nearest Neighbors (k-NN),
k-means, Naive Bayes) for predictive modeling. Explain why Linear Regression and k-NN
are poor choices for Filtering Spam. Explain why Naive Bayes is a better alternative.
▪ Identify common approaches used for Feature Generation. Identify basic Feature Selection
algorithms (Filters, Wrappers, Decision Trees, Random Forests) and use in applications.
▪ Identify and explain fundamental mathematical and algorithmic ingredients that constitute a
Recommendation Engine (dimensionality reduction, singular value decomposition, principal
component analysis). Build their own recommendation system using existing components.
▪ Create effective visualization of given data (to communicate or persuade).
▪ Work effectively (and synergically) in teams on data science projects.
▪ Reason around ethical and privacy issues in data science conduct and apply ethical practices.
Prerequisites
Students are expected to have basic knowledge of algorithms and reasonable programming
experience and some familiarity with basic linear algebra (e.g., solution of linear systems and
eigenvalue/vector computation) and basic probability and statistics. If you are interested in taking
the course, but are not sure if you have the right background, talk to the instructor. You may still
be allowed to take the course if you are willing to put in the extra effort to fill in any gaps.

Topics and course outline:

1. Introduction to Data Science
▪ What is Data Science?
▪ The need for Data Science
▪ Jobs in Data Science
▪ Types of Jobs in Data Science
▪ Components of Data Science
▪ Work Flow of Data Science
▪ Life Cycle (Process) of Data Science
▪ BI (Business Intelligence) Vs. Data Science
▪ Applications of Data Science
▪ Toolboxes for Data Scientists
 Introduction
 Why Python
 Fundamental Python Libraries for Data Scientists

2
✓ Numeric and Scientific Computation: NumPy and SciPy
✓ SCIKIT-Learn: Machine Learning in Python
✓ PANDAS: Python Data Analysis Library
 Data Science Ecosystem Installation
 Integrated Development Environments (IDE)
✓ Web Integrated Development Environment (WIDE): Jupyter
 Get Started with Python for Data Scientists
✓ Reading, Selecting Data, Filtering Data, Filtering Missing Values, Manipulating
Data, Sorting, Grouping Data, Rearranging Data, Ranking Data and Plotting
2. Data Exploration, Cleaning and Data visualization
▪ Exploratory Data Analysis (EDA)
▪ Data cleaning and preprocessing techniques
▪ Dealing with missing data and outliers
▪ Data Visualization
▪ Tools for data visualization (e.g., Matplotlib, Seaborn, ggplot2)
▪ Creating static and interactive visualizations
3. Statistical Concepts in Data Scienc
3.1 Descriptive statistics
▪ Introduction
▪ Descriptive statistics
▪ Exploratory Data Analysis
▪ Estimation
✓ Sample and Estimated Mean, Variance and Standard
3.2 Inferential statistics and hypothesis testing
▪ Introduction
▪ Statistical Inference
▪ Measuring the Variability in Estimates
✓ Point Estimates
✓ Confidence Intervals
▪ Hypothesis Testing
✓ Testing Hypotheses Using Confidence Intervals

3
4. Machine learning
▪ Introduction
▪ Supervised learning (e.g., decision trees, random forests, support vector machines)
▪ Unsupervised learning (e.g., clustering, dimensionality reduction)
▪ Evaluation of machine learning models
▪ Three Basic Machine Learning Algorithms
✓ Linear Regression
✓ k-Nearest Neighbors (k-NN)
✓ k-means
▪ Machine Learning Algorithm and Usage in Applications
5. Regression analysis and Regression:
▪ Introduction
▪ linear regression
✓ Simple linear regression
✓ Multiple & Polynomial regression
▪ Sparse model.
▪ Logistics regression
6. Unsupervised learning
▪ Introduction
▪ Clustering
✓ similarity and distances
✓ quality measures of clustering
7. Mining Social-Network Graphs- Social networks as graphs
▪ Clustering of graphs
▪ Direct discovery of communities in graphs
▪ Partitioning of graphs
▪ Neighborhood properties in graphs
8. Recommendation Systems: Building a User-Facing Data Product
▪ Algorithmic ingredients of a Recommendation Engine
▪ Dimensionality Reduction
▪ Singular Value Decomposition
▪ Principal Component Analysis
▪ Exercise: build your own recommendation system

4
9. Data Science and Ethical Issues
▪ Discussions on privacy, security, ethics
▪ A look back at Data Science
▪ Next-generation data scientists

Books
1. "Python for Data Analysis" by Wes McKinney "Data Science for Business" by Foster
Provost and Tom Fawcett
2. introduction to Data Science a Python approach to concepts, Techniques and
Applications, Igual, L;Seghi’, S. Springer, ISBN:978-3-319-50016-4
3. Data Analysis with Python A Modern Approach, David Taieb, Packt Publishing, ISBN-
9781789950069
4. Python Data Analysis, Second Ed., Armando Fandango, Packt Publishing, ISBN:
9781787127487
Software and Tools:
• Python (Jupyter Notebooks)
• R (optional)
• Data visualization tools (e.g., Matplotlib, Seaborn, ggplot2)
• Machine learning libraries (e.g., scikit-learn, TensorFlow, PyTorch)
Additional references and books related to the course:

• Jure Leskovek, Anand Rajaraman and Jeffrey Ullman. Mining of Massive Datasets. v2.1,
Cambridge University Press. 2014. (free online)
• Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. ISBN 0262018020. 2013.
• Foster Provost and Tom Fawcett. Data Science for Business: What You Need to Know about
Data Mining and Data-analytic Thinking. ISBN 1449361323. 2013.
• Trevor Hastie, Robert Tibshirani and Jerome Friedman. Elements of Statistical Learning,
Second Edition. ISBN 0387952845. 2009. (free online)
• Avrim Blum, John Hopcroft and Ravindran Kannan. Foundations of Data Science. (Note:
this is a book currently being written by the three authors. The authors have made the first
draft of their notes for the book available online. The material is intended for a modern
theoretical course in computer science.)
• Mohammed J. Zaki and Wagner Miera Jr. Data Mining and Analysis: Fundamental Concepts
and Algorithms. Cambridge University Press. 2014.
• Jiawei Han, Micheline Kamber and Jian Pei. Data Mining: Concepts and Techniques, Third
Edition. ISBN 0123814790. 2011.

Data Science Course Curriculum 27 Feb 2023
No ratings yet
Data Science Course Curriculum 27 Feb 2023
21 pages
Introduction To Datascience (R20DS501)
No ratings yet
Introduction To Datascience (R20DS501)
162 pages
21CSS303T DATA SCIENCE SYLLABUS
No ratings yet
21CSS303T DATA SCIENCE SYLLABUS
2 pages
Data_Science_Foundations_Syllabus
No ratings yet
Data_Science_Foundations_Syllabus
5 pages
A course in mathematical biology Quantitative modeling with mathematical and computational methods G. De Vries instant download
100% (1)
A course in mathematical biology Quantitative modeling with mathematical and computational methods G. De Vries instant download
53 pages
ho
No ratings yet
ho
9 pages
Dive Deep Into Data Science - Website
No ratings yet
Dive Deep Into Data Science - Website
2 pages
Full Stack Data Science Brochure 2024
No ratings yet
Full Stack Data Science Brochure 2024
62 pages
DSP U2
No ratings yet
DSP U2
172 pages
Data Science Topics
No ratings yet
Data Science Topics
7 pages
Share Probability and Non Probability Assignment
No ratings yet
Share Probability and Non Probability Assignment
5 pages
Data Science
No ratings yet
Data Science
15 pages
Ocs353dsf Unit Wise Notes
100% (2)
Ocs353dsf Unit Wise Notes
121 pages
hammad raza.
No ratings yet
hammad raza.
28 pages
Internship
No ratings yet
Internship
28 pages
IDS Syllabus
No ratings yet
IDS Syllabus
3 pages
Data Science - Unit 1 MDM
No ratings yet
Data Science - Unit 1 MDM
64 pages
Foundations of Data Science.docx
No ratings yet
Foundations of Data Science.docx
3 pages
Data Science Diary
No ratings yet
Data Science Diary
10 pages
AnalytixLabs - Data Science & Machine Learning With Python-1601625377114-1
No ratings yet
AnalytixLabs - Data Science & Machine Learning With Python-1601625377114-1
16 pages
Data Science
No ratings yet
Data Science
14 pages
Selected Topics - Datascience
No ratings yet
Selected Topics - Datascience
17 pages
Module 1_ Introduction to Data Science
No ratings yet
Module 1_ Introduction to Data Science
3 pages
Question Bank R
No ratings yet
Question Bank R
19 pages
Sem 6
No ratings yet
Sem 6
12 pages
Slidesgo Unlocking Insights A Professional Introduction To Data Science With Python 20241125160150D6YR
No ratings yet
Slidesgo Unlocking Insights A Professional Introduction To Data Science With Python 20241125160150D6YR
14 pages
Data Scientist Analyitcs Syllabus - Tech Transition
No ratings yet
Data Scientist Analyitcs Syllabus - Tech Transition
7 pages
Roadmap Geeksforgeeks
No ratings yet
Roadmap Geeksforgeeks
24 pages
Self Learning Material - Introduction To Data Science
No ratings yet
Self Learning Material - Introduction To Data Science
10 pages
Intro To Data-Science Final
No ratings yet
Intro To Data-Science Final
3 pages
20ad41e2 - Data Science
No ratings yet
20ad41e2 - Data Science
2 pages
CS3352 FDS
No ratings yet
CS3352 FDS
23 pages
Data Science and Analytics
No ratings yet
Data Science and Analytics
3 pages
Data Science and Analytics
No ratings yet
Data Science and Analytics
3 pages
Python For Data Science Syllabus
No ratings yet
Python For Data Science Syllabus
6 pages
Unit2 PDS
No ratings yet
Unit2 PDS
17 pages
DSP U1
No ratings yet
DSP U1
89 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
No ratings yet
Lesson1 Introduction To The Data Science Process and The Value of Learning Data Science
6 pages
Internship Report: T.J.Instituteoftechnology
No ratings yet
Internship Report: T.J.Instituteoftechnology
29 pages
Getting Started With Data Science Using Python
100% (1)
Getting Started With Data Science Using Python
25 pages
IDS UNIT 1,2,3,4 & 5
No ratings yet
IDS UNIT 1,2,3,4 & 5
117 pages
Linear Equations PYQs
No ratings yet
Linear Equations PYQs
29 pages
Data Science Intro
No ratings yet
Data Science Intro
52 pages
Data Science Course Outline CES LUMS
No ratings yet
Data Science Course Outline CES LUMS
4 pages
Gujarat Technological University: Overview of Python and Data Structures
No ratings yet
Gujarat Technological University: Overview of Python and Data Structures
4 pages
Introduction to Data Science
No ratings yet
Introduction to Data Science
25 pages
DSC Unit 1
No ratings yet
DSC Unit 1
59 pages
Applied Data Science With Python-N
No ratings yet
Applied Data Science With Python-N
17 pages
2024-25 Y9.17_Investigation in the Effectiveness of Antibiotics
No ratings yet
2024-25 Y9.17_Investigation in the Effectiveness of Antibiotics
5 pages
reading and ordering numbers
No ratings yet
reading and ordering numbers
10 pages
Mastering Data Science: From Basics to Expert Proficiency
From Everand
Mastering Data Science: From Basics to Expert Proficiency
William Smith
No ratings yet
Python For Data Science and Machine Learning
100% (2)
Python For Data Science and Machine Learning
31 pages
IFY数学 6份
No ratings yet
IFY数学 6份
101 pages
Course Outline PDF
No ratings yet
Course Outline PDF
2 pages
Data Science With Python-Sasmita PDF
67% (3)
Data Science With Python-Sasmita PDF
9 pages
Data Science Course and Machine Learnign Using Python
No ratings yet
Data Science Course and Machine Learnign Using Python
3 pages
Data Science
100% (2)
Data Science
52 pages
Introduction To Data Science: Cpts 483-06 - Syllabus
No ratings yet
Introduction To Data Science: Cpts 483-06 - Syllabus
5 pages
Introduction To Data ScienceA Python Approach To Concepts, Techniques and Applications PDF
100% (10)
Introduction To Data ScienceA Python Approach To Concepts, Techniques and Applications PDF
227 pages
File
No ratings yet
File
27 pages
Data Science Unveiled: A Practical Guide to Key Techniques
From Everand
Data Science Unveiled: A Practical Guide to Key Techniques
Ed A Norex
No ratings yet
Mastering Algorithms and Data Structures
From Everand
Mastering Algorithms and Data Structures
Manish Soni
No ratings yet
Dynamics of Rotational Motion and Angular Momentum Nonconservation
No ratings yet
Dynamics of Rotational Motion and Angular Momentum Nonconservation
29 pages
Exploring the World of Data Science and Machine Learning
From Everand
Exploring the World of Data Science and Machine Learning
NIBEDITA Sahu
No ratings yet
SHA512 Ftfubbj
No ratings yet
SHA512 Ftfubbj
11 pages
2.1. Dossey, J. (1992)
No ratings yet
2.1. Dossey, J. (1992)
10 pages
Solving The Latin Square Completion Problem by Memetic Graph Coloring
No ratings yet
Solving The Latin Square Completion Problem by Memetic Graph Coloring
14 pages
5028-Article Text-16342-1-4-20230406
No ratings yet
5028-Article Text-16342-1-4-20230406
8 pages
Group 2 - Semantics
No ratings yet
Group 2 - Semantics
4 pages
Programming Assignment 2: Algorithmic Warm-Up
No ratings yet
Programming Assignment 2: Algorithmic Warm-Up
14 pages
Second Order Perturbation Theory in General Relativity: Taub Charges As Integral Constraints
No ratings yet
Second Order Perturbation Theory in General Relativity: Taub Charges As Integral Constraints
10 pages
(Solved) Let V Be A Finite-Dimensional Vector Space, and Let W 1 and W 2 Be..
No ratings yet
(Solved) Let V Be A Finite-Dimensional Vector Space, and Let W 1 and W 2 Be..
3 pages
Low Voltage Drive - SJ 700i (415V) (30 KW To 450 KW) Catalogue
100% (1)
Low Voltage Drive - SJ 700i (415V) (30 KW To 450 KW) Catalogue
12 pages
Data Science Mastery: From Beginner to Expert in Big Data Analytics
From Everand
Data Science Mastery: From Beginner to Expert in Big Data Analytics
Kameron Hussain
No ratings yet
Residence Time Distribution
No ratings yet
Residence Time Distribution
8 pages
Periodical Test 2018 in TVL SMAW
93% (14)
Periodical Test 2018 in TVL SMAW
9 pages
1.1 Gauss's Law
No ratings yet
1.1 Gauss's Law
14 pages
It Is The Set of Values of F (X) For Which F Is Defined. - Google Search
No ratings yet
It Is The Set of Values of F (X) For Which F Is Defined. - Google Search
1 page
Is It Time For A Raise?
No ratings yet
Is It Time For A Raise?
2 pages
I. Model Problems. II. Practice III. Challenge Problems VI. Answer Key
No ratings yet
I. Model Problems. II. Practice III. Challenge Problems VI. Answer Key
6 pages
CS 423 Oop 1
No ratings yet
CS 423 Oop 1
5 pages
Solved Problems PDF
100% (5)
Solved Problems PDF
11 pages
Assignment 2: Submitted by
No ratings yet
Assignment 2: Submitted by
7 pages
Algorithm: A Variable Is A Named Value That Can Be Changed As The Program Runs
No ratings yet
Algorithm: A Variable Is A Named Value That Can Be Changed As The Program Runs
4 pages
1 Class Assignments Miscellaneous - PMD
No ratings yet
1 Class Assignments Miscellaneous - PMD
3 pages
Chapter 4 Practice Quiz-1
No ratings yet
Chapter 4 Practice Quiz-1
4 pages
Outlay For Management Report: Format of Research Reports
No ratings yet
Outlay For Management Report: Format of Research Reports
1 page
Brennan - The Mansions of Thomistic Philosophy
100% (4)
Brennan - The Mansions of Thomistic Philosophy
13 pages
Architecture and Programming of 8051 MCU
No ratings yet
Architecture and Programming of 8051 MCU
111 pages