Madan Mohan Malaviya Univ.
of Technology , Gorakhpur
Madan Mohan Malaviya University of Technology,Gorakhpur
INFORMATION TECHNOLOGY AND COMPUTER APPLICATION
“DATA SCIENCE ”
Under the Supervision of- Presented by -
Dr. R.K. Dwivedi Name: PRIYA
Assistant Professor Roll
no2021104039
Course: MCA
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
CONTENTS
What is data
science
Data Science Need of Data
applications Science
Solving
problems with Data Science
Data Science components
Data Science
Life Cycle
17-03-23 Side 2
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
WHAT IS DATA SCIENCE?
Data science, also known as data-driven science, is an interdisciplinary
field of scientific methods, processes, algorithms and systems to extract
knowledge or insights from data in various forms, either structured or
unstructured, similar to data mining.”
17-03-23 Side 3
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
• “Data science, also known as data-driven science, is an
interdisciplinary field of scientific methods, processes, algorithms and
systems to extract knowledge or insights from data in various forms,
either structured or unstructured, similar to data mining.”
• “Data science intends to analyze and understand actual phenomena with
‘data’. In other words, the aim of data science is to reveal the features or the
hidden structure of complicated natural, human, and social phenomena with
data from a different point of view from the established or traditional theory
and method.”
17-03-23 Side 4
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
DATA SCIENCE APPLICATIONS
17-03-23 Side 5
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
Image
recognition
and speech
recognition
Gaming
Risk detection
world
Applications of
Internet
Data Science Recommendation
systems
search
Healthcare Transpor
t
17-03-23 Side 6
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
NEED OF DATA SCIENCE
17-03-23 Side 7
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
Data Science enables companies to efficiently understand gigantic data from
multiple sources and derive valuable insights to make smarter data-driven
decisions. Data Science is widely used in various industry domains, including
marketing, healthcare, finance, banking, policy work, and more.
Today’s world everyone use internet and generate 1.7 MB of data at
every single second, by a single person on earth.
Data is becoming so vast. Approx. 2.5 quintals bytes of data is
generated on every day, which led to data explosion. To handle this
kind of big data
Data Science is exits.
17-03-23 Side 8
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
DATA SCIENCE COMPONENTS
17-03-23 Side 9
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
DATA SCIENCE LIFECYCLE
17-03-23 Side
10
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
1. Discovery: The first phase is discovery, which involves
asking the right questions. When you start any data science
project, you need to determine what are the basic
requirements, priorities, and project budget.
2.Data preparation: Data preparation is also known as Data
Wrangling. In this phase, we
need to perform the following tasks:
•Data cleaning
•Data Reduction
•Data integration
•Data transformation
17-03-23 Side
11
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
3.Model Planning: In this phase, we need to determine the
various methods and techniques to establish the relation
between input variables. We will apply Exploratory data
analytics(EDA) by using various statistical formula and
visualization tools to understand the relations between
variable and to see what data can inform us. Common tools
used for model planning are:
i. SQL Analysis Services ii. SAS iii. R iv.
Python
17-03-23 Side
12
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
4. Model-building : In this phase, the process of model
building starts. We will create datasets for training and testing
purpose. We will apply different techniques such as
association, classification, and clustering, to build the model.
Following are some common
Model building tools:
•SAS Enterprise Miner
•WEKA
•SPCS Modeler
•MATLAB
17-03-23 Side
13
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
5.Operationalize: In this phase, we will deliver the final
reports of the project, along with briefings, code, and technical
documents. This phase provides you a clear overview of
complete project performance and other components on a small
scale before the full deployment.
6.Communicate results: In this phase, we will check if we reach
the goal, which we have set on the initial phase. We will
communicate the findings and final result with the business team.
17-03-23 Side
14
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
PROBLEM SOLVING IN DATA SCIENCE
These are most common
types of problems
occurred in data science.
In data science, problems
are solved using
algorithms
The diagram
representation for
applicable algorithms for
possible questions:
17-03-23 Side
15
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
UNDERSTANDING THE DIFFERENCE BETWEEN AI, ML AND
DATA SCIENCE
17-03-23 Side
16
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
Let's imagine we're building a self-driving car and trying to
make it stop at stop signs. For that, we need all three – data
science, artificial intelligence, and machine learning.
Machine learning
The car should recognize stop signs using its cameras. So we need to create a
dataset with millions of streetside objects photos and train an algorithm to
recognize which have stop signs on them.
Artificial intelligence
As soon as the car recognizes stop signs, it should start applying the brakes.
The car should hit the brakes right in time, not too early or too late. Plus,
we should mind different road conditions like a slippery road. This is an
issue of control theory.
17-03-23 Side
17
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
Data science
During all these tests, we see that sometimes our car doesn’t
react to stop signs. By analyzing the test data, we find out that
the number of false results depends on the time of day. Our
car tends to miss stop signs at night. Then, we see that most of
the training data include objects in full daylight, and now can
add a few nighttime pics and get back to learning.
That’s it. That’s how the whole machine learning vs. artificial
intelligence vs. data science correlation works.
17-03-23 Side
18
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
17-03-23 Side
19
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
CONCLUSION
Data Science has become the most demanding job of the
21st century. Every organization is looking for candidates
with knowledge of data science. Data science can add
value to any business who can use theií data well. Fíom
statistics and insights across workflows and hiring new
candidates, to helping senior staff make better informed
decisions, data science is valuable to any company in any
industry
17-03-23 Side
20
Madan Mohan Malaviya Univ. of Technology, Gorakhpur
17-03-23 Side
21