Applied Statistics Notes
Session 1
Statistics – is defined as a group of methods used to collect , analyze, present and
interpret data in order to make decisions. Two main types of statistics are
descriptive and inferential.
3 phases of analytics :
1. Descriptive -which use data aggregation and data mining to provide insight
into the past and answer: “What has happened?”
2. Predictive- which use statistical models and forecasts techniques to
understand the future and answer: “What could happen?”
3. Prescriptive - which use optimization and simulation algorithms to advice
on possible outcomes and answer: “What should we do?”
CRISP-DM
Step 1- identify what specific problem u are interested in addressing in this
specific data mining project.
Step 2- collecting of data> describe data>explore data> verify data quantity.
Step 3- select data> clean data> construct data> format data
Step 4- select modeling technique> generate test design> build model> access
model
Step 5- evaluate results> review process>determine next steps
Step 6- reuse of the discovered pattern