DATA ANALYTICS REV.
Quantitative data
● can be expressed as numbers, counted or compared on numerical scales
● Examples: # of customer, sales, inventory, tourist arrival, rate of satisfaction
Qualitative data
● Can be expressed as textual descriptions, maps, pictures, transcripts
Data- unprocessed or unanalyzed
Information- processed
Data analytics
● refers to the process and practice of analyzing data to answer questions, extract
insights, and identify trends (Harvard Business School Online, n.d.)
● The science of analyzing raw data to make conclusions about that information
(Frankenfield, 2020)
goal of DATA ANALYTICS
● To cultivate a culture of asking good questions and letting the data to provide answers.
4 types of data analytics
Descriptive (what happened?)
● This is the method of analyzing historical data to identify patterns and trends
Diagnostic (why something happened?)
● This involves more diverse data inputs and a bit of hypothesizing.
Predictive (what will likely to happen?)
● The algorithms used by platforms like Netflix.
● deployment of advanced analytic techniques that allow organizations to anticipate trends
and take advantage of opportunities
Prescriptive(what are we going to do about it?)
● This is where the organization are provided with best possible answer given all business
constraints
● Travel agencies employ prescriptive analytics to recommend personalized travel
itineraries to customers.
(3 ERAS of data analytics)
Analytics 1.0 (BUSINESS INTELLIGENCE) (1950-2009)
● This was the era of the enterprise data warehouse, operational data store, data
marts etc.
● Data sources were relatively small and structured.
● Before analysis, data is to be stored in enterprise warehouses.
● Sourced internally.
● Most of the activity involved descriptive analytics or some form of reporting.
Analytics 2.0 (BIG DATA ANALYTICS) (2010)
New technologies also began to be employed:
● NoSQL databases
● Hadoop clusters
● In-memory analytics
● In-database analytics
● Machine learning
● Visual analytics
Analytics 3.0 (FAST IMPACT FOR DATA ECONOMY) (2010)
● It is an environment with the combination of Analytics 1.0 and 2.0 or a blend of
traditional and big data analytics.
● Every company can create data and analytics-based products and services
● Rise of Prescriptive analytics
● This connects data generated at the edge with data that is stored in enterprise
data centers.
Attributes of Analytics 3.0 are described as:
● Combination of large and small volumes of data;
● Internally and externally sourced;
● Structured and unstructured
● Combines in-database and in-memory analytics with agile analytical methods
and machine learning techniques which produces insights faster
● Data Science/ Analytics/ IT Teams
● Chief Analytics Officer
(ETHICS IN HANDLING DATA)
INFORMED CONSENT
● Ensure that participants are provided informed consent, understanding the purpose of
the study, their involvement, and any potential risks or benefits.
CONFIDENTIALITY
● Ensure that participants' personal data and responses are kept confidential and that any
identifying information is protected.
AUTONOMY
● Respect participants' decisions to participate or withdraw from the study without any
negative consequences.
AVOIDING HARM
● Take measures to minimize any potential harm or discomfort to participants during data
collection, such as distressing survey questions.
HONEST REPORTING
● Report findings accurately and honestly, avoiding manipulation or misrepresentation of
data.
CULTURAL SENSITIVITY
● Be culturally sensitive when working with diverse populations, respecting cultural norms,
values, and traditions.
(BIG DATA)
3 Vs:
VOLUME
VARIETY
VELOCITY
DATA PREPROCESSING
● This is an essential stage in data mining and analysis, when raw data is converted into a
format that can be comprehended and analyzed by computers and machine learning
algorithms.
(STATISTICAL TEST)
COMPARISON TEST
Differences in means, medians or ranking of scores of two or more groups
T-test
Used to comp[are the means of two groups
Analysis of variance (ANOVA)
Used to compare the means of three or more groups
Correlation test
Test that determines the extent to which two variables are associated.
Pearson’s r
Measure the strength and direction of a linear relationship between two numeric variables.
Spearman’s r
Measures the relationship between two variables, but it uses their rank order, not their actual
values.
Chi Square test of independence
Checks whether two categorical variables are related or independent of each other.
Regression test
Tests that demonstrate whether changes in predictor variables cause changes in outcome
variables.
Simple linear regression
- Measure the relationship between 1 predictor and 1 outcome
- Both variables are numeric
Multiple Linear Regression
- Measure relationship between 2 or more predictors and 1 outcome.
- All variables are numeric