Statistical Modeling For Data Analysis
Statistical Modeling For Data Analysis
Statistical Modeling For Data Analysis
GROUP TWO
Cont.….
What is data analysis?
Data analysis is the process of
collecting,cieaning,sorting,and processing
raw data to extract relevant and valuable
information to help businesses. An in-depth
undersanding of data can improve customer
experience,retention,targeting,reducing
operational costs,and problem-solving
methods.
Cont…
Data analysis embraces a whole range of activities of both the qualitative
and quantitative type.
It is usual tendency in behavioral
research that much use of quantative analysis is made and statistical
methods and techniques are employed.
The statistical methods and techniques are employed .
The statistical methods and techniques have got a special position in research
because they provide answers to the problems.
Statistical Modeling
a database,
clouds,
social media, or
within a plain excel file.
Data Analysis Process
Data Collection: Guided by your identified requirements, it’s time to collect the data
from your sources. Sources include case studies, surveys, interviews, questionnaires,
direct observation, and focus groups. Make sure to organize the collected data for
analysis.
Data Cleaning: Not all of the data you collect will be useful, so it’s time to clean it
up. This process is where you remove white spaces, duplicate records, and basic
errors. Data cleaning is mandatory before sending the information on for analysis.
Data Analysis: Here is where you use data analysis software and other tools to help
you interpret and understand the data and arrive at conclusions. Data analysis tools
include Excel, Python, R, Looker, Rapid Miner, Chartio, Metabase, Redash, and
Microsoft Power BI.
Data Interpretation: Now that you have your results, you need to interpret them and
come up with the best courses of action based on your findings. Data Visualization:
Data visualization is a fancy way of saying, “graphically show your information in a
way that people can read and understand it.” You can use charts, graphs, maps, bullet
points, or a host of other methods. Visualization helps you derive valuable insights by
helping you compare datasets and observe relationships.
Types of Data Analysis
Ideally, the analysts find similar patterns that existed in the past,
and consequently, use those solutions to resolve the present
challenges hopefully.
Predictive Analysis: Predictive analysis answers the question,
“What is most likely to happen?” By using patterns found in
older data as well as current events, analysts predict future
events. While there’s no such thing as 100 percent accurate
forecasting, the odds improve if the analysts have plenty of
detailed information and the discipline to research it thoroughly.
Prescriptive Analysis: Mix all the insights gained from the other
data analysis types, and you have prescriptive analysis.
Sometimes, an issue can’t be solved solely with one analysis
type, and instead requires multiple insights.
CONT…
A Hypothesis testing
Testing hypotheses and drawing generalizations about the
population from the sample data are examples of inferential
statistics. Creating a null hypothesis and an alternative
hypothesis, then performing a statistical test of significance
are required.
A hypothesis test can have left-, right-, or two-tailed
distributions. The test statistic’s value, the critical value, and
the confidence intervals are used to conclude. Below are a
few significant hypothesis tests that are employed in
inferential statistics.
CONT…
Z Test:
When data has a normal distribution and a sample size of at least 30, the z
test is applied to the data. When the population variance is known, it
determines if the sample and population means are equal. The following
setup can be used to test the right-tailed hypothesis:
Null Hypothesis: H0: μ=μ0
Alternate hypothesis: H1: μ>μ0
Test Statistic: Z Test = (x̄ – μ) / (σ / √n)
where,
x̄ = sample mean
μ = population mean
σ = standard deviation of the population
n = sample size
Decision Criteria: If the z statistic > z critical value, reject the null
hypothesis.
B Regression analysis
Regression analysis is done to calculate how one variable
will change in relation to another. Numerous regression
models can be used, including simple linear, multiple
linear, nominal, logistic, and ordinal regression.
Regression Coefficients:
The straight line equation is given as y = α + βx, where α
and β are regression coefficients.
β=∑n1(xi − x̄)(yi −y) / ∑n1(xi−x)2
β=rxy σy / σx
α=y−βx
Here, x is the mean, and σx is the standard deviation of the
first data set. Similarly, y is the mean, and σy is the
standard deviation of the second data set.