NORMALITY TEST
LINDA AYU RIZKA PUTRI,S.K.M.,M.SC.
DATA SCALE REFRESHMENT
Normality test
• A test used to determine whether
  sample data has been drawn from a
  normally distributed population
• A number of statistical tests, such as
  the Student's t-test and the one-way
  and two-way ANOVA, require a
  normally distributed sample population.
Selecting normality test
• Analytically
• Graphically
Kolmogorov Smirnov
• The Kolmogorov-Smirnov test is used to test the null hypothesis
  that a set of data comes from a Normal distribution. The
  Kolmogorov Smirnov test produces test statistics that are
  used (along with a degrees of freedom parameter) to test
  for normality.
• Kolmogorov–Smirnov test is used for n ≥50.
• large deviation has a low p-value. As a rule of thumb, we
  reject the null hypothesis if p < 0.05. So if p < 0.05,
  we don't believe that our variable follows a normal
  distribution in our population.
Decision-making process in the normality test with
Kolmogorov-Smirnov
1.If the value Asymp.Sig. > 0.05, then the data is normally distributed research.
2.If the value Asymp.Sig. <0.05, then the research data is not normally distributed.
Performing KS in SPSS
Saphiro- Wilk
• The Shapiro–Wilk test is more appropriate method for small
  sample sizes (<50 samples) although it can also be handling
  on larger sample size while Kolmogorov–Smirnov test is used
  for n ≥50.
• A Shapiro-Wilk test is the test to check the normality of the data.
  The null hypothesis for Shapiro-Wilk test is that your data is
  normal, and if the p-value of the test if less than 0.05, then
  you reject the null hypothesis at 5% significance and
  conclude that your data is non-normal.
Saphiro wilk in spss
Interpreting result
1.If the value Asymp.Sig. > 0.05, then the data is normally distributed research.
2.If the value Asymp.Sig. ≤ 0.05, then the research data is not normally distributed.
Normal probability plot
Normal probability plot
• The normal probability plot (Chambers et al., 1983) is a
  graphical technique for assessing whether or not a data set
  is approximately normally distributed
Histogram
Decision
In univariate
• Data follow normal distribution (mean ± SD)
• Data violated normal distribution (median (IQR)
Bivariate/multivariate
• Data follow normal distribution = parametric test
• Data violated normal distribution = non-parametric data
How to select appropriate test
                            Distribusi                              Statistik parametrik
                             normal
      Uji asumsi                                                                  Statistik non-
     (normalitas,                                                                  parametrik
  independensi, dll)
                                                          Distribusi-log
                       Distribusi tidak                   tidak normal
                           normal
                                               Log-
                                          transformasi
                                                         Distribusi-log
                                                            normal
                                                                             Statistik parametrik lalu
                                                                             dieksponensial/anti-log
Question?
Skewness and kurtosis (next week, normal
plot in excel)
skewness
• Kemencengan atau kecondongan (skewness) adalah tingkat
  ketidaksimetrisan atau kejauhan simetri dari sebuah distribusi.
• If skewness is less than -1 or greater than 1, the distribution is highly
  skewed.
• If skewness is between -1 and -0.5 or between 0.5 and 1, the
  distribution is moderately skewed.
• If skewness is between -0.5 and 0.5, the distribution is approximately
  symmetric.
Kurtosis
• Kurtosis, keruncingan distribusi
  data adalah derajat atau ukuran
  tinggi rendahnya puncak suatu
  distribusi data terhadap distribusi
  normal data.
• When kurtosis is equal to
  3, the distribution is
  mesokurtic. This means the
  kurtosis is the same as the
  normal distribution, it is
  mesokurtic (medium peak).
Confidence interval
A confidence interval is the mean of your estimate plus and
minus the variation in that estimate. This is the range of values
you expect your estimate to fall between if you redo your test,
within a certain level of confidence. Confidence, in statistics, is
another way to describe probability
• Jika confidence interval melewati nol maka tidak ada perbedaan
  signifikan antara kelompok yang dibandingkan.
Misal CI = -2 – 0,5
Sebaliknya jika CI tidak mengandung nol, maka terdapat perbedaan
signifikan antara kelompok
Misal CI = 0,5 – 2,3