Basic Concepts in
Research
Basic Concepts in
Statistics
Statistics in Psychology
The use statistics in psychology is to conduct research in the field.
Most of educational programs in psychology have thesis as an essential
requirement of the degree and all these thesis have some statistical
analysis which are done with the use of statistics.
Determine the sample size, measure the correlation between different variables,
and assess the role of gender and other factors with the use of statistics.
Further statistics enables us to measure the effect of one or more independent
variables on some dependent variables through regression analysis.
Mediation and moderation analysis are also conducted with the use of statistics.
Psychologists develop scales and tests to measure different factors with the use of
statistics.
SD
A Measure of Central Tendency refers to a single or individual value
that defines the manner in which a group of data assembles around a
central value. In other words, one single value describes the
behavior pattern of the whole data group.
The three measures of central tendency are
Mean
Median
Mode
Mean & Median
• Mean of a series of data is the value equal to the sum of the values of
all the observations divided by the number of observations. It is the
most commonly used measure of central tendency.
• Median is the central or the middle value of a data series. In other
words, it is the mid value of a series that divides it into two parts such
that one half of the series has the values greater than the Median
whereas the other half has values lower than the Median. For the
calculation of Median, we need to arrange the data series either in
ascending order or descending order.
• Mode refers to the value that occurs a most or the maximum number
of times in a data series.
Formula
Mean = Sum of observation/Number of observation
Median = {(n+1)/2}th term when n is odd
Median = [(n/2)th term + {(n/2)+1}th]/2 when n is even
Mode = Value repeated maximum number of times
Variance
Range: the difference between the highest and lowest values
The range is calculated by subtracting the value of the lowest data point from the
value of the highest data point. For example, in a sample of children between
the ages of 2 and 6 years the range would be 4 years. When reporting the
range, researchers typically report the lowest and highest value (Range = 2 -
6 years of age).
Standard Deviation
To measure how much the observations within a data set vary from each
other. In general, interpreting the standard deviation follows these rules of
thumb : The lower the standard deviation, the closer the values are to the mean and
the less variability there is..
The higher the standard deviation, the farther the values are spread from the mean
and the more variability there is..
Variance
A commonly used measure of dispersion for variables. The variance is
calculated by squaring the standard deviation. The variance is based on the
square of the difference between the values for each observation and the
mean value.
Normality Test
Tests for normality calculate the probability that the sample was
drawn from a normal population.
The hypotheses used are:
Ho: The sample data is not significantly different from the normal
population.
Ha: The sample data is significantly different from the normal
population.
When testing for normality:
• Probabilities > 0.05 indicate that the data are normal.
• Probabilities < 0.05 indicate that the data are NOT normal.
SPSS runs two statistical tests of normality
Kolmogorov-Smirnov and Shapiro-Wilk.
If the skewness is between -0.5 & 0.5, the data are nearly
symmetrical. If the skewness is between -1 & -0.5 (negative
skewed) or between 0.5 & 1(positive skewed), the data are slightly
skewed. If the skewness is lower than -1 (negative skewed) or
greater than 1 (positive skewed), the data are extremely skewed.
Kurtosis is a measure of whether the distribution is too peaked (a
very narrow distribution with most of the responses in the center). A
positive value for the kurtosis indicates a distribution more peaked
than normal. In contrast, a negative kurtosis indicates a shape flatter
than normal.
The values for asymmetry and kurtosis between -2 and +2 are
considered acceptable in order to prove normal univariate
distribution (George & Mallery, 2010). Hair et al. (2010) and Bryne
(2010) argued that data is considered to be normal if skewness
is between ‐2 to +2 and kurtosis is between ‐7 to +7.
Parametric Test
In Statistics, a parametric test is a kind of the hypothesis test which
gives generalizations for generating records regarding the mean of
the primary/original population. The t-test is carried out based on the
students t-statistic, which is often used in that value.
The t-statistic test holds on the underlying hypothesis which includes the
normal distribution of a variable. In this case, the mean is known, or it is
considered to be known. For finding the sample from the population,
population variance is identified. It is hypothesized that the variables of
concern in the population are estimated on an interval scale.
Non-Parametric Test
The non-parametric test does not require any population
distribution, which is meant by distinct parameters. It is also a kind of
hypothesis test, which is not based on the underlying hypothesis. In the
case of the non-parametric test, the test is based on the differences in the
median. So, this kind of test is also called a distribution-free test. The test
variables are determined on the nominal or ordinal level. If the
independent variables are non-metric, the non-parametric test is usually
performed.
Homogeneity
Homogeneity is the level of uniformity among sampling units within
a population. Homogeneity is commonly interpreted as meaning that all the
items in the sample are chosen because they have similar or identical traits
(for example, people in a homogeneous sample might share the same
age, location, or employment).
Homogeneity of variance
• Homogeneity of variance is an assumption underlying both t tests
and F tests (analyses of variance, ANOVAs) in which the
population variances (i.e., the distribution, or “spread,” of scores
around the mean) of two or more samples are considered equal.
• A homogeneity hypothesis test formally tests if the populations
have equal variances. Many statistical hypothesis tests and
estimators of effect size assume that the variances of the
populations are equal.
Levene’s test is an equal variance test. It can be used to check if our
data sets fulfill the homogeneity of variance assumption before we
perform the t-test or Analysis of Variance (ANOVA).
Standard Error or SE is used to measure the accurateness with the
help of a sample distribution that signifies a population
taking standard deviation into use, or in other words, it can be
understood as a measure with respect to the dispersion of a sample
mean concerned with the population mean.