Statistical Inference and Hypothesis Testing: Concepts, Examples, and Unified R Script
Main Takeaway: Statistical inference uses sample data to draw conclusions about a population.
Hypothesis testing formalizes decision-making, balancing Type I and Type II errors. This guide covers
key concepts, common tests, and provides a single R script that implements theory and application.
1. Population vs. Sample
A population is the entire set of entities of interest (e.g., all adult heights in a country). A sample is a
subset drawn from the population, used to estimate population parameters.
Population parameter (e.g., μ, σ) is unknown.
Sample statistic (e.g., xˉxˉ, s) estimates the parameter.
2. Null and Alternative Hypotheses
Null hypothesis (H₀): The default claim (e.g., μ = μ₀).
Alternative hypothesis (H₁ or Hₐ): Contradicts H₀ (e.g., μ ≠ μ₀, μ > μ₀, or μ < μ₀).
3. Significance Level and Errors
Significance level (α): Maximum probability of a false positive (rejecting H₀ when true).
Type I error: Reject H₀ when it is true (P = α).
Type II error (β): Fail to reject H₀ when false. Power = 1 − β.
4. Confidence Intervals
A (1 − α)·100% confidence interval for a parameter is a range that, under repeated sampling,
contains the true parameter 100·(1 − α)% of the time.
For a mean with known σ:
xˉ±z1−α/2 σn.xˉ±z1−α/2nσ.
For unknown σ: use t critical value.
5. One-Sample t-Test
Purpose: Test H₀: μ = μ₀ for a single group when σ is unknown.
Test statistic: t=(xˉ−μ0)/(s/n)t=(xˉ−μ0)/(s/n).
Compare to tn−1, 1−α/2tn−1,1−α/2.
Example: Is the average systolic blood pressure of a sample (n = 30, xˉ=128xˉ=128, s = 15) equal to
120?
6. One-Sample Proportion Test
Purpose: Test H₀: p = p₀ for a proportion.
Statistic: z=(p^−p0)/p0(1−p0)/nz=(p^−p0)/p0(1−p0)/n.
Example: In 200 voters, 120 favor candidate A. Test p = 0.5.
7. Paired-Sample t-Test
Purpose: Compare means of two related samples (before/after).
Compute differences didi. Test H₀: μd = 0 via one-sample t-test on d.
Example: Weight before and after a diet for 20 subjects.
8. Independent Samples t-Test
Purpose: Compare means of two independent groups.
If variances equal, pooled-variance t; otherwise, Welch’s t.
Statistic:
t=xˉ1−xˉ2sp2(1/n1+1/n2).t=sp2(1/n1+1/n2)xˉ1−xˉ2.
Example: Test whether male and female test scores differ.
9. Two-Sample Proportion Test
Purpose: Compare two independent proportions.
Pooled p^=(x1+x2)/(n1+n2)p^=(x1+x2)/(n1+n2).
z=(p^1−p^2)/p^(1−p^)(1/n1+1/n2)z=(p^1−p^2)/p^(1−p^)(1/n1+1/n2).
Example: Compare defect rates at two factories.
10. One-Way Analysis of Variance (ANOVA)
Purpose: Test mean equality across k > 2 groups.
H₀: all group means equal.
F=MSbetween/MSwithinF=MSbetween/MSwithin.
Example: Compare average yield for three fertilizer types.
11. Chi-Square Test
Purpose: Test association in a contingency table or goodness-of-fit.
Independence test:
χ2=∑(O−E)2Eχ2=∑E(O−E)2.
Goodness-of-fit: Compare observed counts to expected.
Example: Gender vs. preference for product variants.