[go: up one dir, main page]

0% found this document useful (0 votes)
6 views24 pages

Stats Assignment 2

Uploaded by

Rubab Zuhra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views24 pages

Stats Assignment 2

Uploaded by

Rubab Zuhra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

ADVANCED STATISTICAL ANALYSIS

Assignment # 02

Applied Statistical Analysis: A Hands-on Assignment

Submitted by

Rubab Zuhra (SP25-RCP-022)

Submitted to

Ma’am Zaeema Farooq

Dated

16th June, 2025

Department of Humanities

COMSATS University Islamabad,

Lahore Campus
1. Factorial ANOVA (Two-way ANOVA)

Question
The effective life (in hours) of batteries is compared by material type (1, 2 or 3) and

operating temperature: Low (-10˚C), Medium (20˚C) or High (45˚C). Twelve batteries are

randomly selected from each material type and are then randomly allocated to each temperature

level. The resulting life of all 36 batteries is shown in the table. Is there a difference in mean life

of the batteries for differing material type and operating temperature levels? (Table 1.1)

Table 1.1: Two-way ANOVA for Material Type, Temperature, and Battery Life-Time (N=36)

Battery Lifetime

Variable M SD F ratio df ηp2

Material 7.91** (2, 27) .37

Type 1 83.16 48.58

Type 2 108.33 49.47

Type 3 125.08 35.77

Temperature 28.97*** (2,27) .68

Low 144.83 31.69

Medium 107.58 42.88

High 64.53 47.10

Material * Temperature 3.56* (4,27) .35

Note. M=Mean, SD=Standard Deviation


Equality of variances were assumed with F(8, 27)=.90, p=.53. There was a significant

main effect of material type on battery life time, F(2, 27) = 7.91, p = .002. Tuckey HSD revealed

that type 1 was not significantly different from type 2 (M.D= -25.17, p=.06), but was

significantly different from type 3 ( M.D= -41.92, p=.001). Also, type 2 was not significantly

different from type 3 (M.D= -16.75, p=.27). Type 3 (M= 125.08, SD= 35.77) scored highest as

compared to type 2 (M=108.33, SD=49.47) and type 1 (M=83.16, SD=48.58). The main effect of

temperature on battery life-time was also significant, F(2, 27) = 28.97, p < .001. Tuckey HSD

revealed that the low temperature was significantly different from medium temperature ( M.D=

37.25, p=.004), and also from high temperature (M.D=80.67, p<.001). Also, medium temperature

was also significantly different from high temperature ( M.D= 43.42, p<.001). Low temperature

(M=144.83, SD= 31.69) scored highest as compared to medium temperature (M=107.58,

SD=42.88) and high temperature (M=64.17, SD=25.67).

The interaction between material type and temperature was also significant, F(4, 27) =

3.56, p = .02. The interaction plot revealed that for type 1, low temperature have highest battery

life time while medium and high temperature have equal battery life time and lower as compared

to low temperature. For type 2, low temperature has highest battery life time as compared to

medium and high temperature. For type 3, medium temperature has highest battery life time as

compared to low and high temperature. As the material type change, the battery life time for low

temperature first increases then decreases. Meanwhile, the battery life time for medium

temperature increases but for high temperature, first decreases then increases. It revealed that

battery life is not consistent across material types under different temperatures. Type 2 performs

best at low temperature but worst at high. Type 3 is relatively more stable across temperatures

but performs best at medium. Type 1 performs poorly at medium and high temperatures.
Figure 1.1: Graphical Representation of Estimated Marginal Means of Battery Lifetime as a

function of Temperature and Material Type


2. Multivariate Analysis of Variance

Question:

A researcher randomly assigns 36 subjects to one of three groups. The first group

receives technical dietary information interactively from an on-line website. Group 2 receives

the same information from a nurse practitioner, while group 3 receives the information from a

video tape made by the same nurse practitioner. The researcher looks at three different ratings of

the presentation, difficulty, usefulness and importance, to determine if there is a difference in the

modes of presentation. In particular, the researcher is interested in whether the interactive

website is superior because that is the most cost-effective way of delivering the information

(Table 2.1).

Table 2.1: One-way MANOVA for Instructional Format, Presentation, Difficulty, and

Usefulness and Importance (N=36).

Mode of Receiving Technical Dietary Information

Online Nurse Videotaped

(n=12) (n=12) (n=12)

Variable M SD M SD M SD F ratio df ηp2

Presentation 5.75 1.71 9.25 3.40 15.25 1.89 20.91*** (2,9) .82

Difficulty 17.50 3.42 25.25 8.46 18.00 3.16 12.09** (2,9) .73

Usefulness 16.75 3.20 8.00 2.94 21.00 1.15 6.65* (2,9) .60

Note. M=Mean, SD=Standard Deviation

Equality of Covariance matrices were assumed (Box M = 35.17, F(12, 393) = 1.43, p

= .15). A multivariate main effect of mode of receiving technical dietary information [online,
nursing, videotaped] was significant on (Wilks’ λ = .01, F(6, 14) = 21.10, p < .001) on

presentation, difficulty and usefulness. Equality of variances were assumed for presentation (F(2,

9) = .83, p = .47). There was a significant univariate effect of mode of receiving technical dietary

information on presentation (F(2, 9) = 20.91, p <.001). Tukey HSD revealed that online mode

group was significantly different from nursing group (M.D = -11.75, p < .001) and from

videotape (M.D = -11.00, p = .001). Moreover, the nursing group was not significantly different

from videotape group (M.D = .75, p = .92). The nursing group (M = 17.5, SD = 3.42) was scored

highest in presentation as compared to online mode group (M = 5.75, SD = 1.71) and video tape

group (M = 16.75, SD = 3.20).

Equality of variances were not assumed for difficulty (F(2, 9) = 5.39, p = .03). There was

a significant univariate effect of mode of receiving technical dietary information on difficulty

(F(2, 9) = 12.09, p = .003). Games-Howell revealed that online mode group was not significantly

different from nursing group (M.D = -16.00, p = .053) and not from videotaped group (M.D =

1.25, p = .85). The nursing group was significantly different from the videotaped group (M.D =

17.25, p =.04). Nursing group (M = 25.25, SD = 8.46) scored highest as compared to online

mode group (M = 9.25, SD = 3.40) and video tape (M = 8.00, SD = 2.94).

Equality of variances were assumed for usefulness (F(2,9)= 2.62, p=.12). There was a

significant univariate effect of type of instruction format on usefulness (F(2, 9) = 6.65, p = .02).

Tuckey HSD revealed that online mode group was significantly different from videotaped (M.D

= -5.75, p = .013) but not significantly different from nursing group (M.D = -2.75, p = .24). The

nursing group was also not significantly different from videotaped group (M.D = -3.00, p = .19).

Videotape group (M = 21.00, SD = 1.15) scored highest in usefulness as compared to technical

group (M = 15.25, SD = 1.89) and nursing group (M = 18.00, SD = 3.16).


3. Moderation Analysis

Question:

Does cardiovascular disease (CVD) health moderate the relationship between physical

fitness and mental well-being? (Table 3.1)

Table 3.1: Moderation Analysis for Physical Fitness, CVD Health, and Mental well-being

(N=300)

95% CI for B

Variable B LL UL SE B β R2 ΔR2

Model 1

Constant 12.83 .23***

Physical Fitness 2.15*** 1.69 2.60 .23 .48***

Model 2

Constant 12.83 .17***

Physical Fitness 1.10*** .64 1.55 .23 .24***

CVD Health 2.16*** 1.70 2.61 .21 .48***

Model 3

Constant 12.43 .44*** .04***

Physical Fitness 1.15*** .71 1.59 .23 .26***

CVD Health 2.04*** 1.60 2.48 .23 .45***

Physical Fitness X CVD


.82*** .47 1.16 .18 .20***
Health

Note. B= Unstandardized Regression Coefficient, SE= Standard Error, β=Standardized

Regression Coefficient, Cl = Confidence Interval; UL=Upper Limit; LL=Lower Limit. *p<.05.

**p<.01. ***p<.001
Overall, model explained the 44% variances in mental well-being with F(3,296)= 77.97,

p<.001. When Physical fitness was added in first block, Model 1 explained 23% variances in

mental well-being with Fchange (1, 298)=87.18, p<.001. In this block, physical fitness was a

positive predictor of mental well-being (B=2.15***, β=.48***).

When CVD health was added as a moderator in Block 2, Model 2 explained 17%

variance in mental well-being with Fchange(1, 297)=86.33, p<001. In this model, CVD health was

also a significant predictor of mental well-being (B=2.16***, β=.48***).

When interaction term was added in 3rd block, Model 3 explained the 4% variance in the

mental well-being with Fchange(1, 296)= 21.63, p<.001. Interaction between physical fitness and

CVD health was also significant predictor of mental well-being (B= .82***, β=.20***), showing

that CVD health played a moderating role in the association between Physical fitness and mental

well-being.
Figure 3.1: Graphical Representation of moderating role of CVD Health in link between

Physical Fitness and Mental Well-being

Note. This graph depicts the moderation analysis examining the moderating role of CVD health
in the association between Physical Fitness and Mental well-being.
*p<0.05, **p<0.01, ***p<0.001

The above graph depicts the moderating role of CVD health in the association between

physical fitness and mental well-being. For low physical fitness and low CVD health, mental

well-being was the lowest compared to medium and high CVD health. For medium physical

fitness, mental well-being was lowest in low CVD health as compared to medium and high CVD

health. For high physical fitness, mental well-being was lowest in low CVD health as compared

to medium and high CVD health. Mental well-being was increased with high physical fitness and

high CVD health.

With increased physical fitness and increased CVD health, mental wellbeing was also

increased.
4. Mediation Analysis

Question:

Does Cognitive Flexibility mediate the relationship between Dysfunctional Impulsivity

and Thrill Sensation? (Table 4.1, 4.2, 4.3)

Table 4.1: Model Fit Indices of Dysfunctional Impulsivity, Cognitive Flexibility, and Thrill

Sensation (N=224)

Model χ2 df p TLI CFI RMSEA

Model 1 .01 01 .93 1.07 1.00 .00

Note. N=200, Changes in the chi-square statistic are run in reference to the model, χ2=Chi-

square, RMSEA=Root Mean Square Error of Approximation, CFI= Comparative Fit Index,

TLI=Tucker Lewis Index, χ2>0.05.

The table above shows the fit indices for Model 1, which tested the hypothesized

association among Dysfunctional Impulsivity, Cognitive Flexibility, and Thrill Sensation. In this

model, Dysfunctional Impulsivity was considered exogenous variables, while Cognitive

Flexibility and Thrill Sensation were defined as endogenous variables. Path analysis was

conducted to assess the relationships and justify the model's assumptions. The fit indices

indicated that the model explained a excellent fit, as shown by the Chi-square value (χ2 = .01, p

> .05), RMSEA (.00, which is below .08), TLI (1.07), and CFI (1.00). To examine the mediation

model, direct and indirect effects were analyzed using the bootstrapping method with a 95%

confidence interval.
Table 4.2: Estimate the direct effect of Dysfunctional Impulsivity and Cognitive Flexibility

(N=224)

Mediator (Cognitive Flexibility) Outcome (Thrill Sensation)

Variable B β SE B β SE

Dysfunctional Impulsivity -.42*** -.29*** .09

Cognitive Flexibility .39*** .34*** .07

Total Covariates

R2 .08 .12

Note. *p<0.05, **p<0.01, ***p<0.001, B=Unstandardized Regression Coefficient,

β=Standardized Regression Coefficient, SE=Standard Error

The table shows the direct effect of Dysfunctional Impulsivity (IV) on Cognitive

Flexibility (M), and the direct effect of Cognitive Flexibility (M) on Thrill Sensation (DV).

Results indicated that dysfunctional impulsivity significantly and negatively predicted cognitive

flexibility with B = –.42***, β = –.29*** i.e., higher levels of dysfunctional impulsivity were

associated with lower cognitive flexibility. Whereas cognitive flexibility significantly and

positively predicted thrill sensation, B = .39***, β = .34*** showing that greater cognitive

flexibility was associated with higher levels of thrill sensation.


Table 4.3: Estimate the indirect effect of Dysfunctional Impulsivity on Thrill Sensation (N=224)

Outcome (Thrill Sensation)

Variable B β SE

Dysfunctional Impulsivity -.16** -.10** .03

Note. *p<0.05, **p<0.01, ***p<0.001, B=Unstandardized Regression Coefficient,

β=Standardized Regression Coefficient, SE=Standard Error

The table shows the indirect effect of Dysfunctional Impulsivity (IV) on Thrill Sensation

(DV) through the mediation of Cognitive Flexibility. Results indicated that Dysfunctional

impulsivity had a significant indirect effect on thrill sensation with B= -.16**, β= -.10**.

From Table 4.2 and 4.3, it has been showed that there was a significant indirect effect of

Dysfunctional Impulsivity on Thrill Sensation, suggesting that cognitive flexibility had a full

mediating role in the relationship between Dysfunctional Impulsivity and Thrill Sensation.

Figure 4.1: Figural Representation of Mediating Role of Cognitive Flexibility between

Dysfunctional Impulsivity and Thrill Sensation (N=224)

Note. This figure depicts the mediation analysis examining the indirect effects of Dysfunctional
Impulsivity on Thrill Sensation through the mediator Cognitive Flexibility.
*p<0.05, **p<0.01, ***p<0.001
5. Confirmatory Factor Analysis

Question:

Does the proposed factor structure of the 30-item HRQOL scale be confirmed through

CFA, and do the results support the theoretical framework of HRQOL?( Table 5.1, 5.2)

Table 5.1: Model Fit Indices of CFA for HRQOL Scale

Model χ2 IFI CFI RMSEA χ2/df

Model 1a 867.86(390)*** .90 .90 .08 2.23

Model 2b 780.51(381)*** .91 .91 .07 2.05

Δχ2 (Δ df) 87.35(09)***

Note. Structural equation modelling was used for analysis. CFI = comparative fit index; IFI =

incremental fit index; RMSEA = root-mean-square error of approximation.


a
In Model 1, 30 items were loaded on 6 factors. b In Model 2, 30 items were loaded on 6 factors.

*p < .001.
Figure 5.1: Factor Structure of HRQOL Scale Before Model Modification (N = 200)

Figure 5.2: Final Factor Structure of HRQOL Scale (N = 200)


Table 5.2: Factor Loading of HRQOL Scale

Emotional Menstrual
Items Weight Hirsutism Acne Infertility
Disturbance Problem

Item 1 .84

Item 2 .84

Item 3 .86

Item 4 .91

Item 5 .90

Item 6 .85

Item 7 .75

Item 8 .84

Item 9 .86

Item 10 .86

Item 11 .82

Item 12 .91

Item 13 .57

Item 14 .83

Item 15 .56

Item 16 .85

Item 17 .74

Item 18 .77

Item 19 .69
Emotional Menstrual
Items Weight Hirsutism Acne Infertility
Disturbance Problem

Item 20 .87

Item 21 .88

Item 22 .84

Item 23 .86

Item 24 .84

Item 25 .65

Item 26 .74

Item 27 .71

Item 28 .69

Item 29 .75

Item 30 .87

Confirmatory Factor Analysis (CFA) was conducted on the HRQOL Scale using AMOS

Version 24 with the Maximum Likelihood Estimation method to validate the factor structure. The

scale included six hypothesized constructs: Hirsutism (Items 1–5), Weight (Items 7, 8, 10, 11, 13,

14, 15), Acne (Items 6, 17, 18, 30), Emotional Disturbance (Items 19–24), Menstrual Problems

(Items 25–29), and Infertility (Items 9, 12, 16), as displayed in Table 5.1 and illustrated in Figure

5.1.

The initial model (Model 1) produced acceptable fit indices: χ²(390) = 867.86, p < .001,

IFI = .90, CFI = .90, RMSEA = .08, and χ²/df = 2.23. As per criteria, RMSEA values below .08

and Comparative Fit Index (CFI), Incremental Fit Index (IFI) of .9 or higher were recommended
and Goodness of Fit Index (GFI) of .9 or higher (Arbuckle, 2014; Awang, 2020; Byrne, 2013;

Kline, 2015). To enhance model fit, modifications were made by allowing covariances between

error terms of items within the same constructs, as suggested by the modification indices (Figure

5.2). The revised model (Model 2) demonstrated improved fit: χ²(381) = 780.51, p < .001, IFI

= .91, CFI = .91, RMSEA = .07, and χ²/df = 2.05. The chi-square difference between the two

models was significant (Δχ² = 87.35, Δ df = 9), indicating that the revised model better

represented the data. Although the chi-square remained significant (a known limitation due to its

sensitivity to large sample sizes), other fit indices suggested a better-fitting and more

parsimonious model.

Factor loadings were examined to determine the adequacy of each item's contribution to

its respective construct. In line with the recommendations by Brown and Moore (2012) and

Matsunaga (2010), items with loadings below .40 are considered for removal; however, in the

current analysis, all items showed acceptable loadings, ranging from .57 to .95 as shown in Table

5.2.
Table 5.3: Correlation matrix showing the correlations between factors

Component 1 2 3 4 5 6

1. Weight - .46 .60 .33 .53 .53

2. Hirsutism - .55 .58 .46 .23

3. Emotional Disturbance - .47 .67 .50

4. Acne - .54 .1.3

5. Menstrual Problem - .42

6. Infertility -

The above table showed the correlation among the factors of HRQOL scale. The six

components of HRQOL ( weight, hirsutism, emotional disturbance, acne, menstrual problem,

and infertility) are moderate to strong intercorrelated with each other.


6. Exploratory Factor Analysis

Question:

What is the underlying factor structure of the 30-item HRQOL scale, and how many

latent factors are necessary to account for the observed correlations among the items?"

To examine the factorial analysis of the 28 items included in the scale, an Exploratory

Factor Analysis (EFA) was conducted. The objective was to determine the underlying factor

structure and identify the most relevant items to retain in the final version of the scale. A

Principal Component Analysis (PCA) with oblique (Direct Oblimin) rotation was performed on

the responses of 200 participants. Oblique rotation was chosen due to the expected

intercorrelation between underlying factors.

Sampling adequacy was assessed using the Kaiser–Meyer–Olkin (KMO) measure and

Bartlett’s Test of Sphericity. According to Kaiser (1974), a KMO value above .90 is considered

“superb,” values between .80 and .89 are “great,” .70 to .79 are “good,” and values below .60

reflect inadequate sampling. The KMO measure for this dataset was .93, suggesting excellent

sampling adequacy. Bartlett’s Test of Sphericity was also statistically significant, χ² = 3562.59,

p < .001, indicating sufficient correlations among variables to proceed with factor analysis.

These results are summarized in Table 6.1.

Table 6.1: KMO and Bartlett’s test

Kaiser-Myer-Olkin Test for Sampling Adequacy .93

Bartlett’s test of sphericity, Approx. Chi- Square 3562.59

p<.001*
Using the Kaiser criterion (eigenvalues > 1), five components were initially extracted.

Together, these five components accounted for 64.36% of the total variance, with Component 1

explaining 44.62%, followed by Component 2 (5.94%), Component 3 (5.37%), Component 4

(4.54%), and Component 5 (3.87%).

There are three commonly used criteria to determine the appropriate number of factors to

retain in factor analysis: (1) the Kaiser criterion (eigenvalues > 1), (2) the scree plot, and (3) the

cumulative percentage of variance explained (at least 50–60% of the total variance. Social

Sciences Guidance: 50% acceptable (especially for psychological constructs).

However, the scree plot (Figure 6.1) did not support the five-component solution. Instead,

it showed a clear "elbow" after the second component, suggesting that only two components

should be retained. Additionally, the eigenvalues and explained variances beyond the second

component contributed minimal additional variance, indicating that they do not represent

meaningful underlying constructs.

Based on these three criteria, particularly the scree plot and the cumulative variance

distribution, a two-factor solution was selected as the most parsimonious and meaningful model.

The Eigenvalues and percentage of variance explained by these two components are presented in

Table 6.2.
Table 6.2: Eigen values and percentage of variances of 28 items of scale explained by six factors

in the factor solution obtained through Principal Component analysis

Factor Eigen Value % of Variance Cumulative %

1 12.49 44.62 44.62

2 1.67 5.95 50.57

Scree plot in Figure 6.1 shows the extraction of factors. As it indicates that the most

obvious break is on 2nd Eigen value indicating 2 factor solution would be appropriate.

Figure 6.1: Scree Plot showing the extraction of factors of scale

The factor loadings of the 28 items on two factors were obtained using Direct oblimin

rotation. Following the recommendation of Norman and Streiner (1994), items with factor

loadings of .40 or greater were retained. Only one item (Item 27) was discarded due to

insufficient loading. The rotated component matrix is provided in Table 6.3


Table 6.3: Factor Loadings for Exploratory Factor Analysis with Direct Oblimin Rotation of

Scales

Component

Item no. 1 2

Q1 .67

Q2 .66

Q3 .43

Q4 .49

Q5 .45

Q6 .55

Q7 .44

Q8 .46

Q9 .60

Q10 .52

Q11 .44

Q12 .57

Q13 .46

Q14 .84

Q15 .69

Q16 .71

Q17 .46

Q18 .53

Item no. Component


1 2

Q19 .88

Q20 .83

Q21 .95

Q22 .91

Q23 .67

Q24 .79

Q25 .67

Q26 .44

Q27

Q28 .46

Factor 1

16 items (Q6, Q7, Q8, Q9, Q13, Q14, Q15, Q16, Q17, Q18, Q19, Q20, Q21, Q22, Q23,

Q24) were loaded on factor 1.

Factor 2

11 items (Q1, Q2, Q3, Q4, Q5, Q10, Q11, Q12, Q25, Q26, Q28) were loaded on factor 2.

Inter-factor correlations were examined to assess the appropriateness of using an oblique rotation

method. The correlation matrix (Table 6.4) indicated that the extracted factors were moderately

correlated, justifying the use of Direct oblimin rotation.


The following table shows the correlations between factors.

Table 6.4: Correlation matrix showing the correlations between factors

Component 1 2

1 - .57

2 -

The moderate to strong correlation between components confirm that the underlying

constructs are related yet distinct, supporting the multidimensional structure of the scale and

validating the use of an oblique rotation.

Reliability of Factors

Table 6.5: Reliability Analysis of Factors

Variables k M SD α

Factor 1 16 101.83 10.09 .94

Factor 2 11 65.57 8.07 .82

Note. k= no. of items, M=Mean, SD=Standard Deviation, α=Cronbach’s alpha

The table shows descriptive statistics and internal consistency (Cronbach’s alpha) for two

factors. All factors demonstrate high internal reliability (α ≥ .82).

You might also like