Results of Biostat Assignment For MSC Students
Results of Biostat Assignment For MSC Students
Please try to do the following events accordingly and prepare a good report to your work. If it is
possible, use EndNote/Mendlay or Zotero reference management software for your references
depending on your preference.
I. Part one
Consider the lungworm data set given (sheet 2) and
Different appropriate DATA presentation methods (tabular and diagrammatic) for each
variable.
Frequency Table
Sex
sex Frequency
male 197
female 181
Total 378
age
age Frequency
young 180
adult 198
Total 378
body condition
good 87
medium 102
poor 189
Total 378
study site
adele 96
haramaya 164
awoday 118
Total 378
resuts
result Frequency
negative 248
positive 130
Total 378
Bar Chart
Pie Chart
Different appropriate numerical measures (proportion) with interval estimates for each
variable.
sex * resuts
resuts
age * resuts
resuts
resuts
Develop logistic regression model and check for its fitness lungworm data set.
Marginal
N Percentage
Missing 0
Total 378
Subpopulation 36a
a. The dependent variable has only one value observed in 6 (16.7%)
subpopulations.
Model Fitting
Criteria Likelihood Ratio Tests
Goodness-of-Fit
Chi-Square df Sig.
Pseudo R-Square
Nagelkerke .193
McFadden .117
-2 Log Likelihood
Effect of Reduced Model Chi-Square df Sig.
a. This reduced model is equivalent to the final model because omitting the
effect does not increase the degrees of freedom.
Parameter Estimates
95% Confidence Interval for Exp(B)
a
resuts B Std. Error Wald df Sig. Exp(B) Lower Bound Upper Bound
negative Intercept .22 .287 .590 1 .443
1
[sex=1] .97 .237 17.033 1 .000 2.657 1.670 4.225
7
[sex=2] 0b . . 0 . . . .
[age=1] -.85 .237 12.886 1 .000 .427 .268 .679
2
[age=2] 0b . . 0 . . . .
[body condition =1] 1.1 .321 13.753 1 .000 3.293 1.754 6.181
92
[body condition =2] .99 .285 12.201 1 .000 2.707 1.548 4.733
6
[body condition =3] 0b . . 0 . . . .
[study site=1] -.34 .312 1.202 1 .273 .710 .385 1.309
2
[study site=2] -.03 .281 .016 1 .901 .966 .557 1.673
5
[study site=3] 0b . . 0 . . . .
a. The reference category is: positive.
b. This parameter is set to zero because it is redundant.
Consider the production data set (sheet 1) collected from Maya city and
Frequency Table
sample code(Haramaya=1,awaday=2,adele=3)
haramaya 50
awaday 34
adele 36
Total 120
BCS(good=3,medium=2,poor=1)
BCS Frequency
poor 14
medium 61
good 45
Total 120
season of birth(automn=1,summer=2,winter=3,spring=4)
automn 54
summer 44
winter 9
spring 13
Total 120
first parity 15
second parity 25
third parity 27
fourth parity 31
fifth parity 20
above 6 2
Total 120
30 2
31 1
32 2
33 1
34 4
35 1
36 7
38 1
42 3
44 7
45 3
46 1
47 2
48 8
49 2
50 2
51 1
52 3
54 2
55 2
56 1
58 2
60 14
62 1
64 4
66 2
68 3
70 3
72 9
74 1
76 2
78 2
80 1
84 6
88 1
90 2
92 1
94 2
96 6
120 2
Total 120
9.2 1
9.5 1
10.0 5
10.2 2
10.6 2
10.8 1
11.0 4
11.3 1
11.5 1
12.0 9
12.4 2
12.5 3
12.8 1
13.0 4
13.2 3
13.5 3
14.0 5
14.5 7
14.8 1
15.0 3
15.5 4
15.6 1
15.8 3
16.0 7
16.2 1
16.5 4
16.8 1
17.0 6
17.4 1
17.5 2
18.0 8
18.5 2
19.0 5
19.5 4
20.0 3
20.5 1
21.0 2
22.0 3
23.0 1
Total 120
Valid 12 6
13 25
14 20
15 30
16 13
17 2
18 17
20 7
Total 120
Bar Chart
Pie Chart
Histogram
Different appropriate numerical measures (measure of central tendency and dispersions)
with interval estimates for each variable the data set.
Statistics
number of
parity(first
parity=1,sec
sample ond
code(H parity=2,third
aramay BCS(go parity=3
a=1,aw od=3,m season of fourth daily
aday=2 edium= birth(automn=1 parity=4 milk
,adele= 2,poor= ,summer=2,win fifthyparity=5 age in yeild in calving interval in
3) 1) ter=3,spring=4) ,above6=6) month litter month
N Valid 120 120 120 120 120 120 120
Missing 0 0 0 0 0 0 0
Mean 1.88 2.26 1.84 3.18 60.16 15.069 15.13
Std. Error of Mean .077 .060 .089 .121 1.810 .3043 .191
Median 2.00 2.00 2.00 3.00 60.00 15.000 15.00
Mode 1 2 1 4 60 12.0 15
Std. Deviation .842 .655 .970 1.328 19.832 3.3331 2.093
Variance .709 .429 .941 1.764 393.294 11.110 4.379
Range 2 2 3 5 90 14.0 8
Minimum 1 1 1 1 30 9.0 12
Maximum 3 3 4 6 120 23.0 20
Sum 226 271 221 382 7219 1808.3 1815
Compare the average daily milk yield in cattle between Adele and Haramaya sub city?
T-Test
Group Statistics
sample
code(Haramaya=1,awaday=
2,adele=3) N Mean Std. Deviation Std. Error Mean
daily milk yeild in litter haramaya 50 14.538 2.7982 .3957
adele 36 12.914 2.4398 .4066
Summary Data
Std. Error
Mean Difference Difference t df Sig. (2-tailed)
Compare the average daily milk yield between Adele, Awoday and Haramaya with
appropriate post-hoc test?
ANOVA
Multiple Comparisons
Bonferroni
Is there any difference in daily milk yield between various study sites? Do also post hoc
tests (multiple comparisons).
95% Confidence
Interval for Mean
U
p
p
e
r
Betwee
B n-
o Compon
Std. u ent
Deviat n Maximu Varianc
N Mean ion Std. Error Lower Bound d Minimum m e
Multiple Comparisons
95% Confidence
Interval
(I) sample
code(Haramaya=1 (J) sample Mean Upper
,awaday=2,adele= code(Haramaya=1,awad Difference (I- Boun
3) ay=2,adele=3) J) Std. Error Sig. Lower Bound d
Homogeneous Subsets
haramaya 50 14.538
awaday 34 18.132
b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not
guaranteed.
Is there any difference in calving interval between various study sites? Do also post hoc
tests (multiple comparisons).
Descriptives
ANOVA
Multiple Comparisons
(I) sample (J) sample Mean Std. Error Sig. 95% Confidence Interval
code(Haramaya=1,awa code(Haramaya=1,aw Difference (I- Lower
day=2,adele=3) aday=2,adele=3) J) Bound Upper Bound
Homogeneous Subsets
calving interval in month
haramaya 50 14.64
adele 36 16.67
b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels
are not guaranteed.
Try to develop full linear regression model and check for all assumptions and model
fitness by considering calving interval as your outcome variable for production data set.
Regression
Descriptive Statistics
Mean Std. Deviation N
calving interval in month 15.13 2.093 120
sample 1.88 .842 120
code(Haramaya=1,awaday=2,a
dele=3)
BCS(good=3,medium=2,poor=1 2.26 .655 120
)
season of 1.84 .970 120
birth(automn=1,summer=2,wint
er=3,spring=4)
number of parity(first 3.18 1.328 120
parity=1,second parity=2,third
parity=3 fourth parity=4
fifthyparity=5,above6=6)
age in month 60.16 19.832 120
daily milk yeild in litter 15.069 3.3331 120
Model Summary
Change Statistics
Mod Adjusted R Std. Error of R Square Sig. F
el R R Square Square the Estimate Change F Change df1 df2 Change
a
1 .390 .152 .107 1.977 .152 3.378 6 113 .004
a. Predictors: (Constant), daily milk yeild in litter, number of parity(first parity=1,second parity=2,third parity=3 fourth parity=4
fifthyparity=5,above6=6), BCS(good=3,medium=2,poor=1), sample code(Haramaya=1,awaday=2,adele=3), season of
birth(automn=1,summer=2,winter=3,spring=4), age in month
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 79.253 6 13.209 3.378 .004b
Residual 441.872 113 3.910
Total 521.125 119
a. Dependent Variable: calving interval in month
b. Predictors: (Constant), daily milk yeild in litter, number of parity(first parity=1,second parity=2,third
parity=3 fourth parity=4 fifthyparity=5,above6=6), BCS(good=3,medium=2,poor=1), sample
code(Haramaya=1,awaday=2,adele=3), season of birth(automn=1,summer=2,winter=3,spring=4), age in
month
Coefficientsa
Standardize
Unstandardized d 95.0% Confidence Interval
Coefficients Coefficients t Sig. for B Collinearity Statistics
Lower Toleranc
Model B Std. Error Beta Bound Upper Bound e VIF
1 (Constant) 13.821 1.720 8.035 .000 10.413 17.228
sample .893 .233 .359 3.825 .000 .430 1.355 .851 1.175
code(Haramaya
=1,awaday=2,a
dele=3)
BCS(good=3,m .169 .316 .053 .535 .594 -.457 .795 .768 1.302
edium=2,poor=1
)
season of .043 .208 .020 .205 .838 -.369 .455 .808 1.238
birth(automn=1,
summer=2,wint
er=3,spring=4)
number of -.001 .197 .000 -.004 .997 -.390 .389 .482 2.076
parity(first
parity=1,second
parity=2,third
parity=3 fourth
parity=4
fifthyparity=5,ab
ove6=6)
age in month -.005 .014 -.045 -.332 .741 -.033 .024 .399 2.506
daily milk yeild -.036 .057 -.058 -.632 .528 -.150 .077 .900 1.112
in litter
a. Dependent Variable: calving interval in month
Collinearity Diagnosticsa
Mod Dimen Eigen Condition Variance Proportions
number of
parity(first
parity=1,seco
nd
parity=2,third
sample season of parity=3
code(Hara birth(automn fourth
maya=1,aw BCS(good= =1,summer= parity=4
aday=2,ade 3,medium= 2,winter=3,s fifthyparity=5, daily milk yeild
el sion value Index (Constant) le=3) 2,poor=1) pring=4) above6=6) age in month in litter
1 1 6.348 1.000 .00 .00 .00 .00 .00 .00 .00
2 .261 4.930 .00 .14 .04 .31 .01 .01 .00
3 .164 6.225 .00 .09 .00 .37 .16 .04 .00
4 .131 6.966 .00 .52 .09 .00 .04 .00 .06
5 .051 11.174 .00 .02 .52 .10 .01 .00 .40
6 .037 13.061 .01 .01 .00 .11 .66 .54 .06
7 .008 27.618 .98 .22 .35 .10 .11 .40 .47
a. Dependent Variable: calving interval in month
Try to develop full Poisson regression model and check for all assumptions and model
fitness by considering number of parity as your outcome variable for production data
set.
Model Fitting Information
Model Fitting
Criteria Likelihood Ratio Tests
Model -2 Log Likelihood Chi-Square df Sig.
Intercept Only 194.002
Final 152.995 41.007 35 .224
Goodness-of-Fit
Chi-Square df Sig.
Pearson 53.028 75 .974
Deviance 64.501 75 .801
Pseudo R-Square
Cox and Snell .289
Nagelkerke .301
McFadden .104
Likelihood Ratio Tests
Effect Model Fitting Likelihood Ratio Tests
Criteria
-2 Log Likelihood
of Reduced Model Chi-Square df Sig.
a
Intercept 152.995 .000 0 .
sample 159.209 6.214 10 .797
code(Haramaya=1,awaday=2,a
dele=3)
BCS(good=3,medium=2,poor=1 168.511 15.517 10 .114
)
season of 172.611 19.617 15 .187
birth(automn=1,summer=2,wint
er=3,spring=4)
The chi-square statistic is the difference in -2 log-likelihoods between the final model and a
reduced model. The reduced model is formed by omitting an effect from the final model.
The null hypothesis is that all parameters of that effect are 0.
a. This reduced model is equivalent to the final model because omitting the effect does not
increase the degrees of freedom.
Parameter Estimates
number of parity(first 95% Confidence Interval for
parity=1,second Exp(B)
parity=2,third parity=3 fourth
parity=4
fifthyparity=5,above6=6)a B Std. Error Wald df Sig. Exp(B) Lower Bound Upper Bound
first parity Intercept 12.354 1085.062 .000 1 .991
[sample 13.511 576.339 .001 1 .981 737785.76 .000 .b
code(Hara 0
maya=1,aw
aday=2,ade
le=3)=1]
[sample 34.339 1349.990 .001 1 .980 81892711 .000 .b
code(Hara 5117670.0
maya=1,aw 00
aday=2,ade
le=3)=2]
[sample 0c . . 0 . . . .
code(Hara
maya=1,aw
aday=2,ade
le=3)=3]
[BCS(good -11.274 .000 . 1 . 1.270E-5 1.270E-5 1.270E-5
=3,medium
=2,poor=1)
=1]
[BCS(good 33.510 1099.789 .001 1 .976 35743102 .000 .b
=3,medium 5047285.9
=2,poor=1) 40
=2]
[BCS(good 0c . . 0 . . . .
=3,medium
=2,poor=1)
=3]
[season of -12.447 1085.063 .000 1 .991 3.929E-6 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=1]
[season of -.102 1480.299 .000 1 1.000 .903 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=2]
[season of -33.129 1340.620 .001 1 .980 4.096E-15 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=3]
[season of 0c . . 0 . . . .
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=4]
second parity Intercept 12.857 1085.062 .000 1 .991
[sample 12.416 576.339 .000 1 .983 246833.98 .000 .b
code(Hara 1
maya=1,aw
aday=2,ade
le=3)=1]
[sample 34.100 1349.990 .001 1 .980 64508809 .000 .b
code(Hara 4564532.8
maya=1,aw 00
aday=2,ade
le=3)=2]
[sample 0c . . 0 . . . .
code(Hara
maya=1,aw
aday=2,ade
le=3)=3]
[BCS(good 11.117 536.419 .000 1 .983 67314.698 .000 .b
=3,medium
=2,poor=1)
=1]
[BCS(good 34.448 1099.789 .001 1 .975 91362962 .000 .b
=3,medium 4783102.5
=2,poor=1) 00
=2]
[BCS(good 0c . . 0 . . . .
=3,medium
=2,poor=1)
=3]
[season of -12.752 1085.062 .000 1 .991 2.896E-6 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=1]
[season of .286 1480.298 .000 1 1.000 1.331 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=2]
[season of -36.546 1340.619 .001 1 .978 1.344E-16 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=3]
[season of 0c . . 0 . . . .
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=4]
third parity Intercept 12.652 1085.062 .000 1 .991
[sample 12.682 576.339 .000 1 .982 321778.85 .000 .b
code(Hara 1
maya=1,aw
aday=2,ade
le=3)=1]
[sample 33.931 1349.990 .001 1 .980 54452956 .000 .b
code(Hara 0911663.7
maya=1,aw 50
aday=2,ade
le=3)=2]
[sample 0c . . 0 . . . .
code(Hara
maya=1,aw
aday=2,ade
le=3)=3]
[BCS(good 9.066 536.420 .000 1 .987 8651.838 .000 .b
=3,medium
=2,poor=1)
=1]
[BCS(good 33.599 1099.789 .001 1 .976 39074081 .000 .b
=3,medium 4511782.6
=2,poor=1) 00
=2]
[BCS(good 0c . . 0 . . . .
=3,medium
=2,poor=1)
=3]
[season of -11.974 1085.063 .000 1 .991 6.306E-6 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=1]
[season of 1.310 1480.298 .000 1 .999 3.705 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=2]
[season of -34.832 1340.620 .001 1 .979 7.460E-16 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=3]
[season of 0c . . 0 . . . .
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=4]
fourth parity Intercept 12.766 1085.062 .000 1 .991
[sample 12.491 576.339 .000 1 .983 265954.77 .000 .b
code(Hara 4
maya=1,aw
aday=2,ade
le=3)=1]
[sample 33.762 1349.990 .001 1 .980 46000636 .000 .b
code(Hara 0265681.3
maya=1,aw 00
aday=2,ade
le=3)=2]
[sample 0c . . 0 . . . .
code(Hara
maya=1,aw
aday=2,ade
le=3)=3]
[BCS(good 10.832 536.418 .000 1 .984 50625.222 .000 .b
=3,medium
=2,poor=1)
=1]
[BCS(good 33.779 1099.789 .001 1 .975 46774796 .000 .b
=3,medium 9167927.1
=2,poor=1) 00
=2]
[BCS(good 0c . . 0 . . . .
=3,medium
=2,poor=1)
=3]
[season of -11.713 1085.062 .000 1 .991 8.185E-6 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=1]
[season of 1.027 1480.298 .000 1 .999 2.793 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=2]
[season of -36.199 1340.619 .001 1 .978 1.900E-16 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=3]
[season of 0c . . 0 . . . .
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=4]
fifth parity Intercept 13.889 1085.062 .000 1 .990
[sample 12.527 576.339 .000 1 .983 275815.70 .000 .b
code(Hara 1
maya=1,aw
aday=2,ade
le=3)=1]
[sample 34.655 1349.990 .001 1 .980 11230465 .000 .b
code(Hara 91290448.
maya=1,aw 800
aday=2,ade
le=3)=2]
[sample 0c . . 0 . . . .
code(Hara
maya=1,aw
aday=2,ade
le=3)=3]
[BCS(good 10.718 536.419 .000 1 .984 45182.253 .000 .b
=3,medium
=2,poor=1)
=1]
[BCS(good 33.592 1099.789 .001 1 .976 38813952 .000 .b
=3,medium 6843194.3
=2,poor=1) 00
=2]
[BCS(good 0c . . 0 . . . .
=3,medium
=2,poor=1)
=3]
[season of -13.757 1085.062 .000 1 .990 1.060E-6 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=1]
[season of -1.755 1480.298 .000 1 .999 .173 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=2]
[season of -36.180 1340.619 .001 1 .978 1.937E-16 .000 .b
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=3]
[season of 0c . . 0 . . . .
birth(autom
n=1,summe
r=2,winter=
3,spring=4)
=4]
a. The reference category is: above 6.
b. Floating point overflow occurred while computing this statistic. Its value is therefore set to system missing.
c. This parameter is set to zero because it is redundant.
Do the appropriate regression analysis for the above data, develop model equation by appropriate
variable screening methods and write a full document by incorporating all components of
research report writing protocols including (introduction, statement of the problem, rational of
the study, objectives, literature review, conceptual framework, materials and methods(study area,
study design, study period, source and study population, sample size determination, sampling
method and techniques, operational definitions, dependent and independent variables, data
quality assurance and a data management and analysis, participant information sheet and consent
forms for animal owners), result, discussion, conclusion, recommendations based on your
findings and references) with appropriate interpretation for all the above statistical graphs and
tests result.
Note: There will be individual presentation, so you will be ready for that accordingly
(presentation accounts 15% and document 15% loads).
Part three:
1. Plasma urea and creatinine are routinely measured to evaluate the renal function. In
healthy cats the mean urea value in a given pathology laboratory is 7.5 mmol/l. Plasma
urea value in a random sample of 60 healthy cats in January were measured to verify the
assay. The data were approximately normally distributed with a mean urea content of
9.7mmol/l and an estimated standard error of 0.22 mmol/l. Is there any evidence at 99%
confidence level to indicate that the assay performance in this laboratory changed in
January?
2. In a given district animals were reared in a grazing area where by phytotoxic plants dominates.
Animals that graze on such grazing land and animals that feed in stall barn were compared for the
occurrence of photosensitization. In a ten days collected data, out of a total 400 cattle reared in
grazed land 60 of them had developed photosensitization. However, only 5 animals on the stall
barn were developed the condition from 100 cattle feed on a stall barn.
A. Draw 2x2 contingency table?
B. Is there a difference in the occurrence of photosensitization in the two animal groups?
C. How did you declare the association between grazing on abundant phytotoxic plant and
occurrence of photosensitization?
D. State your hypothesis and test the association at 90% confidence level?
E. Interpret the result?
3. A veterinary laboratory technologist wants to predict the interval estimate for the average age of
cattle visiting a certain veterinary clinic. He assumes that the mean age of cattle are
approximately normally distributed with population standard deviation of 3 years. A sample of 20
cattle that participate in the study yielded a mean age of 4 years. Estimate and interpret the 90%,
95% and 99% CI for the population mean?
4. A veterinary research team wants to estimate the prevalence of Brucellosis at Harar intensive
dairy farms. A random sample of 400 cattle was obtained and it was found that 5% of them were
positive for Brucellosis upon CFT test. Compute and interpret the 90%, 95% and 99% CI of
Brucellosis in Harar intensive dairy farms.
5. A laboratory technologist at Haramaya veterinary clinic was interested to compare the mean PCV
value of sheep and goat of the same age he enrolled. Ten animals from each species and their
PCV measured values was shown from the table below. Assume that the two population are
normally distributed with equal variance for the population. Does this data provide sufficient
evidence between the mean PCV levels among the species at 90%, 95% and 99% Confidence
level? Interpret the result?
6. A study was conducted to evaluate the efficacy of Trypanocidal drugs. The efficacy was assessed
using a change in PCV value. Three types of tripanocidal drugs were used and 24 trypanosome
infected animals were randomly assigned to one of the treatment groups. The PCV value of
animals after the experiment was measured and presented below in the table.
Treatment I PCV I Treatment II PCV II Treatment III PCV III Treatment IV PCV IV
Control 24 DA 35 Trypamidium 31 Homidium.B 24
Control 25 DA 40 Trypamidium 21 Homidium.B 25
Control 24 DA 23 Trypamidium 27 Homidium.B 25
Control 23 DA 30 Trypamidium 30 Homidium.B 24
Control 18 DA 25 Trypamidium 27 Homidium.B 23
Control 20 DA 40 Trypamidium 42 Homidium.B 22
9. A physician claims that joggers maximal volume oxygen uptake is greater than the
average of all adults. A sample of 15 joggers has a mean of 40.6 milliliters per kilogram
(ml/kg) and a standard deviation of 6 ml/kg. If the average of all adults is 36.7 ml/kg, is
there enough evidence to support the physician’s claim at α=0.05?
10. Compare and interpret the mean difference between aflatoxin level of feed stuffs on a
sample of 25 groundnut and 20 soybean feed samples at α=0.05 and the data is given here
under in the table.