Rxexam'S Biostatistics Questions & Answers: 2019-2020 Edition
Rxexam'S Biostatistics Questions & Answers: 2019-2020 Edition
RxExam’s Biostatistics
Questions & Answers
2019-2020 Edition
MANAN SHROFF
1
www.pharmacyexam.com © All Rights Reserved
This book is not intended as a substitute for the advice of physicians. Students or readers must consult their
physician about any existing problem. Do not use information in this book for any kind of self-treatment. Do
not administer any dose of mentioned drugs in this book without consulting your physician.
The author is not responsible for any kind of misinterpreted, incorrect, or misleading information or any
typographical errors in this book. Any doubtful or questionable answers should be checked in other
available reference sources.
No part of this book may be reproduced or transmitted in any form or by any means, electronically
Photocopying, recording, or otherwise, without prior written permission of the publisher.
RXEXAM® is a registered trademark of Pharmacy Exam of Krishna Publications Inc. Any unauthorized use of
this trade mark will be considered a violation of law.
2
www.pharmacyexam.com © All Rights Reserved
1. In biostatistics, confounding is normally defined 5. The scores of Naplex for 10 students are 75, 82,
as: 90, 92, 67, 95, 110, 80, 82, 86. Find the mean for the
above data.
a. One chosen from a carefully defined population
with the aid of a formal method to avoid bias. a. 83.6
b. 85.9
b. A formal method to assign subjects by chance to c. 836
one or the other treatment. d. 43
c. The effect of two or more variables that do not 6. The scores of Naplex for 10 students are 75, 82,
allow a conclusion about either one separately. 90, 92, 67, 95, 110, 80, 82, 86. Find the median for
the above data.
d. The systematic tendency of any factors
associated with the design, conduct, analysis, and a. 110
evaluation of the results of a trial to make the b. 84
estimate of a treatment effect deviate from its true c. 82
value. d. 67
2. The data that categories patients as males or 7. The scores of Naplex for 10 students are 75, 82,
females are known as: 90, 92, 67, 95, 110, 80, 82, 86. Find the mode for
the above data.
a. Random data
b. Nominal data a. 75
c. Ordinal data b. 90
d. Interval data c. 86
d. 82
3. Which of the following data represents interval
continuous data? 8. Classifying continuing educational experience
into categories including “strongly agree,”“agree,”
a. A number of cigarettes smoked per day by a and “disagree,” is an example of which type of
person. variable or data?
b. A number of children in a household.
c. Height of children. a. Nominal
d. Number of languages a person speaks. b. Ordinal
c. Interval
4. Data can be transformed by using the logarithm, d. Ratio
square root, or reciprocal. Which of the following is
the most common data transformation used in 9. Which of the following statements about the
medical research? prevalence is/are TRUE?
3
www.pharmacyexam.com © All Rights Reserved
c. It represents the absolute difference of the event 16. If the calculated absolute risk reduction (ARR)
rate between the treatment and control groups. for heart attack event was 5.1%, what would be the
number needed to treat?
d. It is a representation of the number of patients
who need to be treated to prevent one additional a. 11
event compared to treating the same number of b. 2.65
patients with the control therapy. c. 3.2
d. 20
12. Calculate the value of RR if the risk of heart
failure associated with an invention drug is 5% 17. Which of the following statements is TRUE
versus 9% with placebo? ABOUT the standard deviation (SD)?
13. Which of the following statements best describe c. It is used to display bimodal or skewed data.
the interpretation of value of RR obtained in a
previous question? d. A large SD shows that individual data points are
clustered closer to the mean.
a. there was no association between the study drug
and heart failure. 18. If the sample size in a study is 100 subjects and
the SD for blood glucose is 10 mg/dl, what is the
b. the risk of heart failure in patients taking the standard error of the mean?
study drug was less than the risk of heart failure in
patients taking placebo. a. 65.9
4
www.pharmacyexam.com © All Rights Reserved
d. the regression model explained 30% of the total b. Employees who exposed to asbestos for more
variance is not a good fit. than 10 years had 16% decreased odds of
developing asthma than those who did not.
146. In a study examining the relationship between
the drinking a coffee in the late evening and the c. The results are not significant because the
likelihood of insomnia, the r2 value for the confidence interval includes 1.0.
regression line is 1.0. What does this indicate?
d. The results are not significant because the
a. For each additional cup of coffee, chances of confidence interval is greater than 1.0.
occuring insomnia increase by 1.0%.
149. What does “OR 0.4 95%CI 0.4-0.6 p <0.05”
2
b. Since r equals 1.0, this indicates there is no mean?
relationship between drinking coffee and
developing insomnia. a. The odds of death in the intervention groups are
60% less than the odds of death in the control
c. Since r2 equals 1.0, the regression line perfectly groups with the true population effect between
fits the data. 60% and 40%. This result was statistically
significant.
d. For each additional cup of coffee, there is no
effect on sleep. b. The odds of death in the intervention groups are
60% more than the odds of death in the control
147. Which of the following information is/are TRUE groups with the true population effect between
ABOUT an odds ratio (OR)? 60% and 40%. This result was statistically
significant.
I. If the OR = 1 indicates there is no difference
between the two arms of the study. c. The odds of death in the intervention groups are
60% less than the odds of death in the control
II. If the OR is > 1 the control is better than the groups with the true population effect between
intervention. 40% and 60%. This result was statistically NOT
significant.
III. If the OR is < 1 the intervention is better than the
control. d. The odds of death in the intervention groups are
60% more than the odds of death in the control
a. I only groups with the true population effect between
b. I and II only 40% and 60%. This result was statistically NOT
c. II and III only significant.
d. All
150. A drug company-funded double blind
148. Suppose a study examined the relationship randomised controlled trial evaluated the efficacy
between workers exposed to an asbestos for more of an adenosine receptor antagonist Cangrelor vs
than 10 years and the risk of developing asthma. If Clopidogrel in patients undergoing urgent or
the study found an increased risk of asthma in the elective Percutaneous Coronary Intervention (PCI)
group who exposed to asbestos (OR: 1.16, 95% CI: who were followed up for specific complications for
0.9-1.5). What does this mean? 48 hrs (Bhatt et al. 2009). The results section
reported “OR 0.65 95% confidence interval [CI],
a. Employees who exposed to asbestos for more 0.55 to 0.83; P=0.005” What does this mean?
than 10 years were 1.16 times less likely to develop
asthma than those who did not.
21
www.pharmacyexam.com © All Rights Reserved
22
www.pharmacyexam.com © All Rights Reserved
Biostatistics Answers
1. (c) The effect of two or more variables that do “names” or categories based on the presence or
not allow a conclusion about either one separately absence of certain attributes/characteristics
is defined as confounding. without any ranking between the categories. For
example, patients are categorized by gender as
Random sample: one chosen from a carefully males or females; by religion as Hindu, Muslim, or
defined population with the aid of a formal method Christian. It also includes binominal data, which
to avoid bias and confounding. refers to two possible outcomes. For example,
outcome of cancer may be death or survival, drug
Randomization: in comparative trials, a formal therapy with drug ‘X’ will show improvement or no
method to assign subjects by chance to one or the improvement at all.
other treatment.
2. Ordinal data: It is also called as ordered,
Random errors: errors that follow the principle of categorical, or graded data. Generally, this type of
indifference. data is expressed as scores or ranks. There is a
natural order among categories, and they can be
Bias: The systematic tendency of any factors ranked or arranged in order. For example, pain
associated with the design, conduct, analysis, and associated with cancer may be classified as mild,
evaluation of the results of a trial to make the moderate, and severe. Since there is an order
estimate of a treatment effect deviate from its true between the three grades of pain, this type of data
value. is called as ordinal. To indicate the intensity of pain,
it may also be expressed as scores (mild = 1,
Confounding: The effect of two or more variables moderate = 2, severe = 3). Hence, data can be
that do not allow a conclusion about either one arranged in an order and rank.
separately.
3. Interval data: This type of data is characterized
Validity: (a). In logic, an argument is valid if the by an equal and definite interval between two
conclusions follow from the premises measurements. For example, weight is expressed as
100, 110, 120, 130, 140 lbs. The interval between
(b). In pharmacological sciences, a method is valid if 100 and 110 is same as that between 130 and 140.
it measures what it should, is reproducible and
responsive to change, e.g., by a treatment. 3. (c) Height of children.
Statistical methods for analysis mainly depend on Interval type of data can be either continuous or
type of data. Generally, data show picture of the discrete. A continuous variable can take any value
variability and central tendency. Therefore, it is very within a given range.
important to understand the types of data.
For example: hemoglobin (Hb) level may be taken
1. Nominal data: This is synonymous with as 11.3, 12.6, 13.4 gm % while a discrete variable is
categorical data where data is simply assigned usually assigned integer values i.e. does not have
1
www.pharmacyexam.com © All Rights Reserved
Sometimes, certain data may be converted from To find median in above data, we need to arrange
one form to another form to reduce skewness and the data either ascending or descending. Therefore:
make it to follow the normal distribution. For
example, drug doses are converted to their log 67, 75, 80, 82, 82, 86, 90, 92, 95, 110
values and plotted in dose response curve to obtain
a straight line so that analysis becomes easy. Data In case of even numbers, the median should be the
can be transformed by taking the logarithm, square average of two middle values. Therefore, in above
root, or reciprocal. Logarithmic conversion is the case, it should be:
most common data transformation used in medical
research. (82+86)
= 2
5. (b) 85.90
= 84
Mean is the common measure of central tendency,
If we remove the last data set 110 from the
most widely used in calculations of averages. It is
calculation, the median in case of odd number (9)
the average of a data set. So, in the above example,
should be the central value of the data set:
the mean of the data can be calculated by adding
up the individual values and dividing the sum by
67, 75, 80, 82, 82, 86, 90, 92, 95
number of students (n).
= 82
The sample mean 𝑥̅ is calculated as
2
www.pharmacyexam.com © All Rights Reserved
75, 82, 75, 92, 67, 95, 75, 80, 82, 86. Prevalence is measured using cross-sectional study
designs, such as a survey or a census, or by using
In this data set, the mode will be 75 and not 82 administrative data such as medical, hospital, or
since it appears three times. prescription drug claims.
A set of numbers can have more than one mode 10. (b) This statement is an example of prevalence
(this is known as bimodal) if there are multiple since the statement is an expression of a proportion
numbers that occur with equal frequency, and more of a population found to have a condition at a single
times than the others in the set. point in time.
75, 82, 75, 92, 67, 95, 75, 80, 82, 86, 82. 11. (a) The relative risk (RR) is the risk of an event or
outcome occurring in a group of interest in relation
In this example, both the number 75 and the to a control group. The group of interest can be an
number 82 are modes; since they both appear three intervention (e.g., a medication) or a pre-specified
times. characteristic e.g., those with hypertension).
If no number in a set of numbers occurs more than An example would be the risk of heart attack in
once, that set has no mode: patients taking a study drug versus a control group
of patients taking a placebo.
For example: 75, 82, 85, 90, 92
The relative risk (RR) can be calculated by dividing
8. (b) Classifying continuing educational experience the event rate (i.e., heart attacks) in the
into categories is a categorical variable as the value intervention group by the event rate (i.e., heart
functions as a label rather than a numeric value. attacks) in the control group.
The type of categorical variable is ordinal since
categories can be ranked in a specific order. 12. (c) 0.55.
9. (b) III only. The relative risk (RR) can be calculated by using the
following formula:
Incidence is a measure of the rate of occurrence of
a condition. In other words, incidence is the number (𝑒𝑣𝑒𝑛𝑡 𝑟𝑎𝑡𝑒 𝑖𝑛 𝑖𝑛𝑡𝑒𝑟𝑣𝑒𝑛𝑡𝑖𝑜𝑛 𝑔𝑟𝑜𝑢𝑝)
𝑅𝑅 =
of new cases of a condition that develop during a 𝑒𝑣𝑒𝑛𝑡 𝑟𝑎𝑡𝑒 𝑖𝑛 𝑐𝑜𝑛𝑡𝑟𝑜𝑙 𝑔𝑟𝑜𝑢𝑝
specific period of time. For example, the incidence
5
of hypertension may be documented as the number 𝑅𝑅 = = 0.55
9
of new cases of hypertension per year within a
given sample or population. 13. (b) The relative risk (RR) is the risk of an event or
outcome occurring in a group of interest in relation
Incidence can be measured using methods such as to a control group. The group of interest can be an
randomized controlled trials (e.g., to determine the intervention (e.g., a medication) or a pre-specified
incidence of adverse effects of a medication) or characteristic (e.g., those with hypertension).
cohort studies, which follow groups of patients over
time. An example would be the risk of heart failure in
patients taking a study drug versus patients taking
Prevalence, on the other hand, is the proportion of placebo. The RR can be calculated by dividing the
a population found to have a condition at a single event rate (i.e., heart failure) in the intervention
point in time. For example, the prevalence of group by the event rate in the control group.
hypertension may be documented as the
percentage of the US population with hypertension.
3
www.pharmacyexam.com © All Rights Reserved
1. If the RR is 1, there is no difference between the The number needed to treat (NNT) is a
two groups; or, in other words, there is no representation of the number of patients who need
association between exposure to a factor and the to be treated to prevent one additional event
outcome of interest (e.g., there was no association compared to treating the same number of patients
between the study drug and heart failure). with the control therapy.
2. An RR less than 1 indicates a negative association It can be easily calculated as it is the inverse of the
(e.g., an RR of 0.55 - we found in question 11 - ARR. The NNT should always be rounded up to the
means the risk of heart failure in patients taking the nearest integer, as you cannot have a fraction of a
study drug was less than the risk of heart failure in patient (e.g., an NNT of 1.6 would be rounded up to
patients taking placebo). an NNT of 2).
4
www.pharmacyexam.com © All Rights Reserved
margin been utilized, the treatment would have 116. (a) A categorical variable (sometimes called a
been found to be “non-inferior” if the event rates nominal variable) is one that has two or more
were as high as 8.75%. categories, but there is no intrinsic ordering to the
categories. For example, gender is a categorical
On the other hand, overly stringent NI margins can variable having two categories (male and female)
prevent an effective therapy from being deemed and there is no intrinsic ordering to the categories.
non-inferior. Hair color is also a categorical variable having a
number of categories (blonde, brown, brunette,
113. (b) After a non-inferiority clinical trial, a new red, etc.) and again, there is no agreed way to order
therapy may be accepted as effective, even if its these from highest to lowest. A purely categorical
treatment effect is slightly lesser than the current variable is one that simply allows you to assign
standard. It is therefore possible that, after a series categories, but you cannot clearly order the
of trials where the new therapy is slightly worse variables. If the variable has a clear ordering, then
than the preceding drugs, an ineffective or harmful that variable would be an ordinal variable, as
therapy might be incorrectly declared efficacious; described below.
this is known as 'bio-creep'.
An ordinal variable is similar to a categorical
To provide an example, suppose that a theoretical variable. The difference between the two is that
Drug A is deemed to be superior to placebo. Several there is a clear ordering of the variables. For
years later, Drug B is found to be non-inferior to example, suppose you have a variable, economic
drug A with a certain NI margin. Then, Drug C is status, with three categories (low, medium and
compared to Drug B via a NI study at some point high). In addition to being able to classify people
later. During each of these studies, the new agent is into these three categories, you can order the
found to be “not acceptably” worse than the categories as low, medium and high. Now consider
previous agent. It may then be unknown if Drug C is a variable like educational experience (with values
more effective than placebo. such as elementary school graduate, high school
graduate, some college and college graduate).
114. (d) The Chi-Squared Test is used to test for These also can be ordered as elementary school,
statistical significance of a dependent categorical high school, some college, and college graduate.
variable between two or more independent groups. Even though we can order these from lowest to
It is used to determine whether observed sample highest, the spacing between the values may not be
frequencies differ significantly from expected the same across the levels of the variables. If these
frequencies specified in the null hypothesis. categories were equally spaced, then the variable
would be an interval variable.
115. (c) The only assumption for the Chi-Squared
Test is that all cells must have an expected value of An interval variable is similar to an ordinal variable,
at least five. except that the intervals between the values of the
interval variable are equally spaced. For example,
The chi-square goodness of fit test is appropriate suppose you have a variable such as annual income
when the following conditions are met: that is measured in dollars, and we have three
people who make $50,000, $60,000 and $70,000.
1. The sampling method is simple random sampling. The second person makes $10,000 more than the
2. The variable under study is categorical. first person and $10,000 less than the third person,
3. The expected value of the number of sample and the size of these intervals is the same. If there
observations in each level of the variable is at least were two other people who make $100,000 and
5. $110,000, the size of that interval between these
two people is also the same ($10,000).
33
www.pharmacyexam.com © All Rights Reserved
The formula for calculating expected cell value in So, placing all values in above table, it should look
Chi Squared Test is: like:
Expected ? ? There were two cells (e.g. 2.70 and 2.29) in our
Cell Value calculation have expected cell value less than 5.
Therefore, this data does not meet the chi-squared
Total 5 115 120 assumption for at least five expected in each cell.
So, the Fischer’s Exact Test shall be used instead.
As shown in the above table, we need to find values 118. (c) The degrees of freedom for a Chi-squared
for each expected cell denoted with “?”. Using the Test can be calculated by using the following
formula, we can calculate expected cell value in formula:
intervention and control groups.
Degrees of Freedom = (Rows – 1) x (Columns – 1)
A. Intervention Group With Stroke
Therefore, the degrees of freedom in 2 x 3
65 x 5 contingency chi squared test should be:
Expected Cell = = 2.70
120
DF = (R – 1)x(C – 1)
B. Intervention Group Without Stroke DF = (2-1)x(3-1)
DF = 1 x 2 = 2
65 x 115
Expected Cell = = 62.29
120 119. (a) HbA1c is a continuous dependent variable.
As we are testing two independent groups and it
C. Control Group With Stroke
meets the general rule of thumb having at least 30
34
www.pharmacyexam.com © All Rights Reserved
subjects in each group, the t-test is the best option. To calculate expected value for each cell, we need
to use the following formula:
Continuous variable: A continuous variable has
numeric values, and the relative magnitude of the Row Total x Column Total
Expected Cell = Total number subjects
values is significant. This means that data can be
ranked low to high or high to low. An example of a
continuous variable is value of HbA1c in a diabetic 49 x 32
patient. Expected Cell = 100
= 15.68
49 x 27
Discrete variable: Suppose we flip a coin and count Expected Cell = = 13.23
100
the number of tails. The number of tails could be
any integer value between 0 and plus infinity. We 49 x 41
Expected Cell = = 20.09
could not, for example, get 4.6 tails. 100
51 x 32
120. (a) Variables are classified as either Expected Cell = 100
= 16.32
independent or dependent variables. In research
studies, the independent variable is the variable 51 x 27
that is varied or manipulated by the researcher. Expected Cell = = 13.77
100
Expected Value
Biology Physics Chemistry Total
Male ? ? ? ?
Female ? ? ? ?
Total ? ? ? ?
35
www.pharmacyexam.com © All Rights Reserved
The next step is to find out the Chi square test value 122. (d) The pain scores are an ordinal variable, so a
to compare it with our previously obtained critical non-parametric test must be used. If the data is part
value. The formula to find the Chi Square test value of one single group, Wilcoxon signed rank test shall
is: be used. If the data is part of two independent
groups, Mann Whitney U test shall be used. For
(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 − 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒)2 three or more independent groups, Kruskal Wallace
ꭓ2 = ∑ test should be used.
𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒
36
www.pharmacyexam.com © All Rights Reserved
42