Chapter 3
Probability
The Basis of the Statistical
        inference
• Key words:
• Probability, objective Probability,
subjective Probability, equally likely
Mutually exclusive, multiplicative rule
Conditional Probability, independent events, Bayes
  theorem
                         Dr.Elias                    2
3.1 Introduction
• The concept of probability is frequently encountered in
  everyday communication. For example, a physician may
  say that a patient has a 50-50 chance of surviving a certain
  operation.
  Another physician may say that she is 95 percent certain
  that a patient has a particular disease.
• Most people express probabilities in terms of percentages.
• But, it is more convenient to express probabilities as
  fractions. Thus, we may measure the probability of the
  occurrence of some event by a number between 0 and 1.
• The more likely the event, the closer the number is to one.
  An event that can't occur has a probability of zero, and an
  event that is certain to occur has a probability of one.
                            Dr.Elias                         3
3.2 Two views of Probability objective and
 
               subjective:
• *** Objective Probability
•    ** Classical and Relative
• Some definitions:
1.Equally likely outcomes:
Are the outcomes that have the same chance of
  occurring.
2.Mutually exclusive:
Two events are said to be mutually exclusive if they
  cannot occur simultaneously such that A & B =Φ .
                         Dr.Elias                      4
• The universal Set (S): The set all possible outcomes.
• The empty set Φ : Contain no elements.
• The event ,E : is a set of outcomes in S which has a
  certain characteristic.
• Classical Probability : If an event can occur in N
  mutually exclusive and equally likely ways, and if m
  of these possess a triat, E, the probability of the
  occurrence of event E is equal to m/ N .
• For Example: in the rolling of the die , each of the six
  sides is equally likely to be observed . So, the
  probability that a 4 will be observed is equal to 1/6.
                           Dr.Elias                          5
• Relative Frequency Probability:
• Def: If some posses is repeated a large number of
  times, n, and if some resulting event E occurs m
  times , the relative frequency of occurrence of E ,
  m/n will be approximately equal to probability of E .
  P(E) = m/n .
• *** Subjective Probability :
• Probability measures the confidence that a particular
  individual has in the truth of a particular proposition.
• For Example : the probability that a cure for cancer
  will be discovered within the next 10 years.
                           Dr.Elias                      6
 3.3 Elementary Properties of Probability:
• Given some process (or experiment ) with n
   mutually exclusive events E1, E2, E3,…………,
   En, then
• 1-P(Ei ) 0, i= 1,2,3,……n
• 2- P(E1 )+ P(E2) +……+P(En )=1
• 3- P(Ei +EJ )=P(Ei )+ P(EJ )
Ei ,EJ are mutually exclusive
                     Dr.Elias                   7
              Rules of Probability
• 1-Addition Rule
•      P(A U B)= P(A) + P(B) – P (A∩B )
•   2- If A and B are mutually exclusive (disjoint) ,then
•         P (A∩B ) = 0
•   Then , addition rule is
•         P(A U B)= P(A) + P(B) .
•   3- Complementary Rule
•         P(A' )= 1 – P(A)
•   where, A' = = complement event
•   Consider the following example
                             Dr.Elias                       8
                 Table in Example
Family history of Early = 18               Later >18         Total
Mood Disorders                                 (L)
                     (E)
  Negative(A)          28                 35           63
Bipolar                19                 38           57
Disorder(B)
Unipolar (C)           41                 44           85
Unipolar and           53                 60           113
Bipolar(D)
Total                 141                 177          318
                               Dr.Elias                              9
   **Answer the following questions:
Suppose we pick a person at random from this sample.
1-The probability that this person will be 18-years old or younger?
2-The probability that this person has family history of mood orders
   Unipolar(C)?
3-The probability that this person has no family history of mood
   orders Unipolar( )?
4-The probability that this person is 18-years old or younger or has
   no family history of mood orders Unipolar (C))?
5-The probability that this person is more than18-years old and has
   family history of mood orders Unipolar and Bipolar(D)?
                              Dr.Elias                         10
         Conditional Probability:
P(A\B) is the probability of A assuming that B has
  happened.
            P( A  B)
• P(A\B)=     P( B)     , P(B)≠ 0
            P( A  B)
• P(B\A)=     P ( A)    , P(A)≠ 0
                               Dr.Elias              11
            Example Continued
From previous example, answer
• suppose we pick a person at random and find he is 18
  years or younger (E),what is the probability that this
  person will be one with Negative family history of
  mood disorders (A)?
• suppose we pick a person at random and find he has
  family history of mood (D) what is the probability that
  this person will be 18 years or younger (E)?
                          Dr.Elias                     12
   Calculating a joint Probability :
• Example Continued
• Suppose we pick a person at random from the
  318 subjects. Find the probability that he will
  early (E) and has no family history of mood
  disorders (A).
                       Dr. Elias                13
            Multiplicative Rule:
•   P(A∩B)= P(A\B)P(B)
•   P(A∩B)= P(B\A)P(A)
•   Where,
•   P(A): marginal probability of A.
•   P(B): marginal probability of B.
•   P(B\A):The conditional probability.
                    Text Book : Basic Concepts and
                                                         14
                   Methodology for the Health Sciences
           Example Continued
• From previous example, we wish to compute
  the joint probability of Early age at onset(E)
  and a negative family history of mood
  disorders(A) from a knowledge of an
  appropriate marginal probability and an
  appropriate conditional probability.
                  Text Book : Basic Concepts and
                 Methodology for the Health Sciences
                                                       15
          Independent Events:
• If A has no effect on B, we said that A,B are
  independent events.
• Then,
•       1- P(A∩B)= P(B)P(A)
•       2- P(A\B)=P(A)
•       3- P(B\A)=P(B)
                        Dr.Elias                  16
                     Example
• In a certain high school class consisting of 60 girls
  and 40 boys, it is observed that 24 girls and 16 boys
  wear eyeglasses . If a student is picked at random
  from this class ,the probability that the student
  wears eyeglasses , P(E), is 40/100 or 0.4 .
• What is the probability that a student picked at
  random wears eyeglasses given that the student is a
  boy?
• What is the probability of the joint occurrence of the
  events of wearing eye glasses and being a boy?
                          Dr.Elias                     17
          Example 3.4.8 Page 69
• Suppose that of 1200 admission to a general
  hospital during a certain period of time,750 are
  private admissions. If we designate these as a set A,
  then compute P(A) , P(A).
                          Dr.Elias                        18
                  Marginal Probability:
• Definition:
• Given some variable that can be broken down into m
  categories designated
by A , A ,......., A ,......., A and another jointly occurring
      1   2           i       m
   variable that is broken down into n categories
           B , B ,......., B ,......., B
              1   2       j       n      designated by
, the marginal probability of A with all the categories of
                                              i
   B . That is,
   P( Ai )   P( Ai  B j ), for all value of j
• Example
• Use data of Table, and rule of marginal Probabilities
   to calculate P(E).
                                      Dr. Elias                  19
Q1: In a study of violent victimization of women and
 men, Porcelli et al. (A-2) collected information
 from 679 women and 345 men aged 18 to 64
 years at several family practice centers in the
 metropolitan Detroit area. Patients filled out a
 health history questionnaire that included a
 question about victimization. The following table
 shows the sample subjects cross-classified by sex
 and type of violent victimization reported. The
 victimization categories are defined as no
 victimization, partner victimization (and not by
 others), victimization by persons other than
                        Dr.Elias                20
partners (friends, family members, or strangers),
  and those who reported multiple victimization.
                No         Partners       Nonpartners     Multiple      Total
           Victimization                                Victimization
  Women        611           34                  16          18         679
   Men         308           10                  17          10         345
   Total       919           44                  33          28         1024
(a) Suppose we pick a subject at random from this
  group. What is the probability that this subject
  will be a women?
                                      Dr.Elias                                  21
(b) What do we call the probability calculated in part
   a?
(c) Show how to calculate the probability asked for
   in part a by two additional methods.
(d) If we pick a subject at random, what is
   probability that the subject will be a women and
   have experienced partner abuse?
(e) What do we call the probability calculated in part
   d?
(f) Suppose we picked a man at random. Knowing
   this information, what is the probability that he
                        Dr.Elias                  22
experienced abuse from nonpartners?
(g) What do we call the probability calculated in
   part f?
(h) Suppose we pick a subject at random. What
   is the probability that it is a man or someone
   who experienced abuse from a partner?
(i) What do we call the method by which you
   obtained the probability in part h?
                       Dr.Elias                     23
Q2: Fernando et al. (A-3) studied drug-sharing
 among injection drug users in the South Bronx in
 New York City. Drug users in New York City use the
 term “split a bag” or “get down on a bag” to refer
 to the practice of diving a bag of heroin or other
 injectable substances. A common practice
 includes splitting drugs after they are dissolved in
 a common cooker, a procedure with considerable
 HIV risk. Although this practice is common, little is
 known about the prevalence of such practices.
 The researchers asked injection drug users in four
 neighborhoods in the South Bronx if they ever
                         Dr.Elias                 24
“got down on” drugs in bags or shots. The results
  classified by gender and splitting practice are
  given below: Gender Split Drugs Never Split  Total
                                      Drugs
State the           Male    349        324      673
following          Female   220        128      348
probabilities in    Total   569        452     1021
words and calculate:
(a) P( Male  Split Drugs) Ans: 0.3418
(b) P(Male  Split Drugs)          Ans: 0.8746
(c) P( Male Split Drugs)            Ans: 0.6134
                        Dr.Elias                  25
(d)   P(Male) Ans: 0.6592
Q3: Laveist and Nuru-Jeter (A-4) conducted a
 study to determine if doctor-patient race
 concordance was associated with greater
 satisfaction with care. Toward that end, they
 collected a national sample of African-
 American, Caucasian, Hispanic, and Asian-
 American respondents. The following table
 classifies the race of the subjects as well as
 the race of their physician:
                       Dr.Elias                   26
                                 Patient Race
Physician’s     Caucasian    African-          Hispanic    Asian-    Total
  Race                      American                      American
   White          779         436                406        175      1796
  African-         14         162                15          5       196
 American
  Hispanic         19          17                128         2       166
Asian/Pacific      68          75                71         203      417
   -Island
   Other           30          55                56          4       145
    Total         910         745                676        389      2720
(a) What is the probability that a randomly
  selected subject will have an Asian/Pacific-
  Islander physician? Ans: 0.1533
                                    Dr.Elias                                 27
(b) What is the probability that an African-American
  subject will have an African- American physician?
 Ans: 0.2174
(c) What is the probability that a randomly selected
  subject in the study will be Asian-American and
  have an Asian/Pacific-Islander physician? Ans:
  0.075
(d) What is the probability that a subject chosen at
  random will be Hispanic or have a Hispanic
  physician? Ans: 0.2625
(e) Use the concept of complementary events to find
  the probability that a subject chosen at
                       Dr.Elias                 28
random in the study does not have a white
  physician? Ans: 0.3397
Q4:
If the probability of left-handedness in a certain
   group of people is 0.5, what is the probability
   of right-handedness (assuming no
   ambidexterity)?
                        Dr.Elias                     29
Q5:
The probability is 0.6 that a patient selected at
  random from the current residents of a certain
  hospital will be a male. The probability that
  the patient will be a male who is in for surgery
  is 0.2. A patient randomly selected from
  current residents is found to be a male; what
  is the probability that the patient is in the
  hospital for surgery?
 Ans: 0.3333
                       Dr.Elias                  30
Q6:
In a certain population of hospital patients the
  probability is 0.35 that a randomly selected
  patient will have heart disease. The
  probability is 0.86 that a patient with heart
  disease is a smoker. What is the probability
  that a patient randomly selected from the
  population will be a smoker and have heart
  disease?
  Ans: 0.301
                       Dr.Elias                    31
Baye's Theorem
      Dr.Elias   32
In this case if the patient has to do
   a blood test in the laboratory,
       some time the result is
Positive(he has the disease) and if
        the result is negative
    (he doesn't has the disease)
                 Dr.Elias           33
So, we have the following cases
                The patient has the   The patient doesn't has
                     disease               the disease
                        (D)                    (D)
                                          Specificity
Lab result is                            A symptom
 Negative       wrong result
                                           P(T|D)
     (T)
Lab result is     Sensitivity
  positive       A symptom              wrong result
    (T)            P(T|D)
                       Dr.Elias                                 34
Definition.1
The sensitivity of the symptom
This is the probability of a positive result given that the
subject has the disease. It is denoted by P(T|D)
Definition.2
The specificity of the symptom
This is the probability of negative result given that the
subject does not have the disease. It is denoted by
P(T|D)
                             Dr.Elias                       35
Definition 3:
The predictive value positive of the symptom
This is the probability that the subject has the
  disease given that the subject has a positive
  screening test result.
It is calculated using bayes theorem through the
  following formula
                     P(T | D) P( D)
P( D | T ) 
             P(T | D) P( D)  P(T | D) P( D)
Where P(D) is the rate of the disease
                        Dr.Elias                   36
Which is given by
P(D) = 1 – P(D)
P(T/ D) = 1 - P(T/ D)
Note that the numerator is equal to sensitivity
  times rate of the disease, while the
  denominator is equal to sensitivity times rate
  of the disease plus 1 minus the specificity
  times one minus the rate of the disease
                       Dr.Elias                    37
Definition.4
The predictive value negative of the symptom
This is the probability that a subject does not have the
disease given that the subject has a negative
screening test result .It is calculated using Bayes
Theorem through the following formula
                        P(T | D) P( D)
   P( D | T ) 
                P(T | D) P( D)  P(T | D) P( D)
where,
                p(T | D)  1  P(T | D)
                      Text Book : Basic Concepts and
                                                           38
                     Methodology for the Health Sciences
Example
A medical research team wished to evaluate a proposed screening test for
Alzheimer’s disease. The test was given to a random sample of 450 patients
with Alzheimer’s disease and an independent random sample of 500 patients
without symptoms of the disease. The two samples were drawn from
populations of subjects who were 65 years or older. The results are as follows.
          Test Result        Yes (D)              No (D )      Total
       Positive(T)               436                   5        441
       Negativ(T )                14                 495        509
       Total                     450                 500        950
                                       Dr.Elias                             39
In the context of this example
a)What is a false positive?
A false positive is when the test indicates a positive result (T) when
the person does not have the disease D
b) What is the false negative?
A false negative is when a test indicates a negative result ( T )
when the person has the disease (D).
c) Compute the sensitivity of the symptom.
                               436
                  P(T | D)         0.9689
                               450
d) Compute the specificity of the symptom.
                                  495
                     P(T | D)         0.99
                                  500
                                       Dr. Elias                    40
e) Suppose it is known that the rate of the disease in the general population
is 11.3%. What is the predictive value positive of the symptom and the
predictive value negative of the symptom
The predictive value positive of the symptom is calculated as
                                 P(T | D) P( D)
             P( D | T ) 
                         P(T | D) P( D)  P(T | D) P( D)
                                 (0.9689)(0.113)
                                                            0.925
                         (0.9689)(0.113)  (.01)(1 - 0.113)
The predictive value negative of the symptom is calculated as
                                 P(T | D) P( D)
             P( D | T ) 
                         P(T | D) P( D)  P(T | D) P( D)
                                 (0.99)(0.887)
                                                         0.996
                        (0.99)(0.887)  (0.0311)(0.113)
                                       Dr.Elias                             41
Q1; A medical research team wishes to assess
  the usefulness of a certain symptom (call it S)
  in the diagnosis of a particular disease. In a
  random sample of 775 patients with the
  disease, 744 reported having the symptom. In
  an independent random sample of 1380
  subjects without the disease, 21 reported that
  they had the symptom.
(a) In the context of this exercise, what is a false
  positive?
(b) What is a false negative?
                        Dr.Elias                   42
(c) Compute the sensitivity of the symptom.
(d) Compute the specificity of the symptom.
(e) Suppose it is known that the rate of the diseases
   in the general population is 0.001. what is the
   predictive value positive of the symptom?
(f) What is the predictive value negative of the
   symptom?
(g) Find the predictive value positive and the
   predictive value negative for the symptom for the
   following hypothetical diseases rates: 0.0001, 0.01
   and 0.1
                         Dr.Elias                 43
(h) What do you conclude about the predictive
  value of the symptom on the basis of the results
  obtained in part g?
Q2:
Dorsay and Helms (A-6) performed a retrospective
 study of 71 knees scanned by MRI. One of the
 indicators they examined was the absence of the
 “bow-tie sign” in the MRI as evidence of a
 bucket-handle or “bucket-handle type” tear of
 the meniscus.
                      Dr.Elias                 44
In the study, surgery confirmed that 43 of the 71
  cases were bucket-handle tears. The cases
  may be cross-classified by “bow-tie sign”
  status and surgical results as follows:
                         Tear Surgically           Tear Surgically   Total
                         Confirmed (D)            Confirmed As Not
                                                    Present ( D)
    Positive Test              38                        10           48
(absent bow-tie sign)
         (T)
   Negative Test                5                        18           23
(bow-tie present)( T )
        Total                  43                        28           71
                                       Dr.Elias                              45
(a) What is the sensitivity of testing to see if the
  absent bow-tie sign indicates a meniscal tear?
 Ans: 0.8837
(b) What is the specificity of testing to see if the
  absent bow-tie sign indicates a meniscal tear?
 Ans: 0.6229
(c) What additional information would you need to
  determine the predictive value of the test?
                        Dr.Elias                  46
(d) Suppose it is known that the rate of the
  disease in the general population is 0.1, what
  is the predictive value positive of the
  symptom? Ans: 0.20659
(e) What is predictive value negative of the
  symptom? Ans: 0.9797
                       Dr.Elias                    47