THE UNIVERSITY OF NEW SOUTH WALES
DEPARTMENT OF STATISTICS
MATH5905 STATISTICAL INFERENCE
Part one: Decision theory. Bayes and minimax rules (SOLUTIONS)
Question 1: Please draw carefully the graph of the risk set before doing anything else.
a) d3 since the minimal between the four values {6, 5, 3, 5} is 3.
b) The rule d3 again. Its minimax risk is 3.
c) The rule d3 again. Its Bayes risk is equal to 31 × 2 + 23 × 3 = 2 32 .
d) Chooses d2 and d4 with probability 1/2 each.
p
e) All priors in the form (p, 1 − p) with 1 > p > 3/5. Explanation: the slope − 1−p should be
3
smaller than the slope − 2 of d1 d3 .
Question 2: Since X is uniformly distributed in [0, θ) the density is f (x, θ) = θ1 I[0,θ) (x) with E(X) = θ2 and
2
E(X 2 ) = θ3 . The rule is unbiased when µ = 2 since E(2X) = 2 · θ2 = θ holds. Now for any fixed
value of µ we have
E (θ − µX)2 = θ2 (1 − µ + µ2 /3).
3 θ2
When µ = 2 the latter mean squared error is equal to 4 . Now, we get
3 µ2 θ2 3θ2 θ2
E (θ − µX)2 − E (θ − X)2 = − µθ2 + (2µ − 3)2 ≥ 0
=
2 3 4 12
the rule 32 X will be uniformly better than any other rule in the form µX. That is, any rule in
the form µX would be inadmissible unless µ = 3/2.
Question 3: i) The likelihood times the prior gives
Pn
f (x|θ)τ (θ) = kθn e−θ( i=1 xi +k)
and the marginal density of X is
Z ∞ Z ∞ Pn
g(x) = f (x|θ)τ (θ)dθ = k θn e−θ( i=1 xi +k)
dθ
0 0
Now we change the variables: set
n
X dy
θ( xi + k) = y, dθ = Pn
i=1
( i=1 xi + k)
and get: Z ∞
k kΓ(n + 1)
g(x) = Pn y n e−y dy = Pn
( i=1 xi + k)n+1 o ( i=1 xi + k)n+1
Hence Pn
θn e−θ( i=1 xi +k)
h(θ|x) = , θ > 0.
Γ(n + 1)( Pn 1xi +k )n+1
i=1
Then by recalling the general definition of a Gamma(α, β) density:
x
e− β xα−1
f (x; α, β) = , x > 0,
Γ(α)β α
1
we see that
1
h(θ|x) ∼ Gamma(n + 1, Pn ).
i=1 xi + k
Note: we did note really have to determine the normalizing constant as we did above. There
is an easier approach based on looking at the joint density
Pn
f (x|θ)τ (θ) = kθn e−θ( i=1 xi +k)
we can identify that (up to a normalizing constant) that this is a
1
Gamma n + 1, Pn
i=1 xi + k
1
density and hence the posterior h(θ|x) has to be Gamma(n + 1, Pn xi +k ).
i=1
ii) For a Bayes estimator with respect to quadratic loss, we have θ̂ = E(θ|X), and for a
Gamma(α, β) density it is known that the expected value is equal to αβ hence we get
immediately
n+1
θ̂ = Pn .
i=1 xi + k
We could also calculate this directly:
Z ∞ Pn
( i=1 xi + k)n+1 ∞ n+1 −θ(Pn xi +k)
Z
θ̂ = θh(θ|x)dθ = θ e i=1 dθ
0 Γ(n + 1) 0
and after changing variables:
n
X dy
θ( xi + k) = y, dθ = Pn
i=1
( i=1 xi + k)
we can continue the evaluation:
R∞
e−y y n+1 dy
0
θ̂ = Pn
Γ(n + 1)( i=1 xi + k)
Γ(n + 2)
= Pn
Γ(n + 1)( i=1 xi + k)
n+1
= Pn
i=1 xi + k
Question 4: We have a single observation X only with density
1
f (x|θ) = I(x,∞) (θ)
θ
which implies that
Z ∞ Z ∞
1 −θ
g(x) = f (x|θ)τ (θ)dθ = θe dθ = e−x , x > 0.
0 x θ
Hence (
f (x|θ)τ (θ) ex−θ if θ > x
h(θ|x) = =
g(x) 0 if 0 < θ < x
i) With respect to quadratic loss: The Bayesian estimator δτ (x) is given by:
Z ∞ Z ∞ Z ∞
δτ (x) = θh(θ|x)dθ = θex−θ
dθ = e x
θe−θ dθ = ex (xe−x + e−x ) = x + 1.
x x x
ii) With respect to absolute value loss: The Bayesian estimator m solves the equation:
Z ∞
1
ex−θ dθ =
m 2
1
and we get: ex−m = 2 =⇒ m − x = ln 2 =⇒ m = x + ln 2.
2
iii) To find the Bayes rule for the loss function Lη (θ, a) = (θ − a)(η − I(θ − a < 0)) we need to
find the action a such that
inf Q(X, a)
a∈A
which is the same as minimizing the Bayesian risk. Now:
Z
Q(X, a) = L(θ, a)h(θ|X)dθ
Θ
Z ∞
= (θ − a)(η − I(θ < a))h(θ|X)dθ
Zx∞ Z ∞
= ηθh(θ|X)dθ − θI(θ < a)h(θ|X)dθ
x x
Z ∞ Z ∞
− aηh(θ|X)dθ + aI(θ < a)h(θ|X)dθ
Z ∞x Z a x Z ∞ Z a
=η θh(θ|X)dθ − θh(θ|X)dθ − aη h(θ|X)dθ + a h(θ|X)dθ
x x x 0
Therefore Z a
∂Q(X, a)
= −ah(a|X) − η + h(θ|X)dθ + ah(a|X) = 0
∂a x
which leads to the solution: Z a
h(θ|X)dθ = η
x
More specifically:
Z a h ia
ex−θ dθ = − ex−θ = −ex−a − (−e0 ) = 1 − ex−a = η
x x
and the solution is:
a = x − ln(1 − η)
Question 5: Let X = (X1 , . . . , Xn ) are the random variables. Setting µ0 = x0 for convenience of the notation,
we can write: Pn
n+1 2 i=0 xi
1
Pn 2
h(µ|X=x) ∝ e− 2 i=0 (xi −µ) ∝ e− 2 [µ −2µ n+1 ]
Then by completing the square with the expression that does not depend on µ:
Pn
− n+1 i=0 xi ]2
h(µ|X=x) ∝ e 2 [µ− n+1
which implies that h(µ|X=x), (being a density), must be the density of
Pn
xi 1
N ( i=0 , ).
n+1 n+1
Hence, the Bayes estimator (being the posterior mean) would be
n n
X X 1 n
( xi )/(n + 1) = (µ0 + xi )/(n + 1) = µ0 + X̄,
i=0 i=1
n+1 n+1
that is, the Bayes estimator is a convex combination of the mean of the prior and of X̄. In
this combination, the weight of the prior information diminishes quickly when the sample size
increases. The same estimator is obtained with respect to absolute value loss.
Question 6: i) Since X ∼ Bin(5, θ) we have:
P (X = 0|θ) = (1 − θ)5 ,
which means that the posterior of θ given the sample is
h(θ|X = 0) ∝ (1 − θ)5 θ(1 − θ)4 = θ(1 − θ)9 .
3
Hence
h(θ|X = 0) = 110θ(1 − θ)9 .
Γ(12) 11!
where Γ(10)Γ(2) = 9!1! = 110. Then we get for the posterior probability given the sample:
Z 0.2
P (θ ∈ Θ0 |X = 0) = 110θ(1 − θ)9 dθ = .6779
0
and we accept H0 since the above posterior probability is greater than 0.5.
ii) Now
P (X = 1|θ) = 5(1 − θ)4 θ,
which implies that the posterior of θ given the sample is
h(θ|X = 1) ∝ (1 − θ)4 θ(1 − θ)4 θ = (1 − θ)8 θ2 .
Hence
Γ(12)
h(θ|X = 1) = (1 − θ)8 θ2 = 495θ2 (1 − θ)8 .
Γ(9)Γ(3)
Then we get for the posterior probability given the sample:
Z 0.2
1
P (θ ∈ Θ0 |X = 1) = 495θ2 (1 − θ)8 dθ = .3826 < .
0 2
and we reject H0 since the above posterior probability is smaller than 0.5.