DD2434/FDD3434 Machine Learning, Advanced Course
Module 1 Exercise
November 2024
1 Bayesian statistics – Theory
When conducting probabilistic modeling, we usually specify a model for how the data was
generated. We denote the parameters of this model as Θ. In Bayesian statistics, we assume
a prior distribution p(Θ) and infer the posterior p(Θ∣X) through Bayes’ theorem:
p(X∣Θ)p(Θ)
p(Θ∣X) = (1)
p(X)
where p(X∣Θ) is referred to as the likelihood function, p(Θ) the prior and p(X) the evidence
or marginal likelihood.
In some cases, we don’t need to compute p(X) in order to derive the posterior, instead we
can apply the method of ”identifying the distribution” that Jens describes in Video Lecture
1.3. Use this method in the exercises below.
Conjugate priors – Exercises
1.1 Beta-Binomial
Let X = (X1 , .., XN ) be i.i.d. where Xn ∣θ, m ∼ Binomial(m, θ) and θ ∼ Beta(α, β). Show
that the posterior p(θ∣X, m) follows a Beta-distribution, i.e., that the Beta is conjugate prior
to the Binomial with known m. What are the parameters of the posterior? Compare with
the Wikipedia Conjugate prior table.
1.2 Poisson-Gamma
Let D = (d1 , ..., dN ) be i.i.d. with dn ∣Λ ∼ P oisson(Λ) and Λ ∼ Gamma(α, β). Show that
the posterior p(Λ∣D) follows a Gamma-distribution, i.e. that the Gamma is conjugate prior
to the Poisson distribution. What are the parameters of the posterior? Compare with the
Wikipedia Conjugate prior table.
1
1.3 Normal-NormalGamma (Hard)
Let X = (X1 , ..., XN ) be i.i.d. with Xn ∣µ, τ ∼ N ormal(µ, τ1 ) and (µ, τ ) ∼ N ormalGamma(µ0 , λ, α, β).
Show that the posterior p(µ, τ ∣X) follows a NormalGamma-distribution, i.e. that the Nor-
malGamma is conjugate prior to the Normal distribution with unknown mean and precision.
What are the parameters of the posterior? Compare with the Wikipedia Conjugate prior
table.