Bayesian Multi-object Tracking:
Probability Hypothesis Density Filter & Beyond
Ba-Ngu Vo http://ba-ngu.vo-au.com
Dept. Electrical & Computer Engineering
Curtin University
Perth, Australia
ISPL
Intelligent Sensing & Perception Lab
https://ispl-research.com/ IEEE Signal Processing Society, 2022
Introduction
N. Wiener (1894-1964) A. Kolmogorov (1903-1987) R. Kalman (1930-2016)
1940’s: Wiener filter - Pioneering work by Wiener, Kolmogorov
1950’s-1960’s: Kalman filter - Bode & Shannon, Zadeh & Ragazzini, Levinson,
Swerling, Stratonovich, Kalman ... Schmidt’s Implementation – Apollo program
1990’s--: Particle Filter for non-linear filtering - Gordon, Salmond & Smith
1990’s--: Multi-Object Filter - Pioneering work by Mahler
Probability Hypothesis Density (PHD), Cardinalized PHD filters …
Labeled Random Finite Sets: Generalized Labeled Multi-Bernoulli filters
Outline
State Estimation
Multi-object State Space Model
PHD filter
Beyond PHD filtering
Labelled Random Finite Set
Generalized Labelled Multi-Bernoulli Filter
Some Demonstrations
State Estimation
State Space Model of a Dynamical System
observation space ℤ
zk
zk-1
observation vector
state space 𝕏
State Estimation
state transition
xk System ID
xk-1 state vector Control
State equation xk = F(xk-1, uk-1, nk) fk|k-1(xk| xk-1, uk-1)
Markov transition density
control noise
Observation likelihood
Observation equation zk = Y(xk, uk, vk) gk(zk| xk, uk-1)
State Estimation
𝑥0 𝑥1 𝑥2 𝑥3 𝑥4 𝑥𝑘
State space
state trajectory = history of states
x0:k = [x0 ,…, xk]
0 1 2 3 4 … k time
State Estimation: estimate state trajectory
Online Operation: need fixed complexity per time step to be useful in practice
Filtering: x0,…, xk, suitable for online – Kalman & particle filters
Smoothing: x0:k , not suitable for online – Kalman & particle smoothers
S. Sarkka, Bayesian filtering and smoothing, Cambridge University Press, 2013
State Estimation
gk(zk| xk) fk|k-1(xk| xk-1) p0:k-1(x0:k-1 |z1:k-1)
smoothing-while-filtering gk(zk| xk) fk|k-1(xk| xk-1)p0:k-1(x0:k-1|z1:k-1)dx0:k
Bayes smoother
p0:k-1(x0:k-1 |z1:k-1) p0:k(x0:k| z1:k)
Complexity per time step increases with time - not suitable for online
Smooth over a fix-length moving window - fixed complexity per time step
pk-1(xk-1| z1:k-1) fk|k-1(xk| xk-1) dxk-1 gk(zk| xk) pk|k-1(xk| z1:k-1)
gk(zk| xk)pk-1(xk-1| z1:k-1)dxk
Bayes filter
prediction data-update
pk-1(xk-1 |z1:k-1) pk|k-1(xk| z1:k-1) pk(xk| z1:k)
Fixed complexity per time step - suitable for online, widely used
Multi-Object System
What if the state & observation are not vectors but finite sets?
Particles Molecules Cells
People Orbital Debris Astronomy
Multi-Object System
Multi-object state: Finite set of vectors (aka point pattern)
observation space ℤ
observations of the
states
state space 𝕏
state transition
Xk
3 vectors
5 vectors
Xk-1
Fundamental difference from classical dynamical system:
▪ Random time-varying number of states and measurements
▪ False negatives, false positives, association uncertainty
▪ Much more challenging!
Multi-Object System
Multi-object system: Finite-set-valued dynamical system
observations multi-object observation
ℤ F(ℤ)
Z
states multi-object state
𝕏 F(𝕏)
XX
Multi-object Tracking: State Estimation on the space F(𝕏) of finite sets of 𝕏
Theoretical considerations
▪ Can’t treat a (random) finite set as if it were a (random) vector
▪ PDF of random finite set is not the same as PDF of random vector
▪ Mahler’s Finite Set Statistics (FISST) provides a suitable notion of density & integration
R. Mahler, Statistical Multisource-Multitarget Information Fusion, Artech House, 2007.
R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, Artech House, 2014.
Multi-Object System
Simple Multi-Object State Transition Model
motion State transition
equation
x x’
death Xk = Sk|k-1(Xk-1)Bk|k-1(Xk-1)k
x
X’
spawn
fk|k-1(Xk |Xk-1 )
x
X’ State transition
creation kernel
Transition for each element x of a given multi-object state Xk-1
R. Mahler, Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2007.
R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2014.
Multi-Object State Space Model
Simple Multi-Object Observation Model
Observation equation
likelihood
x z
Zk = Qk(Xk) Kk
misdetection
x
gk(Zk|Xk)
clutter
Observation likelihood
state space observation space
Observation for each element x of a given multi-object state Xk
R. Mahler, Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2007.
R. Mahler, Advances in Statistical Multisource-Multitarget Information Fusion, ArtechHouse, 2014.
Probability Hypothesis Density Filter
Multi-Object Estimation: Multi-object Bayes filter
prediction
pk-1(Xk-1|Z1:k-1) pk|k-1(Xk|Z1:k-1) data-update pk(Xk|Z1:k)
fast growing sum
f k |k −1 ( X k | X ) pk −1|k −1 ( X | Z1:k −1 ) X g k ( Z k | X k ) pk |k −1 ( X k | Z1:k −1 )
g k ( Z k | X ) pk |k −1 ( X | Z1:k −1 ) X
Computationally expensive (exponential complexity) … approximation needed!
state of system: single-object first-moment filter
Single-object random vector Bayes filter (e.g. a-b-g filter)
state of system: multi-object first-moment filter
Multi-object random set Bayes filter (“PHD” filter)
Probability Hypothesis Density Filter
PHD = instantaneous expected no. objects per unit (hyper) volume
PHD (intensity function) of an RFS
PHD peaks: highest expected No. points per unit volume,
most likely locations of objects
v(x0) = density/concentration of
expected No. objects at x0
PHD estimator: find “best” cardinality estimate n*,
locate n* highest peaks.
Sv(x)dx = expected No. objects in S
x0 state space 𝕏
S
Probability Hypothesis Density Filter
PHD filter: Propagates PHD forward in time
state space
vk Estimated no. objects = PHD mass
vk-1
Estimated states = location of PHD peaks
PHD PHD
vk-1(xk-1|Z1:k-1) prediction
vk|k-1(xk|Z1:k-1) update
vk(xk|Z1:k)
R. Mahler, “Multitarget Bayes filtering via first-order multitarget moments,” IEEE Trans. AES, 39(4):1152–1178, 2003
▪ Approximate by a Poisson RFS density that minimizes KL-div from multi-object filtering density
▪ SMC & Gaussian mixture implementations (linear in no. detections)
Vo et. al., "SMC methods for Bayesian Multi-target filtering with RFSs," IEEE Trans. AES, 41(4): 1224-1245, 2005
Sidenbladh, "Multi-target particle filtering for the probability hypothesis density," Int. Conf. Information Fusion, 2003.
Zajic & Mahler, “Particle-systems implementation of the PHD multitarget tracking filter,” SPIE 5096:291-299, 2003
Vo & Ma, "The Gaussian mixture PHD Filter," IEEE Trans. SP, 54(11):4091-4104, 2006
Probability Hypothesis Density Filter
Cardinalized PHD filter: Propagates PHD & cardinality distribution
k-1(n|Z1:k-1) cardinality
prediction
k|k-1(n|Z1:k-1) cardinality
update
k(n|Z1:k)
PHD PHD
vk-1(xk-1|Z1:k-1) prediction vk|k-1(xk|Z1:k-1) update
vk(xk|Z1:k)
R. Mahler, “PHD filters of higher order in target number,” IEEE Trans. AES,43(4):1523–1543, 2007
k-1 k Estimated no. objects = argmax k
Estimated states = location of PHD peaks
Better cardinality estimates
vk-1 vk
▪ Approximate by an IID cluster density that minimizes KL-div from multi-object filtering density
▪ Gaussian mixture implementation (cubic in no. detections)
Vo et. al. “Analytic implementations of the Cardinalized PHD Filter," IEEE Trans. SP, 55(7-II): 3553-3567, 2007
Beyond PHD Filtering
Recall: Multi-object Bayes filter
prediction
pk-1(Xk-1|Z1:k-1) pk|k-1(Xk|Z1:k-1) data-update pk(Xk|Z1:k)
fast growing sum
f k |k −1 ( X k | X ) pk −1|k −1 ( X | Z1:k −1 ) X g k ( Z k | X k ) pk |k −1 ( X k | Z1:k −1 )
g k ( Z k | X ) pk |k −1 ( X | Z1:k −1 ) X
Computationally expensive! (exponential complexity)
2 Key Considerations for Multi-object Tracking (MOT)
Implementations Trajectory Estimation
PHD/CPHD: parametric approximations PHD/CPHD: multi-object localization filters
Approximation by truncating filtering density? How to estimate trajectories?
Beyond PHD Filtering
Truncation of Multi-object Filtering Density
▪ No guarantee that the truncated sum is a function of sets
e.g. measurement ={z1}, prior = p({x1, x2}), detection probability: PD(x1) = 0.9, PD(x2) = 0.1.
𝑝({𝑥1 , 𝑥2 }|{𝑧1 }) ∝ 92𝑔(𝑧1 |𝑥1 )𝑝({𝑥1 , 𝑥2 }) + 𝑔(𝑧1 |𝑥2 )𝑝({𝑥1 , 𝑥2 }) + 9𝜅(𝑧1 )𝑝({𝑥1 , 𝑥2 })
filtering density x1 detected & x2 missed x1 missed & x2 detected x1 & x2 missed
Suppose we truncate the last 2 terms: 𝑓(𝑥1 , 𝑥2 |{𝑧1 }) ≜ 𝑔(𝑧1 |𝑥1 )𝑝({𝑥1 , 𝑥2 })
𝑓(𝑎, 𝑏|{𝑧1 }) = 𝑔(𝑧1 |𝑎)𝑝({𝑎, 𝑏}) ≠ 𝑓(𝑏, 𝑎|{𝑧1 }) = 𝑔(𝑧1 |𝑏)𝑝({𝑎, 𝑏})
f is not symmetric in its arguments, i.e. not a function of the set {a,b}
▪ Even if the truncated sum is a function of sets, what’s the approximation error?
Beyond PHD Filtering
𝑋0 𝑋1 𝑋2 𝑋3 𝑋4 𝑋𝑘
State space
multi-object state history
𝑋0:𝑘
?
0 1 2 3 4 … k Unlike single-object system
Trajectory ≠ State History
State space
multi-object trajectory
trajectories
0 1 2 3 4 … k
Labeled Random Finite Set
multi-object trajectory = history of labeled multi-object states
X1 X4 Xk
State space
X 0:k
labeled
trajectories
0 1 2 3 4 … k time
multi-object state X = {x1 xj xn} Label
distinct from other objects
labeled state vector (xj , lj) fixed for the life of the object
labeled multi-object state X = {x1 xj xn} multi-object state augmented with distinct labels
Goodman et. al. Mathematics of Data Fusion. probability density of point processes, 1997
Mahler, Statistical Multisource-Multitarget Information Fusion, Artech House, 2007
Labeled Random Finite Set
multi-object trajectory = history of labeled multi-object states
X1 X4 Xk
State space
X 0:k
labeled
trajectories
0 1 2 3 4 … k time
Provides multi-object trajectory estimate (even from a single scan)
▪ filtering with labeled multi-object states: X0,…, Xk
▪ smoothing with labeled multi-object states: X0:k
Provides lineage information for modelling/estimation with spawning, division/splitting
Bryant, et. al. "A GLMB Filter with Object Spawning," IEEE Trans. SP, 66(23):6177-6189, 2018
Nguyen, et. al. "Tracking Cells and their Lineages via Labeled RFSs," IEEE Trans. SP, 69:6177-6189, 2021
Labeled Random Finite Set
Closure under truncation?
Standard multi-object likelihood
𝐗
(𝜃(ℒ(⋅))
𝑔𝑘 (𝑍|𝐗) ∝ Ψ𝑍,𝑘 (⋅) single-object measurement likelihood
𝜃
PD ,k (x) g k ( z j | x)
set of labels of X , j0
Y{ z1 ,..., zm },k (x) =
( j)
k ( z j )
1 − P (x), j=0
association map 𝜃: ℒ(𝐗)) → {0,1, . . . , |𝑍|} D ,k
1-1 when positive: (i ) = (i' ) 0 i = i'
at most 1 measurement index per label detection probability clutter intensity
▪ Each term of the multi-object likelihood is symmetric
▪ Truncated labeled posterior/filtering density is a function of sets
Labeled RFS - 2 birds with one stone: provides trajectories & closure under truncation
Vo & Vo "Labeled RFSs and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013
Labeled Random Finite Set
What about set of unlabeled trajectories?
𝑘
Trajectory space 𝕋 = ⨄𝑡=1 ℕ × 𝕏𝑡 , unlabeled multi-object trajectory = finite subset of 𝕋
e.g. a
3 trajectory
distinct elements from the trajectory space
1 1 1 k = 1, 3 objects, multi-object state = 1.0 Physically nonsensical:
,
1.0 1.0 , 1.0 counts different segments of 1 trajectory, as separate trajectories
2.0 2.0 k = 2, 2 objects, multi-object state = 2.0
inconsistent cardinalities with multi-object state history
3.0 k = 3, 1 object, multi-object state = 3.0
Theoretical Flaw: multi-object state contains repeated points ⇒ no multi-object density
D. J. Daley and D. Vere-Jones, An introduction to the theory of point processes, Springer, 1988.
Goodman et. al., Mathematics of Data Fusion, Kluwer Academic Publisher, 1997, Prop. 19, pp. 159.
Numerically Infeasible:
▪ Need to estimate trajectories over entire duration, complexity per time step explodes with time
▪ Can’t estimate trajectories over shorter window – no means to connect trajectories from different windows
▪ Necessary to truncate posterior, but no guarantee truncated “density” is a function of sets
▪ PHD approximation? complexity per time step still grows with time – won’t work in real applications.
Labeled Random Finite Set
Recall: PHD = instantaneous expected no. objects per unit (hyper) volume
PHD (intensity function) of an RFS
PHD peaks: highest expected No. points per unit volume,
most likely locations of objects
PHD estimator: find “best” cardinality estimate n*;
locate n* highest peaks.
state space 𝕏
𝑘
▪ “PHD” on trajectory space 𝕋 = ⨄𝑡=1 ℕ × 𝕏𝑡 has different units of measurements for different 𝑘
▪ How can we locate the n* highest peaks if they have different units?
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB filter: Multi-object Analogue of Kalman Smoother/Filter
Closure under Bayes recursion - Analytic solutions to:
▪ Multi-object Bayes Posterior Recursion (Smoothing-while-filtering)
estimated labeled multi-object state history X0:k
Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019
▪ Multi-object Bayes Filtering Recursion
estimated labeled multi-object states X0,…,Xk forms a set of tracks
Vo & Vo "Labeled RFSs and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013
Generalized Labeled Multi-Bernoulli (GLMB) Filter
estimated trajectories on a 64km by 32km area
M. Beard et. al. "A Solution for Large-scale Multi-object Tracking," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
Total no. objects tracked is not indicative of problem size
▪ easy to track arbitrarily large no. in scenarios with 1-2 objects per frame
Problem size is better reflected in the no. objects/observations per frame
Scenario
▪ surveillance region: 64km by 36km
▪ objects can appear anywhere, appearing rate ~ 200-3000/frame (unknown to filter)
▪ mean clutter density 200 km-2 (460800 per frame) uniform,
▪ detection probability 0.88
▪ linear Gaussian state & observation noise sigma of 0.2ms-2 & 5m
M. Beard et. al. "A Solution for Large-scale Multi-object Tracking," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
▪ Over 1 million objects per frame
▪ Peak cardinality: 1,217,531 objects per frame (at time 700)
▪ Peak object density: 520 km-2
▪ Duration 1000 instances: ~ 1 billion data points
▪ OSPA(2): metric for sets of tracks,
▪ Generalization of OSPA that accounts for:
• Localization & Cardinality error;
• Track fragmentation;
• ID switches
Beard et. al. "A Solution for Large-scale MOT," IEEE Trans. SP, 68:2754–2769, 2020.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB density - multi-object analogue of exponential mixture:
labels of X p ( x)
xX
( )
distinct label indicator associations history multi-object exponential
▪ Weight Normalization
▪ Cardinality Distribution & PHD
B.-T. Vo, et. al. "Labeled Random Finite Sets and Multi-Object Conjugate Priors," IEEE Trans. SP, 61(13): 3460-3475, 2013.
Generalized Labeled Multi-Bernoulli (GLMB) Filter
Labeled Multi-Object Density Approximations
Truncation of GLMB
▪ Any truncation of a GLMB is a GLMB – closure under truncation
▪ L1-norm of truncation error can be computed analytically
▪ Minimum L1-norm truncation error: truncate smallest weights
Vo et. al. "Labeled RFS and the Bayes Multi-Target Tracking Filter," IEEE Trans. SP, 62(24):6554-6567, 2014
Approximation of GLMB by a 1-term GLMB (or LMB) with
▪ Same PHD & same Cardinality distribution
Reuter et. al. "The labelled multi-Bernoulli filter," IEEE Trans. SP, 62(12):3246-3260, 2014
Approximation of labeled multi-object density by a GLMB with:
▪ Same PHD & same Cardinality distribution
▪ Minimal Kullback-Leibler divergence
Papi et. al., “GLMB approximation of Multi-object densities,” IEEE Trans. SP, 63(20):5487-5497, 2015
Generalized Labeled Multi-Bernoulli (GLMB) Filter
GLMB Filtering/Posterior sum grows in no. terms: requires reduction of terms
Truncate terms with smallest weights ⇒ minimum L1-norm truncation error
Implementation: How to truncate without exhaustive enumeration of the terms?
Filter implementation:
▪ K-shortest path prediction & ranked assignment update (cubic in no. detections)
▪ Gibbs sampling + Joint Prediction & Update (linear in no. detections)
Vo et. al. "Labeled RFS and the Bayes Multi-Target Tracking Filter," IEEE Trans. SP, 62(24):6554-6567, 2014.
Vo et. al. “An Efficient Implementation of the GLMB Filter,” IEEE Trans. SP, 65(8):1975-1987, 2017.
Multi-sensor filter implementation:
▪ Multi-dimensional assignment (NP-Hard): Gibbs sampling (linear in total no. of detections)
Vo et. al. “Multi-sensor multi-object tracking with the GLMB filter,” IEEE Trans. SP, 67 (23):5952-5967, 2019.
Smoother implementation:
▪ Multi-dimensional assignment (NP-Hard): Gibbs sampling (linear in total no. of time steps)
▪ On-line: smooth over fixed-length window and link trajectories with the same labels
Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019
Demonstration: Multi-object Smoothing
Cell-microscopy – posterior multi-scan statistics
▪ Duration 1000mins, sampling period D=10mins
▪ Clutter 0.3/scan, Scenario:
▪ Detection probability 0.33 (very low) ▪ min 001: 4 cells appear, live for 100mins
▪ Observation noise sigma 0.3mm ▪ min 200: 4 cells appear, live for 200mins
▪ Dynamic noise sigma 0.01mm/D2 ▪ min 500: 4 cells appear, live for 400mins
Statistics on births/deaths, cell-life, migration pattern
Vo & Vo “A Multi-Scan Labeled RFS Model for Multi-object State Estimation,” IEEE Trans. SP, 67(19):4948-4963, 2019
Demonstration: Multi-Sensor Fusion
3D (state + extent) tracking from multiple camera views
▪ 4 Kinect sensors placed high up and facing inwards in each corner
▪ YOLO detector produces 2D detections of bounding boxes in pixel space
▪ Detections are noisy and subject to false positives/negatives
Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion
Tracking output from fused and filtered detections (colours denote estimated identities)
Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion
Tracking multiple jumping/falling people in 3-D
Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Multi-Sensor Fusion
Provides tracks in 3D instead of ground-plane tracks as in existing methods
No retraining when reconfiguring cameras
CLEAR MOT scores and OSPA(2) 3D errors with 3D GIoU base-distance
Detectors (monocular): Faster-RCNN(VGG16) and YOLOv3
GLMB Filters: Standard Multi-Sensor (MS), Multi-View with Occlusion Model (MV)
Asterisk e.g. MV-GLMB-OC* indicates tracking while reconfiguring cameras
Ong et. al. “A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling,” IEEE Trans. PAMI, 2020.
Demonstration: Autonomous Driving
Autonomous Driving: SLAM + multi-object filtering
Prototype system with E-Class Mercedes-Benz
H. Deusch et. al. “The Labeled Multi-Bernoulli SLAM filter,” IEEE Signal Proc. Letters, 22(10), 2015.
Demonstration: Autonomous Driving
Autonomous Driving: SLAM + multi-object filtering
4 patents (Bosch, BMW, Mitsubishi): autonomous driving /driver assistance system
Conclusions
Multi-Object Systems Theory
Filtering/Estimation Control System ID
Many emerging results Largely unexplored
Many interesting problems in
Artificial intelligence, machine learning, data mining
Communications, Astro-dynamics (Space Debris)
Bio-medical research: cell microscopy, brain imaging …
Preprints/latest works: http://ba-ngu.vo-au.com/publications.html
Matlab code: http://ba-tuong.vo-au.com/codes.html
Thank You!