Abstract
Multi-object tracking (MOT) has been dominated by the use of track by detection approaches due to the success of convolutional neural networks (CNNs) on detection in the last decade. As the datasets and bench-marking sites are published, research direction has shifted towards yielding best accuracy on generic scenarios including re-identification (reID) of objects while tracking. In this study, we narrow the scope of MOT for surveillance by providing a dedicated dataset of pedestrians and focus on in-depth analyses of well performing multi-object trackers to observe the weak and strong sides of state-of-the-art (SOTA) techniques for real-world applications. For this purpose, we introduce SOMPT22 dataset; a new set for multi person tracking with annotated short videos captured from static cameras located on poles with 6-8 m in height positioned for city surveillance. This provides a more focused and specific benchmarking of MOT for outdoor surveillance compared to public MOT datasets. We analyze MOT trackers classified as one-shot and two-stage with respect to the way of use of detection and reID networks on this new dataset. The experimental results of our new dataset indicate that SOTA is still far from high efficiency, and single-shot trackers are good candidates to unify fast execution and accuracy with competitive performance. The dataset will be available at: sompt22.github.io.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alberola-López, C., Casar-Corredera, J.R., Ruiz-Alzola, J.: A comparison of CFAR strategies for blob detection in textured images. In: 1996 8th European Signal Processing Conference (EUSIPCO 1996), pp. 1–4 (1996)
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP). IEEE (2016). https://doi.org/10.1109/icip.2016.7533003
Braun, M., Krebs, S., Flohr, F., Gavrila, D.M.: EuroCity persons: a novel benchmark for person detection in traffic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1844–1861 (2019). https://doi.org/10.1109/tpami.2019.2897684
Broström, M.: Real-time multi-object tracker using YOLOv5 and deep sort with OSNet (2022). https://github.com/mikel-brostrom/Yolov5_DeepSort_OSNet
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893 (2005). https://doi.org/10.1109/CVPR.2005.177
Dendorfer, P., et al.: MOT20: a benchmark for multi object tracking in crowded scenes. arXiv:2003.09003 (2020). http://arxiv.org/abs/1906.04567
Dendorfer, P., et al.: Motchallenge: a benchmark for single-camera multiple target tracking. Int. J. Comput. Vis. 129(4), 845–881 (2020)
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012). https://doi.org/10.1109/TPAMI.2011.155
Du, Y., Song, Y., Yang, B., Zhao, Y.: Strongsort: make deepsort great again (2022). https://doi.org/10.48550/arxiv.2202.13514
Enzweiler, M., Gavrila, D.M.: Monocular pedestrian detection: survey and experiments. IEEE Trans. Pattern Anal. Mach. Intell. 31(12), 2179–2195 (2009). https://doi.org/10.1109/TPAMI.2008.260
Ferryman, J., Shahrokni, A.: Pets 2009: dataset and challenge (2009)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015). https://doi.org/10.1109/TPAMI.2014.2345390
INTEL: Cvat. https://openvinotoolkit.github.io/cvat/docs/
Jabar, F., Farokhi, S., Sheikh, U.U.: Object tracking using SIFT and KLT tracker for UAV-based applications. In: 2015 IEEE International Symposium on Robotics and Intelligent Sensors (IRIS), pp. 65–68 (2015). https://doi.org/10.1109/IRIS.2015.7451588
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. ASME-J. Basic Eng. 82, 35–45 (1960). https://doi.org/10.1109/CVPR.2005.177
Kuhn, H.W.: Variants of the Hungarian method for assignment problems (1956)
Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv:1504.01942 (2015). http://arxiv.org/abs/1504.01942
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark (2015)
Lin, T.Y., et al.: Microsoft coco: common objects in context (2014). https://doi.org/10.48550/arxiv.1405.0312
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Luiten, J., et al.: HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vision 129(2), 548–578 (2020)
Manen, S., Gygli, M., Dai, D., Van Gool, L.: Pathtrack: fast trajectory annotation with path supervision (2017). https://doi.org/10.48550/arxiv.1703.02437
Meinhardt, T., Kirillov, A., Leal-Taixe, L., Feichtenhofer, C.: Trackformer: multi-object tracking with transformers (2021). https://doi.org/10.48550/arxiv.2101.02702
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv:1603.00831 (2016). http://arxiv.org/abs/1603.00831
Naphade, M., et al.: The 5th AI city challenge. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2021)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). https://doi.org/10.48550/arxiv.1804.02767
Ristani, E., Solera, F., Zou, R.S., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking (2016). https://doi.org/10.48550/arxiv.1609.01775
Shao, S., et al.: Crowdhuman: a benchmark for detecting human in a crowd (2018). https://doi.org/10.48550/arxiv.1805.00123
Stiefelhagen, R., Bernardin, K., Bowers, R., Rose, R., Michel, M., Garofolo, J.: The clear 2007 evaluation (2007). https://doi.org/10.1007/978-3-540-68585-2_1
Sun, P., et al.: Dancetrack: multi-object tracking in uniform appearance and diverse motion (2021). https://doi.org/10.48550/arxiv.2111.14690
Sun, P., et al.: Transtrack: multiple object tracking with transformer (2020). https://doi.org/10.48550/arxiv.2012.15460
Vaswani, A., et al.: Attention is all you need (2017). https://doi.org/10.48550/arxiv.1706.03762
Wang, X., et al.: Panda: a gigapixel-level human-centric video dataset. In: 2020 IEEE International Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2020)
Wojek, C., Walk, S., Schiele, B.: Multi-cue onboard pedestrian detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 794–801 (2009). https://doi.org/10.1109/CVPR.2009.5206638
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric (2017). https://doi.org/10.48550/arxiv.1703.07402
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search (2016). https://doi.org/10.48550/arxiv.1604.01850
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning (2018). https://doi.org/10.48550/arxiv.1805.04687
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., Wei, Y.: MOTR: end-to-end multiple-object tracking with transformer (2021). https://doi.org/10.48550/arxiv.2105.03247
Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection (2017). https://doi.org/10.48550/arxiv.1702.05693
Zhang, S., Xie, Y., Wan, J., Xia, H., Li, S.Z., Guo, G.: Widerperson: a diverse dataset for dense pedestrian detection in the wild (2019). https://doi.org/10.48550/arxiv.1909.12118
Zhang, Y., et al.: Bytetrack: multi-object tracking by associating every detection box (2021). https://doi.org/10.48550/arxiv.2110.06864
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vision 129(11), 3069–3087 (2021). https://doi.org/10.1007/s11263-021-01513-4
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild (2016). https://doi.org/10.48550/arxiv.1604.02531
Zhou, B., Bose, N.: An efficient algorithm for data association in multitarget tracking. IEEE Trans. Aerosp. Electron. Syst. 31(1), 458–468 (1995). https://doi.org/10.1109/7.366327
Zhou, D., Zhang, H.: Modified GMM background modeling and optical flow for detection of moving objects. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2224–2229 (2005). https://doi.org/10.1109/ICSMC.2005.1571479
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points (2020). https://doi.org/10.48550/arxiv.2004.01177
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points (2019). https://doi.org/10.48550/arxiv.1904.07850
Zhu, P., et al.: Detection and tracking meet drones challenge. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3119563
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Simsek, F.E., Cigla, C., Kayabol, K. (2023). SOMPT22: A Surveillance Oriented Multi-pedestrian Tracking Dataset. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13805. Springer, Cham. https://doi.org/10.1007/978-3-031-25072-9_44
Download citation
DOI: https://doi.org/10.1007/978-3-031-25072-9_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25071-2
Online ISBN: 978-3-031-25072-9
eBook Packages: Computer ScienceComputer Science (R0)