[go: up one dir, main page]

skip to main content
10.1145/3503161.3548231acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

Tracking Game: Self-adaptative Agent based Multi-object Tracking

Published: 10 October 2022 Publication History

Abstract

Multi-object tracking (MOT) has become a hot task in multi-media analysis. It not only locates the objects but also maintains their unique identities. However, previous methods encounter tracking failures in complex scenes, since they lose most of the unique attributes of each target. In this paper, we formulate the MOT problem as Tracking Game and propose a Self-adaptative Agent Tracker (SAT) framework to solve this problem. The roles in Tracking Game are divided into two classes including the agent player and the game organizer. The organizer controls the game and optimizes the agents' actions from a global perspective. The agent encodes the attributes of targets and selects action dynamically. For these purposes, we design the State Transition Net to update the agent state and the Action Decision Net to implement the flexible tracking strategy for each agent. Finally, we present the organizer-agent coordination tracking algorithm to leverage both global and individual information. The experiments show that the proposed SAT achieves the state-of-the-art performance on both MOT17 and MOT20 benchmarks.

Supplementary Material

MP4 File (mmfp1988.mp4)
Presentation video: introduce the main contribution, theory, and experiment of our paper. This paper solves MOT by a game theory, in which the game organizer and agent player cooperate to track targets. The final results are also comparable in both MOT17 and MOT20.

References

[1]
Philipp Bergmann, Tim Meinhardt, and Laura Leal-Taixe. 2019. Tracking without bells and whistles. In Proceedings of the IEEE International Conference on Computer Vision. 941--951.
[2]
Keni Bernardin and Rainer Stiefelhagen. 2008. Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, Vol. 2008 (2008), 1--10.
[3]
Erik Bochinski, Volker Eiselein, and Thomas Sikora. 2017. High-speed tracking-by-detection without using image information. In IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 1--6.
[4]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[5]
Qi Chu, Wanli Ouyang, Bin Liu, Feng Zhu, and Nenghai Yu. 2020. Dasot: A unified framework integrating data association and single object tracking for online multi-object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10672--10679.
[6]
Peng Dai, Renliang Weng, Wongun Choi, Changshui Zhang, Zhangping He, and Wei Ding. 2021. Learning a proposal classifier for multiple object tracking. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2443--2452.
[7]
Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, and Laura Leal-Taixé. 2020. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020).
[8]
James Ferryman and Ali Shahrokni. 2009. Pets2009: Dataset and challenge. In 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance. IEEE, 1--6.
[9]
Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. 2021. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021).
[10]
Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. 2013. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, Vol. 32, 11 (2013), 1231--1237.
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[12]
Lingxiao He, Xingyu Liao, Wu Liu, Xinchen Liu, Peng Cheng, and Tao Mei. 2020. Fastreid: A pytorch toolbox for general instance re-identification. arXiv preprint arXiv:2006.02631 (2020).
[13]
Chanho Kim, Fuxin Li, Arridhana Ciptadi, and James M Rehg. 2015. Multiple hypothesis tracking revisited. In In Proceedings of the IEEE International Conference on Computer Vision. 4696--4704.
[14]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[15]
Chao Liang, Zhipeng Zhang, Yi Lu, Xue Zhou, Bing Li, Xiyong Ye, and Jianxiao Zou. 2020. Rethinking the competition between detection and reid in multi-object tracking. arXiv preprint arXiv:2010.12138 (2020).
[16]
Jonathon Luiten, Aljosa Osep, Patrick Dendorfer, Philip Torr, Andreas Geiger, Laura Leal-Taixé, and Bastian Leibe. 2021. Hota: A higher order metric for evaluating multi-object tracking. International Journal of Computer Vision, Vol. 129, 2 (2021), 548--578.
[17]
Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, and Tae-Kyun Kim. 2020. Multiple object tracking: A literature review. Artificial Intelligence (2020), 103448.
[18]
Anton Milan, Laura Leal-Taixé, Ian Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016).
[19]
Bo Pang, Yizhuo Li, Yifan Zhang, Muchen Li, and Cewu Lu. 2020. TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[20]
Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, and Yanwei Fu. 2020. Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In Proceedings of the European Conference on Computer Vision. Springer, 145--161.
[21]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779--788.
[22]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, Vol. 28 (2015), 91--99.
[23]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In In Proceedings of the European Conference on Computer Vision. Springer, 17--35.
[24]
Hao Sheng, Shuai Wang, Yang Zhang, Dongxiao Yu, Xiuzhen Cheng, Weifeng Lyu, and Zhang Xiong. 2020. Near-online tracking with co-occurrence constraints in blockchain-based edge computing. IEEE Internet of Things Journal, Vol. 8, 4 (2020), 2193--2207.
[25]
Daniel Stadler and Jurgen Beyerer. 2021. Improving Multiple Pedestrian Tracking by Track Management and Occlusion Handling. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10958--10967.
[26]
Peize Sun, Yi Jiang, Rufeng Zhang, Enze Xie, Jinkun Cao, Xinting Hu, Tao Kong, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: Multiple-object tracking with transformer. arXiv preprint arXiv:2012.15460 (2020).
[27]
Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, and Shengjin Wang. 2018. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision. 480--496.
[28]
Pavel Tokmakov, Jie Li, Wolfram Burgard, and Adrien Gaidon. 2021. Learning to Track with Object Permanence. (2021), 10860--10869.
[29]
Xingyu Wan, Sanping Zhou, Jinjun Wang, and Rongye Meng. 2021. Multiple Object Tracking by Trajectory Map Regression with Temporal Priors Embedding. 1377--1386.
[30]
Qiang Wang, Yun Zheng, Pan Pan, and Yinghui Xu. 2021c. Multiple Object Tracking with Correlation Learning. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3876--3886.
[31]
Shuai Wang, Hao Sheng, Yang Zhang, and Zhang Xiong. 2021b. A General Recurrent Tracking Framework without Real Data. In In Proceedings of the IEEE International Conference on Computer Vision. 1--8.
[32]
Yongxin Wang, Kris Kitani, and Xinshuo Weng. 2021a. Joint object detection and multi-object tracking with graph neural networks. In IEEE International Conference on Robotics and Automation. IEEE, 13708--13715.
[33]
Zhongdao Wang, Liang Zheng, Yixuan Liu, Yali Li, and Shengjin Wang. 2020. Towards real-time multi-object tracking. In In Proceedings of the European Conference on Computer Vision. Springer, 107--122.
[34]
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3 (1992), 229--256.
[35]
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In In Proceedings of the IEEE International Conference on Image Processing. IEEE, 3645--3649.
[36]
Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, and Junsong Yuan. 2021. Track to Detect and Segment: An Online Multi-Object Tracker. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12352--12361.
[37]
Fan Yang, Xin Chang, Sakriani Sakti, Yang Wu, and Satoshi Nakamura. 2021. ReMOT: A model-agnostic refinement for multiple object tracking. Image and Vision Computing, Vol. 106 (2021), 104091.
[38]
Sangdoo Yun, Jongwon Choi, Youngjoon Yoo, Kimin Yun, and Jin Young Choi. 2017. Action-decision networks for visual tracking with deep reinforcement learning. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2711--2720.
[39]
Ji Zhang, Jingkuan Song, Lianli Gao, Ye Liu, and Heng Tao Shen. 2022. Progressive Meta-learning with Curriculum. IEEE Transactions on Circuits and Systems for Video Technology (2022).
[40]
Ji Zhang, Jingkuan Song, Yazhou Yao, and Lianli Gao. 2021a. Curriculum-based meta-learning. In In Proceedings of the 29th ACM International Conference on Multimedia. 1838--1846.
[41]
Wei Zhang, Ran Song, Yibin Li, et al. 2020c. Online decision based visual tracking via reinforcement learning. Advances in Neural Information Processing Systems, Vol. 33 (2020).
[42]
Yang Zhang, Hao Sheng, Yubin Wu, Shuai Wang, Wei Ke, and Zhang Xiong. 2020a. Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet of Things Journal, Vol. 7, 9 (2020), 7892--7902.
[43]
Yang Zhang, Hao Sheng, Yubin Wu, Shuai Wang, Weifeng Lyu, Wei Ke, and Zhang Xiong. 2020b. Long-term tracking with deep tracklet association. IEEE Transactions on Image Processing, Vol. 29 (2020), 6694--6706.
[44]
Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Zehuan Yuan, Ping Luo, Wenyu Liu, and Xinggang Wang. 2021b. ByteTrack: Multi-Object Tracking by Associating Every Detection Box. arXiv preprint arXiv:2110.06864 (2021).
[45]
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2021c. Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision (2021), 1--19.
[46]
Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, and Hanqing Lu. 2021. Improving multiple object tracking with single object tracking. In In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2453--2462.
[47]
Xingyi Zhou, Vladlen Koltun, and Philipp Krahenbühl. 2020. Tracking objects as points. In In Proceedings of the European Conference on Computer Vision. Springer, 474--490.

Cited By

View all
  • (2025)Localization-Guided Track: A Deep Association Multiobject Tracking Framework Based on Localization Confidence of Camera DetectionsIEEE Sensors Journal10.1109/JSEN.2024.352202125:3(5282-5293)Online publication date: 1-Feb-2025
  • (2024)GLATrack: Global and Local Awareness for Open-Vocabulary Multiple Object TrackingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681530(2457-2466)Online publication date: 28-Oct-2024
  • (2024)Object-Level Pseudo-3D Lifting for Distance-Aware TrackingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680783(8015-8023)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Tracking Game: Self-adaptative Agent based Multi-object Tracking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. multi-object tracking
    2. self-adaptive agent
    3. tracking game
    4. video analysis

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)224
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Localization-Guided Track: A Deep Association Multiobject Tracking Framework Based on Localization Confidence of Camera DetectionsIEEE Sensors Journal10.1109/JSEN.2024.352202125:3(5282-5293)Online publication date: 1-Feb-2025
    • (2024)GLATrack: Global and Local Awareness for Open-Vocabulary Multiple Object TrackingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681530(2457-2466)Online publication date: 28-Oct-2024
    • (2024)Object-Level Pseudo-3D Lifting for Distance-Aware TrackingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680783(8015-8023)Online publication date: 28-Oct-2024
    • (2024)ConfTrack: Kalman Filter-based Multi-Person Tracking by Utilizing Confidence Score of Detection Box2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00645(6569-6578)Online publication date: 3-Jan-2024
    • (2024)DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01825(19290-19299)Online publication date: 16-Jun-2024
    • (2024)Spatial-angular-epipolar transformer for light field spatial and angular super-resolutionDisplays10.1016/j.displa.2024.102816(102816)Online publication date: Aug-2024
    • (2024)Online 3D behavioral tracking of aquatic model organism with a dual-camera systemAdvanced Engineering Informatics10.1016/j.aei.2024.10248161(102481)Online publication date: Aug-2024
    • (2023)Fusion of Multi-Modal Features to Enhance Dense Video CaptionSensors10.3390/s2312556523:12(5565)Online publication date: 14-Jun-2023
    • (2023)Parallel Dense Video Caption Generation with Multi-Modal FeaturesMathematics10.3390/math1117368511:17(3685)Online publication date: 26-Aug-2023
    • (2023)Cross-View Recurrence-Based Self-Supervised Super-Resolution of Light FieldIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.327846233:12(7252-7266)Online publication date: 22-May-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media