Twin attentive deep reinforcement learning for multi-agent defensive convoy

Dongyu Fan¹,
Haikuo Shen^1,2 &
Lijing Dong^1,2,3

364 Accesses
1 Altmetric
Explore all metrics

Abstract

Multi-agent defensive convoy helps provide critical safety for a leader agent. Escort agents work by coordinating their actions to protect the leader agent in the convoy. This paper investigates the multi-agent defensive convoy problem based on deep reinforcement learning and attention mechanism. To address the joint overestimation and suboptimal policy in multi-agent environments, a novel multi-agent twin attentive reinforcement learning method is proposed with a twin attentive critic and a delay attenuation policy. In addition, a variable temperature coefficient for maximum entropy is added to the learning process. The proposed method is evaluated on the designed defensive convoy environment and two public experimental environments, where our proposed method produces competitive performance compared to prior works. The contribution of each novel component is also extensively studied and analyzed. Further evaluations show that our method is robust to several adaptations in the defensive convoy environments including a changing number of escort agents and a changing number of dangers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Reinforcement Learning Adversarial Algorithm Against Opponent with Fixed Offensive Strategy

Article 02 March 2023

Air Combat Agent Construction Based on Hybrid Self-play Deep Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Kaur N, Kaur H (2022) A multi-agent based evacuation planning for disaster management: a narrative review. Arch Comput Methods Eng 29:4085–4113
Google Scholar
Ben-Dor G, Ben-Elia E, Benenson I (2021) Population downscaling in multi-agent transportation simulations: a review and case study. Simul Model Pract Theory 108:102233
Google Scholar
Amirkhani A, Barshooi AH (2021) Consensus in multi-agent systems: a review. Artif Intell Rev 55:3897–3935
Google Scholar
Mahmoud MS (2020) Multiagent systems: introduction and coordination control. CRC Press, Boca Raton
MATH Google Scholar
Hasan YA, Garg A, Sugaya S, Tapia L (2020) Defensive escort teams for navigation in crowds via multi-agent deep reinforcement learning. IEEE Robot Autom Lett 5(4):5645–5652
Google Scholar
Perrusqu’ia A, Yu W, Li X (2021) Multi-agent reinforcement learning for redundant robot control in task-space. Int J Mach Learn Cybern 12:231–241
Google Scholar
Ji G, Yan J, Du J, Yan W, Chen J, Lu Y, Rojas J, Cheng SS (2021) Towards safe control of continuum manipulator using shielded multiagent reinforcement learning. IEEE Robot Autom Lett 6(4):7461–7468
Google Scholar
Ren L, Fan X, Cui J, Shen Z, Lv Y, Xiong G (2022) A multi-agent reinforcement learning method with route recorders for vehicle routing in supply chain management. IEEE Trans Intell Transp Syst 23(9):16410–16420
Google Scholar
Kumar AS, Zhao L, Fernando X (2022) Multi-agent deep reinforcement learning-empowered channel allocation in vehicular networks. IEEE Trans Veh Technol 71(2):1726–1736
Google Scholar
Panerati J, Zheng H, Zhou S, Xu J, Prorok A, Schoellig AP (2021) Learning to fly-a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 7512–7519
de Souza C, Newbury R, Cosgun A, Castillo P, Vidolov B, Kulić D (2021) Decentralized multi-agent pursuit using deep reinforcement learning. IEEE Robot Autom Lett 6(3):4552–4559
Google Scholar
Xia Z, Du J, Wang J, Jiang C, Ren Y, Li G, Han Z (2022) Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking. IEEE Trans Veh Technol 71(1):931–945
Google Scholar
Sacco A, Esposito F, Marchetto G, Montuschi P (2021) Sustainable task offloading in UAV networks via multi-agent reinforcement learning. IEEE Trans Veh Technol 70(5):5003–5015
Google Scholar
Zhang H, Cheng J, Zhang L, Li Y, Zhang W (2022) H2GNN: hierarchical-hops graph neural networks for multi-robot exploration in unknown environments. IEEE Robot Autom Lett 7(2):3435–3442
Google Scholar
Xie J, Luo J, Peng Y, Xie S, Pu H, Li X, Su Z, Liu Y, Zhou R (2020) Data driven hybrid edge computing-based hierarchical task guidance for efficient maritime escorting with multiple unmanned surface vehicles. Peer-to-Peer Netw Appl 13(5):1788–1798
Google Scholar
Ma J, Lu H, Xiao J, Zeng Z, Zheng Z (2020) Multi-robot target encirclement control with collision avoidance via deep reinforcement learning. J Intell Robot Syst 99(2):371–386
Google Scholar
Gronauer S, Diepold K (2021) Multi-agent deep reinforcement learning: a survey. Artif Intell Rev 55:895–943
Google Scholar
Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 50(9):3826–3839
Google Scholar
Sadhu AK, Konar A (2020) Multi-agent coordination: a reinforcement learning approach. Wiley, Hoboken
Google Scholar
Lyu X, Xiao Y, Daley B, Amato C (2021) Contrasting centralized and decentralized critics in multi-agent reinforcement learning. In: Proceedings of the 20th international conference on autonomous agents and multiagent systems, pp 844–852
Du W, Ding S (2021) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54(5):3215–3238
Google Scholar
Du W, Ding S, Zhang C, Du S (2021) Modified action decoder using Bayesian reasoning for multi-agent deep reinforcement learning. Int J Mach Learn Cybern 12(10):2947–2961
Google Scholar
Cao D, Zhao J, Hu W, Ding F, Huang Q, Chen Z, Blaabjerg F (2021) Data-driven multi-agent deep reinforcement learning for distribution system decentralized voltage control with high penetration of pvs. IEEE Trans Smart Grid 12(5):4137–4150
Google Scholar
Ye Z, Chen Y, Jiang X, Song G, Yang B, Fan S (2021) Improving sample efficiency in multi-agent actor-critic methods. Appl Intell 52:3691–3704
Google Scholar
Xu C, Liu S, Zhang C, Huang Y, Lu Z, Yang L (2021) Multi-agent reinforcement learning based distributed transmission in collaborative cloud-edge systems. IEEE Trans Veh Technol 70(2):1658–1672
Google Scholar
Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in neural information processing systems 30 (NIPS), Long Beach, CA, USA, 4–9 December 2017, pp 6379–6390
Zeng P, Cui S, Song C, Wang Z, Li G (2022) A multiagent deep deterministic policy gradient-based distributed protection method for distribution network. Neural Comput Appl
Huang L, Fu M, Qu H, Wang S, Hu S (2021) A deep reinforcement learning-based method applied for solving multi-agent defense and attack problems. Expert Syst Appl 176:114896
Google Scholar
Chen X, Liu G (2021) Energy-efficient task offloading and resource allocation via deep reinforcement learning for augmented reality in mobile edge networks. IEEE Internet Things J 8(13):10843–10856
Google Scholar
Yang Y, Li B, Zhang S, Zhao W, Zhang H (2021) Cooperative proactive eavesdropping based on deep reinforcement learning. IEEE Wirel Commun Lett 10(9):1857–1861
Google Scholar
Wang L, Wang K, Pan C, Xu W, Aslam N, Hanzo L (2021) Multi-agent deep reinforcement learning-based trajectory planning for multi-UAV assisted mobile edge computing. IEEE Trans Cogn Commun Network 7(1):73–84
Google Scholar
Wu T, Zhou P, Wang B, Li A, Tang X, Xu Z, Chen K, Ding X (2021) Joint traffic control and multi-channel reassignment for core backbone network in SDN-IoT: a multi-agent deep reinforcement learning approach. IEEE Trans Netw Sci Eng 8(1):231–245
MathSciNet Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. In: 4th international conference on learning representations (ICLR), San Juan, Puerto Rico, May 2–4, 2016
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Google Scholar
Hasselt Hv, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, February 12–17, 2016, Phoenix, Arizona, USA, pp 2094–2100
Fujimoto S, van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. In: Proceedings of the 35th international conference on machine learning (ICML), Stockholm Sweden, 10–15 July, 2018, vol 80, pp 1582–1591
Zhang F, Li J, Li Z (2020) A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment. Neurocomputing 411:206–215
Google Scholar
Chaudhuri K, Salakhutdinov R (2019) Actor-attention-critic for multi-agent reinforcement learning. In: Proceedings of the 36th international conference on machine learning (ICML), 9–15 June 2019, Long Beach, California, USA
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Gupta S, Singal G, Garg D (2021) Deep reinforcement learning techniques in diversified domains: a survey. Arch Comput Methods Eng 28:4715–4754
Google Scholar
Silver D, Huang A, Maddison C et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529:484–489
Google Scholar
Silver D, Schrittwieser J, Simonyan K et al (2017) Mastering the game of go without human knowledge. Nature 550:354–359
Google Scholar
Jin Z, Wu J, Liu A, Zhang W-A, Yu L (2022) Policy-based deep reinforcement learning for visual servoing control of mobile robots with visibility constraints. IEEE Trans Ind Electron 69(2):1898–1908
Google Scholar
Arents J, Greitans M (2022) Smart industrial robot control trends, challenges and opportunities within manufacturing. Appl Sci 12(2):937
Google Scholar
Cui F, Cui Q, Song Y (2021) A survey on learning-based approaches for modeling and classification of human-machine dialog systems. IEEE Trans Neural Netw Learn Syst 32(4):1418–1432
Google Scholar
Mekrache A, Bradai A, Moulay E, Dawaliby S (2022) Deep reinforcement learning techniques for vehicular networks: recent advances and future trends towards 6G. Veh Commun 33:100398
Google Scholar
Le N, Rathour VS, Yamazaki K, Luu K, Savvides M (2022) Deep reinforcement learning in computer vision: a comprehensive survey. Artif Intell Rev 55(4):2733–2819
Google Scholar
Hasselt H (2010) Double q-learning. In: Advances in neural information processing systems, December 6-9, 2010, Vancouver, British Columbia, Canada
Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: International conference on machine learning, pp 387–395
Correia AdS, Colombini EL (2022) Attention, please! a survey of neural attention models in deep learning. Artif Intell Rev 55:6037–6124
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008
Long Y, Xiang R, Lu Q, Huang C-R, Li M (2021) Improving attention model based on cognition grounded data for sentiment analysis. IEEE Trans Affect Comput 12(4):900–912
Google Scholar
Li X, Liu L, Tu Z, Li G, Shi S, Meng MQ-H (2021) Attending from foresight: a novel attention mechanism for neural machine translation. IEEE/ACM Trans Audio Speech Lang Process 29:2606–2616
Google Scholar
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: 9th international conference on learning representations, ICLR 2021, Virtual Event, Austria, May 3–7, 2021
Liang D, Chen Q, Liu Y (2021) Gated multi-attention representation in reinforcement learning. Knowl-Based Syst 233:107535
Google Scholar
Fang K, Toshev A, Fei-Fei L, Savarese S (2019) Scene memory transformer for embodied agents in long-horizon tasks. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, pp 538–547
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd international conference on learning representations (ICLR 2015). ICLR, San Diego, CA, USA

Download references

Funding

This work is supported by the Fundamental Research Funds for the Central Universities under Grant 2022JBMC018 and the National Natural Science Foundation of China under Grant 61903022.

Author information

Authors and Affiliations

School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Dongyu Fan, Haikuo Shen & Lijing Dong
Key Laboratory of Vehicle Advanced Manufacturing, Measuring and Control Technology (Beijing Jiaotong University), Ministry of Education, Beijing, 100044, China
Haikuo Shen & Lijing Dong
Beijing Advanced Innovation Center for Intelligent Robots and Systems, Beijing Institute of Technology, Beijing, 100081, China
Lijing Dong

Authors

Dongyu Fan
View author publications
You can also search for this author in PubMed Google Scholar
Haikuo Shen
View author publications
You can also search for this author in PubMed Google Scholar
Lijing Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haikuo Shen.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fan, D., Shen, H. & Dong, L. Twin attentive deep reinforcement learning for multi-agent defensive convoy. Int. J. Mach. Learn. & Cyber. 14, 2239–2250 (2023). https://doi.org/10.1007/s13042-022-01759-5

Download citation

Received: 19 June 2022
Accepted: 17 December 2022
Published: 27 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s13042-022-01759-5

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical Reinforcement Learning Adversarial Algorithm Against Opponent with Fixed Offensive Strategy

Air Combat Agent Construction Based on Hybrid Self-play Deep Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Twin attentive deep reinforcement learning for multi-agent defensive convoy

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Hierarchical Reinforcement Learning Adversarial Algorithm Against Opponent with Fixed Offensive Strategy

Air Combat Agent Construction Based on Hybrid Self-play Deep Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now