Deep Reinforcement Learning for Computation Offloading and Resource Allocation in Unmanned-Aerial-Vehicle Assisted Edge Computing
<p>The architecture of UAV-assisted mobile edge computing system.</p> "> Figure 2
<p>Comparison of cumulative rewards of SACDCO algorithm under different learning rates.</p> "> Figure 3
<p>Comparison of cumulative rewards of SACDCO algorithm under different relative weight.</p> "> Figure 4
<p>Comparison of delay under different schemes.</p> "> Figure 5
<p>Comparison of energy consumption under different schemes.</p> "> Figure 6
<p>Comparison of the size of discarded tasks under different schemes.</p> "> Figure 7
<p>Comparison of cumulative rewards for different UAV computing capabilities under different schemes.</p> "> Figure 8
<p>Comparison of cumulative rewards for different UAV bandwidth under different schemes.</p> ">
Abstract
:1. Introduction
1.1. Motivation and Related Work
1.2. Main Contributions
2. System Model and Problem Formulation
2.1. System Model
2.1.1. Communication Model
2.1.2. Computation Model
- The flight delay from the previous location to the end-user directly above;
- The transmission delay from the end-user to the UAV;
- The calculation delay of the UAV;
- The calculation delay of the end-user;
- The transmission delay from the UAV to the MEC server;
- The calculation delay of the MEC server.
2.1.3. Energy Model
- The flight energy consumption of the UAV;
- The transmission energy consumption when UAV receives tasks from end-users;
- The calculation energy consumption of the UAV;
- The transmission energy consumption from the UAV to the MEC server.
2.2. Problem Formulation
3. Soft Actor–Critic Based Dynamic Computation Offloading Algorithm
3.1. Markov Decision Process
- State space: We consider the current location of the UAV , the UAV battery capacity , and the size of computing tasks as the current state. Therefore, the state space can be denoted as
- Action space: We consider the offloading rate , whether to further offload to the MEC servers as the current action of the agent. Therefore, action space can be denoted as
- Reward: We define cumulative rewards to minimize the weighted sum of service delay, energy consumption, and the size of discarded task. Thus, rewards can be denoted as
3.2. Soft Actor–Critic DRL Algorithm
Algorithm 1 Soft actor–critic algorithm |
Input:
|
Algorithm 2 SAC-based dynamic computation offloading algorithm (SACDCO) |
Input: The initial location of the UAV , the initial battery capacity of the UAV , and task size
|
4. Performance Evaluation
4.1. Simulation Settings
- (Local-Only Scheme) Only execute computing tasks by the end-user;
- (UAV-Only Scheme) Only execute computing tasks on the UAV without further offloading to any MEC servers;
- (Fixed Local-UAV Scheme) Half of the computing tasks is executed locally while the other half is executed on the UAV.
4.2. Simulation Result
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lin, H.; Zeadally, S.; Chen, Z.; Labiod, H.; Wang, L. A survey on computation offloading modeling for edge computing. J. Netw. Comput. Appl. 2020, 169, 102781. [Google Scholar] [CrossRef]
- Alghamdi, I.; Anagnostopoulos, C.; Pezaros, D.P. Delay-tolerant sequential decision making for task offloading in mobile edge computing environments. Information 2019, 10, 312. [Google Scholar] [CrossRef] [Green Version]
- Zhang, K.; Mao, Y.; Leng, S.; Maharjan, S.; Zhang, Y. Optimal delay constrained offloading for vehicular edge computing networks. In Proceedings of the 2017 IEEE International Conference on Communications (ICC), Paris, France, 21–25 May 2017; pp. 1–6. [Google Scholar]
- Wu, Y.; Qian, L.P.; Ni, K.; Zhang, C.; Shen, X. Delay-minimization nonorthogonal multiple access enabled multi-user mobile edge computation offloading. IEEE J. Sel. Top. Signal Process. 2019, 13, 392–407. [Google Scholar] [CrossRef]
- You, C.; Zeng, Y.; Zhang, R.; Huang, K. Asynchronous mobile-edge computation offloading: Energy-efficient resource management. IEEE Trans. Wirel. Commun. 2018, 17, 7590–7605. [Google Scholar] [CrossRef] [Green Version]
- Pan, Y.; Chen, M.; Yang, Z.; Huang, N.; Shikh-Bahaei, M. Energy-efficient NOMA-based mobile edge computing offloading. IEEE Commun. Lett. 2018, 23, 310–313. [Google Scholar] [CrossRef]
- Xu, X.; Li, Y.; Huang, T.; Xue, Y.; Peng, K.; Qi, L.; Dou, W. An energy-aware computation offloading method for smart edge computing in wireless metropolitan area networks. J. Netw. Comput. Appl. 2019, 133, 75–85. [Google Scholar] [CrossRef]
- Zhang, G.; Zhang, W.; Cao, Y.; Li, D.; Wang, L. Energy-delay tradeoff for dynamic offloading in mobile-edge computing system with energy harvesting devices. IEEE Trans. Ind. Inform. 2018, 14, 4642–4655. [Google Scholar] [CrossRef]
- Zhang, W.; Zhang, Z.; Zeadally, S.; Chao, H.C.; Leung, V.C. MASM: A multiple-algorithm service model for energy-delay optimization in edge artificial intelligence. IEEE Trans. Ind. Inform. 2019, 15, 4216–4224. [Google Scholar] [CrossRef]
- Vu, T.T.; Van Huynh, N.; Hoang, D.T.; Nguyen, D.N.; Dutkiewicz, E. Offloading energy efficiency with delay constraint for cooperative mobile edge computing networks. In Proceedings of the 2018 IEEE Global Communications Conference (GLOBECOM), Abu Dhabi, United Arab Emirates, 9–13 December 2018; pp. 1–6. [Google Scholar]
- Neely, M.J. Intelligent packet dropping for optimal energy-delay tradeoffs in wireless downlinks. IEEE Trans. Autom. Control 2009, 54, 565–579. [Google Scholar] [CrossRef] [Green Version]
- Yu, H.; Neely, M.J. A new backpressure algorithm for joint rate control and routing with vanishing utility optimality gaps and finite queue lengths. IEEE/ACM Trans. Netw. 2018, 26, 1605–1618. [Google Scholar] [CrossRef]
- Sharma, G.; Mazumdar, R.; Shroff, N. Delay and capacity trade-offs in mobile ad hoc networks: A global perspective. IEEE/ACM Trans. Netw. 2007, 15, 981–992. [Google Scholar] [CrossRef] [Green Version]
- Mao, Z.; Koksal, C.E.; Shroff, N.B. Near optimal power and rate control of multi-hop sensor networks with energy replenishment: Basic limitations with finite energy and data storage. IEEE Trans. Autom. Control 2011, 57, 815–829. [Google Scholar]
- Zeng, Y.; Zhang, R.; Lim, T.J. Throughput maximization for UAV-enabled mobile relaying systems. IEEE Trans. Commun. 2016, 64, 4983–4996. [Google Scholar] [CrossRef]
- Wu, Q.; Zeng, Y.; Zhang, R. Joint trajectory and communication design for multi-UAV enabled wireless networks. IEEE Trans. Wirel. Commun. 2018, 17, 2109–2121. [Google Scholar] [CrossRef] [Green Version]
- Hu, Q.; Cai, Y.; Yu, G.; Qin, Z.; Zhao, M.; Li, G.Y. Joint offloading and trajectory design for UAV-enabled mobile edge computing systems. IEEE Internet Things J. 2018, 6, 1879–1892. [Google Scholar] [CrossRef]
- Jeong, S.; Simeone, O.; Kang, J. Mobile edge computing via a UAV-mounted cloudlet: Optimization of bit allocation and path planning. IEEE Trans. Veh. Technol. 2017, 67, 2049–2063. [Google Scholar] [CrossRef] [Green Version]
- Zhou, F.; Wu, Y.; Hu, R.Q.; Qian, Y. Computation rate maximization in UAV-enabled wireless-powered mobile-edge computing systems. IEEE J. Sel. Areas Commun. 2018, 36, 1927–1941. [Google Scholar] [CrossRef] [Green Version]
- Xiong, J.; Guo, H.; Liu, J. Task offloading in UAV-aided edge computing: Bit allocation and trajectory optimization. IEEE Commun. Lett. 2019, 23, 538–541. [Google Scholar] [CrossRef]
- Shakarami, A.; Ghobaei-Arani, M.; Shahidinejad, A. A survey on the computation offloading approaches in mobile edge computing: A machine learning-based perspective. Comput. Netw. 2020, 182, 107496. [Google Scholar] [CrossRef]
- Huang, L.; Feng, X.; Zhang, C.; Qian, L.; Wu, Y. Deep reinforcement learning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing. Digit. Commun. Netw. 2019, 5, 10–17. [Google Scholar] [CrossRef]
- Yang, L.; Yao, H.; Wang, J.; Jiang, C.; Benslimane, A.; Liu, Y. Multi-UAV-enabled load-balance mobile-edge computing for IoT networks. IEEE Internet Things J. 2020, 7, 6898–6908. [Google Scholar] [CrossRef]
- Liu, Y.; Xie, S.; Zhang, Y. Cooperative offloading and resource management for UAV-enabled mobile edge computing in power IoT system. IEEE Trans. Veh. Technol. 2020, 69, 12229–12239. [Google Scholar] [CrossRef]
- Zhu, S.; Gui, L.; Zhao, D.; Cheng, N.; Zhang, Q.; Lang, X. Learning-based computation offloading approaches in UAVs-assisted edge computing. IEEE Trans. Veh. Technol. 2021, 70, 928–944. [Google Scholar] [CrossRef]
- Asheralieva, A.; Niyato, D. Hierarchical game-theoretic and reinforcement learning framework for computational offloading in UAV-enabled mobile edge computing networks with multiple service providers. IEEE Internet Things J. 2019, 6, 8753–8769. [Google Scholar] [CrossRef]
- Taleb, T.; Ksentini, A. Follow me cloud: Interworking federated clouds and distributed mobile networks. IEEE Netw. 2013, 27, 12–19. [Google Scholar] [CrossRef]
- Wang, Y.; Sheng, M.; Wang, X.; Wang, L.; Li, J. Mobile-edge computing: Partial computation offloading using dynamic voltage scaling. IEEE Trans. Commun. 2016, 64, 4268–4282. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Introduction to Reinforcement Learning; MIT Press: Cambridge, UK, 1998; Volume 135. [Google Scholar]
- Mohammed, A.; Nahom, H.; Tewodros, A.; Habtamu, Y.; Hayelom, G. Deep reinforcement learning for computation offloading and resource allocation in blockchain-based multi-UAV-enabled mobile edge computing. In Proceedings of the 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 18–20 December 2020; pp. 295–299. [Google Scholar]
- Haarnoja, T.; Zhou, A.; Hartikainen, K.; Tucker, G.; Ha, S.; Tan, J.; Kumar, V.; Zhu, H.; Gupta, A.; Abbeel, P.; et al. Soft actor-critic algorithms and applications. arXiv 2018, arXiv:1812.05905. [Google Scholar]
- Shenzhen DJI Innovation Technology Co., Ltd. Technical parameters of DJI Air 2S. Available online: https://www.dji.com/air-2s/specs (accessed on 27 August 2021).
- Savkin, A.V.; Huang, H. Deployment of unmanned aerial vehicle base stations for optimal quality of coverage. IEEE Wirel. Commun. Lett. 2018, 8, 321–324. [Google Scholar] [CrossRef]
- Chen, Q. Joint position and resource optimization for multi-UAV-aided relaying systems. IEEE Access 2020, 8, 10403–10415. [Google Scholar] [CrossRef]
Reference | Communication-Only | Communication and Computation | EU-UAV | EU-UAV-MEC | Partial Offloading | RL Algorithm | Optimization Objective |
---|---|---|---|---|---|---|---|
[15] | ✓ | Throughput | |||||
[16] | ✓ | Throughput | |||||
[17] | ✓ | ✓ | ✓ | Delay | |||
[18] | ✓ | ✓ | Energy consumption | ||||
[19] | ✓ | ✓ | Computation rate | ||||
[20] | ✓ | ✓ | Energy consumption | ||||
[22] | ✓ | ✓ | DQN | The cost of energy, computation, and delay | |||
[23] | ✓ | ✓ | DQN | Load balance | |||
[24] | ✓ | ✓ | ✓ | DQN | System utility | ||
[25] | ✓ | ✓ | ✓ | Actor–Critic | Average response time | ||
[26] | ✓ | ✓ | DQN | System reward | |||
Our work | ✓ | ✓ | ✓ | Soft Actor–Critic | The cost of delay, energy, and discarded tasks |
Notations | Definitions |
---|---|
The set of end-user n | |
The set of MEC server s | |
The set of unmanned aerial vehicle u | |
The set of time slot t | |
The maximum number of end users or MEC servers | |
The location of the end-user | |
The location of the MEC server | |
The location of the UAV | |
The channel gain between the end-user and the UAV u | |
The transmission rate between the end-user and the UAV u | |
The channel gain between the UAV u and the MEC server | |
The transmission rate between the UAV u and the MEC server | |
The flight delay of UAV u | |
The transmission delay between the end-user and the UAV | |
The channel gain between the MEC server and the UAV u | |
The calculation delay of the end-user | |
The transmission delay between the UAV and the MEC server | |
The computing tasks that end-user needs to complete | |
The offloading ratio of UAV | |
Whether to further offload to the MEC server | |
The total size of the discarded tasks in time slot t |
Parameters | Values | Parameters | Values |
---|---|---|---|
−50 dBm | L | 500 m | |
10 MHz | W | 500 m | |
30 MHz | H | 100 m | |
0.1 w | |||
1 w | |||
−100 dBm | |||
−80 dBm | |||
15 m/s | |||
80 Mbit | |||
s | 1000 | ||
0.6 kg | |||
J | 0.4 GHz | ||
3.0 GHz × 2 | 0.6 GHz | ||
3.0 GHz × 4 | 0.8 GHz | ||
g | 10 m/s | 1.0 GHz | |
K |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, S.; Hu, X.; Du, Y. Deep Reinforcement Learning for Computation Offloading and Resource Allocation in Unmanned-Aerial-Vehicle Assisted Edge Computing. Sensors 2021, 21, 6499. https://doi.org/10.3390/s21196499
Li S, Hu X, Du Y. Deep Reinforcement Learning for Computation Offloading and Resource Allocation in Unmanned-Aerial-Vehicle Assisted Edge Computing. Sensors. 2021; 21(19):6499. https://doi.org/10.3390/s21196499
Chicago/Turabian StyleLi, Shuyang, Xiaohui Hu, and Yongwen Du. 2021. "Deep Reinforcement Learning for Computation Offloading and Resource Allocation in Unmanned-Aerial-Vehicle Assisted Edge Computing" Sensors 21, no. 19: 6499. https://doi.org/10.3390/s21196499