CN113114585B

CN113114585B - Method, device and storage medium for joint optimization of task migration and network transmission

Info

Publication number: CN113114585B
Application number: CN202110391253.6A
Authority: CN
Inventors: 孙远; 李振宇; 黄韬
Original assignee: Network Communication and Security Zijinshan Laboratory
Current assignee: Zijinshan Laboratory
Priority date: 2021-04-13
Filing date: 2021-04-13
Publication date: 2022-10-18
Anticipated expiration: 2041-04-13
Also published as: CN113114585A

Abstract

The invention relates to the field of computer communication, in particular to a method, equipment and a storage medium for joint optimization of task migration and network transmission, which comprises the following steps: judging whether the called times are 0 or not, executing initialization operation, reasoning out a calculation task migration and TCP initial congestion window setting scheme, updating an experience replay buffer, judging whether the Seq2Seq model parameters need to be updated or not, and updating the Seq2Seq model parameters. The method for joint optimization of task migration and network transmission can provide a scheme for setting the TCP initial congestion window. Compared with the communication network physical layer resource allocation scheme, the implementation difficulty of the scheme is low, and the scheme can be independently completed by the mobile equipment. Secondly, the method uses a deep reinforcement learning technology to infer a calculation task migration and TCP initial congestion window setting scheme, and can be used for quickly making a calculation task completion time minimization decision.

Description

Method, device and storage medium for joint optimization of task migration and network transmission

技术领域technical field

本发明涉及计算机通信领域，具体的是任务迁移与网络传输联合优化的方法、设备及存储介质。The invention relates to the field of computer communication, in particular to a method, device and storage medium for joint optimization of task migration and network transmission.

背景技术Background technique

随着智能手机、平台电脑等移动设备的普及，近年来移动应用程序(即运行在移动设备上的应用程序)也出现了爆发式增长。但是，由于移动设备计算能力有限，网络传输延迟高，部分复杂移动应用程序(如人脸识别、增强现实、交互式游戏等)难以在移动设备上获得令人满意的使用体验。With the popularity of mobile devices such as smartphones and platform computers, mobile applications (i.e. applications running on mobile devices) have also experienced explosive growth in recent years. However, due to the limited computing power of mobile devices and high network transmission delay, it is difficult for some complex mobile applications (such as face recognition, augmented reality, interactive games, etc.) to obtain a satisfactory experience on mobile devices.

为此，人们提出了边缘计算模式。该模式允许移动设备将复杂计算任务迁移到边缘云执行，以提升复杂计算任务的执行速度。但是计算任务迁移必然会带来任务输入数据的网络传输，从而导致计算任务的完成时间增加。To this end, edge computing models have been proposed. This mode allows mobile devices to migrate complex computing tasks to the edge cloud for execution to improve the execution speed of complex computing tasks. However, the migration of computing tasks will inevitably lead to the network transmission of task input data, resulting in an increase in the completion time of computing tasks.

现有技术方案主要考虑的是通信网络物理层资源的分配，通过使用数值优化方法求解优化问题来获取最优计算任务迁移策略与通信网络物理层资源分配方案，以达到最小化计算任务完成时间的目的。现有技术(Energy-efficient dynamic offloading andresource scheduling in mobile cloud computing.In IEEE INFOCOM 2016-The 35thAnnual IEEE International Conference on Computer Communications)中提出了一种基于拉格朗日乘子法的计算任务迁移与物理层资源分配联合决策方法。该方法包括以下几个步骤：第一，初始化拉格朗日乘子；第二，求解计算任务迁移决策子问题；第三，求解移动设备CPU主频控制子问题，第四，求解移动设备传输功率分配子问题；第五，更新拉格朗日乘子；第六步，如果达到最大迭代次数，则结束算法运行，否则转向第二步。但是，现有技术存在以下缺点：The prior art scheme mainly considers the allocation of resources at the physical layer of the communication network, and solves the optimization problem by using the numerical optimization method to obtain the optimal calculation task migration strategy and the communication network physical layer resource allocation scheme, so as to minimize the completion time of the calculation task. Purpose. In the prior art (Energy-efficient dynamic offloading and resource scheduling in mobile cloud computing. In IEEE INFOCOM 2016-The 35thAnnual IEEE International Conference on Computer Communications), a computing task migration and physical layer based on Lagrange multiplier method is proposed A joint decision-making method for resource allocation. The method includes the following steps: first, initializing the Lagrangian multiplier; second, solving the computing task migration decision sub-problem; third, solving the mobile device CPU frequency control sub-problem, fourth, solving the mobile device transmission Power distribution sub-problem; fifth, update the Lagrange multiplier; sixth step, if the maximum number of iterations is reached, end the algorithm operation, otherwise turn to the second step. However, the prior art has the following disadvantages:

1、现有技术方案需要多次迭代才能收敛到最优解，算法运行时间长，难以迅速做出最优决策；1. The existing technical solution requires multiple iterations to converge to the optimal solution, and the algorithm runs for a long time, making it difficult to quickly make optimal decisions;

2、现有技术方案给出的通信网络物理层资源分配方案的实施难度大，需要通信网络运营商的支持，不能由移动设备独立完成。2. The implementation of the communication network physical layer resource allocation scheme given by the prior art scheme is difficult to implement, requires the support of the communication network operator, and cannot be independently completed by the mobile device.

因此，如何利用有限的网络资源最小化计算任务完成时间成为一个亟需解决的问题。Therefore, how to use limited network resources to minimize the completion time of computing tasks has become an urgent problem to be solved.

发明内容SUMMARY OF THE INVENTION

为解决上述背景技术中提到的不足，本发明的目的在于提供一种任务迁移与网络传输联合优化的方法、设备及存储介质，解决了算法运行时间长、分配方案的实施难度大的问题。In order to solve the deficiencies mentioned in the above background technology, the purpose of the present invention is to provide a method, device and storage medium for joint optimization of task migration and network transmission, which solves the problems of long algorithm running time and difficult implementation of the allocation scheme.

本发明的目的可以通过以下技术方案实现：The object of the present invention can be realized through the following technical solutions:

一种任务迁移与网络传输联合优化的方法，包括以下步骤：A method for joint optimization of task migration and network transmission, comprising the following steps:

步骤1：判断该方法的被调用次数是否为0，如果该方法的被调用次数t为0，转向步骤2，否则转向步骤3；Step 1: Determine whether the number of calls of the method is 0, if the number of calls t of the method is 0, go to step 2, otherwise go to step 3;

步骤2：执行初始化操作：使用大小在0到1之间的随机数初始化一个基本的Seq2Seq模型，该模型的输入序列长度为6M，M为场景中移动设备的数量，隐藏层长度为k，输出序列长度为M；Step 2: Perform the initialization operation: initialize a basic Seq2Seq model with random numbers of size between 0 and 1, the input sequence length of the model is 6M, M is the number of mobile devices in the scene, the hidden layer length is k, and the output sequence length is M;

其中，k的取值范围为任意正整数，输出序列中的每个元素的取值范围为任意整数，M取值为移动设备的数量；Among them, the value range of k is any positive integer, the value range of each element in the output sequence is any integer, and the value of M is the number of mobile devices;

步骤3：推理出计算任务迁移与TCP初始拥塞窗口设置方案；Step 3: Infer the calculation task migration and the TCP initial congestion window setting scheme;

步骤4：更新经验重放缓冲区；Step 4: Update the experience replay buffer;

步骤5：判断是否需要更新Seq2Seq模型参数：计算该方法的被调用次数t除以d的余数，d的取值范围为任意正整数，如果该余数为0，则转向步骤6，否则该方法结束；Step 5: Determine whether it is necessary to update the Seq2Seq model parameters: Calculate the remainder of the number of calls t of the method divided by d, the value range of d is any positive integer, if the remainder is 0, go to Step 6, otherwise the method ends ;

步骤6：更新Seq2Seq模型参数。Step 6: Update Seq2Seq model parameters.

进一步地，所述步骤1中被调用次数t的初始值为0，每当该方法被调用一次，t的值将会增加1，即t+1。Further, the initial value of the number of calls t in step 1 is 0, and each time the method is called once, the value of t will increase by 1, that is, t+1.

进一步地，所述k的取值范围优选取64到512之间的正整数。 Further, the value range of k is preferably a positive integer between 64 and 512 .

进一步地，所述步骤3中方案的具体操作如下：Further, the specific operations of the scheme in the step 3 are as follows:

(1)将所有移动设备的状态向量合并为系统状态向量S，即(1) Combine the state vectors of all mobile devices into a system state vector S, that is

S＝[S₁，S₂，…，S_i，…，S_M]，S=[S ₁ , S ₂ ,...,S _i ,...,S _M ],

其中M为移动设备的数量，S_i为第i个移动设备的状态向量；S_i包含6个移动设备状态，即Where M is the number of mobile devices, S _i is the state vector of the i-th mobile device; S _i contains 6 mobile device states, namely

S_i＝[IAT_i，IDT_i，IDS_i，LET_i，RTT_i，TP_i]，S _i =[IAT _i , IDT _i , IDS _i , LET _i , RTT _i , TP _i ],

其中IAT_i为第i个移动设备上前两次计算任务迁移请求到达的时间间隔，IDT_i为第i个移动设备上前两次计算任务完成的时间间隔，IDS_i为第i个移动设备请求迁移的计算任务输入数据大小，LET_i为第i个移动设备请求迁移的计算任务的本地运行时长，RTT_i为第i个移动设备与边缘服务器之间的往返延迟，TP_i为第i个移动设备的吞吐量；where IAT _i is the time interval between the arrival of the first two computing task migration requests on the ith mobile device, IDT _i is the time interval between the completion of the first two computing tasks on the ith mobile device, and IDS _i is the request by the ith mobile device The input data size of the migrated computing task, LET _i is the local running time of the computing task requested to be migrated by the ith mobile device, RTT _i is the round-trip delay between the ith mobile device and the edge server, and TP _i is the ith mobile device the throughput of the device;

(2)将系统状态变量S输入到Seq2Seq模型，获取推理结果T，即计算任务迁移与TCP初始拥塞窗口设置方案；T为一个包含M个元素的向量，即(2) Input the system state variable S into the Seq2Seq model, and obtain the inference result T, that is, the calculation task migration and the TCP initial congestion window setting scheme; T is a vector containing M elements, namely

T＝[T₁，T₂，…，T_i，…，T_M]T=[T ₁ , T ₂ , ..., T _i , ..., T _M ]

其中每个元素的含义如下：如果T_i等于0，表示不允许第i个移动设备将其计算任务迁移到边缘服务器；如果T_i大于0，表示允许第i个移动设备将其计算任务迁移到边缘服务器，同时将用于传输任务输入数据的TCP连接的初始拥塞窗口设置为T_i。The meaning of each element is as follows: if T _i is equal to 0, it means that the ith mobile device is not allowed to migrate its computing tasks to the edge server; if T _i is greater than 0, it means that the ith mobile device is allowed to migrate its computing tasks to the edge server edge server, while setting the initial congestion window of the TCP connection used to transmit task input data to T _i .

进一步地，所述步骤4的具体操作方案如下：Further, the concrete operation scheme of described step 4 is as follows:

(1)判断经验重放缓冲区是否存在空闲空间，如果没有空闲空间，则找到最先加入经验重放缓冲区的经验条目，并将其从经验缓冲区中删除；(1) Judging whether there is free space in the experience replay buffer, if there is no free space, find the experience entry that was first added to the experience replay buffer, and delete it from the experience buffer;

(2)将当次调用中使用的系统状态S，推理结果T和当前时刻作为一个经验条目加入到经验重放缓冲区。(2) The system state S used in the current call, the inference result T and the current moment are added to the experience replay buffer as an experience entry.

进一步地，所述经验重放缓冲区用于存储多个经验条目，最大长度为L且L的取值范围为任意正整数，每个经验条目可表示为一个三元组<S,T,E>，其中S为系统状态，T为推理结果，E为该经验条目加入到经验缓冲区的时刻。Further, the experience replay buffer is used to store multiple experience entries, the maximum length is L and the value range of L is any positive integer, and each experience entry can be represented as a triple <S, T, E. >, where S is the system state, T is the inference result, and E is the moment when the experience entry is added to the experience buffer.

进一步地，所述步骤6的具体操作方案如下：Further, the concrete operation scheme of described step 6 is as follows:

(1)判断经验重放缓冲区中的经验条目是否大于或等于N，N的取值范围为任意小于等于经验重放缓冲区最大长度L的正整数，如果是则转向第(2)步，否则该方法结束；(1) Judging whether the experience entry in the experience replay buffer is greater than or equal to N, the value range of N is any positive integer less than or equal to the maximum length L of the experience replay buffer, if so, turn to step (2), Otherwise the method ends;

(2)从经验重放缓冲区中随机选择出N个经验条目；(2) randomly select N experience entries from the experience replay buffer;

(3)将这N个经验条目中的系统状态和推理结果都提取出来，形成N个二元组<S,T>，其中S为系统状态，T为推理结果；(3) Extract the system states and inference results from the N experience items to form N binary groups <S,T>, where S is the system state and T is the inference result;

(4)将这N个二元组作为训练数据集，训练Seq2Seq模型，以更新Seq2Seq模型参数。(4) Using the N binary groups as training datasets, train the Seq2Seq model to update the Seq2Seq model parameters.

一种计算机可读的存储介质，存储有指令，所述指令被执行时实现上述任务迁移与网络传输联合优化的方法。A computer-readable storage medium stores instructions, and when the instructions are executed, the above-mentioned method for joint optimization of task migration and network transmission is realized.

一种任务迁移与网络传输联合优化设备，包括处理器、存储器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现上述任务迁移与网络传输联合优化的方法进行优化。A device for joint optimization of task migration and network transmission, comprising a processor, a memory, and a computer program stored in the memory and running on the processor, and when the processor executes the program, the above-mentioned joint optimization of task migration and network transmission is realized method to optimize.

本发明的有益效果：Beneficial effects of the present invention:

1、与使用数值方法求解优化问题相比，本发明使用深度强化学习技术推理计算任务迁移与TCP初始拥塞窗口设置方案的速度更快，能够满足快速决策需要；1. Compared with using the numerical method to solve the optimization problem, the present invention uses the deep reinforcement learning technology to reason the calculation task migration and the TCP initial congestion window setting scheme faster, and can meet the needs of rapid decision-making;

2、与通信网络物理层资源分配方案相比，本发明给出的TCP初始拥塞窗口设置方案的实施难度低，可由移动设备独立完成。2. Compared with the resource allocation scheme of the physical layer of the communication network, the implementation difficulty of the TCP initial congestion window setting scheme provided by the present invention is low, and can be independently completed by the mobile device.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图；In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. In other words, on the premise of no creative work, other drawings can also be obtained from these drawings;

图1是本发明Seq2Seq模型结构示意图。Figure 1 is a schematic structural diagram of the Seq2Seq model of the present invention.

具体实施方式Detailed ways

下面对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be described clearly and completely below. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

本发明提供一种任务迁移与网络传输联合优化的方法，包括以下步骤：The present invention provides a method for joint optimization of task migration and network transmission, comprising the following steps:

需要说明的是，被调用次数t的初始值为0，每当该方法被调用一次，t的值将会增加1(即变成t+1)。It should be noted that the initial value of the number of calls t is 0. Whenever the method is called once, the value of t will increase by 1 (that is, it becomes t+1).

步骤2：执行初始化操作。即使用大小在0到1之间的随机数初始化一个基本的Seq2Seq模型(模型结构如图1所示)，该模型的输入序列长度为6M，M为场景中移动设备的数量，隐藏层长度为k，输出序列长度为M；Step 2: Perform initialization operations. That is to initialize a basic Seq2Seq model with random numbers between 0 and 1 (the model structure is shown in Figure 1), the input sequence length of the model is 6M, M is the number of mobile devices in the scene, and the hidden layer length is k , the output sequence length is M;

k的取值范围为任意正整数，优选取64到512之间的正整数；输出序列中的每个元素的取值范围为任意整数；M取值为移动设备的数量。The value range of k is any positive integer, preferably a positive integer between 64 and 512; the value range of each element in the output sequence is any integer; M is the number of mobile devices.

Seq2Seq是一种循环神经网络模型，可将一个输入序列转换为一个输出序列，常用于机器翻译、对话系统、自动文摘等场合。在本发明中Seq2Seq模型用于建立系统状态向量和计算任务迁移与TCP初始拥塞窗口设置方案之间的最优映射关系，即将系统状态向量看作输入序列，将计算任务迁移与TCP初始拥塞窗口设置方案看作输出序列。Seq2Seq is a recurrent neural network model that converts an input sequence into an output sequence, and is often used in machine translation, dialogue systems, automatic summarization, etc. In the present invention, the Seq2Seq model is used to establish the optimal mapping relationship between the system state vector and the calculation task migration and the TCP initial congestion window setting scheme, that is, the system state vector is regarded as an input sequence, and the calculation task migration and the TCP initial congestion window setting are set Schemes are seen as output sequences.

步骤3：获取所有移动设备的状态向量；Step 3: Get the state vectors of all mobile devices;

对于每个移动设备，执行以下操作：For each mobile device, do the following:

(1)计算前两次计算任务迁移请求到达的时间间隔(记为IAT)，如果当前计算任务迁移请求次数小于3，令IAT；(1) Calculate the time interval between the arrival of the first two computing task migration requests (denoted as IAT). If the number of current computing task migration requests is less than 3, let IAT;

(2)计算前两次计算任务完成的时间间隔(记为IDT)，如果当前计算任务迁移请求次数小于3，令IDT；(2) Calculate the time interval between the completion of the first two computing tasks (denoted as IDT), if the number of current computing task migration requests is less than 3, let IDT;

(3)获取请求迁移的计算任务的输入数据大小(记为IDS)；(3) Obtain the input data size of the computing task requesting migration (denoted as IDS);

(4)计算请求迁移的计算任务的本地运行时长(记为LET)，计算公式如下(4) Calculate the local running time (denoted as LET) of the computing task requesting migration, and the calculation formula is as follows

其中α为请求迁移的计算任务需要的CPU周期数，φ为移动设备的CPU频率；where α is the number of CPU cycles required by the computing task requesting migration, and φ is the CPU frequency of the mobile device;

(5)测量移动设备与边缘服务器之间的往返延迟(记为RTT)和吞吐量(记为TP)；(5) Measure the round-trip delay (denoted as RTT) and throughput (denoted as TP) between the mobile device and the edge server;

(6)合并以上移动设备状态，形成移动设备的状态向量,即移动设备的状态向量可表示为一个5元组<IAT，IDT，IDS，LET，RTT，TP>。(6) Combine the above mobile device states to form a state vector of the mobile device, that is, the state vector of the mobile device can be represented as a 5-tuple <IAT, IDT, IDS, LET, RTT, TP>.

步骤4：推理出计算任务迁移与TCP初始拥塞窗口设置方案；Step 4: Infer computing task migration and TCP initial congestion window setting scheme;

具体操作方案如下：The specific operation plan is as follows:

其中IAT_i为第i个移动设备上前两次计算任务迁移请求到达的时间间隔，IDT_i为第i个移动设备上前两次计算任务完成的时间间隔，IDS_i为第i个移动设备请求迁移的计算任务的输入数据大小，LET_i为第i个移动设备请求迁移的计算任务的本地运行时长，RTT_i为第i个移动设备与边缘服务器之间的往返延迟，TP_i为第i个移动设备的吞吐量。where IAT _i is the time interval between the arrival of the first two computing task migration requests on the ith mobile device, IDT _i is the time interval between the completion of the first two computing tasks on the ith mobile device, and IDS _i is the request by the ith mobile device The input data size of the migrated computing task, LET _i is the local running time of the computing task requested to be migrated by the ith mobile device, RTT _i is the round-trip delay between the ith mobile device and the edge server, and TP _i is the ith mobile device Throughput of mobile devices.

(2)将系统状态变量S输入到Seq2Seq模型，获取推理结果T(即计算任务迁移与TCP初始拥塞窗口设置方案)；T为一个包含M个元素的向量，即(2) Input the system state variable S into the Seq2Seq model, and obtain the inference result T (that is, the calculation task migration and the TCP initial congestion window setting scheme); T is a vector containing M elements, namely

步骤5：更新经验重放缓冲区；Step 5: Update the experience replay buffer;

具体操作方案如下：The specific operation plan is as follows:

经验重放缓冲区用于存储多个经验条目。每个经验条目可表示为一个三元组<S,T,E>，其中S为系统状态，T为推理结果，E为该经验条目加入到经验缓冲区的时刻。此处S和T的定义等同于步骤4中已给出的定义。The experience replay buffer is used to store multiple experience entries. Each experience entry can be represented as a triple <S,T,E>, where S is the system state, T is the inference result, and E is the moment when the experience entry was added to the experience buffer. The definitions of S and T here are equivalent to those already given in step 4.

经验重放缓冲区最大长度为L。L的取值范围为任意正整数，优选1000到5000之间的正整数。The maximum length of the experience replay buffer is L. The value range of L is any positive integer, preferably a positive integer between 1000 and 5000.

步骤6：判断是否需要更新Seq2Seq模型参数。计算该方法的被调用次数t除以d的余数(d的取值范围为任意正整数)，如果该余数为0，则转向步骤6，否则该方法结束；Step 6: Determine whether the Seq2Seq model parameters need to be updated. Calculate the remainder of the number of calls t of the method divided by d (the value range of d is any positive integer), if the remainder is 0, then turn to step 6, otherwise the method ends;

步骤7：更新Seq2Seq模型参数：Step 7: Update Seq2Seq model parameters:

具体操作方案如下：The specific operation plan is as follows:

(1)判断经验重放缓冲区中的经验条目是否大于或等于N，如果是则转向第(2)步，否则该方法结束；(1) judge whether the experience entry in the experience replay buffer is greater than or equal to N, if so, turn to step (2), otherwise the method ends;

N的取值范围为任意小于等于经验重放缓冲区最大长度L的正整数，优选64到256之间的正整数。The value range of N is any positive integer less than or equal to the maximum length L of the experience playback buffer, preferably a positive integer between 64 and 256.

(3)将这N个经验条目中的系统状态和推理结果都提取出来，形成N个二元组<S,T>，其中S为系统状态，T为推理结果；此处S和T的定义等同于步骤4中已给出的定义。(3) Extract the system states and inference results from the N experience items to form N binary groups <S,T>, where S is the system state and T is the inference result; the definitions of S and T here Equivalent to the definitions already given in step 4.

(4)将这N个二元组作为训练数据集，使用随机梯度下降算法、Adam算法或者其他深度神经网络训练算法训练Seq2Seq模型，以更新Seq2Seq模型参数。(4) Using the N binary groups as the training data set, use the stochastic gradient descent algorithm, the Adam algorithm or other deep neural network training algorithms to train the Seq2Seq model to update the Seq2Seq model parameters.

训练过程采用如下损失函数：The training process uses the following loss function:

其中N为训练数据集大小，T_n，i为第n个推理结果的第i个分量，

为条件概率运算符；c_n为上下文向量，是所有时刻隐状态的平均值，表示为where N is the training dataset size, T _{n, i} is the i-th component of the n-th inference result,

is the conditional probability operator; cn _is the context vector, which is the average value of the hidden state at all times, expressed as

其中h_n，i为时刻i的隐状态，h_n，i进一步可表示为where h _{n, i} is the hidden state at time i, h _{n, i} can be further expressed as

h_n，i＝f(S_n，i，h_n，i-1)h _n,i =f(S _n,i ,h _n,i-1 )

其中f为循环神经网络单元(可从简单RNN、LSTM、GRU等循环神经网络单元任选一种)，S_n，i为第n个系统状态向量的第i个分量，h_n，0为时刻0的隐状态(在训练开始后会使用大小在0到1之间的随机数初始化所有隐状态)。where f is the recurrent neural network unit (you can choose one from simple RNN, LSTM, GRU and other recurrent neural network units), Sn _{, i} is the i-th component of the n-th system state vector, h _{n, 0} is the moment 0 hidden states (all hidden states are initialized with random numbers between 0 and 1 after training starts).

需要说明的是，步骤1和步骤3在第一计算单元中执行，步骤2、4、5、6在第二计算单元中执行。It should be noted that steps 1 and 3 are performed in the first computing unit, and steps 2, 4, 5, and 6 are performed in the second computing unit.

本发明提出的一种任务迁移与网络传输联合优化的方法，可给出TCP初始拥塞窗口设置方案。与通信网络物理层资源分配方案相比，该方案的实施难度低，可由移动设备独立完成。其次，该方法使用深度强化学习技术推理计算任务迁移与TCP初始拥塞窗口设置方案，可用于快速做出计算任务完成时间最小化决策。A method for joint optimization of task migration and network transmission proposed by the present invention can provide a TCP initial congestion window setting scheme. Compared with the resource allocation scheme of the physical layer of the communication network, the implementation of the scheme is less difficult and can be completed independently by the mobile device. Second, the method uses deep reinforcement learning technology to reason about computing task migration and TCP initial congestion window setting scheme, which can be used to quickly make computing task completion time minimization decisions.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。The foregoing has shown and described the basic principles, main features and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments. The above-mentioned embodiments and descriptions only illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will also have Various changes and modifications fall within the scope of the claimed invention.

Claims

1. a method for joint optimization of task migration and network transmission, is characterized in that, comprises the following steps:

Step 1: Perform initialization operation: initialize a basic Seq2Seq model with random numbers of size between 0 and 1, the model has an input sequence length of 6M, M is the number of mobile devices in the scene, the hidden layer length is k, and the output sequence length is M;

Among them, the value range of k is any positive integer, the value range of each element in the output sequence is any integer, and the value of M is the number of mobile devices;

Step 2: Infer computing task migration and TCP initial congestion window setting scheme;

The specific operation of the scheme is as follows:

2.1: Combine the state vectors of all mobile devices into a system state vector S, that is

S=[S ₁ , S ₂ ,...,S _i ,...,S _M ],

Where M is the number of mobile devices, S _i is the state vector of the i-th mobile device; S _i contains 6 mobile device states, namely

Si=[IAT _i , IDT _i , IDS _i , LET _i , RTT _i , TP _i ],

where IAT _i is the time interval between the arrival of the first two computing task migration requests on the ith mobile device, IDT _i is the time interval between the completion of the first two computing tasks on the ith mobile device, and IDS _i is the request by the ith mobile device The input data size of the migrated computing task, LET _i is the local running time of the computing task requested to be migrated by the ith mobile device, RTT _i is the round-trip delay between the ith mobile device and the edge server, and TP _i is the ith mobile device the throughput of the device;

2.2: Input the system state vector S into the Seq2Seq model, and obtain the inference result T, that is, the calculation task migration and the TCP initial congestion window setting scheme; T is a vector containing M elements, namely

T=[T ₁ , T ₂ , ..., T _i , ..., T _M ]

The meaning of each element is as follows: if T _i is equal to 0, it means that the ith mobile device is not allowed to migrate its computing tasks to the edge server; if T _i is greater than 0, it means that the ith mobile device is allowed to migrate its computing tasks to the edge server the edge server, while setting the initial congestion window of the TCP connection used to transmit the task input data to be T _i ;

Step 3: Update the experience replay buffer;

The experience replay buffer is used to store multiple experience entries, the maximum length is L and the value range of L is any positive integer, and each experience entry can be represented as a triple <S, T, E>, where S is the system state, T is the inference result, and E is the moment when the experience entry is added to the experience buffer;

Step 4: Determine whether it is necessary to update the Seq2Seq model parameters: Calculate the remainder of the number of calls t of the method divided by d, the value range of d is any positive integer, if the remainder is 0, go to Step 6, otherwise the method ends ;

Step 5: Update Seq2Seq model parameters;

Step 6: Determine whether the number of times the method is called is 0, if the number of times t of the method is called is 0, go to step 2, otherwise go to step 3.

2. The method for joint optimization of task migration and network transmission according to claim 1, wherein the initial value of the number of calls t in the step 1 is 0, and whenever the method is called once, the value of t is 0. The value will increase by 1, which is t+1.

3 . The method for joint optimization of task migration and network transmission according to claim 1 , wherein the value range of k in the step 2 is a positive integer between 64 and 512. 4 .

4. the method for joint optimization of a kind of task migration and network transmission according to claim 1, is characterized in that, the concrete operation scheme of described step 4 is as follows:

(1) Judging whether there is free space in the experience replay buffer, if there is no free space, find the experience entry that was first added to the experience replay buffer, and delete it from the experience buffer;

(2) The system state S used in the current call, the inference result T and the current moment are added to the experience replay buffer as an experience entry.

5. the method for joint optimization of a kind of task migration and network transmission according to claim 1, is characterized in that, the concrete operation scheme of described step 6 is as follows:

(1) Judging whether the experience entry in the experience replay buffer is greater than or equal to N, the value range of N is any positive integer less than or equal to the maximum length L of the experience replay buffer, if so, turn to step (2), Otherwise the method ends;

(2) randomly select N experience entries from the experience replay buffer;

(3) Extract the system states and inference results from the N experience items to form N binary groups <S,T>, where S is the system state and T is the inference result;

(4) Using the N binary groups as training datasets, train the Seq2Seq model to update the Seq2Seq model parameters.

6 . A computer-readable storage medium storing instructions, wherein when the instructions are executed by a processor, the method for joint optimization of task migration and network transmission according to any one of the preceding claims 1 to 5 is implemented. 7 .

7. A task migration and network transmission joint optimization device, comprising a processor, a memory and a computer program stored on the memory and running on the processor, wherein the processor implements the above-mentioned right when executing the program The method for joint optimization of task migration and network transmission described in any one of 1 to 5 is required to be optimized.