[go: up one dir, main page]

CN114021697A - Reinforcement learning-based neural network generation method and system for end-cloud framework - Google Patents

Reinforcement learning-based neural network generation method and system for end-cloud framework Download PDF

Info

Publication number
CN114021697A
CN114021697A CN202111273767.8A CN202111273767A CN114021697A CN 114021697 A CN114021697 A CN 114021697A CN 202111273767 A CN202111273767 A CN 202111273767A CN 114021697 A CN114021697 A CN 114021697A
Authority
CN
China
Prior art keywords
neural network
privacy
cloud
mobile phone
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111273767.8A
Other languages
Chinese (zh)
Inventor
张爽
向立瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiao Tong University
Original Assignee
Shanghai Jiao Tong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiao Tong University filed Critical Shanghai Jiao Tong University
Priority to CN202111273767.8A priority Critical patent/CN114021697A/en
Publication of CN114021697A publication Critical patent/CN114021697A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本发明提供了一种基于强化学习的端云框架神经网络生成方法和系统,包括:步骤1:使用向量表示初始的神经网络结构,并作为强化学习的状态空间;步骤2:将向量输入到双向长短期记忆网络LSTM中获取切割动作和压缩动作;步骤3:根据切割动作和压缩动作更新初始神经网络的结构,训练新的神经网络并计算其精度、隐私和资源消耗;步骤4:根据精度、隐私和资源消耗计算奖赏并更新LSTM直至收敛;步骤5:将收敛后获取的神经网络部署到手机端和云端,供用户进行神经网络推理。本发明可以自动生成为端云框架定制的神经网络结构,同时能实现任务高精度、中间层特征高隐私性和移动端的神经网络低能耗。

Figure 202111273767

The present invention provides a method and system for generating a terminal-cloud framework neural network based on reinforcement learning. Obtain the cutting action and compression action from the long short-term memory network LSTM; Step 3: Update the structure of the initial neural network according to the cutting action and compression action, train the new neural network and calculate its accuracy, privacy and resource consumption; Step 4: According to the accuracy, Privacy and resource consumption calculate rewards and update LSTM until convergence; Step 5: Deploy the neural network obtained after convergence to the mobile phone and the cloud for users to perform neural network inference. The present invention can automatically generate a neural network structure customized for the terminal-cloud framework, and at the same time can achieve high task accuracy, high privacy of middle-layer features, and low energy consumption of the neural network at the mobile terminal.

Figure 202111273767

Description

End cloud framework neural network generation method and system based on reinforcement learning
Technical Field
The invention relates to the technical field of neural networks, in particular to a method and a system for generating a terminal cloud framework neural network based on reinforcement learning.
Background
Deep Neural Networks (DNNs) have demonstrated powerful capabilities in many areas of computer vision, speech recognition, natural language processing, and the like. Today, attempts are being made to run neural networks on mobile devices. However, the success of DNN is highly dependent on the complexity of the model structure, i.e. high computational requirements, whereas mobile devices are distinguished by a lack of computational power and are therefore not suitable for high energy DNN operations. There are two possible solutions. A first solution is to reduce the complexity and runtime of the DNN using compression techniques and then deploy the compressed DNN directly on the mobile device. However, each compression technique may only be applicable to a specific neural network layer and only select for high inference accuracy, without considering various resource constraints of different platforms. A second solution is to upload raw data (images, video, etc.) to a cloud server and then perform DNN inference in the cloud. However, collecting private data to a user violates the privacy of the user.
From the perspective of resource limitation and user privacy protection of a mobile device, a mainstream method is to cut a neural network into two parts, one part is run on the mobile device to process privacy data to obtain an intermediate layer feature, and then the intermediate layer feature is uploaded to a cloud terminal to perform subsequent operations, which is called as an end cloud framework. However, prior work has demonstrated that intermediate representations may lead to sensitive information leakage. For example, an attacker can infer private attributes in the user's private data, and worse still, the attacker can even steal the user's private data by training a neural network to reverse the middle layer features.
To solve the problem of privacy leakage of the intermediate layer features, there are three main solutions: differential privacy, homomorphic encryption, and countermeasure training. The differential privacy mechanism guarantees privacy by adding noise to model parameters, intermediate layer features, model prediction results, or objective functions. Although the differential privacy mechanism has strong privacy guarantee theory proof, how effective the attack defense is in practice is unknown, and serious precision loss is brought. Homomorphic encryption protects private data of a user through an encryption algorithm, however homomorphic encryption is not well suited for non-linear operations, and current papers either approximate an activation function Sigmoid non-linear activation function using taylor expansion, or divide each neuron into linear and non-linear parts and implement them separately on non-conspiracy parties. However, for DNNs with many non-linear computations, the homomorphic encryption algorithm has too high computational complexity and requires high resource consumption, and therefore, is not suitable for a resource-constrained scenario such as an end cloud framework. The countertraining method can be generally expressed as a game of antagonism between a deep neural network DNN network and an attacking neural network, and these antagonism studies usually simulate an attacking network by solving a very small maximum problem to achieve a balance between privacy and accuracy of the middle-layer features.
Patent document CN111445005A (application number: CN202010115498.1) discloses a neural network control method and a reinforcement learning system based on reinforcement learning. In the invention, the action network determines the state control quantity according to the order and delay of the controlled object or the mechanism model thereof, and the controlled object receives the state control quantity output action value sent by the action network; and the estimation network evaluates the comparison between the current control effect and a preset target based on the output action value, adds random disturbance and model change in the exploration process of the controlled object or the mechanism model thereof, and updates the action network and the estimation network simultaneously to obtain a control law.
Besides the above disadvantages, most of the existing privacy protection methods only consider two indexes of task precision and privacy, and therefore cannot be directly applied to an end cloud framework with limited mobile end resources.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a reinforcement learning end cloud framework neural network generation method and system.
The method for generating the end cloud framework neural network based on reinforcement learning provided by the invention comprises the following steps:
step 1: using the vector to represent an initial neural network structure and serving as a state space for reinforcement learning;
step 2: inputting the vector into a bidirectional long-short term memory network (LSTM) to obtain a cutting action and a compression action;
and step 3: updating the structure of the initial neural network according to the cutting action and the compression action, training a new neural network and calculating the precision, privacy and resource consumption of the new neural network;
and 4, step 4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence;
and 5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to carry out neural network reasoning.
Preferably, the step 1 comprises:
the initial neural network structure is represented using vectors, each layer of the neural network is a five-dimensional vector < l, k, s, p, n >, where l represents the layer's class, k represents the size of the convolution kernel, s represents the size of the stride in the convolutional layer, p represents the size of the padding, and n represents the output dimension of the layer.
Preferably, the step 2 comprises:
step 2.1: inputting the vector into a first bidirectional LSTM neural network to obtain cutting actions, including the number of layers for cutting the neural network;
step 2.2: the vectors are input into a second bi-directional LSTM neural network, and compression actions are obtained, including the compression method to be taken for each layer.
Preferably, the step 3 comprises:
step 3.1: dividing the neural network into two parts according to the cutting action, wherein the first half part runs at a mobile phone end, original data is processed to obtain the characteristics of the middle layer, the characteristics of the middle layer are uploaded to a cloud end, and the second half part runs at the cloud end to obtain the characteristics of the middle layer for subsequent operation;
step 3.2: compressing the neural network part operated at the mobile phone end according to the cutting action to obtain a new neural network structure;
step 3.3: training the newly generated neural network structure, and calculating the precision on the original task;
step 3.4: the privacy of the middle layer characteristics is measured by using a neural network as an attacker, the input data are the middle layer characteristics, the label is the original data or the privacy attributes in the data, for attribute reasoning attack, the privacy is measured by using the accuracy of privacy attribute reasoning, and the higher the accuracy is, the lower the privacy is represented; for data reconstruction attack, the similarity between reconstructed data and original data is used for measuring privacy, and the more similar the data, the lower the privacy;
step 3.5: and calculating the energy consumption of the part running on the neural network of the mobile phone end, wherein the energy consumption comprises the parameters of the neural network of the mobile phone end, the product accumulation operation, the MAC and the time delay, and the time delay comprises the time consumed by running on the mobile phone end, the time required by uploading the characteristics of the middle layer to the cloud end and the time consumed by running on the cloud end.
Preferably, the step 4 comprises:
step 4.1: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure BDA0003328700820000031
wherein A isbaseIs the accuracy of the initial neural network;
for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes; for data reconstruction attacks, P is the similarity of the reconstructed data and the original data;
for the parameter quantity, the parameter quantity of the initial model is SbaseThe parameter quantity of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone end is S1Then, then
Figure BDA0003328700820000032
For multiply-accumulate operation and MAC, the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure BDA0003328700820000033
For time delay, the time that the initial model is completely operated at the mobile phone end is TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure BDA0003328700820000034
Step 4.2: the LSTM parameters are updated using a policy gradient algorithm until LSTM converges.
The invention provides a reinforcement learning-based end cloud framework neural network generation system, which comprises:
module M1: using the vector to represent an initial neural network structure and serving as a state space for reinforcement learning;
module M2: inputting the vector into a bidirectional long-short term memory network (LSTM) to obtain a cutting action and a compression action;
module M3: updating the structure of the initial neural network according to the cutting action and the compression action, training a new neural network and calculating the precision, privacy and resource consumption of the new neural network;
module M4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence;
module M5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to carry out neural network reasoning.
Preferably, the module M1 includes:
the initial neural network structure is represented using vectors, each layer of the neural network is a five-dimensional vector < l, k, s, p, n >, where l represents the layer's class, k represents the size of the convolution kernel, s represents the size of the stride in the convolutional layer, p represents the size of the padding, and n represents the output dimension of the layer.
Preferably, the module M2 includes:
module M2.1: inputting the vector into a first bidirectional LSTM neural network to obtain cutting actions, including the number of layers for cutting the neural network;
module M2.2: the vectors are input into a second bi-directional LSTM neural network, and compression actions are obtained, including the compression method to be taken for each layer.
Preferably, the module M3 includes:
module M3.1: dividing the neural network into two parts according to the cutting action, wherein the first half part runs at a mobile phone end, original data is processed to obtain the characteristics of the middle layer, the characteristics of the middle layer are uploaded to a cloud end, and the second half part runs at the cloud end to obtain the characteristics of the middle layer for subsequent operation;
module M3.2: compressing the neural network part operated at the mobile phone end according to the cutting action to obtain a new neural network structure;
module M3.3: training the newly generated neural network structure, and calculating the precision on the original task;
module M3.4: the privacy of the middle layer characteristics is measured by using a neural network as an attacker, the input data are the middle layer characteristics, the label is the original data or the privacy attributes in the data, for attribute reasoning attack, the privacy is measured by using the accuracy of privacy attribute reasoning, and the higher the accuracy is, the lower the privacy is represented; for data reconstruction attack, the similarity between reconstructed data and original data is used for measuring privacy, and the more similar the data, the lower the privacy;
module M3.5: and calculating the energy consumption of the part running on the neural network of the mobile phone end, wherein the energy consumption comprises the parameters of the neural network of the mobile phone end, the product accumulation operation, the MAC and the time delay, and the time delay comprises the time consumed by running on the mobile phone end, the time required by uploading the characteristics of the middle layer to the cloud end and the time consumed by running on the cloud end.
Preferably, the module M4 includes:
module M4.1: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure BDA0003328700820000051
wherein A isbaseIs the accuracy of the initial neural network;
for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes; for data reconstruction attacks, P is the similarity of the reconstructed data and the original data;
for the parameter quantity, the parameter quantity of the initial model is SbaseThe parameters of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone endAmount is S1Then, then
Figure BDA0003328700820000052
For multiply-accumulate operation and MAC, the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure BDA0003328700820000053
For time delay, the time that the initial model is completely operated at the mobile phone end is TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure BDA0003328700820000054
Module M4.2: the LSTM parameters are updated using a policy gradient algorithm until LSTM converges.
Compared with the prior art, the invention has the following beneficial effects:
(1) the neural network structure generated by the method can simultaneously meet the requirements of high task precision, high privacy of intermediate layer characteristics and low loss of mobile terminal resources;
(2) the algorithm of the invention only needs to input the original neural network neural structure, so that the optimal neural network structure deployed under the end cloud card rolling frame can be obtained, and the neural network structure does not need to be designed manually;
(3) the algorithm of the invention has good migratability, can migrate between different data sets and different initial neural networks, does not need to train from beginning, and has less energy consumption.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic flow chart of a neural network deep learning method in an end cloud framework according to the present invention;
FIG. 2 is a schematic diagram of a network structure model of a neural network VGG11 in an end cloud framework according to the present invention;
FIG. 3 is a schematic diagram of a network structure of a neural network cut LSTM in the end cloud framework according to the present invention;
FIG. 4 is a schematic diagram of a network structure of neural network compressed LSTM in the cloud-end framework according to the present invention;
fig. 5 is a schematic structural diagram of a neural network end cloud framework in the end cloud framework of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
Example (b):
taking a neural network convolutional layer VGG11 as an example, the invention provides a terminal cloud framework neural network generation method based on reinforcement learning, which relates to the steps of acquiring state representation in reinforcement learning according to an original neural network structure, acquiring actions in a corresponding state, calculating rewards according to the actions, and then updating a controller in reinforcement learning until convergence;
specifically, as shown in fig. 1, the method comprises the following steps:
step S1: using the vector to represent an initial neural network structure as a state space for reinforcement learning;
step S2: inputting the vector into a bidirectional LSTM to obtain a cutting action and a compression action;
step S3: changing the structure of the initial neural network according to the cutting action and the compression action, and training a new neural network to calculate the precision, the privacy and the resource consumption;
step S4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence;
step S5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to execute a neural network reasoning stage.
The step S1 includes:
step S101: the initial neural network structure is represented using vectors, each layer of the neural network being a five-dimensional vector < l, k, s, p, n >. Where l represents the layer class, k represents the size of the convolution kernel, s represents the size of stride in the convolutional layer, p represents the size of padding, and n represents the output dimension of the layer. For the VGG11 structure shown in fig. 2, the active layer and the pooling layer are removed, and the types of the different layers are respectively set as follows: [ convolutional layer: 1, a pooling layer: 2, full connection layer: 8], then the vector corresponding to VGG11 is represented as:
[[1,3,1,64,1],
[2,2,2,0,0],
[1,3,1,128,1],
[2,2,2,0,0],
[1,3,1,256,1],
[1,3,1,256,1],
[2,2,2,0,0],
[1,3,1,512,1],
[1,3,1,512,1],
[2,2,2,0,0],
[1,3,1,512,1],
[1,3,1,512,1],
[2,2,2,0,0],
[8,1,1,0,0]]
the step S2 includes:
step S201: inputting the vector into a first bidirectional LSTM neural network to obtain a cutting action, namely cutting the neural network at which layer;
step S202: inputting the vector into a second bidirectional LSTM neural network to obtain a compression action, namely a compression method to be adopted by each layer;
the cut LSTM configuration is shown in fig. 3 and the compressed LSTM configuration is shown in fig. 4. HiIs a hidden state corresponding to the ith layer, apRepresents a cutting layer, ac;iRepresents the compression algorithm corresponding to the ith layer, and the compression algorithm has 6 possible choices including MobileNet, MobileNet V2, SqueezeNet,Prung, FilterPrung and uncompressed.
The step S3 includes:
step S301: the neural network is divided into two parts according to the cutting action. The front half part runs at a mobile phone end, original data are processed to obtain middle layer characteristics, and then the middle layer characteristics are uploaded to a cloud. The latter half runs in the cloud, and the characteristics of the middle layer are obtained for subsequent operation. As shown in fig. 5.
Step S302: and according to the cutting action, executing a compression corresponding compression algorithm on the model operated at the mobile phone end to obtain a new neural network structure.
Step S303: and training the newly generated neural network, and calculating the precision on the original task.
Step S304: a neural network is designed to serve as an attacker to measure the privacy of the middle layer characteristics, the input data are the middle layer characteristics, and the labels are original data or privacy attributes in the data. For attribute reasoning attack, the privacy is measured by the accuracy of privacy attribute reasoning, and the higher the accuracy, the lower the privacy. For data reconstruction attacks, the similarity between reconstructed data and original data is used to measure privacy, and the more similar the data, the lower the privacy.
Step M305: calculating the energy consumption of the model running at the mobile phone end, wherein the energy consumption comprises three choices, namely parameter quantity of a mobile terminal neural network, multiplication and addition operation sum (MAC) and time delay, wherein the time delay comprises time consumed by running at the mobile phone end, time required by uploading the characteristics of the middle layer to the cloud end, and time consumed by running at the cloud end.
The step S4 includes:
step S401: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure BDA0003328700820000081
wherein A isbaseIs the accuracy of the initial neural network; for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes. For data reconstruction attacks, P is the number of reconstructionsAccording to the similarity with the original data; s has three choices, and for the parameter quantity, the parameter quantity of the initial model is assumed to be SbaseThe parameter quantity of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone end is S1Then, then
Figure BDA0003328700820000082
For multiply-add operation sum (MAC), assume the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure BDA0003328700820000083
For time delay, the time for which the initial model is completely operated at the mobile phone end is assumed to be TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure BDA0003328700820000084
Step S402: updating LSTM parameters by using PolicyGradient algorithm, and repeating the steps from S1 to S401 until LSTM converges.
The invention provides a reinforcement learning-based end cloud framework neural network generation system, which comprises: module M1: using the vector to represent an initial neural network structure and serving as a state space for reinforcement learning; module M2: inputting the vector into a bidirectional long-short term memory network (LSTM) to obtain a cutting action and a compression action; module M3: updating the structure of the initial neural network according to the cutting action and the compression action, training a new neural network and calculating the precision, privacy and resource consumption of the new neural network; module M4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence; module M5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to carry out neural network reasoning.
The module M1 includes: the initial neural network structure is represented using vectors, each layer of the neural network is a five-dimensional vector < l, k, s, p, n >, where l represents the layer's class, k represents the size of the convolution kernel, s represents the size of the stride in the convolutional layer, p represents the size of the padding, and n represents the output dimension of the layer.
The module M2 includes: module M2.1: inputting the vector into a first bidirectional LSTM neural network to obtain cutting actions, including the number of layers for cutting the neural network; module M2.2: the vectors are input into a second bi-directional LSTM neural network, and compression actions are obtained, including the compression method to be taken for each layer.
The module M3 includes: module M3.1: dividing the neural network into two parts according to the cutting action, wherein the first half part runs at a mobile phone end, original data is processed to obtain the characteristics of the middle layer, the characteristics of the middle layer are uploaded to a cloud end, and the second half part runs at the cloud end to obtain the characteristics of the middle layer for subsequent operation; module M3.2: compressing the neural network part operated at the mobile phone end according to the cutting action to obtain a new neural network structure; module M3.3: training the newly generated neural network structure, and calculating the precision on the original task; module M3.4: the privacy of the middle layer characteristics is measured by using a neural network as an attacker, the input data are the middle layer characteristics, the label is the original data or the privacy attributes in the data, for attribute reasoning attack, the privacy is measured by using the accuracy of privacy attribute reasoning, and the higher the accuracy is, the lower the privacy is represented; for data reconstruction attack, the similarity between reconstructed data and original data is used for measuring privacy, and the more similar the data, the lower the privacy; module M3.5: and calculating the energy consumption of the part running on the neural network of the mobile phone end, wherein the energy consumption comprises the parameters of the neural network of the mobile phone end, the product accumulation operation, the MAC and the time delay, and the time delay comprises the time consumed by running on the mobile phone end, the time required by uploading the characteristics of the middle layer to the cloud end and the time consumed by running on the cloud end.
The module M4 includes: module M4.1: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure BDA0003328700820000091
wherein A isbaseIs the accuracy of the initial neural network; for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes; for data reconstruction attacks, P is the similarity of the reconstructed data and the original data; for the parameter quantity, the parameter quantity of the initial model is SbaseThe parameter quantity of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone end is S1Then, then
Figure BDA0003328700820000092
For multiply-accumulate operation and MAC, the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure BDA0003328700820000093
For time delay, the time that the initial model is completely operated at the mobile phone end is TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure BDA0003328700820000094
Module M4.2: the LSTM parameters are updated using a policy gradient algorithm until LSTM converges.
Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A method for generating a terminal cloud framework neural network based on reinforcement learning is characterized by comprising the following steps:
step 1: using the vector to represent an initial neural network structure and serving as a state space for reinforcement learning;
step 2: inputting the vector into a bidirectional long-short term memory network (LSTM) to obtain a cutting action and a compression action;
and step 3: updating the structure of the initial neural network according to the cutting action and the compression action, training a new neural network and calculating the precision, privacy and resource consumption of the new neural network;
and 4, step 4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence;
and 5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to carry out neural network reasoning.
2. The reinforcement learning-based end cloud framework neural network generation method according to claim 1, wherein the step 1 comprises:
the initial neural network structure is represented using vectors, each layer of the neural network is a five-dimensional vector < l, k, s, p, n >, where l represents the layer's class, k represents the size of the convolution kernel, s represents the size of the stride in the convolutional layer, p represents the size of the padding, and n represents the output dimension of the layer.
3. The reinforcement learning-based end cloud framework neural network generation method according to claim 1, wherein the step 2 comprises:
step 2.1: inputting the vector into a first bidirectional LSTM neural network to obtain cutting actions, including the number of layers for cutting the neural network;
step 2.2: the vectors are input into a second bi-directional LSTM neural network, and compression actions are obtained, including the compression method to be taken for each layer.
4. The reinforcement learning-based end cloud framework neural network generation method according to claim 1, wherein the step 3 comprises:
step 3.1: dividing the neural network into two parts according to the cutting action, wherein the first half part runs at a mobile phone end, original data is processed to obtain the characteristics of the middle layer, the characteristics of the middle layer are uploaded to a cloud end, and the second half part runs at the cloud end to obtain the characteristics of the middle layer for subsequent operation;
step 3.2: compressing the neural network part operated at the mobile phone end according to the cutting action to obtain a new neural network structure;
step 3.3: training the newly generated neural network structure, and calculating the precision on the original task;
step 3.4: the privacy of the middle layer characteristics is measured by using a neural network as an attacker, the input data are the middle layer characteristics, the label is the original data or the privacy attributes in the data, for attribute reasoning attack, the privacy is measured by using the accuracy of privacy attribute reasoning, and the higher the accuracy is, the lower the privacy is represented; for data reconstruction attack, the similarity between reconstructed data and original data is used for measuring privacy, and the more similar the data, the lower the privacy;
step 3.5: and calculating the energy consumption of the part running on the neural network of the mobile phone end, wherein the energy consumption comprises the parameters of the neural network of the mobile phone end, the product accumulation operation, the MAC and the time delay, and the time delay comprises the time consumed by running on the mobile phone end, the time required by uploading the characteristics of the middle layer to the cloud end and the time consumed by running on the cloud end.
5. The reinforcement learning-based end cloud framework neural network generation method according to claim 1, wherein the step 4 comprises:
step 4.1: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure FDA0003328700810000021
wherein A isbaseIs the accuracy of the initial neural network;
for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes; for data reconstruction attacks, P is the similarity of the reconstructed data and the original data;
for the parameter quantity, the parameter quantity of the initial model is SbaseThe parameter quantity of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone end is S1Then, then
Figure FDA0003328700810000022
For multiply-accumulate operation and MAC, the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure FDA0003328700810000023
For time delay, the time that the initial model is completely operated at the mobile phone end is TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure FDA0003328700810000024
Step 4.2: the LSTM parameters are updated using a policy gradient algorithm until LSTM converges.
6. A reinforcement learning-based end cloud framework neural network generation system is characterized by comprising:
module M1: using the vector to represent an initial neural network structure and serving as a state space for reinforcement learning;
module M2: inputting the vector into a bidirectional long-short term memory network (LSTM) to obtain a cutting action and a compression action;
module M3: updating the structure of the initial neural network according to the cutting action and the compression action, training a new neural network and calculating the precision, privacy and resource consumption of the new neural network;
module M4: calculating rewards according to precision, privacy and resource consumption and updating the LSTM until convergence;
module M5: and deploying the neural network obtained after convergence to a mobile phone end and a cloud end for a user to carry out neural network reasoning.
7. The reinforcement learning-based end cloud framework neural network generation system of claim 6, wherein the module M1 comprises:
the initial neural network structure is represented using vectors, each layer of the neural network is a five-dimensional vector < l, k, s, p, n >, where l represents the layer's class, k represents the size of the convolution kernel, s represents the size of the stride in the convolutional layer, p represents the size of the padding, and n represents the output dimension of the layer.
8. The reinforcement learning-based end cloud framework neural network generation system of claim 6, wherein the module M2 comprises:
module M2.1: inputting the vector into a first bidirectional LSTM neural network to obtain cutting actions, including the number of layers for cutting the neural network;
module M2.2: the vectors are input into a second bi-directional LSTM neural network, and compression actions are obtained, including the compression method to be taken for each layer.
9. The reinforcement learning-based end cloud framework neural network generation system of claim 6, wherein the module M3 comprises:
module M3.1: dividing the neural network into two parts according to the cutting action, wherein the first half part runs at a mobile phone end, original data is processed to obtain the characteristics of the middle layer, the characteristics of the middle layer are uploaded to a cloud end, and the second half part runs at the cloud end to obtain the characteristics of the middle layer for subsequent operation;
module M3.2: compressing the neural network part operated at the mobile phone end according to the cutting action to obtain a new neural network structure;
module M3.3: training the newly generated neural network structure, and calculating the precision on the original task;
module M3.4: the privacy of the middle layer characteristics is measured by using a neural network as an attacker, the input data are the middle layer characteristics, the label is the original data or the privacy attributes in the data, for attribute reasoning attack, the privacy is measured by using the accuracy of privacy attribute reasoning, and the higher the accuracy is, the lower the privacy is represented; for data reconstruction attack, the similarity between reconstructed data and original data is used for measuring privacy, and the more similar the data, the lower the privacy;
module M3.5: and calculating the energy consumption of the part running on the neural network of the mobile phone end, wherein the energy consumption comprises the parameters of the neural network of the mobile phone end, the product accumulation operation, the MAC and the time delay, and the time delay comprises the time consumed by running on the mobile phone end, the time required by uploading the characteristics of the middle layer to the cloud end and the time consumed by running on the cloud end.
10. The reinforcement learning-based end cloud framework neural network generation system of claim 6, wherein the module M4 comprises:
module M4.1: calculating a reward R according to the precision A, the privacy P and the resource consumption S, wherein the formula is as follows:
Figure FDA0003328700810000031
wherein A isbaseIs the accuracy of the initial neural network;
for attribute reasoning attacks, P is the accuracy of reasoning with privacy attributes; for data reconstruction attacks, P is the similarity of the reconstructed data and the original data;
for the parameter quantity, the parameter quantity of the initial model is SbaseThe parameter quantity of the model of the new neural network generated according to the cutting action and the compression action and operated at the mobile phone end is S1Then, then
Figure FDA0003328700810000041
For multiply-accumulate operation and MAC, the MAC of the initial model is MbaseThe MAC of the neural network running at the mobile phone end is M1Then, then
Figure FDA0003328700810000042
For time delay, the time that the initial model is completely operated at the mobile phone end is TbaseThe time consumed by the operation of the new neural network generated according to the cutting action and the compression action at the mobile phone end is TeThe time required for uploading the intermediate layer characteristics to the cloud is TtThe time consumed by running in the cloud is TcThen, then
Figure FDA0003328700810000043
Module M4.2: the LSTM parameters are updated using a policy gradient algorithm until LSTM converges.
CN202111273767.8A 2021-10-29 2021-10-29 Reinforcement learning-based neural network generation method and system for end-cloud framework Pending CN114021697A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111273767.8A CN114021697A (en) 2021-10-29 2021-10-29 Reinforcement learning-based neural network generation method and system for end-cloud framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111273767.8A CN114021697A (en) 2021-10-29 2021-10-29 Reinforcement learning-based neural network generation method and system for end-cloud framework

Publications (1)

Publication Number Publication Date
CN114021697A true CN114021697A (en) 2022-02-08

Family

ID=80059031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111273767.8A Pending CN114021697A (en) 2021-10-29 2021-10-29 Reinforcement learning-based neural network generation method and system for end-cloud framework

Country Status (1)

Country Link
CN (1) CN114021697A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757350A (en) * 2022-04-22 2022-07-15 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Convolutional network channel cutting method and system based on reinforcement learning
KR20240072606A (en) 2022-11-17 2024-05-24 숭실대학교산학협력단 Bi-LSTM-based multivariate autoscaling method and apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757350A (en) * 2022-04-22 2022-07-15 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) Convolutional network channel cutting method and system based on reinforcement learning
KR20240072606A (en) 2022-11-17 2024-05-24 숭실대학교산학협력단 Bi-LSTM-based multivariate autoscaling method and apparatus

Similar Documents

Publication Publication Date Title
CN110880036B (en) Neural network compression method, device, computer equipment and storage medium
JP7017640B2 (en) Learning data expansion measures
WO2019155064A1 (en) Data compression using jointly trained encoder, decoder, and prior neural networks
CN113408743A (en) Federal model generation method and device, electronic equipment and storage medium
US20210312261A1 (en) Neural network search method and related apparatus
CN110622178A (en) Learning neural network structure
CN112101172A (en) Weight grafting-based model fusion face recognition method and related equipment
CN106709565A (en) Neural network optimization method and device
CN112561028B (en) Method for training neural network model, method and device for data processing
CN111178520A (en) Data processing method and device of low-computing-capacity processing equipment
CN113632094A (en) Memory-directed video object detection
WO2022246986A1 (en) Data processing method, apparatus and device, and computer-readable storage medium
CN113505883A (en) Neural network training method and device
WO2019006541A1 (en) System and method for automatic building of learning machines using learning machines
CN114021697A (en) Reinforcement learning-based neural network generation method and system for end-cloud framework
CN117351299B (en) Image generation and model training method, device, equipment and storage medium
KR20220097329A (en) Method and algorithm of deep learning network quantization for variable precision
CN113965313B (en) Model training method, device, equipment and storage medium based on homomorphic encryption
CN114072809A (en) Small and fast video processing network via neural architectural search
CN111652349A (en) Neural network processing method and related equipment
KR20190127261A (en) Method and Apparatus for Learning with Class Score of Two Stream Network for Behavior Recognition
CN115577787A (en) Quantum amplitude estimation method, device, equipment and storage medium
CN116383639A (en) Knowledge distillation method, device, equipment and storage medium for generating countermeasure network
CN117541683A (en) Image generation method, device, equipment and computer readable storage medium
CN113065997A (en) Image processing method, neural network training method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination