WO2022104799A1 - Training method, training apparatus, and storage medium - Google Patents
Training method, training apparatus, and storage medium Download PDFInfo
- Publication number
- WO2022104799A1 WO2022104799A1 PCT/CN2020/130896 CN2020130896W WO2022104799A1 WO 2022104799 A1 WO2022104799 A1 WO 2022104799A1 CN 2020130896 W CN2020130896 W CN 2020130896W WO 2022104799 A1 WO2022104799 A1 WO 2022104799A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- training
- compression
- node
- mode
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 756
- 238000000034 method Methods 0.000 title claims abstract description 115
- 238000007906 compression Methods 0.000 claims abstract description 380
- 230000006835 compression Effects 0.000 claims abstract description 379
- 230000004044 response Effects 0.000 claims abstract description 9
- 238000004891 communication Methods 0.000 claims description 143
- 230000006870 function Effects 0.000 claims description 85
- 238000012935 Averaging Methods 0.000 claims description 9
- 230000011664 signaling Effects 0.000 abstract description 56
- 230000005540 biological transmission Effects 0.000 abstract description 40
- 230000000694 effects Effects 0.000 abstract description 2
- 238000007726 management method Methods 0.000 description 39
- 238000012545 processing Methods 0.000 description 33
- 238000004458 analytical method Methods 0.000 description 28
- 238000012795 verification Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 18
- 239000011159 matrix material Substances 0.000 description 16
- 230000008569 process Effects 0.000 description 15
- 238000005516 engineering process Methods 0.000 description 12
- 230000003993 interaction Effects 0.000 description 7
- 238000012821 model calculation Methods 0.000 description 6
- 238000013138 pruning Methods 0.000 description 6
- 238000010200 validation analysis Methods 0.000 description 5
- 244000141353 Prunus domestica Species 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 4
- 238000011478 gradient descent method Methods 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
Definitions
- the present disclosure relates to the field of wireless communication technologies, and in particular, to a training method, a training device and a storage medium.
- the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections.
- artificial intelligence is introduced to improve the resource utilization of communication networks, terminal service experience, automation and intelligent control and management of communication networks, and models obtained through deep learning of artificial intelligence can have better performance.
- its high storage space and computing resource consumption make it difficult to be effectively applied on various hardware platforms, and the communication overhead is large, the precision is small, and the security is low.
- the present disclosure provides a training method, a training device and a storage medium.
- a training method applied to a first node, the method includes:
- a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
- a compressed model In response to receiving a model training request, a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
- a compressed model In response to receiving a model training request, a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained.
- the model compression parameters include a plurality of model compression options
- the obtaining the first compression model of the first training model based on the first training model and the model compression parameters includes:
- the determining the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model includes: :
- the first loss function is determined using the first cross entropy and the first relative entropy divergence.
- the method further includes:
- a second loss function for updating parameters of the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model .
- the method for updating the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model A second loss function for model parameters, including:
- the second loss function is determined using the second cross entropy and the second relative entropy divergence.
- the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
- the number of the first training models is determined based on the model training mode.
- the method further includes:
- a second indication message is sent, where the second indication message includes a number of first compressed models corresponding to the model training mode.
- the method further includes:
- a third indication message is received, where the third indication message includes an indication of determining the training model.
- the model training mode includes a multi-training node mode
- the method further includes:
- the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first training model based on the number of the first compression models; based on For the third compression model, the model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
- the method further includes:
- a fifth indication message is received, where the fifth indication message is used to instruct the end of training the first compression model.
- a training method applied to a second node comprising:
- the model training request includes model compression parameters, and the model compression parameters are used to compress a first training model to obtain a first compression model, and the first training model is obtained by training based on the model training request .
- the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
- the number of the first training models is determined based on the model training mode.
- the method further includes:
- a second indication message is received, the second indication message includes a number of first compressed models corresponding to the model training mode.
- the method further includes:
- a third indication message is sent, where the third indication message includes an indication of determining the training model.
- the model training mode includes a multi-training node mode
- the method further includes:
- a fourth indication message is sent; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first compression model based on the number of the first training models.
- the method further includes:
- a fifth instruction message is sent, where the fifth instruction message is used to instruct the end of training the first compression model.
- the method further includes:
- a subscription requirement is received, and a model training request is sent based on the subscription requirement.
- a training apparatus applied to a first node comprising:
- a model training and compression module configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, obtain A first compressed model of the first training model.
- the model compression parameters include a plurality of model compression options
- the model training and compression module is configured to determine a first model compression option from among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model ; Determine the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model; update all the parameters based on the first loss function parameters of the second compression model to obtain the first compression model.
- the apparatus further includes a data processing and storage module
- the data processing and storage module is used to determine the first cross entropy between the output of the second compression model and the sample parameter set, and to determine the difference between the output of the second compression model and the output of the first training model
- the first relative entropy divergence of , and the first loss function is determined based on the first cross entropy and the first relative entropy divergence.
- the data processing and storage module is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameters used for training the first training model set to determine a second loss function for updating the parameters of the first training model.
- the data processing and storage module is further configured to determine a second cross entropy between the output of the first training model and the sample parameter set, and determine the output of the first training model and the a second relative entropy divergence between the outputs of the second compression model; the second loss function is determined based on the second cross entropy and the second relative entropy divergence.
- the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models the multi-training node mode; the number of the first training models is determined based on the model training mode.
- the device network communication module comprises
- the first network communication module is configured to send a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
- the first network communication module is further configured to receive a third indication message, where the third indication message includes an indication of determining a training model.
- the first network communication module is further configured to receive a fourth indication message; the fourth indication message is used to indicate a third compression model, and the third compression model is based on the first compression model A compression model obtained by federally averaging the first training model; based on the third compression model, re-determining the model compression parameters, and updating the first compression model based on the re-determined model compression parameters.
- the first network communication module is further configured to receive a fifth indication message, where the fifth indication message is used to instruct the end of training the first compression model.
- a training apparatus applied to a second node comprising:
- the second network communication module is used to send a model training request; wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain a first compression model, and the first training model Obtained by training based on the model training request.
- the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
- the number of the first training models is determined based on the model training mode.
- the second network communication module is further configured to receive a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
- the second network communication module is further configured to send a third instruction message, where the third instruction message includes an instruction for determining the training model.
- the model training mode includes a multi-training node mode
- the network communication module is further configured to send a fourth instruction message; the fourth instruction message is used to indicate the third compression model, the first
- the three-compression model is a compression model obtained by federally averaging the first compression model based on the number of the first training models.
- the second network communication module is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
- the apparatus further includes a service management module
- the service management module is configured to receive subscription requirements and send a model training request based on the subscription requirements.
- a training device comprising:
- processor configured to: execute the first aspect or the training method described in any implementation manner of the first aspect, or execute the second aspect or The training method described in any one of the implementation manners of the second aspect.
- a non-transitory computer-readable storage medium which enables the mobile terminal to execute the first aspect or the first aspect when instructions in the storage medium are executed by a processor of a mobile terminal.
- the training method described in any one of the embodiments of the aspect, or the training method described in the second aspect or any one of the embodiments of the second aspect is performed.
- the technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects: in the present disclosure, the trained model is compressed, and the parameters of the compressed model are updated, so that the compressed model can have the same effect as the training model, thereby reducing the transmission time
- the model is the signaling overhead, and can ensure the accuracy and reliability of the model, and further ensure the security of user information.
- FIG. 1 is a schematic diagram of a system architecture of a compression method provided by the present disclosure.
- Fig. 2 is a flowchart of a training method according to an exemplary embodiment.
- Fig. 3 is a flowchart of another training method according to an exemplary embodiment.
- Fig. 4 is a flowchart of yet another training method according to an exemplary embodiment.
- Fig. 5 is a flowchart of yet another training method according to an exemplary embodiment.
- FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure.
- FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure.
- FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure.
- FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure.
- FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure.
- FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure.
- Fig. 12 is a block diagram of a training apparatus according to an exemplary embodiment.
- Fig. 13 is a block diagram of another training apparatus according to an exemplary embodiment.
- Fig. 14 is a block diagram of an apparatus for training according to an exemplary embodiment.
- Fig. 15 is a block diagram of another apparatus for training according to an exemplary embodiment.
- the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections.
- the implementation process of using the deep learning algorithm when training the model includes: the model request node determines the model structure and the model training mode according to the model/analysis subscription requirements, wherein the model training mode includes a single training node mode and a multi-training node mode. .
- the model request node sends the model structure and model training mode to the model training node, and the model training node independently conducts model training according to the model training mode or participates in the collaborative model training of multiple training nodes.
- the model training node sends the model to the model request node, and the model request node performs a federated average of the models sent by the model request node in the multi-training node mode to obtain a global model.
- the model request node checks whether the obtained model meets the model/analysis subscription requirements, and if so, the model request node sends the obtained model to the model/analysis party. If not, repeat the above model training process until the model obtained by the model request node meets the model/analysis subscription requirements.
- the data volume of the model is relatively large, especially in the multi-training node mode, the model needs to perform multiple transmissions between the model training node and the model requesting node, which greatly increases the communication overhead.
- the present disclosure provides a training method to solve the problems of high communication overhead, insufficient model accuracy, and security of terminal private data.
- the training method provided by the present disclosure determines the model structure and model training mode according to network service requirements (such as model subscription requirements), and fully considers the local available computing power, communication conditions, training sample characteristics and other factors of the model training node, and formulates various Model compression option to reduce unnecessary communication overhead, improve wireless network resource utilization, and apply deep learning to network intelligence work in a more efficient and secure way.
- FIG. 1 is a schematic diagram of a system architecture of a training method provided by the present disclosure.
- the system includes a core network part and a radio access network part.
- the terminal (user) accesses the base station through the wireless channel, the base stations are connected through the Xn interface, the base station accesses the terminal port function (User Port Function, UPF) network element of the core network through the N3 interface, and the UPF network element accesses the session through the N4 interface
- the management function Session Management Function, SMF
- SMF Session Management Function
- the SMF network element is connected to the bus structure of the core network, and is connected with other network functions (Network Function, NF) of the core network.
- Network Function Network Function
- the communication system between the network device and the terminal shown in FIG. 1 is only a schematic illustration, and the wireless communication system may also include other network devices, for example, a wireless relay device and a wireless backhaul device, etc. Not shown in Figure 1.
- the embodiments of the present disclosure do not limit the number of network devices and the number of terminals included in the wireless communication system.
- the wireless communication system is a network that provides a wireless communication function.
- Wireless communication systems can use different communication technologies, such as code division multiple access (CDMA), wideband code division multiple access (WCDMA), time division multiple access (TDMA) , frequency division multiple access (frequency division multiple access, FDMA), orthogonal frequency division multiple access (orthogonal frequency-division multiple access, OFDMA), single carrier frequency division multiple access (single Carrier FDMA, SC-FDMA), carrier sense Carrier Sense Multiple Access with Collision Avoidance.
- CDMA code division multiple access
- WCDMA wideband code division multiple access
- TDMA time division multiple access
- FDMA frequency division multiple access
- OFDMA orthogonal frequency division multiple access
- single carrier frequency division multiple access single Carrier FDMA, SC-FDMA
- carrier sense Carrier Sense Multiple Access with Collision Avoidance CDMA
- CDMA code division multiple access
- WCDMA wideband code division multiple access
- TDMA time division multiple access
- OFDMA orthogonal
- the network can be divided into 2G (English: generation) network, 3G network, 4G network or future evolution network, such as 5G network, 5G network can also be called a new wireless network ( New Radio, NR).
- 2G International: generation
- 3G network 4G network or future evolution network, such as 5G network
- 5G network can also be called a new wireless network ( New Radio, NR).
- New Radio New Radio
- the present disclosure will sometimes refer to a wireless communication network simply as a network.
- the wireless access network equipment may be: a base station, an evolved node B (base station), a home base station, an access point (AP) in a wireless fidelity (WIFI) system, a wireless relay A node, a wireless backhaul node, a transmission point (TP) or a transmission and reception point (TRP), etc., can also be a gNB in an NR system, or can also be a component or part of a device that constitutes a base station Wait.
- the network device may also be an in-vehicle device. It should be understood that, in the embodiments of the present disclosure, the specific technology and specific device form adopted by the network device are not limited.
- the terminal involved in the present disclosure may also be referred to as terminal equipment, user equipment (User Equipment, UE), mobile station (Mobile Station, MS), mobile terminal (Mobile Terminal, MT), etc.
- a device that provides voice and/or data connectivity for example, a terminal may be a handheld device with wireless connectivity, a vehicle-mounted device, or the like.
- some examples of terminals are: Smartphone (Mobile Phone), Pocket Personal Computer (PPC), PDA, Personal Digital Assistant (PDA), notebook computer, tablet computer, wearable device, or Vehicle equipment, etc.
- the terminal device may also be an in-vehicle device. It should be understood that the embodiments of the present disclosure do not limit the specific technology and specific device form adopted by the terminal.
- Fig. 2 is a flow chart of a training method according to an exemplary embodiment. As shown in Figure 2, the training method is used in the first node and includes the following steps.
- step S11 in response to receiving a model training request, a first training model is trained.
- the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description;
- the second node is a model request node. for the second node.
- the model training request includes model compression parameters.
- the model compression parameters include at least one of the following:
- Model training structure multiple model compression options, model training modes.
- the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node).
- the second node determines to send the model training request according to the received model subscription requirement.
- the first node for example, the model training node
- the first node sends third indication information, responds to the model training request information, and determines
- the first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined.
- the response model training request information sent by the first node further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
- step S12 a first compression model of the first training model is obtained based on the first training model and the model compression parameters.
- the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression.
- the relevant parameters required for model compression are determined by the first node based on parameters such as model compression parameters and local computing capabilities sent by the second node, and the model compression options include model accuracy and model parameter data volume.
- the model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities, communication conditions, and training sample parameter set characteristics reported by multiple first nodes.
- Fig. 3 is a flowchart of a training method according to an exemplary embodiment. As shown in FIG. 3 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
- step S21 a first model compression option is determined among the multiple model compression options, and the first training model is compressed based on the first model compression option to obtain a second compression model.
- the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options.
- the first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol ⁇ S is used to identify the second compression model.
- the following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
- the first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model.
- the first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix, and prunes other channels.
- X is used to prune ⁇ to obtain the second compression model ⁇ S .
- the first node selects an appropriate model compression option, compresses the training model according to the model compression option, and then transmits it to the second node. Under the condition of retaining most of the accuracy of the deep learning model, try to compress the training model as much as possible. The data volume of the model is reduced.
- This method realizes model compression according to the communication rate requirements of the model training node, which greatly reduces the communication overhead of the model uplink transmission.
- step S22 a first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model.
- the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined.
- the first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set.
- the output is the true value corresponding to the model input.
- the first relative entropy divergence between, the sum of the first cross entropy and the first relative entropy divergence is determined as the loss function of the second compression model.
- the present disclosure determines the loss function of the second compression model as the first loss function for the convenience of distinction.
- a plurality of first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of the plurality of first loss functions is determined, and a gradient descent method is used according to the average value of the plurality of first loss functions.
- the parameters of the second compression model are updated to obtain the first compression model.
- the first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
- the loss function of the second compression model is the loss function of the second compression model; is the first cross entropy between the output value of the second compression model and the input and output data pairs of the sample validation parameter set to the true value; D KL (p S
- the first training model is the first training model after updating the parameters of the first training model based on the loss function of the first training model.
- a loss function of the training model after updating the parameters of the first training model based on the loss function of the first training model, the loss function (ie, the first loss function) of the second compression model is determined.
- the present disclosure determines the loss function of the first training model as the second loss function for the convenience of distinction.
- the sample training parameter set further includes a sample verification parameter set, and the first node determines at least one data pair of input and output of the sample verification parameter set.
- the first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set.
- the output is the true value corresponding to the model input.
- the second relative entropy divergence between, the sum of the second cross entropy and the second relative divergence is determined as the second loss function.
- a plurality of second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and an average value of the plurality of second loss functions is determined, and a gradient descent method is adopted according to the average value of the plurality of second loss functions
- the parameters of the first training model are updated to obtain an updated first training model.
- the second loss function (that is, the loss function of the first training model) is expressed by the following formula:
- L ⁇ is the loss function of the first training model
- L C is the second cross-entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
- p S ) is the relative entropy divergence of the output value of the first training model and the output value of the second compression model
- p S is the output value of the second compression model
- p 1 is the output value of the first training model.
- the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
- the first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node.
- the training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models.
- the following takes the mth model training node (ie, the mth first node) as an example for description.
- the multi-training node model is described.
- the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options.
- the first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol ⁇ S is used to identify the second compression model.
- the following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
- the first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model.
- the first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix g, and prunes other channels.
- After obtaining the pruning matrix X use X to prune ⁇ m to obtain the mth second compression model
- the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined.
- the first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model.
- the output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
- the present disclosure determines the loss function of the mth second compression model as the mth first loss function for the convenience of distinction.
- a plurality of mth first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of a plurality of mth first loss functions is determined, according to the plurality of mth first loss functions.
- the average value of the loss function uses the gradient descent method to update the parameters of the mth second compression model to obtain the mth first compression model.
- the mth first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
- the mth first training model is the mth first training model obtained by updating the parameters of the mth first training model based on the loss function of the mth first training model.
- the loss function of the mth first training model is preferentially determined, and after updating the parameters of the first training model based on the loss function of the mth first training model, the mth second compression model is determined.
- the loss function of the model i.e. the mth first loss function).
- the loss function of the mth first training model is determined as the mth second loss function.
- the sample training parameter set further includes a sample verification parameter set
- the first node determines at least one data pair of input and output of the sample verification parameter set.
- the first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model.
- the output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
- the second cross-entropy between the output of the mth first training model and the sample parameter set that is, the input and output data of the sample validation parameter set to the true value
- the second relative entropy divergence between the outputs of the mth second compression model, the sum of the mth second cross entropy and the mth second relative divergence is determined as the mth second loss function.
- a plurality of m-th second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and the average value of a plurality of m-th second loss functions is determined, according to the plurality of m-th second loss functions.
- the average value of the loss function uses the gradient descent method to update the parameters of the m-th first training model to obtain the updated m-th first training model.
- the mth second loss function (that is, the loss function of the first training model) is expressed by the following formula:
- L C is the second cross entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
- L C is the second cross entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value
- p m is the output value of the mth first training model.
- model compression methods such as model sparseness, parameter quantization, etc. may also be selected to determine the first compression model, which is not specifically limited in the present disclosure.
- Fig. 4 is a flowchart showing a training method according to an exemplary embodiment. As shown in FIG. 4 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
- step S31 a fourth indication message is received.
- the second node determines the first compression model according to the received second indication message. If there is one first compression model, it is determined whether the first compression model meets the model subscription requirement, or whether it meets the analysis subscription requirement. If there are multiple first compression models, the multiple first compression models are federated averaged to obtain a third compressed model after federation average or called a global model, and it is determined whether the third compressed model after federation average meets the model subscription requirements , or whether the analytics subscription requirements are met. In an implementation manner, if a first compression model or a third compression model after federated average of multiple first compression models does not meet the subscription requirement, a fourth indication message is sent, where the fourth indication message is used to indicate the third compression model. Model. The first node receives the fourth indication message.
- step S32 based on the third compression model, model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
- the first node re-determines model compression parameters according to the third compression model indicated by the fourth indication message, updates the first compression model based on the re-determined model compression parameters, and determines the loss function of the first compression model , and re-update the parameters of the first compression model until the second node determines a compression model that satisfies the model subscription requirement.
- another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is over, and the second node sends the determined compression model to the model subscriber.
- the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
- the second indication message includes the number of first compressed models corresponding to the model training mode.
- the embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model.
- the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
- the embodiments of the present disclosure also provide a training method.
- Fig. 5 is a flowchart of a training method according to an exemplary embodiment. As shown in Figure 5, the training method is used in the second node and includes the following steps.
- step S41 a model training request is sent.
- the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
- the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description;
- the second node is a model request node. for the second node.
- the model training request includes model compression parameters.
- the model compression parameters include at least one of the following:
- Model training structure multiple model compression options, model training modes.
- the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node).
- the second node determines to send the model training request according to the received model subscription requirement.
- the first node for example, the model training node
- the first node sends third indication information, responds to the model training request information, and determines
- the first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined.
- the information sent by the first node to respond to the model training request further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
- the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression.
- the model compression options include model accuracy and model parameter data volume.
- the model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities reported by multiple first nodes, communication conditions, and characteristics of the training sample parameter set.
- the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
- the first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node.
- the training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models.
- the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
- the second indication message includes the number of first compressed models corresponding to the model training modes.
- the second node receives the second instruction message to determine the number of first compression models corresponding to the model training modes, and performs a federated average on the received one or more first compressions to obtain a third compression model after the federal average, and determines the third compression model. Whether the compressed model meets the model subscription requirements, or whether it meets the analysis subscription requirements.
- the subscription requirement may be issued by Operation Administration and Maintenance (OAM), or issued by the core network.
- Subscription requirements include analysis ID, used to identify the analysis type of the model training request; notification target model training node address, used to associate notifications received by the requested party with this subscription; also used for analysis report information, including the preferred analysis accuracy level , analysis time interval and other parameters; analysis filter information (optional): Indicates the conditions to be met by the report analysis information.
- a fourth indication message is sent, wherein the fourth indication message uses to indicate the third compression model.
- the first node receives the fourth indication message.
- another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is finished, and the second node sends the determined compression model to the model subscriber.
- the second node receives the subscription requirement sent by the OAM or the core network, and determines to send the model training request based on the received subscription requirement.
- the first node and the second node may be applied between a base station and a base station, or between a base station and a terminal, and of course, may also be applied between a base station and a core network.
- an application environment may be that the first node is a terminal and the second node is a base station.
- the required measurement can also be applied to an application environment in which the first node is a base station and the second node is also a base station, and may also include an application
- the first node is a base station
- the second node is a core network node.
- the first node after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel.
- the second indication message includes the number of first compressed models corresponding to the model training mode.
- the embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model.
- the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
- the first node is referred to as a model training node
- the second node is referred to as a model request node.
- the present disclosure is further described in terms of interaction between model training nodes and model request nodes.
- FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure.
- the model request node initiates a model training request to the model training node.
- the model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
- the model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
- the model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
- the model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
- the model training node compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compressed model, and transmits the first compressed model to the model requesting node through a wireless channel.
- the model training process ends, and the model request node reports the model to the model/analysis subscriber.
- FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure.
- the model request node initiates a model training request to the model training node.
- the model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
- the model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
- the model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
- the model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
- the model training node selects an appropriate model compression option, and compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compression model, and transmits the first compression model through the wireless channel Passed to the model request node.
- the first compressed model sent from the first model training node of the model request node is federated average to obtain a global model.
- the model training process ends, and the model request node reports the global model to the model/analysis subscriber. If the global model does not meet the model/analysis subscription requirements, the model training node reselects an appropriate model compression option, and updates the first compression model according to the re-determined model compression option.
- FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure. As shown in FIG. 8 , it includes a service management module and a network communication module in the model request node, and a network communication module, model training and compression module, data processing and storage module in the model training node device.
- the service management module in the model request node, the network communication module and the network communication module, model training and compression module, data processing and storage module in the model training node device perform the following steps for information exchange.
- step 1 includes steps 1a-1c, wherein in step 1a, the model request node service management module sends a model training request signaling to the model request node network communication module, and the content of the signaling instruction is to train the model The node initiates a model training request.
- step 1b the model requesting node network communication module sends model training request signaling to the model requesting node network communication module.
- step 1c the model training node network communication module sends the model training request response signaling to the model request node network communication module, and the content of the signaling instruction is to notify the acceptance of the model training request.
- Step 2 includes steps 2a-2c, wherein in step 2a, the model training node model training and compression module sends the computing capability information reporting signaling to the model training node network communication module, and the signaling instruction content is to calculate the model training node equipment. The capability information is reported to the receiver.
- the model training node data processing and storage module sends the training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node local data training sample feature information to the receiver.
- step 2c the model training node network communication module sends the computing capability and training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node computing capability and local data training sample feature information to the recipient.
- step 3 if the model training node is a terminal and the model requesting node is a base station, the network communication module of the model training node needs to measure the Channel Quality Indication (CQI) and report the signaling to the model requesting node network communication module , and the content of the signaling indication is to perform CQI measurement and report the CQI information to the receiver.
- CQI Channel Quality Indication
- the model request node network communication module sends the model training node computing capability, training sample characteristics, CQI information (optional) signaling to the model request node service management module, and the signaling indication content is the model to be received.
- the computing power of the training node, the characteristics of the training samples, and the CQI information (optional) are aggregated and sent to the receiver.
- step 5 the model request node service management module determines the model structure and model training mode according to the model/analysis subscription requirements.
- step 6 the model request node service management module proposes various model compression options according to the information reported by the model training node.
- step 7a-7b are included, wherein in step 7a, the model request node service management module sends the model structure and model training mode signaling to the model request node network communication module, and the signaling instruction content is to convert the model structure and model The training mode is sent to the receiver.
- the model requesting node service management module sends the model requesting node network communication module a signaling of sending model compression options, and the content of the signaling instruction is to send multiple model compression options to the receiver.
- Step 8 includes 8a-8b, wherein in step 8a, the model request node network communication module sends the model structure and model training mode signaling to the model training node network communication module. In step 8b, the model requesting node network communication module sends the model compression option signaling to the model training node network communication module.
- step 9a-9b are included in step 9, wherein in step 9a, the model training node network communication module will send model structure and model training mode signaling to the model training node model training and compression module.
- step 9b the model training node network communication module sends the model compression option signaling to the model training node model training and compression module.
- step 10 the model training node selects an appropriate model compression option according to locally available computing power, real-time communication conditions, and characteristics of training samples, and selects an appropriate model compression option.
- FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure. As shown in FIG. 9 , it includes a data processing and storage module, a model training and compression module, and a network communication module in the model training node, as well as a network communication module and a service management module in the model requesting node device.
- the data processing and storage module, model training and compression module, network communication module in the model training node, and the network communication module and service management module in the model request node device perform the following steps for message interaction.
- step 1 including steps 1a-1b, wherein in step 1a, the model training node model training and compression module sends the request local training data set signaling to the model training node data processing and storage module, and the signaling indicates that the content is the request Collect training datasets from local data.
- the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node, and the signaling indicates the content: collect data from local data to generate a training data set and send it to the model training and compression module. receiver.
- step 2 the model training and compression module of the model training node uses the local training data set for model training, and obtains the training model and relevant parameters required for model compression.
- step 3 the model training node compresses the original training model according to the selected model compression option and relevant parameters required for model compression to obtain a compressed model.
- Step 4 includes 4a-4c, wherein, in step 4a, the model training node model training and compression module sends the compressed model to the model training node network communication module.
- the model training node network communication module sends the compressed model to the model request node network communication module.
- the model request node network communication module sends the compressed model to the model request node service management module.
- step 5 the model request node service management module judges whether the obtained model satisfies the model/analysis subscription requirement. If satisfied, go to step 6.
- step 6 the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module.
- This process and corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notify the model training node to end the model training process.
- step 6a the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module.
- This process and the corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notifies the model training process to continue.
- step 6b the network communication module of the model training node will notify the model training continuation signaling to the model training node model training and compression module.
- step 7 the model training and compression module of the model training node uses the local training data set to train the compressed model, and repeats steps 4a-7 until the model obtained by the model request node meets the model/analysis subscription requirements.
- FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure. As shown in FIG. 10 , it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node device. The following steps are performed for the information exchange among its various modules.
- Step 1 includes steps 1a-1b, wherein in step 1a, the model training and compression module of the model training node sends a request for local training data set signaling to the model training node data processing and storage module. In step 1b, the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node.
- step 2 the model training and compression module of the model training node uses the local training data set to perform model training to obtain a first training model and relevant parameters required for model compression.
- step 3 the model training node compresses the first training model according to the selected model compression option and the relevant parameters required for model compression to obtain the first compression model.
- Step 4 includes steps 4a-4c, wherein in step 4a, the model training node model training and compression module sends the first compressed model to the model training node network communication module. In step 4b, the model training node network communication module sends the first compressed model to the model request node network communication module. In step 4c, the model request node network communication module sends the first compressed model to the model request node model calculation and update module.
- step 5 the model request node model calculation and update module summarizes the first compressed model sent from each model training node, and performs a federated average to obtain a global model.
- step 6 the model request node model calculation and update module sends the global model to the model request node service management module.
- step 7 the model request node service management module judges whether the obtained model meets the model/analysis subscription requirements. If so, execute:
- step 8 the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module.
- step 8a the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module, and distributes the global model to the model training node via the model request node network communication module network communication module.
- step 8b the model training node network communication module will notify the model training continuation signaling to the model training node model training and compression module, and send the global model to the model training node model training and compression module.
- step 9 the model training and compression module of the model training node uses the local training data set to perform model training and compression on the global model sent by the model requesting node, and repeats steps 4a-9 until the model obtained by the model requesting node satisfies the model/analysis Subscription requirements.
- FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure. As shown in Figure 11, it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node. It can be applied to the application scenario where the model requesting node is the base station and the model training node is the terminal. The following steps are performed for the information interaction between each module.
- step 1 the model training node model training and compression module sends the compressed model to the model training node transmission control module.
- step 2 the model training node network communication module sends the measured CQI and reporting signaling to the model training node transmission control module.
- step 3 the model training node transmission control module formulates a data transmission scheme according to compression characteristics and wireless communication conditions.
- step 4 the model training node transmission control module sends the data transmission scheme information signaling to the model training node network communication module, this process and the corresponding signaling are newly added in the present invention, and the signaling indicates the content: send the data transmission scheme
- the information is sent to the receiver, including modulation mode, code rate and other information.
- step 5 the model training and compression module of the model training node sends the compressed model to the network communication module of the model training node.
- step 6 the model training node network communication module encapsulates the compressed model according to the data transmission scheme.
- Step 7 includes steps 7a-7d, wherein in step 7a, the model training node network communication module transmits the compressed model data packet to the model request node network communication module.
- the network communication module of the model request node sends the compressed model to the transmission control module of the model request node, and the decapsulated data is transmitted at this time.
- the transmission control module of the model requesting node sends a signaling of acknowledging receipt of correct data to the network communication module of the model requesting node, and the content of the signaling indicates that the receiver has received correct data.
- the model requesting node network communication module sends a notification acknowledging receipt of correct data signaling to the model training node network communication module.
- the model request node transmission control module sends the compressed model to the model request node service management module. If it is a single training node mode, the compressed model can be directly sent to the model request node business management module; if it is a multi-training node mode, the global model needs to be obtained through the model request node model calculation and update module, and then sent. Request the node service management module for the model.
- step 9 the model request node service management module judges whether the model meets the model/analysis subscription requirement. If so, go to steps 10a1-10b1.
- step 10a1 the model requesting node service management module sends a signaling informing the model training end to the model requesting node transmission control module.
- step 10b1 the model requesting node network communication module sends a signaling informing the model training end to the model training node network communication module.
- steps 10a2-10b2 are performed.
- step 10a2 the model request node service management module sends the distribution model training end signaling to the model request node transmission control module, and the signaling indicates the content: notifies the model training node to end the model training process.
- step 10b2 the model requesting node network communication module sends the model training end signaling to the model training node network communication module.
- the training node mode only the signaling to notify the model to continue training can be sent; in the case of the multi-training node mode, the global model needs to be distributed to the model training nodes.
- the protocol and interface principle of global model distribution are similar to the above steps 1-7.
- the sending module is replaced by the model request node
- the receiving module is replaced by the model training node
- the compression model is replaced by the global model.
- the model requesting node should initiate a CQI measurement request to the model training node, and the model training node performs CQI measurement and then feeds it back to the model requesting node.
- an embodiment of the present disclosure also provides a training device.
- the training apparatus includes corresponding hardware structures and/or software modules for executing each function.
- the embodiments of the present disclosure can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the technical solutions of the embodiments of the present disclosure.
- FIG. 12 is a block diagram of a training apparatus 100 according to an exemplary embodiment. Referring to FIG. 12 , the apparatus is applied to a first node, including a model training and compression module 110 , a first network communication module 120 , a first transmission control module 130 and a data processing and storage module 140 .
- a model training and compression module 110 the apparatus is applied to a first node, including a model training and compression module 110 , a first network communication module 120 , a first transmission control module 130 and a data processing and storage module 140 .
- the model training and compression module 110 is configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters. Based on the first training model and the model compression parameters, a first compression model of the first training model is obtained.
- the model compression parameter includes a plurality of model compression options.
- the model training and compression module 110 is configured to determine a first model compression option among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model.
- the first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model.
- the parameters of the second compression model are updated based on the first loss function to obtain the first compression model.
- the apparatus further includes a data processing and storage module 140 .
- the data processing and storage module 140 is configured to determine the first cross entropy between the output of the second compression model and the sample parameter set, and determine the first relative entropy between the output of the second compression model and the output of the first training model Divergence. Based on the first cross entropy and the first relative entropy divergence, a first loss function is determined.
- the data processing and storage module 140 is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determine the parameters for updating the first training model.
- a second loss function for training model parameters is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determine the parameters for updating the first training model.
- the data processing and storage module 140 is further configured to determine the second cross-entropy between the output of the first training model and the sample parameter set, and to determine the difference between the output of the first training model and the second compression model The second relative entropy divergence between the outputs. Based on the second cross entropy and the second relative entropy divergence, a second loss function is determined.
- the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
- the number of first training models is determined based on the model training mode.
- the apparatus further includes a first network communication module 120 .
- the first network communication module 120 is configured to send a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
- the first network communication module 120 is further configured to receive a third instruction message, where the third instruction message includes an instruction to determine the training model.
- the first network communication module 120 is further configured to receive a fourth indication message.
- the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by federally averaging the first training model based on the number of the first compression models.
- model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
- the network communication module 120 is further configured to receive a fifth instruction message, where the fifth instruction message is used to instruct to end training the first compression model.
- the first network communication module 120 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
- the first transmission control module 130 is used to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
- the data processing and storage module is used to manage local data, generate training sample characteristic information, collect data to generate a local training data set, and store the data set.
- the model training and compression module is used for model training using the local data set, and compressing the model according to the information required for model compression obtained in the training process.
- FIG. 13 is a block diagram of a training apparatus 200 according to an exemplary embodiment.
- the apparatus is applied to a second node, and includes a second network communication module 210 , a second transmission control module 220 , a service management module 230 and a model calculation and update module 240 .
- the second network communication module 210 is configured to send a model training request.
- the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
- the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
- the number of first training models is determined based on the model training mode.
- the second network communication module 210 is further configured to receive a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
- the second network communication module 210 is further configured to send a third instruction message, where the third instruction message includes an instruction to determine the training model.
- the model training mode includes a multi-training node mode
- the second network communication module 210 is further configured to send a fourth indication message.
- the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing a federated average of the first compression models based on the number of the first training models.
- the second network communication module 210 is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
- the apparatus further includes a service management module 230 .
- the service management module 230 is configured to receive subscription requirements and send a model training request based on the subscription requirements.
- the second network communication module 210 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
- the second transmission control module 220 is configured to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
- the service management module 230 is used to process model/analysis subscription requests, initiate model training requests to model training nodes, formulate model structures, model training modes and model compression options, and check whether the obtained models meet model/analysis subscription requirements.
- the model calculation and update module 240 is used for performing federated averaging on the compressed models sent from multiple model training nodes in a multi-training node mode to obtain a global model, and distributing the global model to the model training nodes.
- a model training node device for deep learning model training and compression oriented to a wireless network is responsible for: responding to a model training request from a model requesting node and reporting local resource information. Select the appropriate model compression option, and perform model training and compression according to the model training mode, the selected model compression option.
- FIG. 14 is a block diagram of an apparatus 300 for training according to an exemplary embodiment.
- apparatus 300 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
- apparatus 300 may include one or more of the following components: processing component 302, memory 304, power component 306, multimedia component 308, audio component 310, input/output (I/O) interface 312, sensor component 314, and Communication component 316 .
- the processing component 302 generally controls the overall operation of the device 300, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
- the processing component 302 may include one or more processors 320 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 302 may include one or more modules that facilitate interaction between processing component 302 and other components. For example, processing component 302 may include a multimedia module to facilitate interaction between multimedia component 308 and processing component 302 .
- Memory 304 is configured to store various types of data to support operations at device 300 . Examples of such data include instructions for any application or method operating on device 300, contact data, phonebook data, messages, pictures, videos, and the like. Memory 304 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM erasable Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Magnetic or Optical Disk Magnetic Disk
- Power component 306 provides power to various components of device 300 .
- Power components 306 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power to device 300 .
- Multimedia component 308 includes screens that provide an output interface between the device 300 and the user.
- the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
- the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
- the multimedia component 308 includes a front-facing camera and/or a rear-facing camera. When the apparatus 300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
- Audio component 310 is configured to output and/or input audio signals.
- audio component 310 includes a microphone (MIC) that is configured to receive external audio signals when device 300 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signal may be further stored in memory 304 or transmitted via communication component 316 .
- audio component 310 also includes a speaker for outputting audio signals.
- the I/O interface 312 provides an interface between the processing component 302 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
- Sensor assembly 314 includes one or more sensors for providing status assessment of various aspects of device 300 .
- the sensor assembly 314 can detect the open/closed state of the device 300, the relative positioning of components, such as the display and keypad of the device 300, and the sensor assembly 314 can also detect a change in the position of the device 300 or a component of the device 300 , the presence or absence of user contact with the device 300 , the orientation or acceleration/deceleration of the device 300 and the temperature change of the device 300 .
- Sensor assembly 314 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
- Sensor assembly 314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor assembly 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- Communication component 316 is configured to facilitate wired or wireless communication between apparatus 300 and other devices.
- Device 300 may access wireless networks based on communication standards, such as WiFi, 2G or 3G, or a combination thereof.
- the communication component 316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
- the communication component 316 also includes a near field communication (NFC) module to facilitate short-range communication.
- NFC near field communication
- the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- apparatus 300 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA field programmable A gate array
- controller microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
- non-transitory computer readable storage medium including instructions, such as memory 304 including instructions, executable by processor 320 of apparatus 300 to perform the method described above.
- the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
- FIG. 15 is a block diagram of an apparatus 400 for training according to an exemplary embodiment.
- the apparatus 400 may be provided as a server.
- apparatus 400 includes processing component 422, which further includes one or more processors, and a memory resource represented by memory 432 for storing instructions executable by processing component 422, such as an application program.
- An application program stored in memory 432 may include one or more modules, each corresponding to a set of instructions.
- the processing component 422 is configured to execute instructions to perform the training method described above.
- Device 400 may also include a power supply assembly 426 configured to perform power management of device 400 , a wired or wireless network interface 450 configured to connect device 400 to a network, and an input output (I/O) interface 458 .
- Device 400 may operate based on an operating system stored in memory 432, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like.
- first, second, etc. are used to describe various information, but the information should not be limited to these terms. These terms are only used to distinguish the same type of information from one another, and do not imply a particular order or level of importance. In fact, the expressions “first”, “second” etc. are used completely interchangeably.
- the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information, without departing from the scope of the present disclosure.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Mobile Radio Communication Systems (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The present disclosure relates to a training method, a training apparatus, and a storage medium. The training method comprises: in response to receiving a model training request, training a first training model, wherein the model training request comprises model compression parameters; and on the basis of the first training model and the model compression parameters, obtaining a first compression model of the first training model. Thus, a compression model can have the same effect as a training model, so that signaling overheads during model transmission are reduced, the accuracy and reliability of the model can be guaranteed, and the security of user information is further ensured.
Description
本公开涉及无线通信技术领域,尤其涉及一种训练方法、训练装置及存储介质。The present disclosure relates to the field of wireless communication technologies, and in particular, to a training method, a training device and a storage medium.
通信网络为满足多业务场景的需求,具有超高速率、超低时延、超高可靠和超多连接的特性,而这些业务场景,以及相应的需求和通信网络的特性,给通信网络的部署和运维带来了前所未有的挑战。In order to meet the needs of multi-service scenarios, the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections. These business scenarios, as well as the corresponding requirements and characteristics of the communication network, are used for the deployment of the communication network. and operation and maintenance brought unprecedented challenges.
相关技术中,引入人工智能来提高通信网络的资源利用率、终端业务体验、通信网络的自动化和智能化控制和管理,通过人工智能的深度学习得到的模型可以具有更好的性能。但是,其高额的存储空间、计算资源消耗使其难以有效地应用在各硬件平台上,并且通信开销大,精度小,安全性低。In related technologies, artificial intelligence is introduced to improve the resource utilization of communication networks, terminal service experience, automation and intelligent control and management of communication networks, and models obtained through deep learning of artificial intelligence can have better performance. However, its high storage space and computing resource consumption make it difficult to be effectively applied on various hardware platforms, and the communication overhead is large, the precision is small, and the security is low.
发明内容SUMMARY OF THE INVENTION
为克服相关技术中存在的问题,本公开提供一种训练方法、训练装置及存储介质。In order to overcome the problems existing in the related art, the present disclosure provides a training method, a training device and a storage medium.
根据本公开实施例的第一方面,提供一种训练方法,应用于第一节点,所述方法包括:According to a first aspect of the embodiments of the present disclosure, a training method is provided, applied to a first node, the method includes:
响应于接收到模型训练请求,训练第一训练模型,其中,所述模型训练请求中包括模型压缩参数;基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型。In response to receiving a model training request, a first training model is trained, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, the first training model of the first training model is obtained. A compressed model.
在一种实施方式中,所述模型压缩参数包括多个模型压缩选项;In one embodiment, the model compression parameters include a plurality of model compression options;
所述基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型,包括:The obtaining the first compression model of the first training model based on the first training model and the model compression parameters includes:
在所述多个模型压缩选项中确定第一模型压缩选项,并基于所述第一模型压缩选项对所述第一训练模型进行压缩,得到第二压缩模型;根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定第一损失函数;基于所述第一损失函数更新所述第二压缩模型参数,得到所述第一压缩模型。Determine a first model compression option from the plurality of model compression options, and compress the first training model based on the first model compression option to obtain a second compression model; according to the output of the first training model , the output of the second compression model, and the sample parameter set used to train the first training model, to determine a first loss function; update the parameters of the second compression model based on the first loss function to obtain the The first compression model.
在一种实施方式中,所述根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定第一损失函数,包括:In one embodiment, the determining the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model, includes: :
确定所述第二压缩模型的输出与所述样本参数集之间的第一交叉熵,并确定第二压缩模型的输出与第一训练模型的输出之间的第一相对熵散度;基于所述第一交叉熵和第一相对熵散度,确定所述第一损失函数。determining a first cross-entropy between the output of the second compression model and the sample parameter set, and determining a first relative entropy divergence between the output of the second compression model and the output of the first training model; based on the The first loss function is determined using the first cross entropy and the first relative entropy divergence.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定用于更新所述第一训练模型参数的第二损失函数。A second loss function for updating parameters of the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model .
在一种实施方式中,根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定用于更新所述第一训练模型参数的第二损失函数,包括:In an implementation manner, the method for updating the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model A second loss function for model parameters, including:
确定所述第一训练模型的输出与所述样本参数集之间的第二交叉熵,并确定第一训练模型的输出与第二压缩模型的输出之间的第二相对熵散度;基于所述第二交叉熵和第二相对熵散度,确定所述第二损失函数。determining a second cross-entropy between the output of the first training model and the sample parameter set, and determining a second relative entropy divergence between the output of the first training model and the output of the second compression model; based on the The second loss function is determined using the second cross entropy and the second relative entropy divergence.
在一种实施方式中,所述模型压缩参数包括模型训练模式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;In one embodiment, the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
所述第一训练模型的数量基于所述模型训练模式确定。The number of the first training models is determined based on the model training mode.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
发送第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。A second indication message is sent, where the second indication message includes a number of first compressed models corresponding to the model training mode.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
接收第三指示消息,所述第三指示消息包括确定训练模型指示。A third indication message is received, where the third indication message includes an indication of determining the training model.
在一种实施方式中,所述模型训练模式包括多训练节点模式,所述方法还包括:In one embodiment, the model training mode includes a multi-training node mode, and the method further includes:
接收第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一压缩模型的数量对所述第一训练模型进行联邦平均得到的压缩模型;基于所述第三压缩模型,重新确定所述模型压缩参数,并基于重新确定的模型压缩参数更新所述第一压缩模型。receiving a fourth indication message; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first training model based on the number of the first compression models; based on For the third compression model, the model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
接收第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。A fifth indication message is received, where the fifth indication message is used to instruct the end of training the first compression model.
根据本公开实施例的第二方面,提供一种训练方法,应用于第二节点,所述方法包括:According to a second aspect of the embodiments of the present disclosure, there is provided a training method applied to a second node, the method comprising:
发送模型训练请求;其中,所述模型训练请求中包括模型压缩参数,所述模型压缩参数用于压缩第一训练模型得到第一压缩模型,所述第一训练模型基于所述模型训练请求训练得到。Send a model training request; wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress a first training model to obtain a first compression model, and the first training model is obtained by training based on the model training request .
在一种实施方式中,所述模型压缩参数包括模型训练模式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;In one embodiment, the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
所述第一训练模型的数量基于所述模型训练模式确定。The number of the first training models is determined based on the model training mode.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
接收第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。A second indication message is received, the second indication message includes a number of first compressed models corresponding to the model training mode.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
发送第三指示消息,所述第三指示消息包括确定训练模型指示。A third indication message is sent, where the third indication message includes an indication of determining the training model.
在一种实施方式中,所述模型训练模式包括多训练节点模式,所述方法还包括:In one embodiment, the model training mode includes a multi-training node mode, and the method further includes:
发送第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一训练模型的数量对所述第一压缩模型进行联邦平均得到的压缩模型。A fourth indication message is sent; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first compression model based on the number of the first training models.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
发送第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。A fifth instruction message is sent, where the fifth instruction message is used to instruct the end of training the first compression model.
在一种实施方式中,所述方法还包括:In one embodiment, the method further includes:
接收订阅需求,并基于所述订阅需求发送模型训练请求。A subscription requirement is received, and a model training request is sent based on the subscription requirement.
根据本公开实施例的第三方面,提供一种训练装置,应用于第一节点,所述装置包括:According to a third aspect of the embodiments of the present disclosure, there is provided a training apparatus applied to a first node, the apparatus comprising:
模型训练和压缩模块,用于响应于接收到模型训练请求,训练第一训练模型,其中,所述模型训练请求中包括模型压缩参数;基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型。A model training and compression module, configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters; based on the first training model and the model compression parameters, obtain A first compressed model of the first training model.
在一种实施方式中,所述模型压缩参数包括多个模型压缩选项;In one embodiment, the model compression parameters include a plurality of model compression options;
所述模型训练和压缩模块用于,在所述多个模型压缩选项中确定第一模型压缩选项,并基于所述第一模型压缩选项对所述第一训练模型进行压缩,得到第二压缩模型;根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定第一损失函数;基于所述第一损失函数更新所述第二压缩模型参数,得到所述第一压缩模型。The model training and compression module is configured to determine a first model compression option from among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model ; Determine the first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model; update all the parameters based on the first loss function parameters of the second compression model to obtain the first compression model.
在一种实施方式中,所述装置还包括数据处理和存储模块;In one embodiment, the apparatus further includes a data processing and storage module;
所述数据处理和存储模块用于,确定所述第二压缩模型的输出与所述样本参数集之间的第一交叉熵,并确定第二压缩模型的输出与第一训练模型的输出之间的第一相对熵散度;基于所述第一交叉熵和第一相对熵散度,确定所述第一损失函数。The data processing and storage module is used to determine the first cross entropy between the output of the second compression model and the sample parameter set, and to determine the difference between the output of the second compression model and the output of the first training model The first relative entropy divergence of , and the first loss function is determined based on the first cross entropy and the first relative entropy divergence.
在一种实施方式中,所述数据处理和存储模块还用于,根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定用于更新所述第一训练模型参数的第二损失函数。In one embodiment, the data processing and storage module is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameters used for training the first training model set to determine a second loss function for updating the parameters of the first training model.
在一种实施方式中,所述数据处理和存储模块还用于,确定所述第一训练模型的输出 与所述样本参数集之间的第二交叉熵,并确定第一训练模型的输出与第二压缩模型的输出之间的第二相对熵散度;基于所述第二交叉熵和第二相对熵散度,确定所述第二损失函数。In one embodiment, the data processing and storage module is further configured to determine a second cross entropy between the output of the first training model and the sample parameter set, and determine the output of the first training model and the a second relative entropy divergence between the outputs of the second compression model; the second loss function is determined based on the second cross entropy and the second relative entropy divergence.
在一种实施方式中,所述模型压缩参数包括模型训练模式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;所述第一训练模型的数量基于所述模型训练模式确定。In one embodiment, the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models the multi-training node mode; the number of the first training models is determined based on the model training mode.
在一种实施方式中,所述装置网络通信模块;In one embodiment, the device network communication module;
所述第一网络通信模块用于,发送第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。The first network communication module is configured to send a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
在一种实施方式中,所述第一网络通信模块还用于,接收第三指示消息,所述第三指示消息包括确定训练模型指示。In an implementation manner, the first network communication module is further configured to receive a third indication message, where the third indication message includes an indication of determining a training model.
在一种实施方式中,所述第一网络通信模块还用于,接收第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一压缩模型的数量对所述第一训练模型进行联邦平均得到的压缩模型;基于所述第三压缩模型,重新确定所述模型压缩参数,并基于重新确定的模型压缩参数更新所述第一压缩模型。In an embodiment, the first network communication module is further configured to receive a fourth indication message; the fourth indication message is used to indicate a third compression model, and the third compression model is based on the first compression model A compression model obtained by federally averaging the first training model; based on the third compression model, re-determining the model compression parameters, and updating the first compression model based on the re-determined model compression parameters.
在一种实施方式中,所述第一网络通信模块还用于,接收第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。In an implementation manner, the first network communication module is further configured to receive a fifth indication message, where the fifth indication message is used to instruct the end of training the first compression model.
根据本公开实施例的第四方面,提供一种训练装置,应用于第二节点,所述装置包括:According to a fourth aspect of the embodiments of the present disclosure, there is provided a training apparatus applied to a second node, the apparatus comprising:
第二网络通信模块,用于发送模型训练请求;其中,所述模型训练请求中包括模型压缩参数,所述模型压缩参数用于压缩第一训练模型得到第一压缩模型,所述第一训练模型基于所述模型训练请求训练得到。The second network communication module is used to send a model training request; wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain a first compression model, and the first training model Obtained by training based on the model training request.
在一种实施方式中,所述模型压缩参数包括模型训练模式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;In one embodiment, the model compression parameters include a model training mode including a single training node mode for training a single first training model and a single training node mode for training a plurality of the first training models The multi-training node mode of ;
所述第一训练模型的数量基于所述模型训练模式确定。The number of the first training models is determined based on the model training mode.
在一种实施方式中,所述第二网络通信模块还用于,接收第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。In an implementation manner, the second network communication module is further configured to receive a second indication message, where the second indication message includes a number of first compressed models corresponding to the model training mode.
在一种实施方式中,所述第二网络通信模块还用于,发送第三指示消息,所述第三指示消息包括确定训练模型指示。In an implementation manner, the second network communication module is further configured to send a third instruction message, where the third instruction message includes an instruction for determining the training model.
在一种实施方式中,所述模型训练模式包括多训练节点模式,所述网络通信模块还用于,发送第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一训练模型的数量对所述第一压缩模型进行联邦平均得到的压缩模型。In an embodiment, the model training mode includes a multi-training node mode, and the network communication module is further configured to send a fourth instruction message; the fourth instruction message is used to indicate the third compression model, the first The three-compression model is a compression model obtained by federally averaging the first compression model based on the number of the first training models.
在一种实施方式中,所述第二网络通信模块还用于,发送第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。In an implementation manner, the second network communication module is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
在一种实施方式中,所述装置还包括业务管理模块;In one embodiment, the apparatus further includes a service management module;
所述业务管理模块,用于接收订阅需求,并基于所述订阅需求发送模型训练请求。The service management module is configured to receive subscription requirements and send a model training request based on the subscription requirements.
根据本公开实施例的第五方面,提供一种训练装置,包括:According to a fifth aspect of the embodiments of the present disclosure, there is provided a training device, comprising:
处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为:执行第一方面或第一方面任意一种实施方式中所述的训练方法,或执行第二方面或第二方面任意一种实施方式中所述的训练方法。a processor; a memory for storing processor-executable instructions; wherein the processor is configured to: execute the first aspect or the training method described in any implementation manner of the first aspect, or execute the second aspect or The training method described in any one of the implementation manners of the second aspect.
根据本公开实施例的第六方面,提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行第一方面或第一方面任意一种实施方式中所述的训练方法,或执行第二方面或第二方面任意一种实施方式中所述的训练方法。According to a sixth aspect of the embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, which enables the mobile terminal to execute the first aspect or the first aspect when instructions in the storage medium are executed by a processor of a mobile terminal. The training method described in any one of the embodiments of the aspect, or the training method described in the second aspect or any one of the embodiments of the second aspect is performed.
本公开的实施例提供的技术方案可以包括以下有益效果:在本公开中对训练的模型进行压缩,并对压缩模型参数进行更新,使得压缩模型可以与训练模型具有相同的效果,从而减少在传输模型是的信令开销,并且可以保证模型的精度和可靠性,并且进一步保证了用户信息的安全性。The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects: in the present disclosure, the trained model is compressed, and the parameters of the compressed model are updated, so that the compressed model can have the same effect as the training model, thereby reducing the transmission time The model is the signaling overhead, and can ensure the accuracy and reliability of the model, and further ensure the security of user information.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.
图1为本公开提供的一种压缩方法的系统架构示意图。FIG. 1 is a schematic diagram of a system architecture of a compression method provided by the present disclosure.
图2是根据一示例性实施例示出的一种训练方法的流程图。Fig. 2 is a flowchart of a training method according to an exemplary embodiment.
图3是根据一示例性实施例示出的另一种训练方法的流程图。Fig. 3 is a flowchart of another training method according to an exemplary embodiment.
图4是根据一示例性实施例示出的又一种训练方法的流程图。Fig. 4 is a flowchart of yet another training method according to an exemplary embodiment.
图5是根据一示例性实施例示出的又一种训练方法的流程图。Fig. 5 is a flowchart of yet another training method according to an exemplary embodiment.
图6为本公开提供的一种训练方法中单训练节点模式确定第一压缩模型的实施方式流程图。FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure.
图7为本公开提供的一种训练方法中多训练节点模式确定第一压缩模型的实施方式流程图。FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure.
图8为本公开提供的一种训练方法中模型训练和压缩决策部分的协议和接口原理图。FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure.
图9为本公开提供的一种训练方法中单训练节点模式下模型训练和压缩部分的协议和接口原理图。FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure.
图10为本公开提供的一种训练方法中多训练节点模式下模型训练和压缩部分的协议和接口原理图。FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure.
图11为本公开提供的一种训练方法中无线数据传输部分的协议和接口原理图。FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure.
图12是根据一示例性实施例示出的一种训练装置的框图。Fig. 12 is a block diagram of a training apparatus according to an exemplary embodiment.
图13是根据一示例性实施例示出的另一种训练装置的框图。Fig. 13 is a block diagram of another training apparatus according to an exemplary embodiment.
图14是根据一示例性实施例示出的一种用于训练的装置的框图。Fig. 14 is a block diagram of an apparatus for training according to an exemplary embodiment.
图15是根据一示例性实施例示出的另一种用于训练的装置的框图。Fig. 15 is a block diagram of another apparatus for training according to an exemplary embodiment.
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as recited in the appended claims.
通信网络为满足多业务场景的需求,具有超高速率、超低时延、超高可靠和超多连接的特性,而这些业务场景,以及相应的需求和通信网络的特性,给通信网络的部署和运维带来了前所未有的挑战。In order to meet the needs of multi-service scenarios, the communication network has the characteristics of ultra-high speed, ultra-low latency, ultra-high reliability, and ultra-multiple connections. These business scenarios, as well as the corresponding requirements and characteristics of the communication network, are used for the deployment of the communication network. and operation and maintenance brought unprecedented challenges.
随着人工智能技术的突破,尤其在深度学习算法的丰富、硬件计算能力的提升、新一代通信网络中引入海量的数据等方面,为新一代网络智能化提供了有力的支撑。以及,利用人工智能进一步提高通信网络的资源利用率,提高通信网络的终端业务体验,实现通信网络自动化和智能化控制和管理。With the breakthrough of artificial intelligence technology, especially in terms of the enrichment of deep learning algorithms, the improvement of hardware computing capabilities, and the introduction of massive data into the new generation of communication networks, it has provided strong support for the new generation of network intelligence. And, use artificial intelligence to further improve the resource utilization of the communication network, improve the terminal service experience of the communication network, and realize the automation and intelligent control and management of the communication network.
相关技术中,利用深度学习算法在训练模型时的实施过程包括:模型请求节点依据模型/分析订阅需求,确定模型结构和模型训练模式,其中,模型训练模式包含单训练节点模式和多训练节点模式。模型请求节点将模型结构和模型训练模式发送给模型训练节点,模型训练节点依据模型训练模式独立开展模型训练工作或参与多训练节点协作模型训练。模型训练完成后,模型训练节点将模型发送给模型请求节点,模型请求节点对采用多训练节点模式的模型请求节点发来的模型进行联邦平均,得到全局模型。模型请求节点检验所得模型是否满足模型/分析订阅需求,若满足,则模型请求节点将所得模型发送给模型/分析方。若不满足,则重复上述模型训练过程,直至模型请求节点所得模型满足模型/分析订阅需求。由此可知,相关技术中包括如下不足:In the related art, the implementation process of using the deep learning algorithm when training the model includes: the model request node determines the model structure and the model training mode according to the model/analysis subscription requirements, wherein the model training mode includes a single training node mode and a multi-training node mode. . The model request node sends the model structure and model training mode to the model training node, and the model training node independently conducts model training according to the model training mode or participates in the collaborative model training of multiple training nodes. After the model training is completed, the model training node sends the model to the model request node, and the model request node performs a federated average of the models sent by the model request node in the multi-training node mode to obtain a global model. The model request node checks whether the obtained model meets the model/analysis subscription requirements, and if so, the model request node sends the obtained model to the model/analysis party. If not, repeat the above model training process until the model obtained by the model request node meets the model/analysis subscription requirements. It can be seen from this that the related art includes the following deficiencies:
(1)模型的数据量相对较大,尤其是多训练节点模式下,模型需要在模型训练节点和 模型请求节点之间进行多次传输,大大增加了通信开销。(1) The data volume of the model is relatively large, especially in the multi-training node mode, the model needs to perform multiple transmissions between the model training node and the model requesting node, which greatly increases the communication overhead.
(2)模型训练节点和模型请求节点之间传输的模型数据量大,会加剧无线资源紧缺的状况,从而导致数据传输错误的概率增加,模型请求节点接收到的模型可靠性降低,无法保证模型精度。(2) The large amount of model data transmitted between the model training node and the model requesting node will aggravate the shortage of wireless resources, thereby increasing the probability of data transmission errors, and reducing the reliability of the model received by the model requesting node. precision.
(3)将模型训练节点采用本地数据训练所得的模型不加任何处理地发送给模型请求节点,增加了模型在网络中被恶意截取后反向推理出终端和网络数据相关信息的风险,无法保证终端隐私数据的安全性。(3) Sending the model trained by the model training node using local data to the model requesting node without any processing increases the risk of inferring information related to the terminal and network data in reverse after the model is maliciously intercepted in the network, which cannot be guaranteed. Security of terminal private data.
基于上述相关技术中存在的不足,本公开提供一种训练方法,以解决高额通信开销,模型精度不够,以及终端隐私数据的安全性等问题。本公开提供的训练方法依据网络业务需求(例如模型的订阅需求)来确定模型结构以及模型训练模式,并充分考虑模型训练节点本地的可用算力、通信条件、训练样本特性等因素,制定多种模型压缩选项,以减少不必要的通信开销、提升无线网络资源利用率,将深度学习以一种更加高效、安全的方式应用于网络智能化工作。Based on the deficiencies in the above-mentioned related technologies, the present disclosure provides a training method to solve the problems of high communication overhead, insufficient model accuracy, and security of terminal private data. The training method provided by the present disclosure determines the model structure and model training mode according to network service requirements (such as model subscription requirements), and fully considers the local available computing power, communication conditions, training sample characteristics and other factors of the model training node, and formulates various Model compression option to reduce unnecessary communication overhead, improve wireless network resource utilization, and apply deep learning to network intelligence work in a more efficient and secure way.
图1为本公开提供的一种训练方法的系统架构示意图。如图1所示,该系统包括核心网部分和无线接入网部分。终端(用户)通过无线信道接入基站,基站之间通过Xn接口相连,基站通过N3接口接入核心网的终端端口功能(User Port Function,UPF)网元,UPF网元通过N4接口接入会话管理功能(Session Management Function,SMF)网元,SMF网元接入核心网的总线结构,与核心网其他网络功能(Network Function,NF)相连。FIG. 1 is a schematic diagram of a system architecture of a training method provided by the present disclosure. As shown in Figure 1, the system includes a core network part and a radio access network part. The terminal (user) accesses the base station through the wireless channel, the base stations are connected through the Xn interface, the base station accesses the terminal port function (User Port Function, UPF) network element of the core network through the N3 interface, and the UPF network element accesses the session through the N4 interface The management function (Session Management Function, SMF) network element, the SMF network element is connected to the bus structure of the core network, and is connected with other network functions (Network Function, NF) of the core network.
可以理解的是,图1所示的网络设备与终端的通信系统仅是进行示意性说明,无线通信系统中还可包括其它网络设备,例如还可以包括无线中继设备和无线回传设备等,在图1中未画出。本公开实施例对该无线通信系统中包括的网络设备数量和终端数量不做限定。It can be understood that the communication system between the network device and the terminal shown in FIG. 1 is only a schematic illustration, and the wireless communication system may also include other network devices, for example, a wireless relay device and a wireless backhaul device, etc. Not shown in Figure 1. The embodiments of the present disclosure do not limit the number of network devices and the number of terminals included in the wireless communication system.
进一步可以理解的是,本公开实施例的无线通信系统,是一种提供无线通信功能的网络。无线通信系统可以采用不同的通信技术,例如码分多址(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA)、时分多址(time division multiple access,TDMA)、频分多址(frequency division multiple access,FDMA)、正交频分多址(orthogonal frequency-division multiple access,OFDMA)、单载波频分多址(single Carrier FDMA,SC-FDMA)、载波侦听多路访问/冲突避免(Carrier Sense Multiple Access with Collision Avoidance)。根据不同网络的容量、速率、时延等因素可以将网络分为2G(英文:generation)网络、3G网络、4G网络或者未来演进网络,如5G网络,5G网络也可称为是新无线网络(New Radio,NR)。为了方便描述,本公开有时会将无线通信网络简称为网络。It can be further understood that the wireless communication system according to the embodiment of the present disclosure is a network that provides a wireless communication function. Wireless communication systems can use different communication technologies, such as code division multiple access (CDMA), wideband code division multiple access (WCDMA), time division multiple access (TDMA) , frequency division multiple access (frequency division multiple access, FDMA), orthogonal frequency division multiple access (orthogonal frequency-division multiple access, OFDMA), single carrier frequency division multiple access (single Carrier FDMA, SC-FDMA), carrier sense Carrier Sense Multiple Access with Collision Avoidance. According to the capacity, speed, delay and other factors of different networks, the network can be divided into 2G (English: generation) network, 3G network, 4G network or future evolution network, such as 5G network, 5G network can also be called a new wireless network ( New Radio, NR). For convenience of description, the present disclosure will sometimes refer to a wireless communication network simply as a network.
进一步的,本公开中涉及的网络设备也可以称为无线接入网设备。该无线接入网设备可以是:基站、演进型基站(evolved node B,基站)、家庭基站、无线保真(wireless fidelity,WIFI)系统中的接入点(access point,AP)、无线中继节点、无线回传节点、传输点(transmission point,TP)或者发送接收点(transmission and reception point,TRP)等,还可以为NR系统中的gNB,或者,还可以是构成基站的组件或一部分设备等。当为车联网(V2X)通信系统时,网络设备还可以是车载设备。应理解,本公开的实施例中,对网络设备所采用的具体技术和具体设备形态不做限定。Further, the network devices involved in the present disclosure may also be referred to as radio access network devices. The wireless access network equipment may be: a base station, an evolved node B (base station), a home base station, an access point (AP) in a wireless fidelity (WIFI) system, a wireless relay A node, a wireless backhaul node, a transmission point (TP) or a transmission and reception point (TRP), etc., can also be a gNB in an NR system, or can also be a component or part of a device that constitutes a base station Wait. When it is a vehicle-to-everything (V2X) communication system, the network device may also be an in-vehicle device. It should be understood that, in the embodiments of the present disclosure, the specific technology and specific device form adopted by the network device are not limited.
进一步的,本公开中涉及的终端,也可以称为终端设备、用户设备(User Equipment,UE)、移动台(Mobile Station,MS)、移动终端(Mobile Terminal,MT)等,是一种向用户提供语音和/或数据连通性的设备,例如,终端可以是具有无线连接功能的手持式设备、车载设备等。目前,一些终端的举例为:智能手机(Mobile Phone)、口袋计算机(Pocket Personal Computer,PPC)、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)、笔记本电脑、平板电脑、可穿戴设备、或者车载设备等。此外,当为车联网(V2X)通信系统时,终端设备还可以是车载设备。应理解,本公开实施例对终端所采用的具体技术和具体设备形态不做限定。Further, the terminal involved in the present disclosure may also be referred to as terminal equipment, user equipment (User Equipment, UE), mobile station (Mobile Station, MS), mobile terminal (Mobile Terminal, MT), etc. A device that provides voice and/or data connectivity, for example, a terminal may be a handheld device with wireless connectivity, a vehicle-mounted device, or the like. At present, some examples of terminals are: Smartphone (Mobile Phone), Pocket Personal Computer (PPC), PDA, Personal Digital Assistant (PDA), notebook computer, tablet computer, wearable device, or Vehicle equipment, etc. In addition, when it is a vehicle-to-everything (V2X) communication system, the terminal device may also be an in-vehicle device. It should be understood that the embodiments of the present disclosure do not limit the specific technology and specific device form adopted by the terminal.
图2是根据一示例性实施例示出的一种训练方法的流程图。如图2所示,训练方法用于第一节点中,包括以下步骤。Fig. 2 is a flow chart of a training method according to an exemplary embodiment. As shown in Figure 2, the training method is used in the first node and includes the following steps.
在步骤S11中,响应于接收到模型训练请求,训练第一训练模型。In step S11, in response to receiving a model training request, a first training model is trained.
在本公开实施例中,第一节点为模型训练节点,本公开为便于描述将模型训练节点称为第一节点;第二节点为模型请求节点,类似,本公开为便于描述将模型请求节点称为第二节点。其中,模型训练请求中包括模型压缩参数。In the embodiment of the present disclosure, the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description; the second node is a model request node. for the second node. The model training request includes model compression parameters.
该模型压缩参数中至少包括以下一种:The model compression parameters include at least one of the following:
模型训练结构,多个模型压缩选项,模型训练模式。Model training structure, multiple model compression options, model training modes.
在本公开实施例中,模型压缩选项是基于第二节点(例如,模型请求节点)接收的模型订阅需求确定的。第二节点根据接收的模型订阅需求确定发送模型训练请求,第一节点(例如,模型训练节点)接收到第二节点发送的模型训练请求之后,发送第三指示信息,响应模型训练请求信息,确定基于本地样本参数集和模型训练结构训练第一训练模型,并确定模型压缩所需的相关参数。其中,第一节点发送的响应模型训练请求信息中还包括第一节点的本地计算能力,通信条件、训练样本参数集特性等其中的一种或几种。In an embodiment of the present disclosure, the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node). The second node determines to send the model training request according to the received model subscription requirement. After receiving the model training request sent by the second node, the first node (for example, the model training node) sends third indication information, responds to the model training request information, and determines The first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined. The response model training request information sent by the first node further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
在步骤S12中,基于第一训练模型和模型压缩参数,得到第一训练模型的第一压缩模型。In step S12, a first compression model of the first training model is obtained based on the first training model and the model compression parameters.
在本公开实施例中,第一节点基于模型压缩参数中的模型压缩选项以及模型压缩所需的相关参数对第一训练模型进行压缩。其中,模型压缩所需的相关参数为第一节点基于第二节点发送的模型压缩参数以及本地计算能力等参数确定的,以及模型压缩选项中包括模型精度、模型参数数据量。模型压缩参数包括多个模型压缩选项,多个模型压缩选项是第二节点基于多个第一节点上报的本地计算能力、通信条件、训练样本参数集特性等其中的一种或几种确定的。In the embodiment of the present disclosure, the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression. The relevant parameters required for model compression are determined by the first node based on parameters such as model compression parameters and local computing capabilities sent by the second node, and the model compression options include model accuracy and model parameter data volume. The model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities, communication conditions, and training sample parameter set characteristics reported by multiple first nodes.
图3是根据一示例性实施例示出的一种训练方法的流程图。如图3所示,基于第一训练模型和模型压缩参数,得到第一训练模型的压缩模型,包括以下步骤。Fig. 3 is a flowchart of a training method according to an exemplary embodiment. As shown in FIG. 3 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
在步骤S21中,在多个模型压缩选项中确定第一模型压缩选项,并基于第一模型压缩选项对第一训练模型进行压缩,得到第二压缩模型。In step S21, a first model compression option is determined among the multiple model compression options, and the first training model is compressed based on the first model compression option to obtain a second compression model.
在本公开实施例中,第一节点在多个模型压缩选项中根据本地计算能力,通信条件和训练样本中的一种或几种,确定用于模型压缩的第一模型压缩选项。根据第一模型压缩选项中包括的模型精度以及根据训练过程中确定的模型压缩所需的相关参数确定代表网络中各条通道对精度贡献值的矩阵,并用符号g标识该矩阵。根据模型压缩选项中模型参数数据量的要求对第一训练模型进行压缩,得到第二压缩模型,并用符号θ
S标识第二压缩模型。其中,利用矩阵g与第一模型压缩选项对第一训练模型进行压缩可以采用如下实施方式:
In an embodiment of the present disclosure, the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options. According to the model accuracy included in the first model compression option and the relevant parameters required for model compression determined in the training process, determine a matrix representing the contribution value of each channel in the network to the accuracy, and use the symbol g to identify the matrix. The first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol θ S is used to identify the second compression model. The following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
第一节点以模型参数数据量为约束条件,设计一个剪枝矩阵X,来保留模型中对精度贡献较大的通道。第一节点以剪枝矩阵X的每一列的元素和作为未知项,根据矩阵g每一列中元素的大小,保留矩阵中列元素最大的项所对应的通道,并将其他通道进行剪枝。求出剪枝矩阵X后,采用X对θ进行剪枝得到第二压缩模型θ
S。
The first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model. The first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix, and prunes other channels. After the pruning matrix X is obtained, X is used to prune θ to obtain the second compression model θ S .
本公开实施例中,第一节点选择合适的模型压缩选项,并依据模型压缩选项将训练模型进行压缩后再传输给第二节点,在保留深度学习模型绝大部分精度的情况下,尽可能的减小了模型的数据量,这种方法依据模型训练节点通信速率要求来实现模型压缩,大大减少了模型上行传输时的通信开销。In the embodiment of the present disclosure, the first node selects an appropriate model compression option, compresses the training model according to the model compression option, and then transmits it to the second node. Under the condition of retaining most of the accuracy of the deep learning model, try to compress the training model as much as possible. The data volume of the model is reduced. This method realizes model compression according to the communication rate requirements of the model training node, which greatly reduces the communication overhead of the model uplink transmission.
在步骤S22中,根据第一训练模型的输出、第二压缩模型的输出、以及用于训练第一训练模型的样本参数集,确定第一损失函数。In step S22, a first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model.
在本公开实施例中,样本训练参数集还包括样本验证参数集,并确定样本验证参数集的输入和输出的至少一个数据对。第一节点将样本验证参数集的输入和输出的数据对输入至第一训练模型和第二压缩模型中,确定第一训练模型的输出和第二压缩模型的输出,以及对应的样本验证参数集的输出,该输出为模型输入所对应的真实值。In an embodiment of the present disclosure, the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined. The first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set. The output is the true value corresponding to the model input.
进一步,确定第二压缩模型的输出与样本参数集(即样本验证参数集的输入和输出数据对真实值)之间的第一交叉熵,确定第二压缩模型的输出与第一训练模型的输出之间的第一相对熵散度,将第一交叉熵和第一相对熵散度之和确定为第二压缩模型的损失函数。本公开为便于区分将第二压缩模型的损失函数确定为第一损失函数。并根据如上述实施方式基于样本参数集中多个输入输出数据对确定多个第一损失函数,并确定多个第一损失函数的平均值,根据多个第一损失函数的平均值采用梯度下降法对第二压缩模型参数进行更新,得到第一压缩模型。Further, determine the first cross entropy between the output of the second compression model and the sample parameter set (that is, the input and output data of the sample verification parameter set to the true value), and determine the output of the second compression model and the output of the first training model The first relative entropy divergence between, the sum of the first cross entropy and the first relative entropy divergence is determined as the loss function of the second compression model. The present disclosure determines the loss function of the second compression model as the first loss function for the convenience of distinction. And according to the above-mentioned embodiment, a plurality of first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of the plurality of first loss functions is determined, and a gradient descent method is used according to the average value of the plurality of first loss functions. The parameters of the second compression model are updated to obtain the first compression model.
其中,第一损失函数(即第二压缩模型的损失函数)采用如下公式表示:The first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
式中,
为第二压缩模型的损失函数;
为第二压缩模型输出值与样本验证参数集的输入和输出数据对真实值之间的第一交叉熵;D
KL(p
S||p
2)为第一训练模型输出值和第二压缩模型输出值的第一相对熵散度;p
S为第二压缩模型输出值;p
2为第一训练模型输出值。
In the formula, is the loss function of the second compression model; is the first cross entropy between the output value of the second compression model and the input and output data pairs of the sample validation parameter set to the true value; D KL (p S || p 2 ) is the output value of the first training model and the second compression model The first relative entropy divergence of the output value; p S is the output value of the second compression model; p 2 is the output value of the first training model.
在本公开实施例中,需要说明的是第一训练模型是基于第一训练模型的损失函数对第一训练模型参数进行更新后的第一训练模型,换言之,在本公开实施例中优先确定第一训练模型的损失函数,基于第一训练模型的损失函数对第一训练模型参数进行更新后,确定第二压缩模型的损失函数(即第一损失函数)。本公开为便于区分将第一训练模型的损失函数确定为第二损失函数。如上述,样本训练参数集还包括样本验证参数集,第一节点确定样本验证参数集的输入和输出的至少一个数据对。第一节点将样本验证参数集的输入和输出的数据对输入至第一训练模型和第二压缩模型中,确定第一训练模型的输出和第二压缩模型的输出,以及对应的样本验证参数集的输出,该输出为模型输入所对应的真实值。In the embodiment of the present disclosure, it should be noted that the first training model is the first training model after updating the parameters of the first training model based on the loss function of the first training model. A loss function of the training model, after updating the parameters of the first training model based on the loss function of the first training model, the loss function (ie, the first loss function) of the second compression model is determined. The present disclosure determines the loss function of the first training model as the second loss function for the convenience of distinction. As mentioned above, the sample training parameter set further includes a sample verification parameter set, and the first node determines at least one data pair of input and output of the sample verification parameter set. The first node inputs the input and output data pairs of the sample verification parameter set into the first training model and the second compression model, and determines the output of the first training model and the output of the second compression model, and the corresponding sample verification parameter set. The output is the true value corresponding to the model input.
进一步确定第一训练模型的输出与样本参数集(即样本验证参数集的输入和输出数据对真实值)之间的第二交叉熵,并确定第一训练模型的输出与第二压缩模型的输出之间的第二相对熵散度,将第二交叉熵和第二相对散度之和确定为第二损失函数。根据如上述实施方式,基于样本参数集中多个输入输出数据对确定多个第二损失函数,并确定多个第二损失函数的平均值,根据多个第二损失函数的平均值采用梯度下降法对第一训练模型参数进行更新,得到更新后的第一训练模型。Further determine the second cross entropy between the output of the first training model and the sample parameter set (that is, the input and output data of the sample verification parameter set to the true value), and determine the output of the first training model and the output of the second compression model The second relative entropy divergence between, the sum of the second cross entropy and the second relative divergence is determined as the second loss function. According to the above-mentioned embodiment, a plurality of second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and an average value of the plurality of second loss functions is determined, and a gradient descent method is adopted according to the average value of the plurality of second loss functions The parameters of the first training model are updated to obtain an updated first training model.
其中,第二损失函数(即第一训练模型的损失函数)采用如下公式表示:Wherein, the second loss function (that is, the loss function of the first training model) is expressed by the following formula:
L
θ=L
C+D
KL(p
1||p
S)
L θ =L C +D KL (p 1 ||p S )
式中,L
θ为第一训练模型的损失函数;L
C为第一训练模型输出值与样本验证参数集的输入和输出数据对真实值之间的第二交叉熵;D
KL(p
1||p
S)为第一训练模型输出值和第二压缩模型输出值的相对熵散度;p
S为第二压缩模型输出值;p
1为第一训练模型输出值。
In the formula, L θ is the loss function of the first training model; L C is the second cross-entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value; D KL (p 1 | | p S ) is the relative entropy divergence of the output value of the first training model and the output value of the second compression model; p S is the output value of the second compression model; p 1 is the output value of the first training model.
在本公开实施例中,模型压缩参数中的模型训练模式包括用于训练单个第一训练模型的单训练节点模式和用于训练多个第一训练模型的多训练节点模式。In an embodiment of the present disclosure, the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
第一节点根据模型训练参数中包括的模型训练模式确定训练第一训练模型的数量,若模型训练模式为单训练节点训练,则确定基于单个第一节点训练一个第一训练模型,其训练方式如上述。若模型训练模式为多训练节点训练,则确定基于多个第一节点训练多个第一训练模型,并为多个第一节点设置不同的序列标记,训练多个第一训练模型。下面以第m个模型训练节点(即第m个第一节点)为例进行说明。对多训练节点模型进行说明。The first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node. The training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models. The following takes the mth model training node (ie, the mth first node) as an example for description. The multi-training node model is described.
在本公开实施例中,第一节点在多个模型压缩选项中根据本地计算能力,通信条件和训练样本中的一种或几种,确定用于模型压缩的第一模型压缩选项。根据第一模型压缩选项中包括的模型精度以及根据训练过程中确定的模型压缩所需的相关参数确定代表网络中各条通道对精度贡献值的矩阵,并用符号g标识该矩阵。根据模型压缩选项中模型参数数据量的要求对第一训练模型进行压缩,得到第二压缩模型,并用符号θ
S标识第二压缩模型。其中,利用矩阵g与第一模型压缩选项对第一训练模型进行压缩可以采用如下实施方式:
In an embodiment of the present disclosure, the first node determines a first model compression option for model compression according to one or more of local computing capabilities, communication conditions, and training samples among the multiple model compression options. According to the model accuracy included in the first model compression option and the relevant parameters required for model compression determined in the training process, determine a matrix representing the contribution value of each channel in the network to the accuracy, and use the symbol g to identify the matrix. The first training model is compressed according to the requirement of the model parameter data volume in the model compression option to obtain the second compression model, and the symbol θ S is used to identify the second compression model. The following implementations may be used to compress the first training model by using the matrix g and the first model compression option:
第一节点以模型参数数据量为约束条件,设计一个剪枝矩阵X,来保留模型中对精度贡献较大的通道。第一节点以剪枝矩阵X的每一列的元素和作为未知项,根据矩阵g每一列中元素的大小,保留矩阵g中列元素最大的项所对应的通道,并将其他通道进行剪枝。求出剪枝矩阵X后,采用X对θ
m进行剪枝得到第m个第二压缩模型
The first node takes the amount of model parameter data as a constraint, and designs a pruning matrix X to retain the channel that contributes more to the accuracy in the model. The first node takes the sum of the elements of each column of the pruning matrix X as the unknown item, and according to the size of the elements in each column of the matrix g, retains the channel corresponding to the item with the largest column element in the matrix g, and prunes other channels. After obtaining the pruning matrix X, use X to prune θ m to obtain the mth second compression model
在本公开实施例中,样本训练参数集还包括样本验证参数集,确定样本验证参数集的输入和输出的至少一个数据对。第一节点将样本验证参数集的输入和输出的数据对输入至第m个第一训练模型和第m个第二压缩模型中,确定第m个第一训练模型的输出和第m个第二压缩模型的输出,以及对应的样本验证参数集的输出,该输出为模型输入所对应的真实值。In an embodiment of the present disclosure, the sample training parameter set further includes a sample verification parameter set, and at least one data pair of input and output of the sample verification parameter set is determined. The first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model. The output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
进一步,确定第m个第二压缩模型的输出与样本参数集(即样本验证参数集的输入和输出数据对真实值)之间的第m个第一交叉熵,确定第m个第二压缩模型的输出与第m个第一训练模型的输出之间的第m个第一相对熵散度,将第m个第一交叉熵和第m个第一相对熵散度之和确定为第m个第二压缩模型的损失函数。本公开为便于区分将第m个第二压缩模型的损失函数确定为第m个第一损失函数。并根据如上述实施方式基于样本参数集中多个输入输出数据对确定多个第m个第一损失函数,并确定多个第m个第一损失函数的平均值,根据多个第m个第一损失函数的平均值采用梯度下降法对第m个第二压缩模型参数进行更新,得到第m个第一压缩模型。Further, determine the mth first cross-entropy between the output of the mth second compression model and the sample parameter set (that is, the input and output data of the sample verification parameter set to the true value), and determine the mth second compression model The mth first relative entropy divergence between the output of the mth first training model and the output of the mth first training model, and the sum of the mth first cross-entropy and the mth first relative entropy divergence is determined as the mth first The loss function of the second compression model. The present disclosure determines the loss function of the mth second compression model as the mth first loss function for the convenience of distinction. And according to the above-mentioned embodiment, a plurality of mth first loss functions are determined based on a plurality of input and output data pairs in the sample parameter set, and an average value of a plurality of mth first loss functions is determined, according to the plurality of mth first loss functions. The average value of the loss function uses the gradient descent method to update the parameters of the mth second compression model to obtain the mth first compression model.
其中,第m个第一损失函数(即第二压缩模型的损失函数)采用如下公式表示:Among them, the mth first loss function (that is, the loss function of the second compression model) is expressed by the following formula:
式中,
为第m个第二压缩模型的损失函数;
为第m个第二压缩模型输出值与样本验证参数集的输入和输出数据对真实值之间的第一交叉熵;
为第m个第一训练模型输出值和第m个第二压缩模型输出值的第一相对熵散度;
为第m个第二压缩模型输出值;p
m为第m个第一训练模型输出值。
In the formula, is the loss function of the mth second compression model; is the first cross entropy between the output value of the mth second compression model and the input and output data pairs of the sample validation parameter set to the true value; is the first relative entropy divergence of the mth first training model output value and the mth second compression model output value; is the output value of the mth second compression model; p m is the output value of the mth first training model.
在本公开实施例中,需要说明的是第m个第一训练模型是基于第m个第一训练模型的损失函数对第m个第一训练模型参数进行更新后的第m个第一训练模型,换言之,在本公开实施例中优先确定第m个第一训练模型的损失函数,基于第m个第一训练模型的损失函数对第一训练模型参数进行更新后,确定第m个第二压缩模型的损失函数(即第m个第一损失函数)。本公开为便于区分、将第m个第一训练模型的损失函数确定为第m个第二损失函数。如上述,样本训练参数集还包括样本验证参数集,第一节点确定样本验证参数集的输入和输出的至少一个数据对。第一节点将样本验证参数集的输入和输出的数据对输入至第m个第一训练模型和第m个第二压缩模型中,确定第m个第一训练模型的输出和第m个第二压缩模型的输出,以及对应的样本验证参数集的输出,该输出为模型输入所对应的真实值。In the embodiment of the present disclosure, it should be noted that the mth first training model is the mth first training model obtained by updating the parameters of the mth first training model based on the loss function of the mth first training model. , in other words, in the embodiment of the present disclosure, the loss function of the mth first training model is preferentially determined, and after updating the parameters of the first training model based on the loss function of the mth first training model, the mth second compression model is determined. The loss function of the model (i.e. the mth first loss function). In the present disclosure, for the convenience of distinction, the loss function of the mth first training model is determined as the mth second loss function. As mentioned above, the sample training parameter set further includes a sample verification parameter set, and the first node determines at least one data pair of input and output of the sample verification parameter set. The first node inputs the input and output data pairs of the sample verification parameter set into the mth first training model and the mth second compression model, and determines the output of the mth first training model and the mth second compression model. The output of the compressed model, and the output of the corresponding sample validation parameter set, the output is the real value corresponding to the model input.
进一步确定第m个第一训练模型的输出与样本参数集(即样本验证参数集的输入和输出数据对真实值)之间的第二交叉熵,并确定第m个第一训练模型的输出与第m个第二压缩模型的输出之间的第二相对熵散度,将第m个第二交叉熵和第m个第二相对散度之和确定为第m个第二损失函数。根据如上述实施方式、基于样本参数集中多个输入输出数据对确定多个第m个第二损失函数,并确定多个第m个第二损失函数的平均值,根据多个第m个第二损失函数的平均值采用梯度下降法对第m个第一训练模型参数进行更新,得到更新后的第m个第一训练模型。Further determine the second cross-entropy between the output of the mth first training model and the sample parameter set (that is, the input and output data of the sample validation parameter set to the true value), and determine the output of the mth first training model and The second relative entropy divergence between the outputs of the mth second compression model, the sum of the mth second cross entropy and the mth second relative divergence is determined as the mth second loss function. According to the above-mentioned embodiment, a plurality of m-th second loss functions are determined based on a plurality of input-output data pairs in the sample parameter set, and the average value of a plurality of m-th second loss functions is determined, according to the plurality of m-th second loss functions. The average value of the loss function uses the gradient descent method to update the parameters of the m-th first training model to obtain the updated m-th first training model.
其中,第m个第二损失函数(即第一训练模型的损失函数)采用如下公式表示:Among them, the mth second loss function (that is, the loss function of the first training model) is expressed by the following formula:
式中,
为第m个第一训练模型的损失函数;L
C为第一训练模型输出值与样本验证参数集的输入和输出数据对真实值之间的第二交叉熵;
为第m个第一训练模型输出值和第二压缩模型输出值的相对熵散度;
为第m个第二压缩模型输出值;p
m为第m个第一训练模型输出值。
In the formula, is the loss function of the mth first training model; L C is the second cross entropy between the output value of the first training model and the input and output data of the sample verification parameter set to the true value; is the relative entropy divergence of the mth output value of the first training model and the output value of the second compression model; is the output value of the mth second compression model; p m is the output value of the mth first training model.
当然,在本公开实施例中,还可以选自,模型稀疏化、参数量化等其他模型压缩方法确定第一压缩模型,本公开在此不做具体限定。Of course, in the embodiment of the present disclosure, other model compression methods such as model sparseness, parameter quantization, etc. may also be selected to determine the first compression model, which is not specifically limited in the present disclosure.
图4是根据一示例性实施例示出的一种训练方法的流程图。如图4所示,基于第一训练模型和模型压缩参数,得到第一训练模型的压缩模型,包括以下步骤。Fig. 4 is a flowchart showing a training method according to an exemplary embodiment. As shown in FIG. 4 , based on the first training model and the model compression parameters, a compression model of the first training model is obtained, including the following steps.
在步骤S31中,接收第四指示消息。In step S31, a fourth indication message is received.
在本公开实施例中,第二节点根据接收的第二指示消息确定第一压缩模型。若第一压缩模型为一个,则确定该第一压缩模型是否满足模型订阅需求,或者是否满足分析订阅需求。若第一压缩模型为多个,则将多个第一压缩模型进行联邦平均,得到联邦平均后的第三压缩模型或者称为全局模型,确定联邦平均后的第三压缩模型是否满足模型订阅需求,或者是否满足分析订阅需求。一种实施方式中,如果一个第一压缩模型、或者多个第一压缩模型联邦平均后的第三压缩模型不满足订阅需求,发送第四指示消息,其中第四指示消息用于指示第三压缩模型。第一节点接收到第四指示消息。In this embodiment of the present disclosure, the second node determines the first compression model according to the received second indication message. If there is one first compression model, it is determined whether the first compression model meets the model subscription requirement, or whether it meets the analysis subscription requirement. If there are multiple first compression models, the multiple first compression models are federated averaged to obtain a third compressed model after federation average or called a global model, and it is determined whether the third compressed model after federation average meets the model subscription requirements , or whether the analytics subscription requirements are met. In an implementation manner, if a first compression model or a third compression model after federated average of multiple first compression models does not meet the subscription requirement, a fourth indication message is sent, where the fourth indication message is used to indicate the third compression model. Model. The first node receives the fourth indication message.
在步骤S32中,基于第三压缩模型,重新确定模型压缩参数,并基于重新确定的模型压缩参数更新第一压缩模型。In step S32, based on the third compression model, model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
在本公开实施例中,第一节点根据第四指示消息指示的第三压缩模型,重新确定模型压缩参数,并基于重新确定的模型压缩参数更新第一压缩模型,确定第一压缩模型的损失函数,重新更新第一压缩模型的参数,直到第二节点确定满足模型订阅需求的压缩模型。In this embodiment of the present disclosure, the first node re-determines model compression parameters according to the third compression model indicated by the fourth indication message, updates the first compression model based on the re-determined model compression parameters, and determines the loss function of the first compression model , and re-update the parameters of the first compression model until the second node determines a compression model that satisfies the model subscription requirement.
在本公开一示例性实施例中,另一种实施方式为,当第二节点确定第一压缩模型满足模型订阅需求,确定发送第五指示消息,其中,第五指示消息用于指示结束训练第一压缩模型。第一节点接收到第五指示消息后确定训练第一压缩模型结束,第二节点将确定的压缩模型发送至模型订阅方。In an exemplary embodiment of the present disclosure, another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is over, and the second node sends the determined compression model to the model subscriber.
在本公开实施例中,第一节点得到与模型训练模式对应的一个或多个第一压缩模型后,通过无线信道发送第二指示消息至第二节点。其中,第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。In this embodiment of the present disclosure, after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel. Wherein, the second indication message includes the number of first compressed models corresponding to the model training mode.
本公开实施例解决了深度学习模型数据量过大的问题,有效缓解了无线资源紧缺的情况,减少了网络拥塞情况下出现的数据传输错误问题,从而提高了模型在无线网络中传输的可靠性、保证了模型的精度。本公开实施例将第一节点采用本地训练参数集训练所得模型进行压缩后上传至第二节点,这种方法不仅将用户隐私数据保留在本地,同时大大增加了网络对于模型进行反向推理的难度,进一步保证了用户信息的安全性。The embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model. In the embodiment of the present disclosure, the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
基于相同的/相似的构思,本公开实施例还提供一种训练方法。Based on the same/similar concept, the embodiments of the present disclosure also provide a training method.
图5是根据一示例性实施例示出的一种训练方法的流程图。如图5所示,训练方法用于第二节点中,包括以下步骤。Fig. 5 is a flowchart of a training method according to an exemplary embodiment. As shown in Figure 5, the training method is used in the second node and includes the following steps.
在步骤S41中,发送模型训练请求。In step S41, a model training request is sent.
在本公开实施例中,模型训练请求中包括模型压缩参数,模型压缩参数用于压缩第一训练模型得到第一压缩模型,第一训练模型基于模型训练请求训练得到。In the embodiment of the present disclosure, the model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
在本公开实施例中,第一节点为模型训练节点,本公开为便于描述将模型训练节点称为第一节点;第二节点为模型请求节点,类似,本公开为便于描述将模型请求节点称为第二节点。其中,模型训练请求中包括模型压缩参数。In the embodiment of the present disclosure, the first node is a model training node, and the model training node is referred to as the first node in this disclosure for the convenience of description; the second node is a model request node. for the second node. The model training request includes model compression parameters.
该模型压缩参数中至少包括以下一种:The model compression parameters include at least one of the following:
模型训练结构,多个模型压缩选项,模型训练模式。Model training structure, multiple model compression options, model training modes.
在本公开实施例中,模型压缩选项是基于第二节点(例如,模型请求节点)接收的模型订阅需求确定的。第二节点根据接收的模型订阅需求确定发送模型训练请求,第一节点(例如,模型训练节点)接收到第二节点发送的模型训练请求之后,发送第三指示信息,响应模型训练请求信息,确定基于本地样本参数集和模型训练结构训练第一训练模型,并确定模型压缩所需的相关参数。其中,第一节点发送响应模型训练请求信息中还包括第一节点的本地计算能力,通信条件、训练样本参数集特性等其中的一种或几种。In an embodiment of the present disclosure, the model compression option is determined based on the model subscription requirement received by the second node (eg, the model requesting node). The second node determines to send the model training request according to the received model subscription requirement. After receiving the model training request sent by the second node, the first node (for example, the model training node) sends third indication information, responds to the model training request information, and determines The first training model is trained based on the local sample parameter set and the model training structure, and the relevant parameters required for model compression are determined. The information sent by the first node to respond to the model training request further includes one or more of the local computing capability of the first node, communication conditions, and characteristics of the training sample parameter set.
在本公开实施例中,第一节点基于模型压缩参数中的模型压缩选项以及模型压缩所需的相关参数对第一训练模型进行压缩。其中,模型压缩选项中包括模型精度、模型参数数据量。模型压缩参数包括多个模型压缩选项,多个模型压缩选项是第二节点基于多个第一节点上报的本地计算能力,通信条件、训练样本参数集特性等其中的一种或几种确定的。In the embodiment of the present disclosure, the first node compresses the first training model based on the model compression option in the model compression parameters and the relevant parameters required for model compression. Among them, the model compression options include model accuracy and model parameter data volume. The model compression parameters include multiple model compression options, and the multiple model compression options are determined by the second node based on one or more of local computing capabilities reported by multiple first nodes, communication conditions, and characteristics of the training sample parameter set.
在本公开实施例中,模型压缩参数中的模型训练模式包括用于训练单个第一训练模型的单训练节点模式和用于训练多个第一训练模型的多训练节点模式。In an embodiment of the present disclosure, the model training modes in the model compression parameters include a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
第一节点根据模型训练参数中包括的模型训练模式确定训练第一训练模型的数量,若模型训练模式为单训练节点训练,则确定基于单个第一节点训练一个第一训练模型,其训练方式如上述。若模型训练模式为多训练节点训练,则确定基于多个第一节点训练多个第一训练模型,并为多个第一节点设置不同的序列标记,训练多个第一训练模型。The first node determines the number of training first training models according to the model training mode included in the model training parameters. If the model training mode is single training node training, it is determined to train a first training model based on a single first node. The training method is as follows: above. If the model training mode is multi-training node training, it is determined to train a plurality of first training models based on the plurality of first nodes, and different sequence marks are set for the plurality of first nodes to train the plurality of first training models.
在本公开实施例中,第一节点得到与模型训练模式对应的一个或多个第一压缩模型后,通过无线信道发送第二指示消息至第二节点。其中,第二指示消息包括与模型训练模式对应数量的第一压缩模型。第二节点接收第二指示消息确定与模型训练模式对应数量的第一压缩模型,并对接收的一个或多个第一压缩进行联邦平均,得到联邦平均后的第三压缩模型,并确定第三压缩模型是否满足模型订阅需求,或者是否满足分析订阅需求。In this embodiment of the present disclosure, after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel. Wherein, the second indication message includes the number of first compressed models corresponding to the model training modes. The second node receives the second instruction message to determine the number of first compression models corresponding to the model training modes, and performs a federated average on the received one or more first compressions to obtain a third compression model after the federal average, and determines the third compression model. Whether the compressed model meets the model subscription requirements, or whether it meets the analysis subscription requirements.
在本公开实施例中,订阅需求可以由操作维护管理(Operation Administration and Maintenance,OAM)发出,或者由核心网发出。订阅需求包括分析ID,用于标识模型训练请求的分析类型;通知目标模型训练节点地址,用于将被请求方接收到的通知与此订阅 关联;还用于分析报告信息,包含首选分析精度级别、分析时间间隔等参数;分析筛选器信息(可选):指示报告分析信息要满足的条件。In the embodiment of the present disclosure, the subscription requirement may be issued by Operation Administration and Maintenance (OAM), or issued by the core network. Subscription requirements include analysis ID, used to identify the analysis type of the model training request; notification target model training node address, used to associate notifications received by the requested party with this subscription; also used for analysis report information, including the preferred analysis accuracy level , analysis time interval and other parameters; analysis filter information (optional): Indicates the conditions to be met by the report analysis information.
在本公开实施例一种实施方式中,对一个第一压缩模型,或者多个第一压缩模型联邦平均后的第三压缩模型不满足订阅需求,发送第四指示消息,其中第四指示消息用于指示第三压缩模型。第一节点接收到第四指示消息。In an implementation manner of the embodiment of the present disclosure, if a first compression model or a third compression model obtained by federated averaging of multiple first compression models does not meet the subscription requirement, a fourth indication message is sent, wherein the fourth indication message uses to indicate the third compression model. The first node receives the fourth indication message.
在本公开一示例性实施例中,另一种实施方式为,当第二节点确定第一压缩模型满足模型订阅需求,确定发送第五指示消息,其中,第五指示消息用于指示结束训练第一压缩模型。第一节点接收到第五指示消息后确定确定训练第一压缩模型结束,第二节点将确定的压缩模型发送至模型订阅方。In an exemplary embodiment of the present disclosure, another implementation is that, when the second node determines that the first compression model meets the model subscription requirements, it determines to send a fifth indication message, where the fifth indication message is used to instruct the end of training the first A compressed model. After receiving the fifth indication message, the first node determines that the training of the first compression model is finished, and the second node sends the determined compression model to the model subscriber.
在本公开实施例中,第二节点接收OAM或核心网发送的订阅需求,基于接收的订阅需求确定发送模型训练请求。In the embodiment of the present disclosure, the second node receives the subscription requirement sent by the OAM or the core network, and determines to send the model training request based on the received subscription requirement.
在本公开实施例中,第一节点和第二节点可以应用于基站与基站之间,也可以应用于基站与终端之间,当然也可以应用于基站与核心网之间。例如,一种应用环境可以是第一节点为终端,第二节点为基站,此时需要测量还可以应用于第一节点为基站,第二节点也是基站的应用环境中,还可以包括一种应用环境中,例如第一节点为基站,第二节点为核心网节点。当然这仅仅是对本公开第一节点和第二节点应用环境的举例说明,其具体实施方式的应用环境在本公开中不做具体限定。In this embodiment of the present disclosure, the first node and the second node may be applied between a base station and a base station, or between a base station and a terminal, and of course, may also be applied between a base station and a core network. For example, an application environment may be that the first node is a terminal and the second node is a base station. At this time, the required measurement can also be applied to an application environment in which the first node is a base station and the second node is also a base station, and may also include an application In an environment, for example, the first node is a base station, and the second node is a core network node. Of course, this is only an illustration of the application environment of the first node and the second node of the present disclosure, and the application environment of the specific implementation manner is not specifically limited in the present disclosure.
在本公开实施例中,第一节点得到与模型训练模式对应的一个或多个第一压缩模型后,通过无线信道发送第二指示消息至第二节点。其中,第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。In this embodiment of the present disclosure, after obtaining one or more first compressed models corresponding to the model training modes, the first node sends a second indication message to the second node through a wireless channel. Wherein, the second indication message includes the number of first compressed models corresponding to the model training mode.
本公开实施例解决了深度学习模型数据量过大的问题,有效缓解了无线资源紧缺的情况,减少了网络拥塞情况下出现的数据传输错误问题,从而提高了模型在无线网络中传输的可靠性、保证了模型的精度。本公开实施例将第一节点采用本地训练参数集训练所得模型进行压缩后上传至第二节点,这种方法不仅将用户隐私数据保留在本地,同时大大增加了网络对于模型进行反向推理的难度,进一步保证了用户信息的安全性。The embodiment of the present disclosure solves the problem that the data volume of the deep learning model is too large, effectively alleviates the shortage of wireless resources, reduces the problem of data transmission errors in the case of network congestion, and improves the reliability of model transmission in the wireless network. , to ensure the accuracy of the model. In the embodiment of the present disclosure, the model obtained by training the first node using the local training parameter set is compressed and uploaded to the second node. This method not only keeps the user's private data locally, but also greatly increases the difficulty of the network's reverse reasoning for the model. , which further ensures the security of user information.
在本公开一些实施例中,将第一节点称为模型训练节点,第二节点称为模型请求节点。以模型训练节点和模型请求节点交互的方式对本公开进行进一步说明。In some embodiments of the present disclosure, the first node is referred to as a model training node, and the second node is referred to as a model request node. The present disclosure is further described in terms of interaction between model training nodes and model request nodes.
图6为本公开提供的一种训练方法中单训练节点模式确定第一压缩模型的实施方式流程图。如图6所示,模型请求节点向模型训练节点发起模型训练请求。FIG. 6 is a flowchart of an implementation manner of determining a first compression model in a single training node mode in a training method provided by the present disclosure. As shown in Figure 6, the model request node initiates a model training request to the model training node.
模型训练节点将本地计算能力,通信条件和训练样本参数集特性发送给模型请求节点。The model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
模型请求节点依据模型/分析订阅需求确定模型结构和模型训练模式,依据模型训练节 点上报的信息提出多种模型压缩选项,包含模型精度、模型参数数据量。The model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
模型请求节点将模型结构、模型训练模式以及模型压缩选项发送给模型训练节点,模型训练节点选择合适的模型压缩选项。The model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
模型训练节点采用本地样本参数集进行模型训练,得到第一训练模型,以及模型压缩所需的相关参数。The model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
模型训练节点根据所选模型压缩选项以及模型压缩所需的相关参数,对第一训练模型进行压缩,得到第一压缩模型,并将第一压缩模型通过无线信道传输给模型请求节点。The model training node compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compressed model, and transmits the first compressed model to the model requesting node through a wireless channel.
当模型请求节点所得第一压缩模型满足模型/分析订阅需求时,模型训练过程结束,模型请求节点将模型上报给模型/分析订阅方。When the first compressed model obtained by the model request node meets the model/analysis subscription requirements, the model training process ends, and the model request node reports the model to the model/analysis subscriber.
图7为本公开提供的一种训练方法中多训练节点模式确定第一压缩模型的实施方式流程图。如图7所示,模型请求节点向模型训练节点发起模型训练请求。FIG. 7 is a flowchart of an implementation manner of determining a first compression model in a multi-training node mode in a training method provided by the present disclosure. As shown in Figure 7, the model request node initiates a model training request to the model training node.
模型训练节点将本地计算能力,通信条件和训练样本参数集特性发送给模型请求节点。The model training node sends the local computing power, communication conditions and training sample parameter set characteristics to the model requesting node.
模型请求节点依据模型/分析订阅需求确定模型结构和模型训练模式,依据模型训练节点上报的信息提出多种模型压缩选项,包含模型精度、模型参数数据量。The model request node determines the model structure and model training mode according to the model/analysis subscription requirements, and proposes a variety of model compression options based on the information reported by the model training node, including model accuracy and model parameter data volume.
模型请求节点将模型结构、模型训练模式以及模型压缩选项发送给模型训练节点,模型训练节点选择合适的模型压缩选项。The model request node sends the model structure, model training mode, and model compression options to the model training node, and the model training node selects an appropriate model compression option.
模型训练节点采用本地样本参数集进行模型训练,得到第一训练模型,以及模型压缩所需的相关参数。The model training node uses the local sample parameter set for model training to obtain the first training model and related parameters required for model compression.
模型训练节点选择合适的模型压缩选项,并根据所选的模型压缩选项以及模型压缩所需的相关参数,对第一训练模型进行压缩,得到第一压缩模型,并将第一压缩模型通过无线信道传输给模型请求节点。The model training node selects an appropriate model compression option, and compresses the first training model according to the selected model compression option and relevant parameters required for model compression to obtain a first compression model, and transmits the first compression model through the wireless channel Passed to the model request node.
模型请求节点第一模型训练节点发来的第一压缩模型进行联邦平均,得到全局模型。The first compressed model sent from the first model training node of the model request node is federated average to obtain a global model.
确定全局模型是否满足模型/分析订阅需求。Determine if the global model meets the model/analytics subscription requirements.
若全局模型满足模型/分析订阅需求,模型训练过程结束,模型请求节点将全局模型上报给模型/分析订阅方。若全局模型不满足模型/分析订阅需求,模型训练节点重新选择合适的模型压缩选项,根据重新确定的模型压缩选项更新第一压缩模型。If the global model meets the model/analysis subscription requirements, the model training process ends, and the model request node reports the global model to the model/analysis subscriber. If the global model does not meet the model/analysis subscription requirements, the model training node reselects an appropriate model compression option, and updates the first compression model according to the re-determined model compression option.
图8为本公开提供的一种训练方法中模型训练和压缩决策部分的协议和接口原理图。如图8所示,包括模型请求节点中的业务管理模块、网络通信模块,以及模型训练节点装置中的网络通信模块、模型训练和压缩模块、数据处理和存储模块。其中,模型请求节点中的业务管理模块、网络通信模块和模型训练节点装置中的网络通信模块、模型训练和压缩模块、数据处理和存储模块信息交互执行如下步骤。FIG. 8 is a schematic diagram of the protocol and interface of the model training and compression decision part in a training method provided by the present disclosure. As shown in FIG. 8 , it includes a service management module and a network communication module in the model request node, and a network communication module, model training and compression module, data processing and storage module in the model training node device. The service management module in the model request node, the network communication module and the network communication module, model training and compression module, data processing and storage module in the model training node device perform the following steps for information exchange.
在本公开实施例中,步骤1包括步骤1a-1c,其中在步骤1a中,模型请求节点业务管理模块将模型训练请求信令发送给模型请求节点网络通信模块,信令指示内容为向模型训练节点发起模型训练请求。在步骤1b中,模型请求节点网络通信模块将模型训练请求信令发送给模型请求节点网络通信模块。在步骤1c中,模型训练节点网络通信模块将模型训练请求响应信令发送给模型请求节点网络通信模块,信令指示内容为通知接受模型训练请求。In the embodiment of the present disclosure, step 1 includes steps 1a-1c, wherein in step 1a, the model request node service management module sends a model training request signaling to the model request node network communication module, and the content of the signaling instruction is to train the model The node initiates a model training request. In step 1b, the model requesting node network communication module sends model training request signaling to the model requesting node network communication module. In step 1c, the model training node network communication module sends the model training request response signaling to the model request node network communication module, and the content of the signaling instruction is to notify the acceptance of the model training request.
步骤2包括步骤2a-2c,其中在步骤2a中,模型训练节点模型训练和压缩模块将计算能力信息上报信令发送给模型训练节点网络通信模块,信令指示内容为将模型训练节点设备的计算能力信息上报给接收方。在步骤2b中,模型训练节点数据处理和存储模块将训练样本特性信息上报信令发送给模型训练节点网络通信模块,信令指示内容为将模型训练节点本地数据训练样本特性信息上报给接收方。在步骤2c中,模型训练节点网络通信模块将计算能力和训练样本特性信息上报信令发送给模型训练节点网络通信模,信令指示内容为将模型训练节点计算能力以及本地数据训练样本特性信息上报给接收方。Step 2 includes steps 2a-2c, wherein in step 2a, the model training node model training and compression module sends the computing capability information reporting signaling to the model training node network communication module, and the signaling instruction content is to calculate the model training node equipment. The capability information is reported to the receiver. In step 2b, the model training node data processing and storage module sends the training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node local data training sample feature information to the receiver. In step 2c, the model training node network communication module sends the computing capability and training sample feature information reporting signaling to the model training node network communication module, and the signaling instruction content is to report the model training node computing capability and local data training sample feature information to the recipient.
在步骤3中,若模型训练节点为终端,模型请求节点为基站,则需要将模型训练节点网络通信模块测量信道质量指示(Channel Quality Indication,CQI)并上报信令发送给模型请求节点网络通信模块,信令指示内容为进行CQI测量并将CQI信息上报给接收方。In step 3, if the model training node is a terminal and the model requesting node is a base station, the network communication module of the model training node needs to measure the Channel Quality Indication (CQI) and report the signaling to the model requesting node network communication module , and the content of the signaling indication is to perform CQI measurement and report the CQI information to the receiver.
在步骤4中,模型请求节点网络通信模块将发送模型训练节点计算能力、训练样本特性、CQI信息(可选)信令发送给模型请求节点业务管理模块,信令指示内容为将收到的模型训练节点计算能力、训练样本特性、CQI信息(可选)汇总并发送给接收方。In step 4, the model request node network communication module sends the model training node computing capability, training sample characteristics, CQI information (optional) signaling to the model request node service management module, and the signaling indication content is the model to be received. The computing power of the training node, the characteristics of the training samples, and the CQI information (optional) are aggregated and sent to the receiver.
在步骤5中,模型请求节点业务管理模块依据模型/分析订阅需求来确定模型结构和模型训练模式。In step 5, the model request node service management module determines the model structure and model training mode according to the model/analysis subscription requirements.
在步骤6中,模型请求节点业务管理模块依据模型训练节点上报信息提出多种模型压缩选项。In step 6, the model request node service management module proposes various model compression options according to the information reported by the model training node.
在步骤7中包括7a-7b,其中在步骤7a中,模型请求节点业务管理模块将发送模型结构和模型训练模式信令发送给模型请求节点网络通信模块,信令指示内容为将模型结构和模型训练模式发送给接收方。在步骤7b中,模型请求节点业务管理模块将发送模型压缩选项信令发送给模型请求节点网络通信模块,信令指示内容为将多种模型压缩选项发送给接收方。In step 7, 7a-7b are included, wherein in step 7a, the model request node service management module sends the model structure and model training mode signaling to the model request node network communication module, and the signaling instruction content is to convert the model structure and model The training mode is sent to the receiver. In step 7b, the model requesting node service management module sends the model requesting node network communication module a signaling of sending model compression options, and the content of the signaling instruction is to send multiple model compression options to the receiver.
在步骤8中包括8a-8b,其中在步骤8a中,模型请求节点网络通信模块将发送模型结构和模型训练模式信令发送给模型训练节点网络通信模块。在步骤8b中模型请求节点网络通信模块将发送模型压缩选项信令发送给模型训练节点网络通信模块。Step 8 includes 8a-8b, wherein in step 8a, the model request node network communication module sends the model structure and model training mode signaling to the model training node network communication module. In step 8b, the model requesting node network communication module sends the model compression option signaling to the model training node network communication module.
在步骤9中包括9a-9b,其中在步骤9a中,模型训练节点网络通信模块将发送模型结 构和模型训练模式信令发送给模型训练节点模型训练和压缩模块。在步骤9b中,模型训练节点网络通信模块将发送模型压缩选项信令发送给模型训练节点模型训练和压缩模块。9a-9b are included in step 9, wherein in step 9a, the model training node network communication module will send model structure and model training mode signaling to the model training node model training and compression module. In step 9b, the model training node network communication module sends the model compression option signaling to the model training node model training and compression module.
在步骤10中,模型训练节点依据本地可用算力、实时通信条件、训练样本特性选择合适的模型压缩选项,选择合适的模型压缩选项。In step 10, the model training node selects an appropriate model compression option according to locally available computing power, real-time communication conditions, and characteristics of training samples, and selects an appropriate model compression option.
图9为本公开提供的一种训练方法中单训练节点模式下模型训练和压缩部分的协议和接口原理图。如图9所示,包括模型训练节点中的数据处理和存储模块、模型训练和压缩模块、网络通信模块,以及模型请求节点装置中的网络通信模块、业务管理模块。模型训练节点中的数据处理和存储模块、模型训练和压缩模块、网络通信模块,以及模型请求节点装置中的网络通信模块、业务管理模块的消息交互执行如下步骤。FIG. 9 is a schematic diagram of the protocol and interface of the model training and compression part in a single training node mode in a training method provided by the present disclosure. As shown in FIG. 9 , it includes a data processing and storage module, a model training and compression module, and a network communication module in the model training node, as well as a network communication module and a service management module in the model requesting node device. The data processing and storage module, model training and compression module, network communication module in the model training node, and the network communication module and service management module in the model request node device perform the following steps for message interaction.
在步骤1中,包括步骤1a-1b,其中在步骤1a中,模型训练节点模型训练和压缩模块将请求本地训练数据集信令发送给模型训练节点数据处理和存储模块,信令指示内容为请求从本地数据中采集训练数据集。在步骤1b中,模型训练节点数据处理和存储模块将发送本地训练数据集信令发送到模型训练节点模型训练和压缩模块,信令指示内容:从本地数据中采集数据生成训练数据集并发送给接收方。In step 1, including steps 1a-1b, wherein in step 1a, the model training node model training and compression module sends the request local training data set signaling to the model training node data processing and storage module, and the signaling indicates that the content is the request Collect training datasets from local data. In step 1b, the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node, and the signaling indicates the content: collect data from local data to generate a training data set and send it to the model training and compression module. receiver.
在步骤2中,模型训练节点模型训练和压缩模块采用本地训练数据集进行模型训练,得到训练模型以及模型压缩所需的相关参数。In step 2, the model training and compression module of the model training node uses the local training data set for model training, and obtains the training model and relevant parameters required for model compression.
在步骤3中,模型训练节点根据所选模型压缩选项以及模型压缩所需的相关参数,对原始训练模型进行压缩,得到压缩模型。In step 3, the model training node compresses the original training model according to the selected model compression option and relevant parameters required for model compression to obtain a compressed model.
在步骤4中包括4a-4c,其中,在步骤4a中,模型训练节点模型训练和压缩模块将压缩模型发送给模型训练节点网络通信模块。在步骤4b中,模型训练节点网络通信模块将压缩模型发送给模型请求节点网络通信模块。在步骤4c中,模型请求节点网络通信模块将压缩模型发送给模型请求节点业务管理模块。Step 4 includes 4a-4c, wherein, in step 4a, the model training node model training and compression module sends the compressed model to the model training node network communication module. In step 4b, the model training node network communication module sends the compressed model to the model request node network communication module. In step 4c, the model request node network communication module sends the compressed model to the model request node service management module.
在步骤5中,模型请求节点业务管理模块判断所得模型是否满足模型/分析订阅需求。若满足,则执行步骤6。In step 5, the model request node service management module judges whether the obtained model satisfies the model/analysis subscription requirement. If satisfied, go to step 6.
在步骤6中,模型请求节点业务管理模块将通知模型训练结束信令经由模型请求节点网络通信模块发送给模型训练节点网络通信模块,此流程及对应的信令为本发明新增,信令指示内容:通知模型训练节点结束模型训练过程。In step 6, the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module. This process and corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notify the model training node to end the model training process.
否则,执行步骤6a。Otherwise, go to step 6a.
在步骤6a中,模型请求节点业务管理模块将通知模型训练继续信令经由模型请求节点网络通信模块发送给模型训练节点网络通信模块,此流程及对应的信令为本发明新增,信令指示内容:通知模型训练过程继续。在步骤6b中,模型训练节点网络通信模块将通知 模型训练继续信令发送给模型训练节点模型训练和压缩模块。In step 6a, the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module. This process and the corresponding signaling are newly added in the present invention, and the signaling indicates Content: Notifies the model training process to continue. In step 6b, the network communication module of the model training node will notify the model training continuation signaling to the model training node model training and compression module.
在步骤7中,模型训练节点模型训练和压缩模块采用本地训练数据集对压缩模型进行训练,并重复步骤4a-7,直至模型请求节点所得模型满足模型/分析订阅需求。In step 7, the model training and compression module of the model training node uses the local training data set to train the compressed model, and repeats steps 4a-7 until the model obtained by the model request node meets the model/analysis subscription requirements.
图10为本公开提供的一种训练方法中多训练节点模式下模型训练和压缩部分的协议和接口原理图。如图10所示,包括模型训练节点中的模型训练和压缩模块、传输控制模块、网络通信模块,以及模型请求节点装置中的网络通信模块,传输控制模块,业务管理模块。其各个模块之间的信息交互执行如下步骤。FIG. 10 is a schematic diagram of a protocol and an interface of a model training and compression part in a multi-training node mode in a training method provided by the present disclosure. As shown in FIG. 10 , it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node device. The following steps are performed for the information exchange among its various modules.
在步骤1中包括步骤1a-1b,其中在步骤1a中,模型训练节点模型训练和压缩模块将请求本地训练数据集信令发送给模型训练节点数据处理和存储模块。在步骤1b中,模型训练节点数据处理和存储模块将发送本地训练数据集信令发送到模型训练节点模型训练和压缩模块。Step 1 includes steps 1a-1b, wherein in step 1a, the model training and compression module of the model training node sends a request for local training data set signaling to the model training node data processing and storage module. In step 1b, the data processing and storage module of the model training node sends the signaling of sending the local training data set to the model training and compression module of the model training node.
在步骤2中,模型训练节点模型训练和压缩模块采用本地训练数据集进行模型训练,得到第一训练模型以及模型压缩所需的相关参数。In step 2, the model training and compression module of the model training node uses the local training data set to perform model training to obtain a first training model and relevant parameters required for model compression.
在步骤3中,模型训练节点根据所选模型压缩选项以及模型压缩所需的相关参数,对第一训练模型进行压缩,得到第一压缩模型。In step 3, the model training node compresses the first training model according to the selected model compression option and the relevant parameters required for model compression to obtain the first compression model.
在步骤4中包括步骤4a-4c,其中在步骤4a中,模型训练节点模型训练和压缩模块将第一压缩模型发送给模型训练节点网络通信模块。在步骤4b中,模型训练节点网络通信模块将第一压缩模型发送给模型请求节点网络通信模块。在步骤4c中,模型请求节点网络通信模块将第一压缩模型发送给模型请求节点模型计算和更新模块。Step 4 includes steps 4a-4c, wherein in step 4a, the model training node model training and compression module sends the first compressed model to the model training node network communication module. In step 4b, the model training node network communication module sends the first compressed model to the model request node network communication module. In step 4c, the model request node network communication module sends the first compressed model to the model request node model calculation and update module.
在步骤5中,模型请求节点模型计算和更新模块汇总各模型训练节点发来的第一压缩模型,并进行联邦平均,得到全局模型。In step 5, the model request node model calculation and update module summarizes the first compressed model sent from each model training node, and performs a federated average to obtain a global model.
在步骤6中,模型请求节点模型计算和更新模块将全局模型发送给模型请求节点业务管理模块。In step 6, the model request node model calculation and update module sends the global model to the model request node service management module.
在步骤7中,模型请求节点业务管理模块判断所得模型是否满足模型/分析订阅需求。若满足,则执行:In step 7, the model request node service management module judges whether the obtained model meets the model/analysis subscription requirements. If so, execute:
在步骤8中,模型请求节点业务管理模块将通知模型训练结束信令经由模型请求节点网络通信模块发送给模型训练节点网络通信模块。In step 8, the model request node service management module sends a signaling of notifying the model training end to the model training node network communication module via the model request node network communication module.
否则,执行步骤8a-8b。Otherwise, go to steps 8a-8b.
在步骤8a中,模型请求节点业务管理模块将通知模型训练继续信令经由模型请求节点网络通信模块发送给模型训练节点网络通信模块,并将全局模型经由模型请求节点网络通信模块分发给模型训练节点网络通信模块。在步骤8b中,模型训练节点网络通信模块将 通知模型训练继续信令发送给模型训练节点模型训练和压缩模块,并将全局模型发送给模型训练节点模型训练和压缩模块。In step 8a, the model request node service management module sends the notification model training continuation signaling to the model training node network communication module via the model request node network communication module, and distributes the global model to the model training node via the model request node network communication module network communication module. In step 8b, the model training node network communication module will notify the model training continuation signaling to the model training node model training and compression module, and send the global model to the model training node model training and compression module.
在步骤9中,模型训练节点模型训练和压缩模块采用本地训练数据集对模型请求节点发来的全局模型进行模型训练和压缩,并重复步骤4a-9,直至模型请求节点所得模型满足模型/分析订阅需求。In step 9, the model training and compression module of the model training node uses the local training data set to perform model training and compression on the global model sent by the model requesting node, and repeats steps 4a-9 until the model obtained by the model requesting node satisfies the model/analysis Subscription requirements.
图11为本公开提供的一种训练方法中无线数据传输部分的协议和接口原理图。如图11所示,包括模型训练节点中的模型训练和压缩模块、传输控制模块、网络通信模块,以及模型请求节点中的网络通信模块、传输控制模块、业务管理模块。可以适用于模型请求节点为基站、模型训练节点为终端的应用场景。其中各个模块之间的信息交互执行如下步骤。FIG. 11 is a schematic diagram of a protocol and interface of a wireless data transmission part in a training method provided by the present disclosure. As shown in Figure 11, it includes a model training and compression module, a transmission control module, and a network communication module in the model training node, and a network communication module, transmission control module, and service management module in the model request node. It can be applied to the application scenario where the model requesting node is the base station and the model training node is the terminal. The following steps are performed for the information interaction between each module.
在步骤1中,模型训练节点模型训练和压缩模块将压缩模型发送给模型训练节点传输控制模块。In step 1, the model training node model training and compression module sends the compressed model to the model training node transmission control module.
在步骤2中,模型训练节点网络通信模块将测量CQI并上报信令发送给模型训练节点传输控制模块。In step 2, the model training node network communication module sends the measured CQI and reporting signaling to the model training node transmission control module.
在步骤3中,模型训练节点传输控制模块根据压缩特性、无线通信条件制定数据传输方案。In step 3, the model training node transmission control module formulates a data transmission scheme according to compression characteristics and wireless communication conditions.
在步骤4中,模型训练节点传输控制模块将发送数据传输方案信息信令发送给模型训练节点网络通信模块,此流程及对应的信令为本发明新增,信令指示内容:将数据传输方案信息发送给接收方,包含调制方式、码率等信息。In step 4, the model training node transmission control module sends the data transmission scheme information signaling to the model training node network communication module, this process and the corresponding signaling are newly added in the present invention, and the signaling indicates the content: send the data transmission scheme The information is sent to the receiver, including modulation mode, code rate and other information.
在步骤5中,模型训练节点模型训练和压缩模块将压缩模型发送给模型训练节点网络通信模块。In step 5, the model training and compression module of the model training node sends the compressed model to the network communication module of the model training node.
在步骤6中,模型训练节点网络通信模块根据数据传输方案将压缩模型封装打包。In step 6, the model training node network communication module encapsulates the compressed model according to the data transmission scheme.
在步骤7中包括步骤7a-7d,其中在步骤7a中,模型训练节点网络通信模块将压缩模型数据包传输给模型请求节点网络通信模块。在步骤7b中,模型请求节点网络通信模块将压缩模型发送给模型请求节点传输控制模块,此时传输的是解封装后的数据。在步骤7c中,模型请求节点传输控制模块将通知确认收到正确数据信令发送给模型请求节点网络通信模块,信令指示内容:通知接收方已经收到正确数据。在步骤7d中,模型请求节点网络通信模块将通知确认收到正确数据信令发送给模型训练节点网络通信模块。Step 7 includes steps 7a-7d, wherein in step 7a, the model training node network communication module transmits the compressed model data packet to the model request node network communication module. In step 7b, the network communication module of the model request node sends the compressed model to the transmission control module of the model request node, and the decapsulated data is transmitted at this time. In step 7c, the transmission control module of the model requesting node sends a signaling of acknowledging receipt of correct data to the network communication module of the model requesting node, and the content of the signaling indicates that the receiver has received correct data. In step 7d, the model requesting node network communication module sends a notification acknowledging receipt of correct data signaling to the model training node network communication module.
在步骤8中,模型请求节点传输控制模块将压缩模型发送给模型请求节点业务管理模块。若为单训练节点模式,则直接将压缩模型发送给模型请求节点业务管理模块即可;若为多训练节点模式,则中间需要经过模型请求节点模型计算和更新模块得到全局模型,再 将其发送给模型请求节点业务管理模块。In step 8, the model request node transmission control module sends the compressed model to the model request node service management module. If it is a single training node mode, the compressed model can be directly sent to the model request node business management module; if it is a multi-training node mode, the global model needs to be obtained through the model request node model calculation and update module, and then sent. Request the node service management module for the model.
在步骤9中,模型请求节点业务管理模块判断所的模型是否满足模型/分析订阅需求。若满足,则执行步骤10a1-10b1。In step 9, the model request node service management module judges whether the model meets the model/analysis subscription requirement. If so, go to steps 10a1-10b1.
在步骤10a1中,模型请求节点业务管理模块将通知模型训练结束信令发送给模型请求节点传输控制模块。在步骤10b1中,模型请求节点网络通信模块将通知模型训练结束信令发送给模型训练节点网络通信模块。In step 10a1, the model requesting node service management module sends a signaling informing the model training end to the model requesting node transmission control module. In step 10b1, the model requesting node network communication module sends a signaling informing the model training end to the model training node network communication module.
否则,执行步骤10a2-10b2。Otherwise, steps 10a2-10b2 are performed.
在步骤10a2中,模型请求节点业务管理模块将分发模型训练结束信令发送给模型请求节点传输控制模块,信令指示内容:通知模型训练节点结束模型训练过程。在步骤10b2中,模型请求节点网络通信模块将通知模型训练结束信令发送给模型训练节点网络通信模块。In step 10a2, the model request node service management module sends the distribution model training end signaling to the model request node transmission control module, and the signaling indicates the content: notifies the model training node to end the model training process. In step 10b2, the model requesting node network communication module sends the model training end signaling to the model training node network communication module.
若为训练节点模式,则只发送通知模型训练继续信令即可;若为多训练节点模式,则还需要将全局模型分发给模型训练节点。In the case of the training node mode, only the signaling to notify the model to continue training can be sent; in the case of the multi-training node mode, the global model needs to be distributed to the model training nodes.
全局模型分发的协议和接口原理同上述步骤1-步骤7相似,将发送模块替换为模型请求节点、接收模块替换为模型训练节点、压缩模型替换为全局模型即可,并且步骤2中测量CQI及上报过程应由模型请求节点向模型训练节点发起CQI测量请求,模型训练节点进行CQI测量后反馈给模型请求节点。The protocol and interface principle of global model distribution are similar to the above steps 1-7. The sending module is replaced by the model request node, the receiving module is replaced by the model training node, and the compression model is replaced by the global model. In step 2, the CQI and In the reporting process, the model requesting node should initiate a CQI measurement request to the model training node, and the model training node performs CQI measurement and then feeds it back to the model requesting node.
基于相同的构思,本公开实施例还提供一种训练装置。Based on the same concept, an embodiment of the present disclosure also provides a training device.
可以理解的是,本公开实施例提供的训练装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。结合本公开实施例中所公开的各示例的单元及算法步骤,本公开实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用来使用不同的方法来实现所描述的功能,但是这种实现不应认为超出本公开实施例的技术方案的范围。It can be understood that, in order to realize the above-mentioned functions, the training apparatus provided by the embodiments of the present disclosure includes corresponding hardware structures and/or software modules for executing each function. Combining with the units and algorithm steps of each example disclosed in the embodiments of the present disclosure, the embodiments of the present disclosure can be implemented in hardware or a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the technical solutions of the embodiments of the present disclosure.
图12是根据一示例性实施例示出的一种训练装置100的框图。参照图12,该装置应用于第一节点,包括模型训练和压缩模块110,第一网络通信模块120,第一传输控制模块130以及数据处理和存储模块140。FIG. 12 is a block diagram of a training apparatus 100 according to an exemplary embodiment. Referring to FIG. 12 , the apparatus is applied to a first node, including a model training and compression module 110 , a first network communication module 120 , a first transmission control module 130 and a data processing and storage module 140 .
模型训练和压缩模块110,用于响应于接收到模型训练请求,训练第一训练模型,其中,模型训练请求中包括模型压缩参数。基于第一训练模型和模型压缩参数,得到第一训练模型的第一压缩模型。The model training and compression module 110 is configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters. Based on the first training model and the model compression parameters, a first compression model of the first training model is obtained.
在本公开实施例中,模型压缩参数包括多个模型压缩选项。In an embodiment of the present disclosure, the model compression parameter includes a plurality of model compression options.
模型训练和压缩模块110用于,在多个模型压缩选项中确定第一模型压缩选项,并基于第一模型压缩选项对第一训练模型进行压缩,得到第二压缩模型。根据第一训练模型的输出、第二压缩模型的输出、以及用于训练第一训练模型的样本参数集,确定第一损失函数。基于第一损失函数更新第二压缩模型参数,得到第一压缩模型。The model training and compression module 110 is configured to determine a first model compression option among the multiple model compression options, and compress the first training model based on the first model compression option to obtain a second compression model. The first loss function is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model. The parameters of the second compression model are updated based on the first loss function to obtain the first compression model.
在本公开实施例中,装置还包括数据处理和存储模块140。In the embodiment of the present disclosure, the apparatus further includes a data processing and storage module 140 .
数据处理和存储模块140用于,确定第二压缩模型的输出与样本参数集之间的第一交叉熵,并确定第二压缩模型的输出与第一训练模型的输出之间的第一相对熵散度。基于第一交叉熵和第一相对熵散度,确定第一损失函数。The data processing and storage module 140 is configured to determine the first cross entropy between the output of the second compression model and the sample parameter set, and determine the first relative entropy between the output of the second compression model and the output of the first training model Divergence. Based on the first cross entropy and the first relative entropy divergence, a first loss function is determined.
在本公开实施例中,数据处理和存储模块140还用于,根据第一训练模型的输出、第二压缩模型的输出、以及用于训练第一训练模型的样本参数集,确定用于更新第一训练模型参数的第二损失函数。In the embodiment of the present disclosure, the data processing and storage module 140 is further configured to, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determine the parameters for updating the first training model. A second loss function for training model parameters.
在本公开实施例中,数据处理和存储模块140还用于,确定第一训练模型的输出与样本参数集之间的第二交叉熵,并确定第一训练模型的输出与第二压缩模型的输出之间的第二相对熵散度。基于第二交叉熵和第二相对熵散度,确定第二损失函数。In this embodiment of the present disclosure, the data processing and storage module 140 is further configured to determine the second cross-entropy between the output of the first training model and the sample parameter set, and to determine the difference between the output of the first training model and the second compression model The second relative entropy divergence between the outputs. Based on the second cross entropy and the second relative entropy divergence, a second loss function is determined.
在本公开实施例中,模型压缩参数包括模型训练模式,模型训练模式包括用于训练单个第一训练模型的单训练节点模式和用于训练多个第一训练模型的多训练节点模式。第一训练模型的数量基于模型训练模式确定。In an embodiment of the present disclosure, the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models. The number of first training models is determined based on the model training mode.
在本公开实施例中,装置还包括第一网络通信模块120。In this embodiment of the present disclosure, the apparatus further includes a first network communication module 120 .
第一网络通信模块120用于,发送第二指示消息,第二指示消息包括与模型训练模式对应数量的第一压缩模型。The first network communication module 120 is configured to send a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
在本公开实施例中,第一网络通信模块120还用于,接收第三指示消息,第三指示消息包括确定训练模型指示。In this embodiment of the present disclosure, the first network communication module 120 is further configured to receive a third instruction message, where the third instruction message includes an instruction to determine the training model.
在本公开实施例中,第一网络通信模块120还用于,接收第四指示消息。第四指示消息用于指示第三压缩模型,第三压缩模型为基于第一压缩模型的数量对第一训练模型进行联邦平均得到的压缩模型。基于第三压缩模型,重新确定模型压缩参数,并基于重新确定的模型压缩参数更新第一压缩模型。In this embodiment of the present disclosure, the first network communication module 120 is further configured to receive a fourth indication message. The fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by federally averaging the first training model based on the number of the first compression models. Based on the third compression model, model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
在本公开实施例中,网络通信模块120还用于,接收第五指示消息,第五指示消息用于指示结束训练第一压缩模型。In this embodiment of the present disclosure, the network communication module 120 is further configured to receive a fifth instruction message, where the fifth instruction message is used to instruct to end training the first compression model.
其中,第一网络通信模块120,用于在模型请求节点和模型训练节点之间进行数据传输和控制信令交互。The first network communication module 120 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
第一传输控制模块130,用于根据待传输数据特性、无线通信条件制定数据传输方案, 并依据数据传输方案将待传输数据打包,仅在模型请求节点为基站、模型训练节点为用户的实施例中需要使用传输控制模块。The first transmission control module 130 is used to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
数据处理和存储模块,用于管理本地数据,生成训练样本特性信息、以及采集数据生成本地训练数据集,并对数据集进行存储。The data processing and storage module is used to manage local data, generate training sample characteristic information, collect data to generate a local training data set, and store the data set.
模型训练和压缩模块,用于利用本地数据集进行模型训练,并依据训练过程中所得得模型压缩所需信息对模型进行压缩。The model training and compression module is used for model training using the local data set, and compressing the model according to the information required for model compression obtained in the training process.
图13是根据一示例性实施例示出的一种训练装置200的框图。参照图13,该装置应用于第二节点,包括第二网络通信模块210,第二传输控制模块220,业务管理模块230以及模型计算和更新模块240。FIG. 13 is a block diagram of a training apparatus 200 according to an exemplary embodiment. Referring to FIG. 13 , the apparatus is applied to a second node, and includes a second network communication module 210 , a second transmission control module 220 , a service management module 230 and a model calculation and update module 240 .
第二网络通信模块210,用于发送模型训练请求。其中,模型训练请求中包括模型压缩参数,模型压缩参数用于压缩第一训练模型得到第一压缩模型,第一训练模型基于模型训练请求训练得到。The second network communication module 210 is configured to send a model training request. The model training request includes model compression parameters, and the model compression parameters are used to compress the first training model to obtain the first compression model, and the first training model is obtained by training based on the model training request.
在本公开实施例中,模型压缩参数包括模型训练模式,模型训练模式包括用于训练单个第一训练模型的单训练节点模式和用于训练多个第一训练模型的多训练节点模式。In an embodiment of the present disclosure, the model compression parameters include a model training mode, which includes a single training node mode for training a single first training model and a multi-training node mode for training multiple first training models.
第一训练模型的数量基于模型训练模式确定。The number of first training models is determined based on the model training mode.
在本公开实施例中,第二网络通信模块210还用于,接收第二指示消息,第二指示消息包括与模型训练模式对应数量的第一压缩模型。In this embodiment of the present disclosure, the second network communication module 210 is further configured to receive a second indication message, where the second indication message includes a number of the first compressed models corresponding to the model training modes.
在本公开实施例中,第二网络通信模块210还用于,发送第三指示消息,第三指示消息包括确定训练模型指示。In this embodiment of the present disclosure, the second network communication module 210 is further configured to send a third instruction message, where the third instruction message includes an instruction to determine the training model.
在本公开实施例中,模型训练模式包括多训练节点模式,第二网络通信模块210还用于,发送第四指示消息。第四指示消息用于指示第三压缩模型,第三压缩模型为基于第一训练模型的数量对第一压缩模型进行联邦平均得到的压缩模型。In this embodiment of the present disclosure, the model training mode includes a multi-training node mode, and the second network communication module 210 is further configured to send a fourth indication message. The fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing a federated average of the first compression models based on the number of the first training models.
在本公开实施例中,第二网络通信模块210还用于,发送第五指示消息,第五指示消息用于指示结束训练第一压缩模型。In this embodiment of the present disclosure, the second network communication module 210 is further configured to send a fifth instruction message, where the fifth instruction message is used to instruct the end of training the first compression model.
在本公开实施例中,装置还包括业务管理模块230。In this embodiment of the present disclosure, the apparatus further includes a service management module 230 .
业务管理模块230,用于接收订阅需求,并基于订阅需求发送模型训练请求。The service management module 230 is configured to receive subscription requirements and send a model training request based on the subscription requirements.
其中,第二网络通信模块210,用于在模型请求节点和模型训练节点之间进行数据传输和控制信令交互。The second network communication module 210 is used for data transmission and control signaling interaction between the model requesting node and the model training node.
第二传输控制模块220,用于根据待传输数据特性、无线通信条件制定数据传输方案,并依据数据传输方案将待传输数据打包,仅在模型请求节点为基站、模型训练节点为用户的实施例中需要使用传输控制模块。The second transmission control module 220 is configured to formulate a data transmission scheme according to the characteristics of the data to be transmitted and wireless communication conditions, and package the data to be transmitted according to the data transmission scheme, only in the embodiment in which the model requesting node is the base station and the model training node is the user Requires the use of a transport control module.
业务管理模块230,用于处理模型/分析订阅请求,向模型训练节点发起模型训练请求,以及制定模型结构、模型训练模式以及模型压缩选项,并检查所得模型是否满足模型/分析订阅需求。The service management module 230 is used to process model/analysis subscription requests, initiate model training requests to model training nodes, formulate model structures, model training modes and model compression options, and check whether the obtained models meet model/analysis subscription requirements.
模型计算和更新模块240,用于多训练节点模式下,对多个模型训练节点发来的压缩模型进行联邦平均,得到全局模型,并将全局模型分发给模型训练节点。The model calculation and update module 240 is used for performing federated averaging on the compressed models sent from multiple model training nodes in a multi-training node mode to obtain a global model, and distributing the global model to the model training nodes.
本发明实施例一种面向无线网络的深度学习模型训练和压缩的模型训练节点装置负责:响应模型请求节点的模型训练请求,并上报本地资源信息。选择合适的模型压缩选项,并按照模型训练模式、所选的模型压缩选项进行模型训练和压缩。A model training node device for deep learning model training and compression oriented to a wireless network according to an embodiment of the present invention is responsible for: responding to a model training request from a model requesting node and reporting local resource information. Select the appropriate model compression option, and perform model training and compression according to the model training mode, the selected model compression option.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the apparatus in the above-mentioned embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment of the method, and will not be described in detail here.
图14是根据一示例性实施例示出的一种用于训练的装置300的框图。例如,装置300可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 14 is a block diagram of an apparatus 300 for training according to an exemplary embodiment. For example, apparatus 300 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
参照图14,装置300可以包括以下一个或多个组件:处理组件302,存储器304,电力组件306,多媒体组件308,音频组件310,输入/输出(I/O)接口312,传感器组件314,以及通信组件316。14, apparatus 300 may include one or more of the following components: processing component 302, memory 304, power component 306, multimedia component 308, audio component 310, input/output (I/O) interface 312, sensor component 314, and Communication component 316 .
处理组件302通常控制装置300的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件302可以包括一个或多个处理器320来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件302可以包括一个或多个模块,便于处理组件302和其他组件之间的交互。例如,处理组件302可以包括多媒体模块,以方便多媒体组件308和处理组件302之间的交互。The processing component 302 generally controls the overall operation of the device 300, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 302 may include one or more processors 320 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 302 may include one or more modules that facilitate interaction between processing component 302 and other components. For example, processing component 302 may include a multimedia module to facilitate interaction between multimedia component 308 and processing component 302 .
存储器304被配置为存储各种类型的数据以支持在装置300的操作。这些数据的示例包括用于在装置300上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器304可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 304 is configured to store various types of data to support operations at device 300 . Examples of such data include instructions for any application or method operating on device 300, contact data, phonebook data, messages, pictures, videos, and the like. Memory 304 may be implemented by any type of volatile or nonvolatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
电力组件306为装置300的各种组件提供电力。电力组件306可以包括电源管理系统,一个或多个电源,及其他与为装置300生成、管理和分配电力相关联的组件。 Power component 306 provides power to various components of device 300 . Power components 306 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power to device 300 .
多媒体组件308包括在所述装置300和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板, 屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件308包括一个前置摄像头和/或后置摄像头。当装置300处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。 Multimedia component 308 includes screens that provide an output interface between the device 300 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action. In some embodiments, the multimedia component 308 includes a front-facing camera and/or a rear-facing camera. When the apparatus 300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
音频组件310被配置为输出和/或输入音频信号。例如,音频组件310包括一个麦克风(MIC),当装置300处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器304或经由通信组件316发送。在一些实施例中,音频组件310还包括一个扬声器,用于输出音频信号。 Audio component 310 is configured to output and/or input audio signals. For example, audio component 310 includes a microphone (MIC) that is configured to receive external audio signals when device 300 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signal may be further stored in memory 304 or transmitted via communication component 316 . In some embodiments, audio component 310 also includes a speaker for outputting audio signals.
I/O接口312为处理组件302和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 312 provides an interface between the processing component 302 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
传感器组件314包括一个或多个传感器,用于为装置300提供各个方面的状态评估。例如,传感器组件314可以检测到装置300的打开/关闭状态,组件的相对定位,例如所述组件为装置300的显示器和小键盘,传感器组件314还可以检测装置300或装置300一个组件的位置改变,用户与装置300接触的存在或不存在,装置300方位或加速/减速和装置300的温度变化。传感器组件314可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件314还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件314还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。 Sensor assembly 314 includes one or more sensors for providing status assessment of various aspects of device 300 . For example, the sensor assembly 314 can detect the open/closed state of the device 300, the relative positioning of components, such as the display and keypad of the device 300, and the sensor assembly 314 can also detect a change in the position of the device 300 or a component of the device 300 , the presence or absence of user contact with the device 300 , the orientation or acceleration/deceleration of the device 300 and the temperature change of the device 300 . Sensor assembly 314 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. Sensor assembly 314 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件316被配置为便于装置300和其他设备之间有线或无线方式的通信。装置300可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件316经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件316还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 316 is configured to facilitate wired or wireless communication between apparatus 300 and other devices. Device 300 may access wireless networks based on communication standards, such as WiFi, 2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 316 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 316 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在示例性实施例中,装置300可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, apparatus 300 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation is used to perform the above method.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包 括指令的存储器304,上述指令可由装置300的处理器320执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium including instructions, such as memory 304 including instructions, executable by processor 320 of apparatus 300 to perform the method described above. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.
图15是根据一示例性实施例示出的一种用于训练的装置400的框图。例如,装置400可以被提供为一服务器。参照图15,装置400包括处理组件422,其进一步包括一个或多个处理器,以及由存储器432所代表的存储器资源,用于存储可由处理组件422的执行的指令,例如应用程序。存储器432中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件422被配置为执行指令,以执行上述训练方法。FIG. 15 is a block diagram of an apparatus 400 for training according to an exemplary embodiment. For example, the apparatus 400 may be provided as a server. 15, apparatus 400 includes processing component 422, which further includes one or more processors, and a memory resource represented by memory 432 for storing instructions executable by processing component 422, such as an application program. An application program stored in memory 432 may include one or more modules, each corresponding to a set of instructions. Additionally, the processing component 422 is configured to execute instructions to perform the training method described above.
装置400还可以包括一个电源组件426被配置为执行装置400的电源管理,一个有线或无线网络接口450被配置为将装置400连接到网络,和一个输入输出(I/O)接口458。装置400可以操作基于存储在存储器432的操作系统,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。 Device 400 may also include a power supply assembly 426 configured to perform power management of device 400 , a wired or wireless network interface 450 configured to connect device 400 to a network, and an input output (I/O) interface 458 . Device 400 may operate based on an operating system stored in memory 432, such as Windows Server™, Mac OS X™, Unix™, Linux™, FreeBSD™ or the like.
进一步可以理解的是,本公开中“多个”是指两个或两个以上,其它量词与之类似。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。It should be further understood that in the present disclosure, "plurality" refers to two or more, and other quantifiers are similar. "And/or", which describes the association relationship of the associated objects, means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects are an "or" relationship. The singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise.
进一步可以理解的是,术语“第一”、“第二”等用于描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开,并不表示特定的顺序或者重要程度。实际上,“第一”、“第二”等表述完全可以互换使用。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。It is further understood that the terms "first", "second", etc. are used to describe various information, but the information should not be limited to these terms. These terms are only used to distinguish the same type of information from one another, and do not imply a particular order or level of importance. In fact, the expressions "first", "second" etc. are used completely interchangeably. For example, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information, without departing from the scope of the present disclosure.
进一步可以理解的是,本公开实施例中尽管在附图中以特定的顺序描述操作,但是不应将其理解为要求按照所示的特定顺序或是串行顺序来执行这些操作,或是要求执行全部所示的操作以得到期望的结果。在特定环境中,多任务和并行处理可能是有利的。It is further to be understood that, although the operations in the embodiments of the present disclosure are described in a specific order in the drawings, it should not be construed as requiring that the operations be performed in the specific order shown or the serial order, or requiring Perform all operations shown to obtain the desired result. In certain circumstances, multitasking and parallel processing may be advantageous.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or techniques in the technical field not disclosed by the present disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
Claims (21)
- 一种训练方法,其特征在于,应用于第一节点,所述方法包括:A training method, characterized in that, applied to a first node, the method comprising:响应于接收到模型训练请求,训练第一训练模型,其中,所述模型训练请求中包括模型压缩参数;training a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters;基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型。Based on the first training model and the model compression parameters, a first compression model of the first training model is obtained.
- 根据权利要求1所述的训练方法,其特征在于,所述模型压缩参数包括多个模型压缩选项;The training method according to claim 1, wherein the model compression parameter comprises a plurality of model compression options;所述基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型,包括:The obtaining the first compression model of the first training model based on the first training model and the model compression parameters includes:在所述多个模型压缩选项中确定第一模型压缩选项,并基于所述第一模型压缩选项对所述第一训练模型进行压缩,得到第二压缩模型;determining a first model compression option among the plurality of model compression options, and compressing the first training model based on the first model compression option to obtain a second compression model;根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定第一损失函数;determining a first loss function according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model;基于所述第一损失函数更新所述第二压缩模型参数,得到所述第一压缩模型。The parameters of the second compression model are updated based on the first loss function to obtain the first compression model.
- 根据权利要求2所述的训练方法,其特征在于,所述根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定第一损失函数,包括:The training method according to claim 2, wherein, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, Determine the first loss function, including:确定所述第二压缩模型的输出与所述样本参数集之间的第一交叉熵,并确定第二压缩模型的输出与第一训练模型的输出之间的第一相对熵散度;determining a first cross-entropy between the output of the second compression model and the sample parameter set, and determining a first relative entropy divergence between the output of the second compression model and the output of the first training model;基于所述第一交叉熵和第一相对熵散度,确定所述第一损失函数。The first loss function is determined based on the first cross entropy and a first relative entropy divergence.
- 根据权利要求2或3所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 2 or 3, wherein the method further comprises:根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定用于更新所述第一训练模型参数的第二损失函数。A second loss function for updating parameters of the first training model is determined according to the output of the first training model, the output of the second compression model, and the sample parameter set used to train the first training model .
- 根据权利要求4所述的训练方法,其特征在于,根据所述第一训练模型的输出、所述第二压缩模型的输出、以及用于训练所述第一训练模型的样本参数集,确定用于更新所述第一训练模型参数的第二损失函数,包括:The training method according to claim 4, wherein, according to the output of the first training model, the output of the second compression model, and the sample parameter set used for training the first training model, determining the for updating the second loss function of the parameters of the first training model, including:确定所述第一训练模型的输出与所述样本参数集之间的第二交叉熵,并确定第一训练模型的输出与第二压缩模型的输出之间的第二相对熵散度;determining a second cross-entropy between the output of the first training model and the sample parameter set, and determining a second relative entropy divergence between the output of the first training model and the output of the second compression model;基于所述第二交叉熵和第二相对熵散度,确定所述第二损失函数。The second loss function is determined based on the second cross entropy and a second relative entropy divergence.
- 根据权利要求1所述的训练方法,其特征在于,所述模型压缩参数包括模型训练模 式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;The training method according to claim 1, wherein the model compression parameter includes a model training mode, and the model training mode includes a single training node mode for training a single first training model and a single training node mode for training multiple a multi-training node mode of the first training model;所述第一训练模型的数量基于所述模型训练模式确定。The number of the first training models is determined based on the model training mode.
- 根据权利要求6所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 6, wherein the method further comprises:发送第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。A second indication message is sent, where the second indication message includes a number of first compressed models corresponding to the model training mode.
- 根据权利要求1所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 1, wherein the method further comprises:接收第三指示消息,所述第三指示消息包括确定训练模型指示。A third indication message is received, where the third indication message includes an indication of determining the training model.
- 根据权利要求6所述的训练方法,其特征在于,所述模型训练模式包括多训练节点模式,所述方法还包括:The training method according to claim 6, wherein the model training mode comprises a multi-training node mode, and the method further comprises:接收第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一压缩模型的数量对所述第一训练模型进行联邦平均得到的压缩模型;receiving a fourth indication message; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by federally averaging the first training model based on the number of the first compression models;基于所述第三压缩模型,重新确定所述模型压缩参数,并基于重新确定的模型压缩参数更新所述第一压缩模型。Based on the third compression model, the model compression parameters are re-determined, and the first compression model is updated based on the re-determined model compression parameters.
- 根据权利要求1所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 1, wherein the method further comprises:接收第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。A fifth indication message is received, where the fifth indication message is used to instruct the end of training the first compression model.
- 一种训练方法,其特征在于,应用于第二节点,所述方法包括:A training method, characterized in that, applied to a second node, the method comprising:发送模型训练请求;Send model training request;其中,所述模型训练请求中包括模型压缩参数,所述模型压缩参数用于压缩第一训练模型得到第一压缩模型,所述第一训练模型基于所述模型训练请求训练得到。Wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress a first training model to obtain a first compression model, and the first training model is obtained by training based on the model training request.
- 根据权利要求11所述的训练方法,其特征在于,所述模型压缩参数包括模型训练模式,所述模型训练模式包括用于训练单个所述第一训练模型的单训练节点模式和用于训练多个所述第一训练模型的多训练节点模式;The training method according to claim 11, wherein the model compression parameter includes a model training mode, and the model training mode includes a single training node mode for training a single first training model and a single training node mode for training multiple a multi-training node mode of the first training model;所述第一训练模型的数量基于所述模型训练模式确定。The number of the first training models is determined based on the model training mode.
- 根据权利要求12所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 12, wherein the method further comprises:接收第二指示消息,所述第二指示消息包括与所述模型训练模式对应数量的第一压缩模型。A second indication message is received, the second indication message includes a number of first compressed models corresponding to the model training mode.
- 根据权利要求11所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 11, wherein the method further comprises:发送第三指示消息,所述第三指示消息包括确定训练模型指示。A third indication message is sent, where the third indication message includes an indication of determining the training model.
- 根据权利要求12所述的训练方法,其特征在于,所述模型训练模式包括多训练节点模式,所述方法还包括:The training method according to claim 12, wherein the model training mode comprises a multi-training node mode, and the method further comprises:发送第四指示消息;所述第四指示消息用于指示第三压缩模型,所述第三压缩模型为基于第一训练模型的数量对所述第一压缩模型进行联邦平均得到的压缩模型。A fourth indication message is sent; the fourth indication message is used to indicate a third compression model, where the third compression model is a compression model obtained by performing federated averaging on the first compression model based on the number of the first training models.
- 根据权利要求11所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 11, wherein the method further comprises:发送第五指示消息,所述第五指示消息用于指示结束训练所述第一压缩模型。A fifth instruction message is sent, where the fifth instruction message is used to instruct the end of training the first compression model.
- 根据权利要求11所述的训练方法,其特征在于,所述方法还包括:The training method according to claim 11, wherein the method further comprises:接收订阅需求,并基于所述订阅需求发送模型训练请求。A subscription requirement is received, and a model training request is sent based on the subscription requirement.
- 一种训练装置,其特征在于,应用于第一节点,所述装置包括:A training device, characterized in that, applied to a first node, the device comprises:模型训练模块,用于响应于接收到模型训练请求,训练第一训练模型,其中,所述模型训练请求中包括模型压缩参数;a model training module, configured to train a first training model in response to receiving a model training request, wherein the model training request includes model compression parameters;模型压缩模块,用于基于所述第一训练模型和所述模型压缩参数,得到所述第一训练模型的第一压缩模型。A model compression module, configured to obtain a first compression model of the first training model based on the first training model and the model compression parameters.
- 一种训练装置,其特征在于,应用于第二节点,所述装置包括:A training device, characterized in that, applied to a second node, the device comprising:网络通信模块,用于发送模型训练请求;Network communication module, used to send model training request;其中,所述模型训练请求中包括模型压缩参数,所述模型压缩参数用于压缩第一训练模型得到第一压缩模型,所述第一训练模型基于所述模型训练请求训练得到。Wherein, the model training request includes model compression parameters, and the model compression parameters are used to compress a first training model to obtain a first compression model, and the first training model is obtained by training based on the model training request.
- 一种训练装置,其特征在于,包括:A training device, comprising:处理器;processor;用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;其中,所述处理器被配置为:执行权利要求1-10中任意一项所述的训练方法,或执行权利要求11-17中任意一项所述的训练方法。Wherein, the processor is configured to: execute the training method described in any one of claims 1-10, or execute the training method described in any one of claims 11-17.
- 一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行权利要求1-10中任意一项所述的训练方法,或执行权利要求11-17中任意一项所述的训练方法。A non-transitory computer-readable storage medium, when the instructions in the storage medium are executed by the processor of the mobile terminal, the mobile terminal can execute the training method described in any one of claims 1-10, or execute The training method of any one of claims 11-17.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080003605.XA CN114793453A (en) | 2020-11-23 | 2020-11-23 | Training method, training device and storage medium |
PCT/CN2020/130896 WO2022104799A1 (en) | 2020-11-23 | 2020-11-23 | Training method, training apparatus, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/130896 WO2022104799A1 (en) | 2020-11-23 | 2020-11-23 | Training method, training apparatus, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022104799A1 true WO2022104799A1 (en) | 2022-05-27 |
Family
ID=81708237
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/130896 WO2022104799A1 (en) | 2020-11-23 | 2020-11-23 | Training method, training apparatus, and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114793453A (en) |
WO (1) | WO2022104799A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024152290A1 (en) * | 2023-01-19 | 2024-07-25 | 华为技术有限公司 | Network quantization method and apparatus, and related device |
WO2024207182A1 (en) * | 2023-04-04 | 2024-10-10 | Qualcomm Incorporated | Training dataset mixture for user equipment-based model training in predictive beam management |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116233857A (en) * | 2021-12-02 | 2023-06-06 | 华为技术有限公司 | Communication method and communication device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109784474A (en) * | 2018-12-24 | 2019-05-21 | 宜通世纪物联网研究院(广州)有限公司 | A kind of deep learning model compression method, apparatus, storage medium and terminal device |
CN109978144A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | A kind of model compression method and system |
WO2020131968A1 (en) * | 2018-12-18 | 2020-06-25 | Movidius Ltd. | Neural network compression |
CN111898484A (en) * | 2020-07-14 | 2020-11-06 | 华中科技大学 | Method, apparatus, readable storage medium, and electronic device for generating a model |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11244242B2 (en) * | 2018-09-07 | 2022-02-08 | Intel Corporation | Technologies for distributing gradient descent computation in a heterogeneous multi-access edge computing (MEC) networks |
CN111488985B (en) * | 2020-04-08 | 2023-11-14 | 华南理工大学 | Deep neural network model compression training methods, devices, equipment and media |
-
2020
- 2020-11-23 CN CN202080003605.XA patent/CN114793453A/en active Pending
- 2020-11-23 WO PCT/CN2020/130896 patent/WO2022104799A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020131968A1 (en) * | 2018-12-18 | 2020-06-25 | Movidius Ltd. | Neural network compression |
CN109784474A (en) * | 2018-12-24 | 2019-05-21 | 宜通世纪物联网研究院(广州)有限公司 | A kind of deep learning model compression method, apparatus, storage medium and terminal device |
CN109978144A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | A kind of model compression method and system |
CN111898484A (en) * | 2020-07-14 | 2020-11-06 | 华中科技大学 | Method, apparatus, readable storage medium, and electronic device for generating a model |
Non-Patent Citations (1)
Title |
---|
WEI YUE, CHEN SHICHAO;ZHU FENGHUA;XIONG GANG: "Pruning Method for Convolutional Neural Network Models Based on Sparse Regularization", COMPUTER ENGINEERING, vol. 47, no. 10, 14 November 2021 (2021-11-14), CN , pages 61 - 66, XP055931887, ISSN: 1000-3428, DOI: 10.19678/j.issn.1000-3428.0059375 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024152290A1 (en) * | 2023-01-19 | 2024-07-25 | 华为技术有限公司 | Network quantization method and apparatus, and related device |
WO2024207182A1 (en) * | 2023-04-04 | 2024-10-10 | Qualcomm Incorporated | Training dataset mixture for user equipment-based model training in predictive beam management |
Also Published As
Publication number | Publication date |
---|---|
CN114793453A (en) | 2022-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021258370A1 (en) | Communication processing method, communication processing apparatus and storage medium | |
WO2022104799A1 (en) | Training method, training apparatus, and storage medium | |
WO2022099512A1 (en) | Data processing method and apparatus, communication device, and storage medium | |
CN107926000A (en) | Information transceiving method, apparatus and system | |
US12309718B2 (en) | Methods and apparatuses for processing transmission power level information, and computer storage media | |
CN111466127A (en) | Processing method and device for enhancing uplink coverage and storage medium | |
CN113632571B (en) | Message configuration method, message configuration device and storage medium | |
WO2023000341A1 (en) | Information configuration method, information configuration apparatus, and storage medium | |
US11387923B2 (en) | Information configuration method and apparatus, method and apparatus for determining received power, and base station | |
WO2021007827A1 (en) | Information indication and determination methods and apparatuses, communication device and storage medium | |
WO2021142796A1 (en) | Communication processing methods and apparatuses, and computer storage medium | |
WO2022151490A1 (en) | Channel state information determination method and apparatus, and storage medium | |
CN111566985B (en) | Transmission processing method, device, user equipment, base station and storage medium | |
CN110945827B (en) | Method, device, communication equipment and storage medium for configuring downlink control information | |
WO2022082742A1 (en) | Model training method and device, server, terminal, and storage medium | |
WO2022141290A1 (en) | Parameter determination method, parameter determination apparatus, and storage medium | |
WO2022133689A1 (en) | Model transmission method, model transmission device, and storage medium | |
CN113169825B (en) | Data packet transmission method and device | |
WO2022126555A1 (en) | Transmission method, transmission apparatus, and storage medium | |
CN114080852A (en) | Method, device, communication device and storage medium for reporting capability information | |
WO2022151052A1 (en) | Method and apparatus for configuring random access parameter, and storage medium | |
WO2022204973A1 (en) | Policy determining method, policy determining apparatus, and storage medium | |
US20250048137A1 (en) | Information processing method and apparatus, and communication device and storage medium | |
WO2023039722A1 (en) | Information reporting method, information reporting apparatus and storage medium | |
WO2024130590A1 (en) | Transmission configuration indicator state determination method and apparatus, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20962090 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20962090 Country of ref document: EP Kind code of ref document: A1 |