[go: up one dir, main page]

CN118586475B - A federated category incremental learning modeling method based on stable feature prototypes - Google Patents

A federated category incremental learning modeling method based on stable feature prototypes Download PDF

Info

Publication number
CN118586475B
CN118586475B CN202411062873.5A CN202411062873A CN118586475B CN 118586475 B CN118586475 B CN 118586475B CN 202411062873 A CN202411062873 A CN 202411062873A CN 118586475 B CN118586475 B CN 118586475B
Authority
CN
China
Prior art keywords
prototype
class
feature
category
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411062873.5A
Other languages
Chinese (zh)
Other versions
CN118586475A (en
Inventor
沈晓兵
孙飞
张铁汉
朱君
赵春晖
姚邹静
贾晓燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Zheneng Digital Technology Co ltd
Xiaoshan Power Plant Of Zhejiang Zhengneng Electric Power Co ltd
Zhejiang University ZJU
Original Assignee
Zhejiang Zheneng Digital Technology Co ltd
Xiaoshan Power Plant Of Zhejiang Zhengneng Electric Power Co ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zheneng Digital Technology Co ltd, Xiaoshan Power Plant Of Zhejiang Zhengneng Electric Power Co ltd, Zhejiang University ZJU filed Critical Zhejiang Zheneng Digital Technology Co ltd
Priority to CN202411062873.5A priority Critical patent/CN118586475B/en
Publication of CN118586475A publication Critical patent/CN118586475A/en
Application granted granted Critical
Publication of CN118586475B publication Critical patent/CN118586475B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a federal type increment learning modeling method based on stable feature prototypes, which is realized under cloud-edge collaborative scenes, aims at the problem of catastrophic forgetting of a model during the increment of an edge device data category, establishes a category sample memory for storing a category representative sample at an edge, adopts a prototype network updating strategy based on a playback paradigm to update the edge locally, aims at the problem of spreading of catastrophic forgetting during the collaborative modeling of a plurality of edge cloud edges, adopts a weighted aggregation strategy taking a feature prototype under a unified feature space as a reference standard at a cloud end, and stably optimizes the feature space under a federal frame to realize federal updating of category knowledge. Under the condition of protecting data privacy, the invention solves the collaborative modeling problem in the edge class increment, can effectively relieve catastrophic forgetting, and has superiority in the aspects of model accuracy and training stability.

Description

Federal class increment learning modeling method based on stable feature prototype
Technical Field
The invention relates to the technical field of big data calculation, in particular to the technical field of cloud edge collaborative federal modeling for edge data class increment, and particularly relates to a federal class increment learning modeling method based on a stable characteristic prototype.
Background
In the times of Internet of things and big data, the deep learning method has many applications in the aspect of processing common application tasks such as fault diagnosis components or structure prediction, image classification segmentation and the like. The existing method often needs to perform offline modeling based on a large amount of data accumulated in a historical database, and then apply online. However, in the open world of transient changes, data tends to appear in streams and new categories appear from time to time. At the same time, these methods often only consider modeling requirements of a single plant or factory. With the rapid development of technologies such as industrial Internet of things and edge computing, the modeling method under the cloud edge collaborative framework breaks down data islands among factories. The federal learning method can integrate knowledge of multiple parties from different devices or factories while protecting data privacy, and becomes an attractive modeling paradigm. How to perform incremental learning from non-stationary data streams under the federal learning framework is an important problem in the field of deep learning.
For a single device or factory, due to hardware condition limitations, only the data of the last period of time can be stored, and as the machine operates, some old data can be continuously discarded, and newly generated data can be stored. In practical industries such as manufacturing industry, raw materials, components, working conditions and corresponding data tend to be gradually increased and changed under the condition of long time span, for fault diagnosis, some new types of fault data or fault pictures are generated, for two-dimensional code identification, the actual application is often faced with various and increasing identification requirements, and a deep learning model needs to be trained based on continuously generated new data streams (i.e. tasks). When deep neural networks (Deep Neural Networks, DNN) are trained on samples of new tasks or data distributions, they tend to quickly forget the previously acquired knowledge, with dramatic degradation in performance, a phenomenon known as catastrophic forgetfulness. While it may be mandatory to iteratively retrain the model based on the entire historical dataset from scratch, this may lead to an ever increasing training time and computational burden, which is not a viable long-term solution.
Because of the collaborative training involved, the federal class incremental modeling problem under the polygonal end is more complex than that of the single-sided end, and space-time double isomerism of the data exists. In a multi-edge data type increment federal modeling scene, the problem of catastrophic forgetting superposition model drift is diffused among the multi-edge data types, and the problem is continuously propagated along with cloud edge iterative optimization. At this time, tasks are increasing due to complexity of category combinations, and forgetting cannot be avoided if knowledge of historical tasks is stored in a model. If the sample is stored or the network is expanded according to the task, the number of the tasks is large, and the storage space is not long-term feasible. Therefore, the problem essence of federal class increment needs to be pursued, and analysis is performed with the class as the core. The data distribution of the same class of different side ends is consistent, if knowledge is stored according to the class, and the self-updating of the class knowledge is combined, namely, on one hand, new class knowledge is supplemented, on the other hand, old class knowledge is maintained, and the side end collaborative optimization is carried out based on the class knowledge, so that the federal class incremental modeling problem facing space-time double isomerism is expected to be solved.
Disclosure of Invention
The invention aims to provide a federal type increment learning modeling method based on a stable feature prototype, aiming at the problems that local disastrous forgetting and spreading of disastrous forgetting of a model are caused by incapability of processing the increment of an edge data type in the federal modeling method in the prior art. According to the invention, a class memory is established at the side to store a class representative sample, a prototype network updating strategy based on a playback paradigm is designed, a weighted aggregation strategy taking a feature prototype under a unified feature space as a reference is designed at the cloud, the feature space is stably optimized under a federal frame, federal updating of class knowledge is realized, and the collaborative modeling problem during the increment of the side class is solved.
The invention aims at realizing the technical scheme that the federal type increment learning modeling method based on the stable characteristic prototype is realized under a cloud-edge collaborative scene, wherein the cloud-edge collaborative scene comprises C edge ends and a cloud end, and each edge end comprises a continuously-growing visible type setFollowing tasksVariable data setA group of class memory librariesAnd a prototype networkWhereinRepresent the firstThe end of each side is provided with a groove,For each sample in the parameter, dataset and class repository representing the prototype networkThe representation is made of a combination of a first and a second color,Representing a sampleA corresponding category label; the cloud end aggregates prototype networks updated by each side after each task, and the method comprises the following steps:
(1) Randomly initializing weight parameters of a prototype network preset by each side;
(2) Dynamically updating the local class memory library of each side to obtain an updated class memory library Based on the updated class memoryPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkData set of current task tAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype sets;
(3) Prototype group based on old class featuresAnd new class feature prototype setsCalculating the aggregation weights of all the side ends at the cloud end according to the global prototype interval and the local prototype interval, weighting and aggregating prototype networks of all the side ends according to the aggregation weights of the side ends to obtain a global model after model calibration, and issuing the global model to all the side ends;
(4) Step (2) and step (3) are alternately performed until a preset training round is reached;
(5) After the collaborative optimization is completed, the global model finally received by the side is used as a final prototype network, the data in the local class memory library is input to the final prototype network by the side, various characteristics are obtained, and the average value of the characteristics is calculated to be used as a class characteristic prototype to carry out classification tasks.
Further, the local class memory library of each side is dynamically updated to obtain an updated class memory libraryThe method specifically comprises the following steps:
Class memory bank for each edge According to the newly added current task data of the category, calculating the category center of each category under the current prototype network, selecting the P nearest examples of each category from the category center as the current sample of the category memory library, and adding the current sample into the category memory libraryObtaining updated class memory library
Further, the calculating the class center of each class under the current prototype network specifically includes:
the current task data of each category is input into a current prototype network to obtain the characteristics corresponding to all the task data of the category, and the average value of the characteristics corresponding to all the task data is calculated and used as a category center of the category under the current prototype network.
Further, the updated class-based memory libraryPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkThe method specifically comprises the following steps:
recording prototype network update of each edge as Under the playback paradigm, adopting a prototype network updating strategy based on the playback paradigm, and based on the dataset of the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype network
Further, under the playback paradigm, adopting a prototype network updating strategy based on the playback paradigm, and based on the dataset of the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype networkThe method specifically comprises the following steps:
the prototype network of edge i is optimized locally each time from the set of visible categories In (a) and (b)Random selection among categoriesSelecting a category memory bankAnd the dataset of the current task tCorresponding toSupport set for constructing current task t by seed dataClass memoryThe rest data in the list form the query set of the current task tWhereinRepresenting the total number of the visible categories of the edge i;
will support the collection Inputting a prototype network to obtain each class characteristic, and calculating the average value of each class characteristic as a class characteristic prototype;
Optimizing a query set by adopting a random gradient descent methodIn category feature prototypesA loss function of the upper part, the loss function being a negative logarithmic probability functionWhereinRepresenting prototype networksFrom the sampleA distribution of categories generated in feature space at a distance from a prototype of a corresponding category, in particular in an optimized query setPreviously, the negative logarithmic probability functionInitializing to 0, optimizing a query setWhen, for each categoryEach sample in the query set below updates its negative log probability function according to the following formula:
Wherein P is the number of samples reserved in each category in the category memory,Sample x input prototype network for representing query setThe characteristics obtained in the course of the process,Representing non-categories,Representation ofAndEuclidean distance between them.
Further, the data set of the current task t is obtainedAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype setsThe method specifically comprises the following steps:
for the side i, selecting the data set of the current task t And class memorySample of class j, input to prototype network before updateIn the method, a group of features corresponding to the category j is obtained, the mean value of the group of features is calculated and used as an old feature prototype of the category j, and an old category feature prototype group is constructed according to the old feature prototypes of all the categories;
For the side i, selecting the data set of the current task tAnd class memorySample of class j is input to updated prototype networkIn the method, a group of characteristics corresponding to the category j is obtained, the mean value of the group of characteristics is calculated and used as a new characteristic prototype of the category j, and a new category characteristic prototype group is constructed according to the new characteristic prototypes of all the categories
Further, the prototype group is characterized according to the old categoryAnd new class feature prototype setsCalculating a global prototype interval and a local prototype interval at a cloud, wherein the method specifically comprises the following steps:
old class feature prototype group for summarizing all sides in cloud And new class feature prototype setsThen, prototype sets are formed according to the old class characteristics of each sideCalculating the feature prototype mean value of each category to obtain a global category feature prototype set under the task tWhereinRepresenting global features of class j under task t, M representing the number of classes that the edge should finally distinguish, calculating corresponding prototype intervals according to two groups of class feature prototype groups, specifically according to old class feature prototype groupsAnd new class feature prototype setsCalculating to obtain local prototype intervals of each side class jPrototype group based on old class featuresAnd global class feature prototype setCalculating to obtain global prototype interval of each side class j
Further, the calculating the corresponding prototype interval according to the two groups of category characteristic prototype groups specifically includes:
recording two groups of characteristic prototype groups as AndFor category j, calculate a category feature prototype setAndEuclidean distance corresponding to middle class jComputing class feature prototype setsClass j andMean value of Euclidean distance corresponding to non-category j;
Two sets of class feature prototype sets AndSelecting old class feature prototype group of edge iAnd new class feature prototype setsWhen the local prototype interval of the side class j is calculated according to the following formula;
Two sets of class feature prototype setsAndSelecting old class feature prototype group of edge iAnd global class feature prototype setWhen the global prototype interval of the side class j is calculated according to the following formula:
Wherein, Prototype interval representing class j, including local prototype interval of class jAnd global prototype spacing
Further, the calculating the aggregate weight of each edge at the cloud according to the global prototype interval and the local prototype interval, and weighting and aggregating the prototype network of each edge according to the aggregate weight of each edge, to obtain a global model after model calibration, specifically includes:
summing the local prototype intervals under different categories of each side, and inputting the sum to the Sigmoid activation function to obtain the local credibility of the side ;
Summing the global prototype intervals under different categories of each side, and inputting the summed global prototype intervals into a Sigmoid activation function to obtain the global credibility of the side;
Taking the local credibility of each sideAnd global credibilityThe average value of (2) is used as the aggregation weight of the edge end in the cloud;
The prototype network of each side is weighted and aggregated according to the aggregation weight of each side to obtain a global model after model calibration
Further, the step (5) specifically includes:
After the collaborative optimization is completed, the global model finally received by the edge is used as a final prototype network, and the edge uses the data in the local class memory bank to input the data into the final prototype network Obtaining various characteristics, and calculating the average value of the characteristics as a class characteristic prototype;
for any sample to be classified at the edge Input it into the final prototype networkObtaining the corresponding characteristics of the sample;
And calculating Euclidean distance between the feature corresponding to the sample and each category feature prototype, and taking the category corresponding to the category feature prototype corresponding to the minimum Euclidean distance as a classification result.
Compared with the prior art, the invention has the beneficial effects that:
(1) The invention provides a federal type increment learning framework for establishing a stable characteristic prototype by using type knowledge, so that the optimization of a local characteristic space and the update of the characteristic prototype under the type increment are realized, the characteristic space of each side end is unified, and the stable aggregation of cloud model parameters is realized;
(2) The invention designs a prototype network updating strategy supported by a class memory, overcomes the problem of catastrophic forgetting under the condition of class increment by dynamically maintaining the class memory, and improves the stability of model optimization by training a class-related feature space;
(3) The invention designs a model calibration strategy based on characteristic prototype intervals, obtains the aggregation weight of each side model according to the prototype intervals, and indirectly realizes the correction of the model aggregation stage by means of category related knowledge.
Drawings
FIG. 1 is a flow chart of a federal class incremental learning modeling method based on stable feature prototypes of the present invention;
FIG. 2 is a schematic diagram of a class C delta edge according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of a prototype network update strategy based on playback paradigm at the edge of the present invention;
FIG. 4 is a schematic diagram of a model calibration strategy based on feature prototype intervals for cloud end in the present invention;
Fig. 5 is a visual result diagram of a class feature prototype at an edge under different methods, wherein (a) in fig. 5 is a visual result diagram of a class feature prototype at an edge 1, an edge 2 and an edge 3 of the method, (b) in fig. 5 is a visual result diagram of a class feature prototype of PN-FedAvg-R at an edge 1, an edge 2 and an edge 3, and (c) in fig. 5 is a visual result diagram of a class feature prototype of PN-FedProx-R at an edge 1, an edge 2 and an edge 3.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
The present invention will be described in detail with reference to the accompanying drawings. The features of the examples and embodiments described below may be combined with each other without conflict.
The federal type incremental learning modeling method based on the stable feature prototype is realized under a cloud-edge collaborative scene, wherein the cloud-edge collaborative scene comprises C edge ends and a cloud end, as shown in fig. 2, and fig. 2 is a schematic diagram comprising the C edge ends. Wherein each edge contains an ever-increasing set of visible categoriesFollowing tasksVariable data setA group of class memory librariesAnd a prototype networkWhereinRepresent the firstThe end of each side is provided with a groove,For each sample in the parameter, dataset and class repository representing the prototype networkThe representation is made of a combination of a first and a second color,Representing a sampleA corresponding category label. The cloud end aggregates prototype networks updated by each side after each task.
Referring to fig. 1, the method specifically includes the steps of:
(1) And randomly initializing the weight parameters of the prototype network preset by each side.
It should be understood that in the cloud-edge collaboration scenario, each edge is preset with a prototype network, and the weight parameters of the prototype network need to be randomly initialized, and any neural network initializing means may be used for random initialization, for example, an Xavier initializing method.
It should be noted that, due to the influence of the storage capacity limitation, the data types of different edges increase with time, but the number of stored samples is limited, and the data of the old task is continuously replaced by the data of the new task. In order to realize model collaborative training under category increment, the invention relates to a prototype network updating strategy based on a playback paradigm and a model calibration strategy based on a characteristic prototype interval. Firstly, carrying out random initialization on a prototype network of each side, and after the initialization is finished, carrying out local updating and cloud aggregation on the side alternately, namely, carrying out the step (2) and the step (3) alternately.
(2) Dynamically updating the local class memory library of each side to obtain an updated class memory libraryBased on the updated class memoryPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkData set of current task tAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype sets
In this embodiment, the local class memory library at each side is dynamically updated to obtain an updated class memory libraryThe method specifically comprises a class memory bank of each sideAccording to the newly added current task data of the category, calculating the category center of each category under the current prototype network, selecting the P nearest examples of each category from the category center as the current sample of the category memory library, and adding the current sample into the category memory libraryObtaining updated class memory library
The method comprises the steps of obtaining the characteristics corresponding to all task data of each category by inputting the current task data of each category into a current prototype network, and calculating the average value of the characteristics corresponding to all task data as the category center of the category under the current prototype network.
In this embodiment, the updated class-based memory libraryPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkThe method comprises recording the prototype network of each edge as before updatingUnder the playback paradigm, adopting a prototype network updating strategy based on the playback paradigm, and based on the dataset of the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype network
Further, record the prototype network of each edge as before updatingIn the playback paradigm, a dataset based on the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype networkAs shown in fig. 3, the method specifically comprises the following substeps:
(2.1 a) prototype network of edge i each time locally optimized, from the set of visible categories In (a) and (b)Random selection among categoriesSelecting a category memory bankAnd the dataset of the current task tCorresponding toSupport set for constructing current task t by seed dataClass memoryThe rest data in the list form the query set of the current task tWhereinRepresenting the total number of visible categories for edge i.
(2.2 A) support setInputting a prototype network to obtain each class characteristic, and calculating the average value of each class characteristic as a class characteristic prototype
(2.3 A) optimizing the query set Using random gradient descent (SGD) approachIn category feature prototypesA loss function of the upper part, the loss function being a negative logarithmic probability functionWhereinRepresenting prototype networksFrom the sampleA distribution of categories generated in feature space at a distance from a prototype of a corresponding category, in particular in an optimized query setPreviously, the negative logarithmic probability functionInitializing to 0, optimizing a query setWhen, for each categoryEach sample in the query set below updates its negative log probability function according to the following formula:
Wherein P is the number of samples reserved in each category in the category memory,Sample x input prototype network for representing query setThe characteristics obtained in the course of the process,Representing non-categories,Representation ofAndEuclidean distance between them.
As described above, the local update of the edge requires a dynamic update of the edge local class memory, in one embodiment, the class memory of each edgeThe current task data is newly added according to the categories, category centers of each category under the current prototype network are calculated, the first P instances of each category, which are closest to the category center, are selected as current samples of the memory bank, and an updated category memory bank is obtained. Then, the prototype network and the feature prototype need to be updated based on the playback paradigm, as shown in fig. 3, and correspondingly, the prototype network updating strategy based on the playback paradigm provided by the invention is specifically implemented as shown in (2.1 a) -step (2.3 a).
In this embodiment, the data set of the current task t is setAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype setsThe method specifically comprises the following substeps:
(2.1 b) for edge i, select the dataset of the current task t And class memorySample of class j, input to prototype network before updateIn the method, a group of features corresponding to the category j is obtained, the mean value of the group of features is calculated and used as an old feature prototype of the category j, and an old category feature prototype group is constructed according to the old feature prototypes of all the categories
(2.2 B) for edge i, select the dataset of the current task tAnd class memorySample of class j is input to updated prototype networkIn the method, a group of characteristics corresponding to the category j is obtained, the mean value of the group of characteristics is calculated and used as a new characteristic prototype of the category j, and a new category characteristic prototype group is constructed according to the new characteristic prototypes of all the categories
(3) Prototype group based on old class featuresAnd new class feature prototype setsAnd calculating the aggregation weights of all the side ends at the cloud end according to the global prototype interval and the local prototype interval, weighting and aggregating prototype networks of all the side ends according to the aggregation weights of the side ends to obtain a global model after model calibration, and transmitting the global model to all the side ends.
In this embodiment, prototype sets are characterized according to old categoriesAnd new class feature prototype setsThe global prototype interval and the local prototype interval are calculated at the cloud, and the method specifically comprises the steps of summarizing old class feature prototype groups at each side end at the cloudAnd new class feature prototype setsThen, prototype sets are formed according to the old class characteristics of each sideCalculating the feature prototype mean value of each category to obtain a global category feature prototype set under the task tWhereinRepresenting global features of class j under task t, M representing the number of classes that the edge should finally distinguish, calculating corresponding prototype intervals according to two groups of class feature prototype groups, specifically according to old class feature prototype groupsAnd new class feature prototype setsCalculating to obtain local prototype intervals of each side class jPrototype group based on old class featuresAnd global class feature prototype setCalculating to obtain global prototype interval of each side class j
Further, the corresponding prototype interval is calculated according to two groups of prototype groups of the class features, which comprises recording the two groups of prototype groups of the class features as respectivelyAndFor category j, calculate a category feature prototype setAndEuclidean distance corresponding to middle class jComputing class feature prototype setsClass j andMean value of Euclidean distance corresponding to non-category j in the model setAndSelecting old class feature prototype group of edge iAnd new class feature prototype setsWhen the local prototype interval of the side class j is calculated according to the following formulaTwo sets of class feature prototypesAndSelecting old class feature prototype group of edge iAnd global class feature prototype setWhen the global prototype interval of the side class j is calculated according to the following formula:
Wherein, Prototype interval representing class j, including local prototype interval of class jAnd global prototype spacing
In this embodiment, an aggregation weight of each edge at the cloud end is calculated according to a global prototype interval and a local prototype interval, and prototype networks of each edge are aggregated according to the aggregation weights of the edges, so as to obtain a global model after model calibration, as shown in fig. 4, which specifically includes the following substeps:
(3.1) summing the local prototype intervals under different categories of each side and inputting the summed values into a Sigmoid activation function to obtain the local credibility of the side
Further, the purpose of the Sigmoid activation function is to input valuesAn output value mapped between 0 and 1, expressed as:
Wherein, Representing Sigmoid activation functions.
(3.2) Summing the global prototype intervals under different categories of each side and inputting the summed global prototype intervals into the Sigmoid activation function to obtain the global credibility of the side
It should be understood that, according to the prototype intervals under different classes of each edge, the model credibility of the prototype network update can be obtained through calculation, wherein the prototype intervals are divided into two types of local prototype intervals and global prototype intervals, and the credibility corresponding to each type of prototype intervals, namely the local credibility and the global credibility corresponding to the step (3.1) and the step (3.2), needs to be calculated.
(3.3) Taking the local credibility of each edgeAnd global credibilityThe average value of (2) is used as the aggregation weight of the edge end in the cloud
(3.4) Weighting and aggregating prototype networks of all the side ends according to the aggregation weights of all the side ends to obtain a global model after model calibration
It should be understood that, after the calculation of the global prototype interval and the local prototype interval is completed, a model calibration strategy based on the feature prototype interval is executed at the cloud, and the specific implementation flow is shown in fig. 4, namely, the steps (3.1) - (3.4) are described above.
(4) The step (2) and the step (3) are alternately performed until a preset training round is reached.
(5) After the collaborative optimization is completed, the global model finally received by the side is used as a final prototype network, the data in the local class memory library is input to the final prototype network by the side, various characteristics are obtained, and the average value of the characteristics is calculated to be used as a class characteristic prototype to carry out classification tasks.
(5.1) After the collaborative optimization is completed, the global model finally received by the edge is used as a final prototype network, and the edge uses the data in the local category memory library to input the data into the final prototype networkAnd obtaining various characteristics, and calculating the average value of the characteristics as a class characteristic prototype.
(5.2) For any sample to be classified at the edgeInput it into the final prototype networkAnd obtaining the corresponding characteristics of the sample.
And (5.3) calculating Euclidean distances between the features corresponding to the samples and the feature prototypes of the various categories, and taking the category corresponding to the category feature prototype corresponding to the minimum Euclidean distance as a classification result.
The technical scheme of the invention is further described below with reference to the accompanying drawings and specific embodiments.
In this embodiment, the classification of the color image dataset CIFAR is exemplified as 10 classes, which are respectively aircraft, automobile, birds, cat, deer, dog, frog, horse, boat, and truck. 6000 pictures are included in each category, and each picture has the size of. 50000 Samples in this dataset were used as training sets and 10000 samples were used as test sets. In the cloud-edge cooperative scene, three side ends are arranged as shown in fig. 2, at this time. Single task per edgeRandom 3 kinds of images, each kind of sample number is 100. And (3) assuming that the change time of the edge task is synchronous, and updating the model synchronously after the model cloud correction. It should be noted that model training of the edge is parallel, the cloud model is updated and then sent to the edge, and the updating of the edge model can be performed asynchronously.
Under the cloud edge cooperative scene, the federal type increment learning modeling method based on the stable characteristic prototype specifically comprises the following steps:
(1) And randomly initializing the weight parameters of the prototype network of each side.
(2) Dynamically updating a class memory bank local to each side: class memory bank for each edgeCalculating a class center of each class under the current prototype network according to the newly added current task data of the class, selecting P instances closest to the class center of each class as current samples of a class memory library, and obtaining an updated class memory library
(3) Prototype network local update based on playback paradigm by recording prototype network model of each edge as before updateUnder the playback paradigm, a prototype network updating strategy based on the playback paradigm is adopted, and the prototype network is updated as follows. Data set of current task tAnd class memoryInput into prototype network model before and after updating to obtain old class feature prototype setAnd new class feature prototype sets
(4) Calculating global prototypes and prototype intervals in the cloud, namely summarizing class feature prototypes of all side ends in the cloud, and then collecting old class feature prototype groups of all side ends according to the cloud(Calculating the feature prototype mean value of each category to obtain a taskGlobal category feature prototype set belowThe corresponding prototype interval can be calculated according to the two groups of prototype feature groups, specifically, the edge class is calculated according to the new and old groups of prototype feature groupsLocal prototype spacing of (a)Calculating according to the old class feature prototype group and the global class feature prototype group to obtain each side classAggregate prototype spacing of (2)
(5) Model calibration based on prototype intervals is carried out at the cloud end, namely model credibility of prototype network update is calculated according to prototype intervals under different categories of each side, specifically, local credibility is calculated according to local prototype intervalsCalculating according to the aggregate prototype interval to obtain global credibilityTaking the average value of the local credibility and the global credibility of each side as the weight of the side model during polymerizationAccording to the aggregation weight of the edgeWeighting and aggregating prototype networks of all edge ends to obtain a global modelAnd issued to each edge.
(6) And (3) and (4) alternately performing the steps (2) and (5), and iterating for 500 times to finish collaborative optimization.
(7) After the collaborative optimization is completed, the edge uses the final prototype network model distributed by the data input cloud in the local class memory to obtain various class characteristics, and calculates the average value of the class characteristics as a class characteristic prototype for classification tasks.
According to the color image dataset CIFAR, an image classification model constructed by the method can be obtained, and can be used for image classification tasks on line, and classification effect verification can be performed on a test set.
In the case of the polygon side data class increment, the accuracy of model classification is shown in table 1. FedAvg, fedProx was used as a comparative method in the case of federal polymerization. "PN-FedAvg-R" represents the FedAvg algorithm joining the designed edge playback paradigm, and "PN-FedProx-R" represents the FedProx algorithm joining the designed edge playback paradigm, compared to the former method, the model parameters are corrected for client drift problems at the time of cloud aggregation. "R w/o" means that the edge playback paradigm is ablated in the designed approach, i.e., only the designed prototype interval-based weighted aggregation strategy is preserved. The classification effect of the model on the current new class and old class at different edges is specifically shown in table 1. The method of the present invention was found to be superior to the comparative method and the method in ablation experiments.
Table 1. The invention compares the classification accuracy (%) with other federal modeling methods at the polygonal class increment
Fig. 5 shows the visualization results of the proposed method and the feature prototype of each side class in PN-FedAvg-R, PN-FedProx-R with playback range after Principal Component Analysis (PCA) dimensionality reduction. The origin of the shading in the figure represents prototypes of different categories. As can be seen from FIG. 5, compared with other methods, the relative positions of the feature prototypes of the edge classes obtained by the method are more consistent, and the weighted aggregation strategy based on the prototype interval is explained to enable the feature spaces of different edges to be consistent during federal optimization. Meanwhile, from the visualization results of PN-FedAvg-R at side 1, side 2 and side 3 shown in fig. 5 (b) and the visualization results of PN-FedProx-R at side 1, side 2 and side 3 shown in fig. 5 (c), it is known that PN-FedAvg-R, PN-FedProx-R respectively exist category feature prototypes which are too close to each other in side 2 and side 3, and the category feature prototypes of the method of the present invention are more separated from each other as shown in fig. 5 (a), so that the classification accuracy is higher in table 1.
The foregoing embodiments are merely for illustrating the technical solution of the present invention, but not for limiting the same, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that modifications may be made to the technical solution described in the foregoing embodiments or equivalents may be substituted for parts of the technical features thereof, and that such modifications or substitutions do not depart from the spirit and scope of the technical solution of the embodiments of the present invention in essence.

Claims (10)

1. The federal type increment learning modeling method based on stable feature prototypes is realized under a cloud-edge collaborative scene and is characterized in that the cloud-edge collaborative scene comprises C edge ends and a cloud end, wherein each edge end comprises a continuously-growing visible type setData set as a function of task tClass memory bankAnd a prototype network f φ,i (,), wherein i represents the ith edge, phi represents parameters of the prototype network, each sample in the dataset and the class memory is represented by (x, y), y represents class labels corresponding to the sample x, the dataset is a color image dataset, the cloud aggregates the prototype network after updating each edge after each task, the method comprises the following steps:
(1) Randomly initializing weight parameters of a prototype network preset by each side;
(2) Dynamically updating the local class memory library of each side to obtain an updated class memory library Based on updated class memoryPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkData set of current task tAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype sets
(3) Prototype group based on old class featuresAnd new class feature prototype setsCalculating the aggregation weights of all the side ends at the cloud end according to the global prototype interval and the local prototype interval, weighting and aggregating prototype networks of all the side ends according to the aggregation weights of the side ends to obtain a global model after model calibration, and transmitting the global model to all the side ends;
(4) Step (2) and step (3) are alternately performed until a preset training round is reached;
(5) After the collaborative optimization is completed, the global model finally received by the side is used as a final prototype network, the data in the local class memory library is input to the final prototype network by the side, various characteristics are obtained, and the average value of the characteristics is calculated to be used as a class characteristic prototype to carry out classification tasks.
2. The method for incremental learning modeling of federal class based on stable feature prototypes of claim 1 wherein the local class memory at each edge is dynamically updated to obtain updated class memoryThe method specifically comprises the following steps:
Class memory bank for each edge According to the newly added current task data of the category, calculating the category center of each category under the current prototype network, selecting the P nearest examples of each category from the category center as the current sample of the category memory library, and adding the current sample into the category memory libraryObtaining updated class memory library
3. The steady feature prototype-based federal class incremental learning modeling method according to claim 2, wherein the calculating the class center of each class under the current prototype network specifically comprises:
the current task data of each category is input into a current prototype network to obtain the characteristics corresponding to all the task data of the category, and the average value of the characteristics corresponding to all the task data is calculated and used as a category center of the category under the current prototype network.
4. The steady feature prototype-based federal class incremental learning modeling method of claim 1, wherein the updated class-based memory bankPrototype network local to each edge using prototype network update policies based on playback paradigmUpdating to obtain updated prototype networkThe method specifically comprises the following steps:
recording prototype network update of each edge as Under the playback paradigm, adopting a prototype network updating strategy based on the playback paradigm, and based on the dataset of the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype network
5. The steady feature prototype-based federal class incremental learning modeling method of claim 4, wherein the prototype network update strategy based on playback paradigm is adopted under playback paradigm, based on the dataset of the current task tAnd updated class memoryTo prototype networkOptimizing for fixed times to obtain updated prototype networkThe method specifically comprises the following steps:
the prototype network of edge i is optimized locally each time from the set of visible categories In (a) and (b)K c kinds of selected categories are selected randomly, and category memory library is selectedAnd the dataset of the current task tCorresponding K c data in the current task t are constructed into a supporting setClass memory bankThe rest data in the list form the query set of the current task tWherein the method comprises the steps ofRepresenting the total number of the visible categories of the edge i;
will support the collection Inputting a prototype network to obtain various class characteristics, and calculating the average value of the various class characteristics as a class characteristic prototype p [ j ];
Optimizing a query set by adopting a random gradient descent method A loss function on the class feature prototype p [ j ], which is a negative logarithmic probability functionRepresenting a distribution of the prototype network f φ with respect to the class generated from the distance of the sample (x, y) from the prototype of the corresponding class in the feature space, in particular in the optimization query setPreviously, the negative logarithmic probability functionInitializing to 0, optimizing a query setAt this time, for each sample in the query set under each category j, its negative log probability function is updated according to the following formula
Wherein P is the number of samples reserved in each category in the category memory library, f φ (x) represents the characteristics obtained after the sample x of the query set is input into the prototype network f φ, j' represents the non-category j, d (f φ (x), P [ j ]) represents the Euclidean distance between f φ (x) and P [ j ].
6. The steady feature prototype-based federal class incremental learning modeling method of claim 1, wherein the dataset of the current task t is to be modeledAnd class memoryInput to prototype network before and after updating to obtain old class feature prototype setAnd new class feature prototype setsThe method specifically comprises the following steps:
for the side i, selecting the data set of the current task t And class memorySample of class j, input to prototype network before updateIn the method, a group of features corresponding to the category j is obtained, the mean value of the group of features is calculated and used as an old feature prototype of the category j, and an old category feature prototype group is constructed according to the old feature prototypes of all the categories
For the side i, selecting the data set of the current task tAnd class memorySample of class j is input to updated prototype networkIn the method, a group of characteristics corresponding to the category j is obtained, the mean value of the group of characteristics is calculated and used as a new characteristic prototype of the category j, and a new category characteristic prototype group is constructed according to the new characteristic prototypes of all the categories
7. The stable feature prototype-based federal class incremental learning modeling method in accordance with claim 1, wherein the set of old class feature prototypes is based onAnd new class feature prototype setsCalculating a global prototype interval and a local prototype interval at a cloud, wherein the method specifically comprises the following steps:
old class feature prototype group for summarizing all sides in cloud And new class feature prototype setsThen, prototype sets are formed according to the old class characteristics of each sideCalculating the feature prototype mean value of each category to obtain a global category feature prototype set under the task tWherein gp t [ j ] represents the global feature of class j under task t, M represents the number of classes that the edge should finally distinguish, and corresponding prototype intervals are calculated according to two groups of class feature prototype groups, specifically, according to the old class feature prototype groupAnd new class feature prototype setsCalculating to obtain local prototype intervals of each side class jPrototype group based on old class featuresAnd global class feature prototype group GP t to calculate global prototype interval of each side class j
8. The steady feature prototype-based federal class incremental learning modeling method of claim 7, wherein the calculating the corresponding prototype intervals from the two class feature prototype sets comprises:
Recording two groups of class feature prototype groups which are p a and p b respectively, calculating Euclidean distance d +[j](pa,pb corresponding to class j in class feature prototype groups p a and p b for class j, and calculating the average value of Euclidean distances corresponding to class j in class feature prototype group p a and non-class j in p b;
When two sets of class feature prototype sets p a and p b select the old class feature prototype set FP i t and the new class feature prototype set UP i t of the edge i, the local prototype interval of the edge i class j is calculated according to the following formula
When the two sets of class feature prototype sets p a and p b select the old class feature prototype set FP i t and the global class feature prototype set GP t of the edge i, the global prototype interval μ i agg,t [ j ] of the edge i class j is calculated according to the following formula:
Wherein μ [ j ] represents the prototype interval of category j, including the local prototype interval of category j And global prototype spacing
9. The method for model building by federal type incremental learning based on stable feature prototypes according to claim 1, wherein the calculating the aggregate weight of each edge in the cloud according to the global prototyping interval and the local prototyping interval, and weighting and aggregating the prototype network of each edge according to the aggregate weight of each edge, to obtain the global model after model calibration specifically comprises:
summing the local prototype intervals under different categories of each side, and inputting the sum to the Sigmoid activation function to obtain the local credibility of the side
Summing the global prototype intervals under different categories of each side, and inputting the summed global prototype intervals into a Sigmoid activation function to obtain the global credibility of the side
Taking the local credibility of each sideAnd global credibilityThe average value of (2) is used as the aggregation weight of the edge end in the cloudThe prototype network of each side is weighted and aggregated according to the aggregation weight of each side to obtain a global model after model calibration
10. The steady feature prototype-based federal class incremental learning modeling method according to claim 1, wherein the step (5) specifically comprises:
After the collaborative optimization is completed, the global model finally received by the edge is used as a final prototype network, and the edge uses the data in the local class memory bank to input the data into the final prototype network Obtaining various characteristics, and calculating the average value of the characteristics as a class characteristic prototype;
For any sample (x, y) to be classified at the edge, it is input to the final prototype network Obtaining the corresponding characteristics of the sample;
And calculating Euclidean distance between the feature corresponding to the sample and each category feature prototype, and taking the category corresponding to the category feature prototype corresponding to the minimum Euclidean distance as a classification result.
CN202411062873.5A 2024-08-05 2024-08-05 A federated category incremental learning modeling method based on stable feature prototypes Active CN118586475B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411062873.5A CN118586475B (en) 2024-08-05 2024-08-05 A federated category incremental learning modeling method based on stable feature prototypes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411062873.5A CN118586475B (en) 2024-08-05 2024-08-05 A federated category incremental learning modeling method based on stable feature prototypes

Publications (2)

Publication Number Publication Date
CN118586475A CN118586475A (en) 2024-09-03
CN118586475B true CN118586475B (en) 2024-11-29

Family

ID=92526158

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411062873.5A Active CN118586475B (en) 2024-08-05 2024-08-05 A federated category incremental learning modeling method based on stable feature prototypes

Country Status (1)

Country Link
CN (1) CN118586475B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861936A (en) * 2022-05-10 2022-08-05 天津大学 A Federated Incremental Learning Method Based on Feature Prototypes
CN116089883A (en) * 2023-01-30 2023-05-09 北京邮电大学 A training method for improving the discrimination between old and new categories in incremental learning of existing categories

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200144398A (en) * 2019-06-18 2020-12-29 삼성전자주식회사 Apparatus for performing class incremental learning and operation method thereof
US20230281438A1 (en) * 2022-03-03 2023-09-07 NavInfo Europe B.V. Consistency-Regularization Based Approach for Mitigating Catastrophic Forgetting in Continual Learning
EP4303754A1 (en) * 2022-07-07 2024-01-10 Tata Consultancy Services Limited Prompt augmented generative replay via supervised contrastive training for lifelong intent detection
WO2024119422A1 (en) * 2022-12-08 2024-06-13 上海成电福智科技有限公司 Deep-neural-network-based class-incremental learning method for mobile phone radiation source spectrogram
CN116630718A (en) * 2023-06-08 2023-08-22 天津大学 A Prototype-Based Low Perturbation Image-like Incremental Learning Algorithm
CN117275098B (en) * 2023-11-13 2024-02-27 南京栢拓视觉科技有限公司 Federal increment method oriented to action recognition and based on topology data analysis
CN118072099A (en) * 2024-03-11 2024-05-24 桂林电子科技大学 Class increment learning method based on joint distillation playback strategy
CN118313445A (en) * 2024-04-29 2024-07-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) A federated incremental learning method and system based on constrained gradient updating

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861936A (en) * 2022-05-10 2022-08-05 天津大学 A Federated Incremental Learning Method Based on Feature Prototypes
CN116089883A (en) * 2023-01-30 2023-05-09 北京邮电大学 A training method for improving the discrimination between old and new categories in incremental learning of existing categories

Also Published As

Publication number Publication date
CN118586475A (en) 2024-09-03

Similar Documents

Publication Publication Date Title
US20220343172A1 (en) Dynamic, automated fulfillment of computer-based resource request provisioning using deep reinforcement learning
US10360500B2 (en) Two-phase distributed neural network training system
US10360517B2 (en) Distributed hyperparameter tuning system for machine learning
US20200401939A1 (en) Systems and methods for preparing data for use by machine learning algorithms
Joy et al. Batch Bayesian optimization using multi-scale search
US10963802B1 (en) Distributed decision variable tuning system for machine learning
US11625614B2 (en) Small-world nets for fast neural network training and execution
WO2022166125A1 (en) Recommendation system with adaptive weighted baysian personalized ranking loss
Liu et al. Generalising random forest parameter optimisation to include stability and cost
US20200372295A1 (en) Minimum-Example/Maximum-Batch Entropy-Based Clustering with Neural Networks
CN110390393A (en) Aspect of model screening technique and device, readable storage medium storing program for executing
US20210334704A1 (en) Method and System for Operating a Technical Installation with an Optimal Model
US11741101B2 (en) Estimating execution time for batch queries
CN112215655A (en) Client portrait label management method and system
US11468271B2 (en) Method of data prediction and system thereof
CN113656707A (en) Financing product recommendation method, system, storage medium and equipment
CN118586475B (en) A federated category incremental learning modeling method based on stable feature prototypes
CN108829846B (en) Service recommendation platform data clustering optimization system and method based on user characteristics
Bouchra Pilet et al. Simple, efficient and convenient decentralized multi-task learning for neural networks
US20230032822A1 (en) Systems and methods for adapting machine learning models
Truong et al. A flexible cluster-oriented alternative clustering algorithm for choosing from the Pareto front of solutions
CN115310709A (en) Power engineering project information optimization method based on particle swarm optimization
Meng et al. Adaptive resonance theory (ART) for social media analytics
CN114330118A (en) Data processing method and device and electronic equipment
EP4007173A1 (en) Data storage method, and data acquisition method and apparatus therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant