[go: up one dir, main page]

CN118014010B - Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models - Google Patents

Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models Download PDF

Info

Publication number
CN118014010B
CN118014010B CN202410418128.3A CN202410418128A CN118014010B CN 118014010 B CN118014010 B CN 118014010B CN 202410418128 A CN202410418128 A CN 202410418128A CN 118014010 B CN118014010 B CN 118014010B
Authority
CN
China
Prior art keywords
neural network
population
architecture
network architecture
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410418128.3A
Other languages
Chinese (zh)
Other versions
CN118014010A (en
Inventor
朱陈陈
薛羽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202410418128.3A priority Critical patent/CN118014010B/en
Publication of CN118014010A publication Critical patent/CN118014010A/en
Application granted granted Critical
Publication of CN118014010B publication Critical patent/CN118014010B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physiology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-objective evolutionary neural architecture searching method based on multiple group mechanisms and agent models in the technical field of automated machine learning, aiming at solving the problems that the labor cost is high, the efficiency is low and multi-objective scenes are difficult to consider when a convolutional neural network for various visual tasks is designed in the prior art; initializing an evolution search process through an initial population, and carrying out multi-target evolution search in a search space by combining a pre-trained agent model and a plurality of group mechanisms to obtain a candidate neural network architecture; and screening out the neural network architecture taking two optimization targets into consideration from the candidate neural network architectures. According to the invention, the searching process is accelerated through the proxy model, the diversity of solutions is expanded by using a plurality of group mechanisms, so that the efficient neural architecture searching is performed, and a group of network architectures considering a plurality of optimization targets can be obtained on the target data set.

Description

Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models
Technical Field
The invention relates to a multi-target evolutionary neural architecture searching method based on a plurality of group mechanisms and agent models, and belongs to the technical field of automatic machine learning.
Background
Convolutional neural networks (Convolutional Neural Network, CNN) have enjoyed tremendous success in a variety of computer vision tasks. However, conventional CNNs are typically designed manually by experts with a great deal of field knowledge and experience. Not every interested user has such expertise, and even for an expert, designing CNNs is a time-consuming and constantly-trial-and-error process. Neural architecture search (Neural Architecture Search, NAS) can simplify and automate the design of deep convolutional neural networks and can obtain neural structures that are more competitive than manual neural structures. In recent years, researchers have developed a number of NAS approaches, which have attracted increasing attention in the industry and academia in various learning tasks such as object detection, semantic segmentation, and natural language processing.
The evolution algorithm is a search method based on the evolution principle, and the optimal solution is searched through the evolution and selection of the population. NAS methods based on evolutionary algorithms are widely studied due to their global search capabilities and flexibility. However, a significant bottleneck in this area of NAS is the need to evaluate a large number of network architectures during the search, which requires a large consumption of computing resources. Many researchers have proposed many approaches to improve the efficiency of NAS algorithms, but existing acceleration methods still have to train a large number of network architectures or only pay attention to the absolute classification accuracy values.
In addition to the accuracy of neural networks, practical applications also require NAS approaches to find computationally efficient network architectures, e.g., low power consumption in mobile applications and low delay in autonomous driving. Maximizing accuracy and minimizing network complexity are two goals of competing nature, and therefore require multi-objective optimization of NAS. In the multi-objective evolution NAS, the diversity of the algorithm may be gradually lost along with the increase of the iteration times, so that the algorithm converges to the local optimal solution and falls into the local optimal solution. How to maintain diversity of populations without sacrificing convergence is a challenging problem.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, and provides a multi-target evolutionary neural architecture searching method based on multiple group mechanisms and proxy models, which accelerates the searching process through the proxy models, expands the diversity of solutions by using the multiple group mechanisms to perform efficient neural architecture searching, can obtain a group of network architectures considering multiple optimization targets on a target data set, improves the searching efficiency and reduces the consumption of computing resources.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
in one aspect, the present invention provides a multi-objective evolutionary neural architecture search method based on multiple group mechanisms and proxy models, comprising:
constructing a search space, and encoding a neural network architecture in the search space;
Determining two optimization targets, wherein the optimization targets are network complexity and classification precision respectively;
Carrying out population initialization on the neural network architecture in the search space according to the network complexity to obtain an initial population for evolution search;
Initializing an evolution search process through an initial population, and carrying out multi-target evolution search in a search space by combining a pre-trained agent model and a plurality of group mechanisms to obtain a candidate neural network architecture;
and screening the neural network architecture taking the network complexity and the classification accuracy into consideration.
Further, the construction method of the search space comprises the following steps:
setting search parameters;
searching a trunk part of the convolutional neural network according to the search parameters to obtain a convolutional neural network which meets the conditions, wherein all the convolutional neural networks which meet the conditions form a search space;
The trunk part of the convolutional neural network comprises five MBConvBlocks modules which are sequentially connected, wherein the MBConvBlocks module consists of a plurality of layers, and each layer adopts an inverted bottleneck structure;
The search parameters comprise the number of layers of each MBConvBlocks modules, the convolution kernel size of each layer, the expansion rate and the resolution.
Further, encoding the neural network architecture in the search space includes:
Coding a neural network architecture in a search space by utilizing an integer character string with a fixed length, wherein the coding content comprises resolution, architecture layer number and expected expansion rate of convolution kernel size;
When the coding length of the neural network architecture is smaller than the fixed length, the fixed length is achieved by padding zeros.
Further, initializing a neural network architecture in a search space according to network complexity to obtain an initial population, including:
Random sampling from search space A neural network architecture;
Calculating the complexity of each neural network architecture;
dividing all neural network architectures into segments according to the complexity level of each neural network architecture Each population, wherein each population comprises/>A neural network architecture;
Random sampling from each population The neural network architecture is combined to obtain an initial population, wherein the initial population comprises/>A neural network architecture, wherein >
Further, after initializing the population of the neural network architecture in the search space according to the network complexity to obtain the neural network architecture after the population initialization, the method further includes:
constructing a proxy model;
training the agent model by using the initial population, taking the neural network architecture in the initial population as input, and taking the comparison result of the merits of the neural network architecture as output to obtain the pre-trained agent model.
Further, the training process of the proxy model includes:
training the neural network architecture in the initial population by using a random gradient descent algorithm to obtain the classification precision of the neural network architecture, wherein each neural network architecture and the classification precision corresponding to each neural network architecture are taken as one sample to form an original training sample set;
pairing original training sample sets in sequence, namely Individual samples and rear remainder/>The samples are paired separately, if the/>The classification precision of each sample is superior to that of the paired samples, the label is marked as1, otherwise, the label is marked as 0, the sample with more 1 labels is obtained as the excellent sample, and the sample with more 0 labels is obtained as the inferior sample.
Further, initializing an evolution search process through an initial population, and performing multi-target evolution search in a search space by combining a pre-trained agent model and a plurality of group mechanisms to obtain a candidate neural network architecture, wherein the method comprises the following steps:
S1, non-dominant sorting and crowding degree calculation are carried out on a neural network architecture in an initial population, and a main population and a sub population are obtained;
S2, calculating a threshold value of the evolution of the current round, wherein the expression is as follows:
Wherein, Representing a threshold value/>Representing a random access function,/>For randomly generated/>Number between,/>Representing the number of evolutions,/>,/>Is the total evolution times;
S3, comparing the threshold value with a preset super parameter, if the threshold value is larger than the super parameter, taking the main population and the auxiliary population as parent individuals together, and if the threshold value is smaller than the super parameter, taking the main population as the parent individuals; crossing and mutating parent individuals to generate a series of child individuals;
S4, processing through a pre-trained agent model to obtain classification precision ranking of the child individuals, and calculating complexity of the child individuals; non-dominant ranking and congestion degree calculation are carried out on child individuals according to classification precision ranking and complexity, and reservation is carried out A child generation individual;
S5, will And merging the sub-generation individuals with the initial population, and repeating S1-S5 until the total evolution times are reached, so as to obtain the candidate neural network architecture.
Further, performing non-dominant ranking and congestion degree calculation on the neural network architecture in the initial population to obtain a main population and a sub population, including:
Non-dominated sorting is carried out on the neural network architecture in the initial population, and a first level in the non-dominated sorting is used as a main population;
removing the main population, calculating the congestion degree of the rest neural network architecture, and selecting the highest congestion degree ranking The individual neural network architecture acts as a sub-population.
Further, non-dominant ranking and crowding degree calculation are carried out on child individuals according to classification precision ranking and complexity, and reservation is carried outA child, comprising:
Non-dominant ranking is carried out by utilizing a rapid non-dominant algorithm according to classification precision ranking and complexity of the child individuals, and a non-dominant ranking result of the child individuals is obtained;
Selecting child individuals with non-dominant ranking results at a first level to calculate the degree of congestion, obtaining a congestion degree calculation result, and reserving the front of the congestion degree calculation result Sub-individuals.
Further, screening the neural network architecture considering two optimization targets of network complexity and classification accuracy from the candidate neural network architectures, including:
Non-dominant ranking is carried out on the candidate neural network architecture, and a non-dominant ranking result is obtained;
And selecting a plurality of neural network architectures with non-dominant sequencing results positioned at the front part, and training the neural network architectures by using a random gradient descent algorithm to obtain the neural network architecture considering two optimization targets.
In another aspect, the present invention further provides a computer program product, including a computer program, where the computer program when executed by a processor implements the steps of the multi-objective evolutionary neural architecture searching method based on multiple swarm mechanisms and agent models described in any of the above.
Compared with the prior art, the invention has the beneficial effects that:
According to the invention, the agent model is constructed based on the pairwise comparison relation, and the precision ranking of the system structure is predicted instead of the absolute precision value of the network structure, so that the search is more efficient; a plurality of group mechanisms are provided, the main population and the auxiliary population cooperate with each other in the searching process, the main population is dominant and evolved, the auxiliary population expands diversity, the algorithm can be effectively prevented from being trapped into local optimum, and the convergence rate of the algorithm is accelerated;
in the searching process, the agent model is used, so that the time consumption can be greatly reduced, the searching efficiency is improved, and meanwhile, the diversity of solutions can be expanded by using various group mechanisms, and the algorithm is prevented from being trapped into local optimum.
Drawings
FIG. 1 is a flow chart of a multi-objective evolutionary neural architecture search method based on multiple swarm mechanisms and agent models in an embodiment of the invention;
FIG. 2 is a schematic diagram of a search space construction flow of a multi-objective evolutionary neural architecture search method based on multiple group mechanisms and proxy models in an embodiment of the present invention;
FIG. 3 is a schematic diagram of architecture coding flow of a multi-objective evolutionary neural architecture search method based on multiple group mechanisms and proxy models in an embodiment of the present invention;
Fig. 4 is a schematic diagram of a proxy model processing flow of a multi-objective evolutionary neural architecture search method based on multiple group mechanisms and proxy models according to an embodiment of the invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
Example 1
As shown in FIG. 1, the multi-objective evolutionary neural architecture searching method based on multiple group mechanisms and agent models provided by the embodiment of the invention comprises the following steps of
The first step, constructing a proper search space, and coding a neural network architecture in the search space:
The convolutional neural network for searching comprises three parts, namely a starting part, a trunk part and an output part, wherein the starting part extracts characteristics, the output part outputs categories, the two parts do not need searching, the trunk part needs searching, the trunk part of the convolutional neural network comprises five MBConvBlocks modules (modules with inverse linear bottleneck structures of depth separable convolution) which are sequentially connected, each MBConvBlocks module consists of a series of layers, and each layer adopts an inverse bottleneck structure and comprises three parts: one or more of Convolution, one depth separable convolution, one/>And (5) convolution.
Search parameters including the number of layers (depth) of each Block, the convolution kernel size of each layer, the expansion ratio, and the resolution (input image size) are set. In this embodiment, the candidate range of the number of layers is {2,3,4}, the expansion ratio is selected from {3,4,6}, and the kernel size is selected from {3,5,7 }. Further, the candidate range of the input image size is from 192 to 256, and the step size is 4. As shown in fig. 2, the trunk part of the convolutional neural network is searched according to the search parameters to obtain the convolutional neural network which meets the conditions, and all the convolutional neural networks which meet the conditions form a search space.
As shown in fig. 3, the neural network architecture in the search space is encoded with a fixed length integer string, which in this embodiment is 46. The encoded content includes resolution, architecture layer number, convolution kernel size, and expected expansion rate.
If the coding length of the architecture with fewer layers is less than 46 bits, a fixed length is achieved by padding zeros.
Secondly, determining an optimization target:
in the present embodiment, two optimization targets are determined, which are the network complexity (model size, calculation amount, etc.) and classification accuracy, respectively.
Thirdly, initializing a population of the neural network architecture in the search space according to network complexity to obtain an initial population for evolution search:
Random sampling from search space And calculating the complexity of each neural network architecture. The complexity of the network can be generally measured by the parameters, the calculated amount, the network delay and the like of the network, and the calculated amount is selected to be used as a second target, namely FLOPs (floating point operation times), and the calculation process comprises two parts of a convolution layer and a full connection layer, and for the convolution layer, the calculated amount is expressed as follows:
Wherein, Representing the number of input channels,/>Representing the number of output channels and the number of convolution kernels of that layer,/>For convolution kernel size,/>And/>Representing the length and width of the feature map, respectively.
For the fully connected layer, the calculation formula is as follows:
Wherein, To input the number of neurons,/>To output the number of neurons.
Dividing all neural network architectures into segments according to the complexity level of each neural network architectureEach population, wherein each population comprises/>A neural network architecture.
Random sampling from each populationThe neural network architecture is combined to obtain an initial population, and the initial population comprises/>A neural network architecture, wherein >
Fourth, constructing a proxy model and training the proxy model:
a proxy model is constructed, and in this embodiment, a support vector machine (Support Vector Machines, SVM) model is selected as the proxy model.
Training the agent model by using the initial population, taking the neural network architecture in the initial population as input, and taking the comparison result of the merits of the neural network architecture as output to obtain the pre-trained agent model. With reference to fig. 4, a specific training process is as follows:
training the neural network architecture in the initial population by using a random gradient descent algorithm to obtain the classification precision of the neural network architecture, wherein each neural network architecture and the classification precision corresponding to each neural network architecture are taken as one sample to form an original training sample set;
pairing original training sample sets in sequence, namely Individual samples and rear remainder/>The samples are paired separately, if the/>The classification precision of each sample is superior to that of the paired samples, the label is marked as1, otherwise, the label is marked as 0, the sample with more 1 labels is obtained as the excellent sample, and the sample with more 0 labels is obtained as the inferior sample.
Taking the 1 st and 2 nd neural network architecture as an example: first architecture and remainderThe architectures are respectively paired to obtainIf the classification accuracy value of the first architecture is better than that of the other architecture, the label is marked as 1, otherwise, the label is marked as 0. The first architecture is then removed from the second architecture, and the remainder/>The fabric pairs and tags.
Finally can obtainAnd paired samples, the paired samples forming a training data set.
Initializing an evolution search process through an initial population, and carrying out multi-target evolution search in a search space by combining a pre-trained agent model and a plurality of group mechanisms to obtain a candidate neural network architecture:
S1, non-dominated sorting is conducted on the neural network architecture in the initial population, and a first level in the non-dominated sorting is used as a main population. Removing the main population, calculating the congestion degree of the rest neural network architecture, and selecting the highest congestion degree ranking The individual neural network architecture acts as a sub-population.
S2, calculating a threshold value of the evolution of the current round, wherein the expression is as follows:
Wherein, Representing a threshold value/>Representing a random access function,/>For randomly generated/>Number between,/>Representing the number of evolutions,/>,/>Is the total evolution times.
S3, comparing the threshold value with a preset super parameter, if the threshold value is larger than the super parameter, taking the main population and the auxiliary population as parent individuals together, and if the threshold value is smaller than the super parameter, taking the main population as the parent individuals; a series of child individuals are generated for the parent individuals through crossover and mutation operations, and in the embodiment, crossover operators use two-point crossover, and mutation operators use polynomial mutation.
S4, obtaining classification precision ranking of the child individuals through pre-trained agent model processing, and calculating complexity of the child individuals.
And performing non-dominant ranking by using a rapid non-dominant algorithm according to the classification precision ranking and the complexity of the child individuals to obtain non-dominant ranking results of the child individuals.
Selecting child individuals with non-dominant ranking results at a first level to calculate the degree of congestion, obtaining a congestion degree calculation result, and reserving the front of the congestion degree calculation resultSub-individuals.
S5, willAnd merging the sub-generation individuals with the initial population, and repeating the steps S1-S6 until the total evolution times are reached, so as to obtain the candidate neural network architecture.
End of evolution reservation per roundSub-individuals may decode and train them and then supplement the training dataset for training the proxy model.
Step six, screening out a neural network architecture taking account of two optimization targets of network complexity and classification accuracy from candidate neural network architectures:
And performing non-dominant ranking on the candidate neural network architecture to obtain a non-dominant ranking result. And selecting a plurality of neural network architectures with non-dominant sequencing results positioned at the front part, and training the neural network architectures by using a random gradient descent algorithm to obtain the neural network architecture considering two optimization targets.
Example 2:
On the basis of embodiment 1, this embodiment provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, implements the steps of the multi-population mechanism and proxy model-based multi-objective evolutionary neural architecture searching method in embodiment 1.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims (6)

1.基于多种群机制及代理模型的多目标演化神经架构搜索方法,其特征在于,包括:1. A multi-objective evolutionary neural architecture search method based on a multi-population mechanism and an agent model, characterized by comprising: 构建搜索空间,包括:设置搜索参数,根据搜索参数对卷积神经网络的主干部分进行搜索,得到符合条件的卷积神经网络,所有符合条件的卷积神经网络构成搜索空间;Constructing a search space, including: setting search parameters, searching the backbone of the convolutional neural network according to the search parameters, obtaining a convolutional neural network that meets the conditions, and all convolutional neural networks that meet the conditions constitute a search space; 所述卷积神经网络的主干部分包括五个依次连接的MBConvBlocks模块,所述MBConvBlocks模块由多个层组成,每一层均采用倒置瓶颈结构;所述搜索参数包括每个MBConvBlocks模块的层数、每一层的卷积核大小、扩展率和分辨率,所述分辨率即为输入图像大小;The backbone of the convolutional neural network includes five MBConvBlocks modules connected in sequence, each of which is composed of multiple layers, and each layer adopts an inverted bottleneck structure; the search parameters include the number of layers of each MBConvBlocks module, the convolution kernel size of each layer, the expansion rate and the resolution, and the resolution is the input image size; 对搜索空间中的神经网络架构进行编码;Encode the neural network architecture in the search space; 确定两个优化目标,所述优化目标分别为网络复杂度和分类精度;Determine two optimization goals, the optimization goals are network complexity and classification accuracy; 根据网络复杂度对搜索空间中的神经网络架构进行种群初始化,得到用于演化搜索的初始种群;Initialize the population of neural network architectures in the search space according to the network complexity to obtain the initial population for evolutionary search; 通过初始种群来初始化演化搜索过程,并结合预训练的代理模型和多种群机制在搜索空间内进行多目标演化搜索,得到候选神经网络架构,包括:The evolutionary search process is initialized through the initial population, and a multi-objective evolutionary search is performed in the search space in combination with the pre-trained proxy model and the multi-population mechanism to obtain candidate neural network architectures, including: S1、对初始种群中的神经网络架构进行非支配排序和拥挤度计算,得到主种群和副种群,包括:S1. Perform non-dominated sorting and crowding calculation on the neural network architectures in the initial population to obtain the main population and the secondary population, including: 对初始种群中的神经网络架构进行非支配排序,将非支配排序中的第一层级作为主种群;Perform non-dominated sorting on the neural network architectures in the initial population, and use the first level in the non-dominated sorting as the main population; 除去主种群,对剩余神经网络架构进行拥挤度计算,选择拥挤度排名最高的个神经网络架构作为副种群;Excluding the main population, the crowding of the remaining neural network architectures is calculated, and the one with the highest crowding ranking is selected. A neural network architecture as a sub-population; S2、计算本轮演化的阈值,其表达式如下:S2. Calculate the threshold of this round of evolution. The expression is as follows: ; 其中,表示阈值,/>为随机产生的/>之间的数,/>表示演化次数,/>为总演化次数;in, Indicates the threshold value, /> Randomly generated /> The number between, /> Indicates the number of evolutions, /> is the total number of evolutions; S3、将阈值与预设的超参数进行比较,若阈值大于超参数,则将主种群、副种群共同作为父代个体,若阈值小于超参数,则将主种群作为父代个体;对父代个体进行交叉、变异操作生成一系列子代个体;S3. Compare the threshold with the preset hyperparameter. If the threshold is greater than the hyperparameter, the main population and the secondary population are taken as parent individuals. If the threshold is less than the hyperparameter, the main population is taken as the parent individual. Perform crossover and mutation operations on the parent individuals to generate a series of offspring individuals. S4、通过预训练的代理模型处理得到子代个体的分类精度排名,并计算子代个体的复杂度;根据子代个体的分类精度排名和复杂度对其进行非支配排序和拥挤度计算,保留个子代个体,包括:S4. Obtain the classification accuracy ranking of the offspring individuals through the pre-trained proxy model, and calculate the complexity of the offspring individuals; perform non-dominated sorting and crowding calculation on the offspring individuals according to their classification accuracy ranking and complexity, and retain Offspring individuals, including: 根据子代个体的分类精度排名和复杂度大小利用快速非支配算法进行非支配排序,得到子代个体的非支配排序结果;According to the classification accuracy ranking and complexity of the offspring individuals, the fast non-dominated algorithm is used to perform non-dominated sorting, and the non-dominated sorting result of the offspring individuals is obtained; 选择非支配排序结果处于第一层级的子代个体计算拥挤度,得到拥挤度计算结果,保留拥挤度计算结果靠前的个子代个体;Select the offspring individuals in the first level of the non-dominated sorting result to calculate the crowding degree, obtain the crowding degree calculation result, and retain the crowding degree calculation result at the top offspring individuals; S5、将个子代个体与初始种群合并,重复S1~S5,直至达到总演化次数,得到候选神经网络架构;S5. The offspring individuals are merged with the initial population, and S1-S5 are repeated until the total number of evolutions is reached to obtain the candidate neural network architecture; 从候选神经网络架构中筛选出兼顾网络复杂度和分类精度两个优化目标的神经网络架构。From the candidate neural network architectures, a neural network architecture that takes into account both network complexity and classification accuracy is selected. 2.根据权利要求1所述的基于多种群机制及代理模型的多目标演化神经架构搜索方法,其特征在于,对搜索空间中的神经网络架构进行编码,包括:2. The multi-objective evolutionary neural architecture search method based on a multi-population mechanism and an agent model according to claim 1, characterized in that encoding the neural network architecture in the search space comprises: 利用固定长度的整数字符串对搜索空间中的神经网络架构进行编码,编码内容包括分辨率、模块层数、卷积核大小和扩展率;Use a fixed-length integer string to encode the neural network architecture in the search space, including resolution, number of module layers, convolution kernel size, and expansion rate; 当神经网络架构的编码长度小于固定长度时,则通过填充零以达到固定长度。When the encoding length of the neural network architecture is less than the fixed length, it is padded with zeros to reach the fixed length. 3.根据权利要求1所述的基于多种群机制及代理模型的多目标演化神经架构搜索方法,其特征在于,根据网络复杂度对搜索空间中的神经网络架构进行种群初始化,得到初始种群,包括:3. The multi-objective evolutionary neural architecture search method based on a multi-population mechanism and an agent model according to claim 1 is characterized in that the neural network architecture in the search space is initialized according to the network complexity to obtain an initial population, including: 从搜索空间中随机采样个神经网络架构;Randomly sample from the search space A neural network architecture; 计算各个神经网络架构的复杂度;Calculate the complexity of each neural network architecture; 根据每个神经网络架构的复杂度大小将所有神经网络架构划分为个种群,其中,每个种群中包括/>个神经网络架构;All neural network architectures are divided into populations, each of which includes/> A neural network architecture; 从每个种群中随机采样个神经网络架构,并对其进行合并得到初始种群,所述初始种群包括/>个神经网络架构,其中,/>Randomly sample from each population neural network architectures, and merge them to obtain an initial population, the initial population includes/> A neural network architecture, where . 4.根据权利要求1所述的基于多种群机制及代理模型的多目标演化神经架构搜索方法,其特征在于,在根据网络复杂度对搜索空间中的神经网络架构进行种群初始化,得到用于演化搜索的初始种群之后,所述方法还包括:4. The multi-objective evolutionary neural architecture search method based on a multi-population mechanism and an agent model according to claim 1 is characterized in that after initializing the population of the neural network architecture in the search space according to the network complexity to obtain the initial population for evolutionary search, the method further comprises: 构建代理模型;Building agent models; 利用初始种群对代理模型进行训练,以初始种群中的神经网络架构作为输入,神经网络架构的优劣比较结果作为输出,得到预训练的代理模型;The proxy model is trained using the initial population, with the neural network architecture in the initial population as input and the comparison result of the neural network architecture as output, to obtain a pre-trained proxy model; 所述代理模型的训练过程包括:The training process of the proxy model includes: 利用随机梯度下降算法对初始种群中的神经网络架构进行训练,得到神经网络架构的分类精度,每个神经网络架构及其对应的分类精度作为一个样本,组成原始训练样本集;The neural network architecture in the initial population is trained using the stochastic gradient descent algorithm to obtain the classification accuracy of the neural network architecture. Each neural network architecture and its corresponding classification accuracy are used as a sample to form the original training sample set. 对原始训练样本集按顺序进行两两配对,即第个样本与后方剩余的/>个样本分别进行配对,若第/>个样本的分类精度优于配对样本,则标签记为1,否则标签记为0,得到1标签数多的样本即为优样本,0标签数多的样本即为劣样本。The original training sample sets are paired in order, that is, samples and the remaining /> The samples are paired separately. If the If the classification accuracy of a sample is better than that of the paired sample, the label is recorded as 1, otherwise the label is recorded as 0. The samples with more 1 labels are good samples, and the samples with more 0 labels are bad samples. 5.根据权利要求1所述的基于多种群机制及代理模型的多目标演化神经架构搜索方法,其特征在于,从候选神经网络架构中筛选出兼顾网络复杂度和分类精度两个优化目标的神经网络架构,包括:5. The multi-objective evolutionary neural architecture search method based on a multi-population mechanism and an agent model according to claim 1 is characterized in that a neural network architecture that takes into account both network complexity and classification accuracy optimization objectives is selected from the candidate neural network architectures, comprising: 对候选神经网络架构进行非支配排序,得到非支配排序结果;Perform non-dominated sorting on the candidate neural network architectures to obtain a non-dominated sorting result; 选择非支配排序结果位于前部的多个神经网络架构,利用随机梯度下降算法对其进行训练,得到兼顾两个优化目标的神经网络架构。Multiple neural network architectures with non-dominated sorting results at the front are selected, and the stochastic gradient descent algorithm is used to train them to obtain a neural network architecture that takes into account both optimization objectives. 6.一种计算机程序产品,包括计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1~5中任一项所述的基于多种群机制及代理模型的多目标演化神经架构搜索方法的步骤。6. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the steps of the multi-objective evolutionary neural architecture search method based on multiple population mechanisms and agent models described in any one of claims 1 to 5 are implemented.
CN202410418128.3A 2024-04-09 2024-04-09 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models Active CN118014010B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410418128.3A CN118014010B (en) 2024-04-09 2024-04-09 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410418128.3A CN118014010B (en) 2024-04-09 2024-04-09 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models

Publications (2)

Publication Number Publication Date
CN118014010A CN118014010A (en) 2024-05-10
CN118014010B true CN118014010B (en) 2024-06-18

Family

ID=90958028

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410418128.3A Active CN118014010B (en) 2024-04-09 2024-04-09 Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models

Country Status (1)

Country Link
CN (1) CN118014010B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118821905B (en) * 2024-09-18 2024-11-29 南京信息工程大学 Agent model-assisted evolutionary generative adversarial network architecture search method and system
CN118917389B (en) * 2024-10-11 2024-12-20 南京信息工程大学 A diffusion model architecture search method and system based on attention mechanism

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11544536B2 (en) * 2018-09-27 2023-01-03 Google Llc Hybrid neural architecture search
CN112561027B (en) * 2019-09-25 2025-02-07 华为技术有限公司 Neural network architecture search method, image processing method, device and storage medium
US11620487B2 (en) * 2019-12-31 2023-04-04 X Development Llc Neural architecture search based on synaptic connectivity graphs
CN111275172B (en) * 2020-01-21 2023-09-01 复旦大学 Feedforward neural network structure searching method based on search space optimization
GB2599137A (en) * 2020-09-25 2022-03-30 Samsung Electronics Co Ltd Method and apparatus for neural architecture search
CN112784949B (en) * 2021-01-28 2023-08-11 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Neural network architecture searching method and system based on evolutionary computation
US20220035878A1 (en) * 2021-10-19 2022-02-03 Intel Corporation Framework for optimization of machine learning architectures
CN116151319A (en) * 2021-11-22 2023-05-23 华为技术有限公司 Method and device for searching neural network integration model and electronic equipment
CN114662593B (en) * 2022-03-24 2025-01-28 江南大学 An image classification method based on genetic algorithm and partitioned data set
CN115222046A (en) * 2022-07-22 2022-10-21 南京信息工程大学 Neural network structure search method, device, electronic device and storage medium
CN116108384A (en) * 2022-12-26 2023-05-12 南京信息工程大学 A neural network architecture search method, device, electronic equipment and storage medium
CN116258165A (en) * 2023-02-14 2023-06-13 河北工业大学 A multi-objective neural architecture search method integrating convolution and self-attention
CN116611504A (en) * 2023-04-25 2023-08-18 东北大学 Neural architecture searching method based on evolution
CN117611974B (en) * 2024-01-24 2024-04-16 湘潭大学 Image recognition method and system based on multi-population alternating evolution neural structure search

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Novel Approach to Detecting Muscle Fatigue Based on sEMG by Using Neural Architecture Search Framework;Wang, SR等;《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》;20211225;第34卷(第8期);4932-494 *
Design of smart Home Implementation Within IoT Natural Language Interface;Kin, TY等;《IEEE ACESS》;20200730(第8期);84929-84949 *

Also Published As

Publication number Publication date
CN118014010A (en) 2024-05-10

Similar Documents

Publication Publication Date Title
Baymurzina et al. A review of neural architecture search
CN118014010B (en) Multi-objective evolutionary nerve architecture searching method based on multiple group mechanisms and agent models
CN108334949B (en) An image classifier construction method based on fast evolution of optimized deep convolutional neural network structure
CN112465120A (en) Fast attention neural network architecture searching method based on evolution method
CN110427965A (en) Convolutional neural networks structural reduction and image classification method based on evolution strategy
CN111476285B (en) A training method for an image classification model, an image classification method, and a storage medium
CN114373101A (en) Image classification method for neural network architecture search based on evolution strategy
CN114118369B (en) Image classification convolutional neural network design method based on group intelligent optimization
CN112084877B (en) Remote Sensing Image Recognition Method Based on NSGA-NET
CN113128432B (en) Machine vision multitask neural network architecture searching method based on evolution calculation
CN105956093A (en) Individual recommending method based on multi-view anchor graph Hash technology
WO2022126448A1 (en) Neural architecture search method and system based on evolutionary learning
CN117150026B (en) Text content multi-label classification method and device
CN113282747A (en) Text classification method based on automatic machine learning algorithm selection
CN114241267A (en) Structural entropy sampling-based multi-target architecture search osteoporosis image identification method
CN117253037A (en) Semantic segmentation model structure searching method, automatic semantic segmentation method and system
CN118821905B (en) Agent model-assisted evolutionary generative adversarial network architecture search method and system
CN113920514A (en) Target detection-oriented high-efficiency evolutionary neural network architecture searching method
CN116611504A (en) Neural architecture searching method based on evolution
CN116796797A (en) Network architecture search method, image classification method, device and electronic device
CN115795035A (en) Science and technology service resource classification method and system based on evolutionary neural network and computer readable storage medium thereof
CN113111308B (en) Symbolic regression method and system based on data-driven genetic programming algorithm
CN109299725A (en) A prediction system and device for parallel realization of high-order principal eigenvalue decomposition based on tensor chain
CN109918659B (en) Method for optimizing word vector based on unreserved optimal individual genetic algorithm
CN113704570A (en) Large-scale complex network community detection method based on self-supervision learning type evolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant