CN114841287B

CN114841287B - Training method of classification model, image classification method and device

Info

Publication number: CN114841287B
Application number: CN202210579153.0A
Authority: CN
Inventors: 刘彦宏; 蒋宁; 吴海英; 王洪斌
Original assignee: Mashang Consumer Finance Co Ltd
Current assignee: Mashang Consumer Finance Co Ltd
Priority date: 2022-05-26
Filing date: 2022-05-26
Publication date: 2024-07-16
Anticipated expiration: 2042-05-26
Also published as: CN114841287A

Abstract

The application discloses a training method of a classification model, an image classification method and an image classification device, and belongs to the technical field of deep learning. The training method of the neural network comprises the following steps: performing gradient iteration processing on the first image set to obtain a first countermeasure image set; inputting the first image set into a first target classification model to obtain a first prediction classification result; inputting the first image set and the first countermeasure image set into a second target classification model, performing model iterative training, and adjusting model parameters of the second target classification model based on the first prediction classification result, the loss between classification labels of the first image set, the second prediction classification result, the third prediction classification result and a first preset loss function; and obtaining a second target classification model until the convergence condition is met. According to the technical scheme, the problem that in the prior art, when robustness is improved by using the countermeasure sample, the accuracy of identifying the original sample by the model is reduced is solved.

Description

Training method of classification model, image classification method and device

Technical Field

The application belongs to the technical field of deep learning, and particularly relates to a training method of a classification model, an image classification method and an image classification device.

Background

In recent years, deep neural network technology has achieved great success in vision, speech, natural language, and the like. However, the neural network model is vulnerable to attack by the challenge sample, which refers to a slightly perturbed version of the initial input image, which is not visually significantly different from the initial input image, but can significantly alter the behavior of the neural network model, resulting in a less robust neural network model, such that the original recognition result is erroneous.

In some examples, training is performed using the challenge sample as a training sample to increase the robustness of the neural network, but to reduce the accuracy of the model in identifying the original sample.

Disclosure of Invention

The embodiment of the application provides a training method of a classification model, an image classification method and an image classification device, which can solve the problem that in the prior art, when robustness is improved by using an antagonistic sample, the accuracy of identifying an original sample by the model is reduced.

In a first aspect, a training method of a classification model is provided, including:

performing gradient iteration processing on the first image set to obtain a first countermeasure image set;

Inputting the first image set into a first target classification model to obtain a first prediction classification result;

Inputting the first image set and the first countermeasure image set into a second target classification model, performing model iterative training, and adjusting model parameters of the second target classification model based on the first prediction classification result, the classification label of the first image set, a second prediction classification result obtained by the second target classification model for the input first countermeasure image set, a third prediction classification result obtained by the second target classification model for the input first image set and a first preset loss function; and obtaining the second target classification model until the convergence condition of the second target classification model is met.

In a second aspect, there is provided an image classification method, comprising:

Acquiring an image to be classified;

Inputting the image to be classified into a classification model to obtain the image category of the image to be classified, wherein the classification model is a model obtained by training the training method of the classification model in the first aspect.

In a third aspect, a training device for a classification model is provided, including:

the first processing module is used for carrying out gradient iteration processing on the first image set to obtain a first countermeasure image set;

The second processing module is used for inputting the first image set into a first target classification model to obtain a first prediction classification result;

the training module is used for inputting the first image set and the first countermeasure image set into a second target classification model, performing model iterative training, and adjusting model parameters of the second target classification model based on the first prediction classification result, the classification label of the first image set, a second prediction classification result obtained by the second target classification model for the input first countermeasure image set, a third prediction classification result obtained by the second target classification model for the input first image set and a first preset loss function; and obtaining the second target classification model until the convergence condition of the second target classification model is met.

In a fourth aspect, there is provided an image classification apparatus comprising:

The acquisition module is used for acquiring the images to be classified;

The classification module is used for inputting the image to be classified into a classification model to obtain the image category of the image to be classified, wherein the classification model is a model obtained by training the training method of the classification model in the first aspect.

In a fifth aspect, there is provided an electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction when executed by the processor implementing the steps of the method as described in the first aspect or implementing the steps of the method as described in the second aspect.

In a sixth aspect, there is provided a readable storage medium having stored thereon a program or instructions which when executed by a processor, performs the steps of the method according to the first aspect or performs the steps of the method according to the second aspect.

In a seventh aspect, an embodiment of the present application provides a chip, the chip including a processor and a communication interface, the communication interface being coupled to the processor, the processor being configured to execute a program or instructions to implement the steps of the method according to the first aspect, or to implement the steps of the method according to the second aspect.

In the embodiment of the application, gradient iterative processing is firstly carried out on a first image set to obtain a first countermeasure image set, then the first image set is input into a first target classification model to obtain a first prediction classification result, the first image set and the first countermeasure image set are input into a second target classification model to carry out model iterative training, and model parameters of the second target classification model are adjusted until convergence conditions of the second target classification model are met to obtain the second target classification model based on the first prediction classification result, a classification label of the first image set, a second prediction classification result obtained by the second target classification model on the input first countermeasure image set, a third prediction classification result obtained by the second target classification model on the input first image set and a first preset loss function. That is, the first target classification model takes the first image set as input to obtain a first prediction classification result, the second target classification model takes the first contrast image set and the first image set as input to obtain a second prediction classification result and a third prediction classification result, and then the model parameters of the second target classification model are adjusted according to the first prediction classification result, the classification label of the first image set, the second prediction classification result, the third prediction classification result and the first preset loss function to obtain the second target classification model. According to the embodiment of the application, the first target classification model and the second target classification model are cooperatively trained, so that a reliable robustness model can be obtained, meanwhile, the first image set is considered as the input of the second target classification model in training, so that probability output distribution of the second target classification model aiming at the first image set and the first countermeasure image set is similar, generalization of the second target classification model can be effectively optimized, and identification accuracy of the second target classification model to the first image set is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a schematic view of an application environment of a training method, an image classification method and a device for classifying a model according to an embodiment of the present application;

FIG. 2 is a flow chart of a training method for a classification model according to an embodiment of the application;

FIG. 3 is a flow chart of a method of image classification provided by one embodiment of the application;

FIG. 4 is a schematic diagram of a training apparatus for classification models according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an image classification apparatus according to an embodiment of the present application;

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The training method, the image classification method and the device for the classification model provided by the embodiment of the application are described in detail through specific embodiments and application scenes thereof by combining the attached drawings.

The current training process of the image classification model is generally as follows: the original sample is used as the input of a conventional target image classification model, the countermeasure sample is used as the input of a classification model for carrying out robust recognition on the image, so that the robustness of the model is improved, and the training process improves the robustness of the model, but the recognition accuracy of the model on the original sample is reduced due to the fact that the countermeasure sample is used as the input. In order to solve the problems, the application provides a training method of a classification model, an image classification method and a device, wherein the first image set is used as the input of a first target classification model, and the first image set and the first countermeasure image set are used as the input of a second target classification model, so that the generalization capability of the second target classification model can be effectively optimized, and meanwhile, the recognition accuracy of the second target classification model to an original sample can be improved.

The training method, the image classification method and the device for the classification model can be applied to any scene needing image classification. For example, classifying traffic identification, park or farm classifying insects, etc. Each traffic identification category or each insect category may be defined with a category label.

In order to better understand the training method, the image classification method and the device of the classification model provided by the embodiment of the application, an application environment suitable for the embodiment of the application is described below.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating an application environment of a training method, an image classification method and an apparatus for classifying images according to an embodiment of the present application. The training method, the image classification method and the device for the classification model provided by the embodiment of the application can be applied to an image recognition system. The image recognition system 100 may be composed of the terminal device 110 and the server 120 in fig. 1, for example. Wherein the network is used as a medium to provide a communication link between terminal device 110 and server 120. The network may include various connection types, such as wired communication links, wireless communication links, and the like, as embodiments of the application are not limited in this regard.

It should be understood that the terminal device 110, server 120, and network in fig. 1 are merely illustrative. There may be any number of terminal devices, servers, and networks, as desired for implementation.

In some embodiments, the terminal device 110 may send the images to be identified to the server 120 through a network, and after the server 120 receives the images, the image classification method according to the embodiments of the present application may perform image identification on the corresponding images. For example, the terminal device 110 may also receive images sent by other devices, and send the images to the server 120 as images to be identified, and then perform image identification on the images to be identified in the server 120.

The server 120 may be a physical server, a server cluster formed by a plurality of servers, or the like, and the terminal device 110 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a wearable device, a smart speaker, or the like. It will be appreciated that embodiments of the present application may also allow multiple terminal devices 110 to access the server 120 simultaneously.

In some embodiments, the server 120 may process the image before the image is identified, including binarization, clipping, and so on, and before the image is identified, classification model training needs to be performed on the server 120, where the classification model training is performed according to the classification model training method provided by the scheme, so as to obtain a corresponding image classification model. And then carrying out image recognition (image classification) on the image to be recognized sent by the terminal by using the image classification model. It should be noted that, training of any model may be performed on the server 120, and in this embodiment, the model training method obtains the second target classification model by training on the server 120.

The above application environments are merely examples for facilitating understanding, and it is to be understood that embodiments of the present application are not limited to the above application environments.

The training method, the image classification method and the device for the classification model provided by the embodiment of the application are described in detail below through specific embodiments.

As shown in fig. 2, an embodiment of the present application provides a training method of a classification model, which may include the contents shown in S201 to S203.

In S201, a gradient iteration process is performed on the first image set, resulting in a first countermeasure image set.

The first image set X is an initial sample for training a classification model, and may include a plurality of first images, where any one of the first images in the first image set is labeled with a real class, and the types of the first images may be various, for example, traffic sign images, insect images, flower images, face images, animal images, and the like.

The gradient iteration may be obtaining a first contrast image set X' of the first image set by a gradient projection attack (projected gradient-based descent, PGD), the first contrast image set comprising a plurality of first contrast images, the first contrast image being an image of the first image after the contrast perturbation has been added.

In S202, a first image set is input into a first target classification model, resulting in a first prediction classification result.

In this embodiment, the first object classification model M may be a conventional object classification model, which is used for performing image recognition on an original image where no disturbance occurs. The first object classification model may be a traffic sign classification model for identifying traffic signs, or may be a flower classification model for identifying types of flowers, etc. The first image set is input into the first target classification model, and a first prediction classification result M (X) can be obtained, wherein M (X) is the probability that the first target classification model classifies X into a predefined class in the corresponding model classification.

In S203, inputting the first image set and the first countermeasure image set into a second target classification model, performing model iterative training, and adjusting model parameters of the second target classification model based on the first prediction classification result, the classification label of the first image set, the second prediction classification result obtained by the second target classification model for the input first countermeasure image set, the third prediction classification result obtained by the second target classification model for the input first image set, and the first preset loss function; and obtaining the second target classification model until the convergence condition of the second target classification model is met.

The second target classification model R may be a classification model that performs robust recognition on the countermeasure image, that is, a model that may better classify the image in which the disturbance occurs. Inputting the first countermeasure image set into a second target classification model to obtain a second prediction classification result R (X '), inputting the first image set into the second target classification model to obtain a third prediction classification result R (X), wherein R (X ') is the probability that the second target classification model classifies the first countermeasure image X ' into a predefined category in the corresponding model classification, R (X) is the probability that the second target classification model classifies the first image set X into the predefined category in the corresponding model classification, the second target classification model adjusts model parameters in the model training process, the model parameters are preset values at the initial stage of model training, the preset values can be set according to actual conditions or experience of technicians, and the model parameters can be set in other modes.

Further, in this embodiment, the model parameters of the second target classification model may be adjusted by calculating the loss between the obtained first prediction classification result and the classification label of the first image set, the loss between the second prediction classification result and the first prediction classification result, and the loss between the third prediction classification result and the second prediction classification result by using the first preset loss function until the convergence condition of the second target classification model is satisfied, and finally the second target classification model is obtained.

The convergence condition of the second target classification model may be that the number of times of model parameter adjustment of the second target classification model reaches a preset number of times, and at this time, the iterative training of the model may be ended, or the number of times of training of the second target classification model reaches a preset number of times of training, or the variation of model parameters of the second classification model is smaller than a preset variation, or other convergence conditions may be used.

It is worth noting that the loss between the first prediction classification result and the classification label of the first image set may be a cross entropy loss of the second target classification model. The loss between the second prediction classification result and the first prediction classification result, which are obtained by the second target classification model on the input first contrast image set, is used for restricting the output of the second target classification model on the first contrast image set to follow the classification boundary of the first target classification model on the first image set, and can be represented by using infinite norm distance or other norm distances, for example, the euclidean distance between vectors, and the like, and can be specifically determined according to practical application. The loss between the third prediction classification result and the second prediction classification result, which are obtained by the second target classification model on the input first image set, is used for restraining that probability output distribution of the second target classification model on the first image set and the first contrast image set is similar, and the probability output distribution can be represented by a KL (Kullback-Leibler) distance or a Wasserstein distance, and particularly can be determined according to practical application.

In one possible embodiment of the present application, performing gradient iterative processing on the first image set to obtain a first contrast image set may include: and carrying out gradient iteration processing on the first image set based on a second preset loss function to obtain a first countermeasure image set. The second preset loss function comprises a classification loss function and an image semantic loss function.

Further, gradient iteration processing is carried out on the first image set by utilizing the second target classification model and a second preset loss function, so that a first countermeasure image set is obtained.

In this embodiment, the difference between the obtained first countermeasure image set and the first image set is larger through the classification loss function, that is, the difference between the obtained first countermeasure image set and the first image set is larger through the classification loss function, and meanwhile, the first countermeasure image cannot be recognized by human eyes in the process that the difference between the first image set and the first image set is larger through adding semantic constraint, that is, the first countermeasure image set and the first image set are constrained to be more similar in terms of semantics through minimizing the semantic loss function, so that the human eyes look less in difference, and disturbance is not easily perceived.

In one possible embodiment of the present application, performing gradient iterative processing on the first image set based on the second preset loss function, to obtain a first contrast image set includes:

the method for generating the first intermediate challenge image comprises the following specific steps:

Inputting the first image into the second target classification model for gradient processing to obtain a first intermediate countermeasure image;

Calculating a minimum similarity loss between the first image and the first intermediate challenge image based on the image semantic loss function;

Calculating the maximum cross entropy loss between a fourth prediction classification result obtained by inputting the first intermediate countermeasure image into the second target classification model and the classification label of the first image based on the classification loss function;

adjusting the first intermediate challenge image according to the minimum similarity loss and the maximum cross entropy loss, and taking the adjusted first intermediate challenge image as a first image;

and repeatedly executing the step of generating the first intermediate countermeasure image until the gradient iteration processing condition is met, and taking the adjusted first intermediate countermeasure image as the first countermeasure image.

In the embodiment of the application, the second target classification model is utilized to perform iterative gradient processing on the first image, and the obtained first intermediate challenge image is adjusted through the classification loss function and the image semantic loss function until the gradient iterative processing condition is met, the last obtained first intermediate challenge image is taken as the first challenge image, and the gradient iterative processing condition can be that the gradient processing times meet the second preset times, the second preset function can meet the second preset threshold range, and the like.

According to the embodiment of the application, semantic loss between the first countermeasure image set and the first image set can be minimized by minimizing the similarity loss, human eyes look like the difference between the first countermeasure image set and the first image set is smaller, disturbance is not easy to perceive, the difference between the first countermeasure image set and the first image set is larger by maximizing the cross entropy loss, and the type difference which can be recognized in model recognition is larger, so that subsequent model training is facilitated.

Optionally, performing gradient iterative processing on the first image set based on the second preset loss function to obtain a first countermeasure image set may include: the first contrast image set X' is calculated by the following formula:

L′＝CE-Loss(R(X′),Y)-SSIM-Loss(X,X′)

wherein X 'is the first contrast image set, X' = { X ₁′,x₂′,...,x_m '}, m is the number of first images in the first image set, and initial X' =x; x is the first image set, x= { X ₁,x₂,...,x_m }; gamma is a first scalar parameter, ranging from [0,1]; l' is a second predetermined loss function; Gradient of the second preset loss function relative to the first contrast image set; CE-Loss (R (X'), Y) is a class Loss function; SSIM-Loss (X, X') is an image semantic Loss function; r is a second target classification model; y is the classification label for the first image set, y= { Y ₁,y₂,...,y_m }.

In one possible embodiment of the present application, adjusting the model parameters of the second target classification model based on the first prediction classification result, the classification label of the first image set, the second prediction classification result obtained by the second target classification model for the input first countermeasure image set, the third prediction classification result obtained by the second target classification model for the input first image set, and the first preset loss function may include: calculating the loss between the first prediction classification result and the classification label of the first image set by using a first preset loss function, the loss between the second prediction classification result and the first prediction classification result obtained by the second target classification model for the input first countermeasure image set, and the loss between the third prediction classification result and the second prediction classification result obtained by the second target classification model for the input first image set, and adjusting the model parameters of the second target classification model.

In this embodiment, the model parameters of the second target classification model may be adjusted based on the obtained loss between the first prediction classification result and the classification label of the first image set, the obtained loss between the second prediction classification result and the first prediction classification result, the obtained loss between the third prediction classification result and the second prediction classification result, and the first preset loss function until the convergence condition of the second target classification model is satisfied, and finally the second target classification model is obtained.

The convergence condition of the second target classification model may be that the number of times of model parameter adjustment of the second target classification model reaches a preset number of times, and at this time, model iterative training may be ended, or the variation of model parameters of the second classification model is smaller than a preset variation, or other convergence conditions may be used.

It should be noted that, the loss between the first prediction classification result and the classification label of the first image set may be a cross entropy loss of the second target classification model. The loss between the second prediction classification result and the first prediction classification result, which are obtained by the second target classification model on the input first contrast image set, is used for restricting the output of the second target classification model on the first contrast image set to follow the classification boundary of the first target classification model on the first image set, and can be represented by using infinite norm distance or other norm distances, for example, the euclidean distance between vectors, and the like, and can be specifically determined according to practical application. The loss between the third prediction classification result and the second prediction classification result, which are obtained by the second target classification model on the input first image set, is used for restraining that probability output distribution of the second target classification model on the first image set and the first contrast image set is similar, and the probability output distribution can be represented by a KL (Kullback-Leibler) distance or a Wasserstein distance, and particularly can be determined according to practical application.

In the embodiment of the application, the first preset loss function comprises the loss between the first prediction classification result and the classification label of the first image set, the loss between the second prediction classification result and the first prediction classification result obtained by the second target classification model on the input first countermeasure image set, and the loss between the third prediction classification result and the second prediction classification result obtained by the second target classification model on the input first image set, and then the model parameters of the second target classification model are adjusted according to the loss value of the first preset loss function, so that the second target classification model is more optimized, the loss value obtained by the first preset loss function is more and more similar to the preset variation, the first target classification model and the second target classification model are cooperatively trained, a reliable robust model (second target classification model) can be obtained, meanwhile, the input of the first image set as the second target classification model is considered in training, the generalization of the second target classification model can be effectively optimized, and the recognition rate of the second target classification model on the image is improved.

Optionally, adjusting the model parameters of the second object classification model may include: model parameters of the second target classification model are adjusted by the following formula:

L(x′_i,x_i,y_i)＝CE-Loss(M(x_i),y_i)+L_∞(M(x_i),R(x′_i))+D_KL(R(x′_i),R(x_i))

wherein θ is a model parameter of the second target classification model; alpha is a second scalar parameter, ranging from [0,1]; m is the number of first images in the first set of images; Gradient of model parameters of the first preset loss function relative to the second target classification model; l (x '_i,x_i,y_i) is a first preset loss function, x' _i is an ith first countermeasure image, x _i is an ith first image, and y _i is a classification label of the ith first image; CE-Loss (M (x _i),y_i) is the Loss between the first prediction classification result and the classification label of the first image set; L _∞(M(x_i),R(x′_i)) is the Loss between the second prediction classification result obtained by the second target classification model for the input first contrast image set and the first prediction classification result; d _KL(R(x′_i),R(x_i)) is a loss between a third prediction classification result and a second prediction classification result, which are obtained by the second target classification model for the input first image set; m is a first target classification model; r is a second object classification model.

As shown in fig. 3, the embodiment of the present application further provides an image classification method, where the model obtained by training by using the training method of the classification model provided in the above embodiment is classified, and the image classification method may include the contents shown in S301 to S302.

In S301, an image to be classified is acquired. The image to be classified may be a traffic sign or may be an insect in a park or farm, and is not limited in the present application.

In S302, an image to be classified is input into the classification model, and an image class of the image to be classified is obtained. The classification model is a model obtained by training according to the training method of the classification model provided by the embodiment.

The classification model is a model obtained by training according to the training method of the classification model provided in fig. 2, and the specific training process is referred to above and will not be described herein.

In the embodiment of the application, an image to be classified is firstly obtained, and then the image to be classified is input into a classification model to obtain the image category of the image to be classified. The classification model is a model obtained by training the training method of the classification model provided by the embodiment, has strong generalization capability, and meanwhile, the recognition accuracy of the image to be classified is high, so that the classification of the image category of the image to be classified is more accurate finally.

As shown in fig. 4, the embodiment of the present application further provides a training device for a classification model, where the training device for a classification model may include: a first processing module 401, a second processing module 402, and a training module 403.

The first processing module 401 is configured to perform gradient iterative processing on the first image set to obtain a first countermeasure image set; a second processing module 402, configured to input a first image set into a first target classification model to obtain a first prediction classification result; the training module 403 is configured to input the first image set and the first countermeasure image set into a second target classification model, perform model iterative training, and adjust model parameters of the second target classification model based on the first prediction classification result, the classification label of the first image set, a second prediction classification result obtained by the second target classification model for the input first countermeasure image set, a third prediction classification result obtained by the second target classification model for the input first image set, and a first preset loss function; and obtaining the second target classification model until the convergence condition of the second target classification model is met.

In the embodiment of the present application, first, the first processing module 401 performs gradient iterative processing on the first image set to obtain a first countermeasure image set, then the second processing module 402 inputs the first image set into the first target classification model to obtain a first prediction classification result, and finally the training module 403 inputs the first image set and the first countermeasure image set into the second target classification model to perform model iterative training, and based on the first prediction classification result, the loss between classification labels of the first image set, the second prediction classification result obtained by the second target classification model on the input first countermeasure image set, the third prediction classification result obtained by the second target classification model on the input first image set, and the first preset loss function, the model parameters of the second target classification model are adjusted until convergence conditions of the second target classification model are satisfied, so as to obtain the second target classification model. That is, the first target classification model takes the first image set as input to obtain a first prediction classification result, the second target classification model takes the first contrast image set and the first image set as input to obtain a second prediction classification result and a third prediction classification result, and then the model parameters of the second target classification model are adjusted according to the first prediction classification result, the classification label of the first image set, the second prediction classification result, the third prediction classification result and the first preset loss function to obtain the second target classification model. According to the embodiment of the application, the first target classification model and the second target classification model are cooperatively trained, so that a reliable robustness model can be obtained, meanwhile, the first image set is considered as the input of the second target classification model in training, so that probability output distribution of the second target classification model aiming at the first image set and the first countermeasure image set is similar, generalization of the second target classification model can be effectively optimized, and identification accuracy of the second target classification model to the first image set is improved.

Optionally, the first processing module 401 is configured to: and carrying out gradient iteration processing on the first image set based on a second preset loss function to obtain a first countermeasure image set, wherein the second preset loss function comprises a classification loss function and an image semantic loss function.

Optionally, the first processing module 401 is specifically configured to: the method for generating the first intermediate challenge image comprises the following specific steps: inputting the first image into a second target classification model for gradient processing to obtain a first intermediate countermeasure image; calculating a minimum similarity loss between the first image and the first intermediate challenge image based on the image semantic loss function; calculating the maximum cross entropy loss between a fourth prediction classification result obtained by inputting the first intermediate countermeasure image into the second target classification model and the classification label of the first image based on the classification loss function; adjusting the first intermediate countermeasure image according to the minimum similarity loss and the maximum cross entropy loss, and taking the adjusted first intermediate countermeasure image as a first image; the step of generating the first intermediate challenge image is repeatedly performed until the gradient iterative processing condition is satisfied, and the adjusted first intermediate challenge image is used as the first challenge image.

Optionally, the first processing module 401 is specifically configured to: the first contrast image set X' is calculated by the following formula:

L′＝CE-Loss(R(X′),Y)-SSIM-Loss(X,X′)

Wherein X 'is the first contrast image set, X' = { X '₁,x′₂,...,x′_m }, m is the number of first images in the first image set, and initial X' = X; x is the first image set, x= { X ₁,x₂,...,x_m }; gamma is a first scalar parameter; l' is a second predetermined loss function; Gradient of the second preset loss function relative to the first contrast image set; CE-Loss (R (X'), Y) is a class Loss function; SSIM-Loss (X, X') is an image semantic Loss function; r is a second target classification model; y is the classification label for the first image set, y= { Y ₁,y₂,...,y_m }.

Optionally, the training module 403 is configured to: calculating the loss between the first prediction classification result and the classification label of the first image set by using a first preset loss function, the loss between the second prediction classification result and the first prediction classification result obtained by the second target classification model for the input first countermeasure image set, and the loss between the third prediction classification result and the second prediction classification result obtained by the second target classification model for the input first image set, and adjusting the model parameters of the second target classification model.

Optionally, the training module 403 is specifically configured to: model parameters of the second target classification model are adjusted by the following formula:

wherein θ is a model parameter of the second target classification model; alpha is a second scalar parameter; m is the number of first images in the first set of images; Gradient of model parameters of the first preset loss function relative to the second target classification model; l (x '_i,x_i,y_i) is a first preset loss function, x' _i is an ith first countermeasure image, x _i is an ith first image, and y _i is a classification label of the ith first image; CE-Loss (M (x _i),y_i) is the Loss between the first prediction classification result and the classification label of the first image set; L _∞(M(x_i),R(x′_i)) is the Loss between the second prediction classification result and the first prediction classification result obtained by the second target classification model for the input first contrast image set; d _KL(R(x′_i),R(x_i)) is a loss between a third prediction classification result and a second prediction classification result, which are obtained by the second target classification model for the input first image set; m is a first target classification model; r is a second object classification model.

The training device of the neural network in the embodiment of the application can be a device, and can also be a component, an integrated circuit or a chip in a server.

The training device for the neural network provided by the embodiment of the present application can implement each process implemented by the method embodiment of fig. 2, and in order to avoid repetition, a description is omitted here.

As shown in fig. 5, an embodiment of the present application further provides an image classification apparatus, which may include: an acquisition module 501 and a classification module 502.

The acquiring module 501 is configured to acquire an image to be classified; the classification module 502 is configured to input an image to be classified into a classification model to obtain an image class of the image to be classified, where the classification model is a model obtained by training the training method of the classification model provided in the above embodiment.

The classification model is a model obtained by training the training method of the classification model provided in fig. 2, and the specific training process is referred to above and will not be described herein.

In the embodiment of the present application, the obtaining module 501 obtains an image to be classified first, and then the classifying module 502 inputs the image to be classified into the classifying model to obtain an image category of the image to be classified. The classification model is a model obtained by training the training method of the classification model provided by the embodiment, has strong generalization capability, and meanwhile, the recognition accuracy of the image to be classified is high, so that the classification of the image category of the image to be classified is more accurate finally.

The image classification device in the embodiment of the application can be a device, and can also be a component, an integrated circuit or a chip in electronic equipment.

The image classification device provided by the embodiment of the present application can implement each process implemented by the method embodiment of fig. 3, and in order to avoid repetition, a description thereof will not be repeated here.

Optionally, as shown in fig. 6, an embodiment of the present application further provides an electronic device 600, including a processor 601, a memory 602, and a program or an instruction stored in the memory 602 and capable of running on the processor 601, where the program or the instruction implements each process of the embodiment of the training method of the classification model or implements each process of the embodiment of the image classification method when executed by the processor 601, and the process can achieve the same technical effect, and is not repeated herein.

The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements the respective procedures of the embodiment of the training method of the classification model provided in any of the above embodiments, or implements the respective procedures of the embodiment of the image classification method. And the same technical effects can be achieved, and in order to avoid repetition, the description is omitted here.

Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.

The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled to the processor, and the processor is used for running a program or instructions, implementing each process of the embodiment of the training method of the classification model provided in the above embodiment, or implementing each process of the embodiment of the image classification method, and achieving the same technical effect, so that repetition is avoided and no redundant description is provided herein.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims

1. A method of training a classification model, comprising:

Inputting the first image set and the first countermeasure image set into a second target classification model, performing model iterative training, calculating the loss between the first prediction classification result and the classification label of the first image set by using a first preset loss function, the loss between the second prediction classification result obtained by the second target classification model for the input first countermeasure image set and the first prediction classification result, and the loss between the third prediction classification result obtained by the second target classification model for the input first image set and the second prediction classification result, and adjusting model parameters of the second target classification model;

And obtaining the second target classification model until the convergence condition of the second target classification model is met.

2. The training method of claim 1, wherein performing gradient iterative processing on the first image set to obtain a first contrast image set comprises:

And carrying out gradient iteration processing on the first image set based on a second preset loss function to obtain a first countermeasure image set, wherein the second preset loss function comprises a classification loss function and an image semantic loss function.

3. The training method of claim 2, wherein performing gradient iterative processing on the first image set based on a second preset loss function to obtain a first countermeasure image set includes:

Calculating maximum cross entropy loss between a fourth prediction classification result obtained by inputting the first intermediate countermeasure image into the second target classification model and the classification label of the first image based on the classification loss function;

4. A training method according to claim 3, wherein performing gradient iterative processing on the first image set based on a second preset loss function to obtain a first contrast image set comprises:

the first contrast image set X' is calculated by the following formula:

L′＝CE-Loss(R(X′),Y)-SSIM-Loss(X,X′)

Wherein X 'is the first contrast image set, X' = { X ₁′,x₂′,...,x_m '}, m is the number of first images in the first image set, and initial X' =x; x is the first image set, x= { X ₁,x₂,...,x_m }; gamma is a first scalar parameter; l' is a second predetermined loss function; Gradient of the second preset loss function relative to the first contrast image set; CE-Loss (R (X'), Y) is a class Loss function; SSIM-Loss (X, X') is an image semantic Loss function; r is a second target classification model; y is the classification label for the first image set, y= { Y ₁,y₂,...,y_m }.

5. Training method according to claim 1, characterized in that it comprises:

the adjusting the model parameters of the second target classification model includes adjusting the model parameters of the second target classification model by:

L(x_i′,x_i,y_i)＝CE-Loss(M(x_i),y_i)+L_∞(M(x_i),R(x_i′))+D_KL(R(x_i′),R(x_i))

wherein θ is a model parameter of the second target classification model; alpha is a second scalar parameter; m is the number of first images in the first set of images; Gradient of model parameters of the first preset loss function relative to the second target classification model; l (x _i′,x_i,y_i) is a first preset loss function, x _i' is an ith first contrast image, x _i is an ith first image, and y _i is a classification label of the ith first image; CE-Loss (M (x _i),y_i) is the Loss between the first prediction classification result and the classification label of the first image set; L _∞(M(x_i),R(x_i') is the Loss between the second prediction classification result obtained by the second target classification model for the input first contrast image set and the first prediction classification result; d _KL(R(x_i′),R(x_i)) is a loss between a third prediction classification result and a second prediction classification result, which are obtained by the second target classification model for the input first image set; m is a first target classification model; r is a second object classification model.

6. An image classification method, comprising:

Acquiring an image to be classified;

inputting the image to be classified into a classification model to obtain the image category of the image to be classified, wherein the classification model is a model obtained by training the training method of the classification model according to any one of claims 1-5.

7. An image classification apparatus, comprising:

The acquisition module is used for acquiring the images to be classified;

The classification module is used for inputting the image to be classified into a classification model to obtain the image category of the image to be classified, wherein the classification model is a model obtained by training the training method of the classification model according to any one of claims 1-5.

8. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method of any of claims 1-6.

9. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-6.