CN113240013A

CN113240013A - Model training method, device and equipment based on sample screening and storage medium

Info

Publication number: CN113240013A
Application number: CN202110536342.5A
Authority: CN
Inventors: 陈筱; 钱江; 庄伯金
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2021-08-10

Abstract

The application is suitable for the technical field of artificial intelligence, and provides a model training method, a device, equipment and a storage medium based on sample screening, wherein the method comprises the following steps: inputting the training data set into a first classifier to obtain an output first classification result, taking sample data belonging to a target class in the first classification result as a first abandoned sample, and removing the first abandoned sample from the training data set to obtain a first updated training data set; inputting the first updated training data set into a second classifier to obtain an output second classification result, taking sample data belonging to a target class in the second classification result as a second abandoned sample, removing the second abandoned sample from the first updated training data set to obtain a second updated training data set, and repeating the steps until the classification result output by the classifier meets the model convergence requirement; and cascading the classifiers to obtain a target classification discrimination model, and improving the classification precision of the model.

Description

Model training method, device and equipment based on sample screening and storage medium

Technical Field

The application belongs to the technical field of artificial intelligence, and particularly relates to a model training method, a model training device, model training equipment and a storage medium based on sample screening.

Background

Data imbalance is one of the problems in the classification task that the classification performance is seriously affected, so that solving the sample imbalance is always an industry hotspot.

In the existing data classification process, due to the fact that data in each sample class are unevenly distributed, in the model training process, samples which are easy to separate under current network parameters are changed into difficultly-separated samples after network parameters are modified, the difficultly-separated samples are also changed, the difficultly-separated and easily-separated samples are unstable, the model training process vibrates and cannot be converged, the training process of the model is further influenced, and the classification performance of the trained model is influenced.

Disclosure of Invention

The embodiment of the application provides a model training method, a model training device, model training equipment and a storage medium based on sample screening, and aims to solve the problems that in the prior art, due to the fact that the distribution of data in each sample class is unbalanced, the difficult and unstable samples which are easy to separate cause the model training process to vibrate and cannot be converged, and the model training process and the classification performance of a trained model are influenced.

A first aspect of an embodiment of the present application provides a model training method based on sample screening, including:

acquiring a training data set;

inputting the training data set into a first classifier to obtain an output first classification result, wherein the first classification result comprises classes to which different sample data in the training data set belong;

taking the sample data belonging to the target category in the first classification result as a first abandoned sample, and removing the first abandoned sample from the training data set to obtain a first updated training data set;

inputting the first updated training data set into a second classifier to obtain an output second classification result, wherein the second classification result comprises classes to which different sample data in the first updated training data set belong;

taking the sample data belonging to the target class in the second classification result as a second abandoned sample, removing the second abandoned sample from the first updated training data set to obtain a second updated training data set, and repeating the steps until the classification result output by the classifier meets the model convergence requirement;

and cascading all the classifiers to obtain a target classification discrimination model.

A second aspect of the embodiments of the present application provides a model training apparatus based on sample screening, including:

the acquisition module is used for acquiring a training data set;

the classification module is used for inputting the training data set into a first classifier to obtain an output first classification result, and the first classification result comprises classes to which different sample data in the training data set belong;

the screening module is used for taking the sample data belonging to the target category in the first classification result as a first abandoned sample, and removing the first abandoned sample from the training data set to obtain a first updated training data set;

the cyclic execution module is used for inputting the first updated training data set into a second classifier to obtain an output second classification result, and the second classification result comprises classes to which different sample data in the first updated training data set belong; taking the sample data belonging to the target class in the second classification result as a second abandoned sample, removing the second abandoned sample from the first updated training data set to obtain a second updated training data set, and repeating the steps until the classification result output by the classifier meets the model convergence requirement;

and the model determining module is used for cascading all the classifiers to obtain a target classification discrimination model.

A third aspect of embodiments of the present application provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method according to the first aspect when executing the computer program.

A fourth aspect of embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, which, when executed by a processor, performs the steps of the method according to the first aspect.

A fifth aspect of the present application provides a computer program product, which, when run on a terminal, causes the terminal to perform the steps of the method of the first aspect described above.

As can be seen from the above, in the embodiment of the present application, the obtained training data set is input into the first classifier, so as to obtain the output classification result including the classes to which different sample data in the training data set belong, sample data belonging to the target class in the classification result is taken as a reject sample, the reject sample is removed from the training data set so as to obtain an updated training data set, the updated training data set is input into the new classifier, so as to obtain the output classification result including the classes to which different sample data in the updated training data set belong, the sample data belonging to the target class in the classification result is taken as a reject sample again, the reject sample is removed from the updated training data set so as to obtain a training data set which is updated again, and so on, until the classification result output by the last classifier meets the model convergence requirement, all classifiers are cascaded, and obtaining a target classification discrimination model. The process introduces a cascade screening thought, easily-separable samples are screened from massive multi-sample categories and then removed, and the sample data after the easily-separable samples are removed is used as training data of a subsequent classifier, so that the easily-separable samples do not participate in subsequent model training any more, the problem that the classification performance is influenced by the easily-separable samples which are turned into difficultly-separable samples in the subsequent training is solved, and the model classification effect of unbalanced data is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a first flowchart of a model training method based on sample screening according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of sample distribution and sample screening provided in the embodiments of the present application;

FIG. 3 is a schematic diagram of reclassification of retained samples according to an embodiment of the present disclosure;

FIG. 4 is a flowchart II of a model training method based on sample screening according to an embodiment of the present disclosure;

FIG. 5 is a block diagram of a model training apparatus based on sample screening according to an embodiment of the present disclosure;

fig. 6 is a block diagram of a terminal according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In particular implementations, the terminals described in embodiments of the present application include, but are not limited to, other portable devices such as mobile phones, laptop computers, or tablet computers having touch sensitive surfaces (e.g., touch screen displays and/or touch pads). It should also be understood that in some embodiments, the device is not a portable communication device, but is a desktop computer having a touch-sensitive surface (e.g., a touch screen display and/or touchpad).

In the discussion that follows, a terminal that includes a display and a touch-sensitive surface is described. However, it should be understood that the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.

The terminal supports various applications, such as one or more of the following: a drawing application, a presentation application, a word processing application, a website creation application, a disc burning application, a spreadsheet application, a gaming application, a telephone application, a video conferencing application, an email application, an instant messaging application, an exercise support application, a photo management application, a digital camera application, a web browsing application, a digital music player application, and/or a digital video player application.

Various applications that may be executed on the terminal may use at least one common physical user interface device, such as a touch-sensitive surface. One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within respective applications. In this way, a common physical architecture (e.g., touch-sensitive surface) of the terminal can support various applications with user interfaces that are intuitive and transparent to the user.

It should be understood that, the sequence numbers of the steps in this embodiment do not mean the execution sequence, and the execution sequence of each process should be determined by the function and the inherent logic of the process, and should not constitute any limitation to the implementation process of the embodiment of the present application.

In order to explain the technical solution described in the present application, the following description will be given by way of specific examples.

Referring to fig. 1, fig. 1 is a first flowchart of a model training method based on sample screening according to an embodiment of the present application. As shown in fig. 1, a model training method based on sample screening includes the following steps:

step 101, a training data set is obtained.

The training data set may be a pre-configured data set or may be obtained by capturing information from a network. The manner in which the training data set is obtained is not particularly limited.

In the process of obtaining the training data set, the training data set may be configured to include training samples of a set category, specifically, a first training sample of a first category may be selected according to a first quantity, a second training sample of a second category may be selected according to a second quantity, and a data set including the first training sample and the second training sample may be used as the training data set. The difference between the numbers of the training samples of different categories may be set to be greater than a threshold, and specifically, the difference between the first number and the second number may be set to be greater than the threshold. The training data set comprises training data with extremely uneven quantity distribution, and the improvement of the model training effect in the subsequent model training process is further promoted through the setting of the sample data.

Step 102, inputting the training data set into a first classifier to obtain an output first classification result.

Wherein, the first classification result comprises the classes of different sample data in the training data set.

Here, a classifier is provided to make a training data set input data to the classifier so that the classifier performs class prediction on sample data in the training data set.

Step 103, taking the sample data belonging to the target category in the first classification result as a first abandoned sample, and removing the first abandoned sample from the training data set to obtain a first updated training data set.

Wherein the target class may be a class in which the number of samples in the training dataset exceeds a threshold. That is, when constructing the training data set, the sample class having a large number of sample distributions in the training data set is set as the target class. The target category may be specified in advance by the relevant operator.

Specifically, in the foregoing step, when the data set including the first training sample and the second training sample is taken as the training data set, the first class is determined as the target class.

When the sample data belonging to the target class is used as the abandoned sample, the sample data belonging to other classes except the target class is used as the retained sample to be retained, so that the content of the training data set is updated under the condition that the first abandoned sample is removed from the training data set and the retained sample is retained.

In this step, the data type is erroneously determined due to the influence of the problem of uneven data distribution that may exist in the training data set. Therefore, the sample data needs to be screened based on the classification result output by the classifier.

Specifically, based on the class prediction result of the classifier, the samples which are easy to separate (i.e., the training samples which are judged to belong to the target class by the classifier) are screened out, so that the samples which are easy to separate are removed from the original training data set, and other sample data (reserved samples, namely samples which are difficult to separate) except the samples which are easy to separate are reserved, so that the data in the training data set is updated, the interference on model training caused by the conversion between the samples which are easy to separate and the samples which are difficult to separate in the subsequent model training process is reduced, and the problem that the model training process cannot be converged due to oscillation is avoided.

For a classification scene with severely unbalanced data, taking a data two-class classification network as an example, as shown in fig. 2, wherein a circular pattern in the graph represents a class M sample, a star pattern represents a class L sample, and the number of the class M samples is far greater than that of the class L. When the classifier performs class classification on the patterns of the two shapes, the line in the graph is used as a boundary for classifying the class M and the class L, specifically, the pattern content on the left side of the line is recognized as the class M, the pattern content on the right side of the line is recognized as the class L, and because the number of samples of the circular pattern of the class M is far greater than that of the star pattern of the class L, the sample data of the class L is too small, so that the data characteristics of the class L which can be learned by the model are less, the classifier is easy to misjudge when performing class judgment on the sample in the class L, the sample in the class L is easy to form into a difficult-to-classify sample, and at this time, the sample data on the right side of the line in the graph, which is judged by the model as the class L, actually contains the sample data which should belong to the class M, namely, the sample misidentification occurs when the sample in the class L is recognized. In the subsequent step, the sample data judged to be the easily separable sample is discarded based on the output result of the classifier, that is, the sample data judged to be the class M by the classifier on the left side of the line in fig. 2 is discarded.

And 104, inputting the first updated training data set into a second classifier to obtain an output second classification result.

And the second classification result comprises the classes to which different sample data in the first updated training data set belong.

Here, the second classifier is a different classifier than the first classifier. After the training data set is subjected to data screening based on the classification result of the first classifier, on the basis of the training data set after the content is updated, the training data continues to be classified through another classifier different from the first classifier.

And 105, taking the sample data belonging to the target class in the second classification result as a second abandoned sample, removing the second abandoned sample from the first updated training data set to obtain a second updated training data set, and repeating the steps until the classification result output by the classifier meets the model convergence requirement.

In the above process, the processing procedure in step 104 and step 105, in which the sample data belonging to the target class in the second classification result is used as the second reject sample, and the second reject sample is removed from the first updated training data set to obtain the second updated training data set, is to continue processing the updated training data set on the basis of step 102 and step 103, and when the training samples in the training data set are specifically processed, the processing operation is the same as that in step 102 and step 103.

Referring to fig. 3, the training data set from which the easily separable data is removed in step 103 (i.e., the first updated training data set) is used as input data of the classifier (i.e., the second classifier) in the next round of processing. In fig. 3, the sample identified as the class M by the two classification identifiers is below the line, and the sample identified as the class L by the two classification identifiers is above the line, and in the process, the easily separable data is removed layer by layer through a cyclic removal process, so that the classification interference of the easily separable sample on the difficultly separable sample can be reduced as much as possible, and the difficulty data can be accurately distinguished to the greatest extent in the model training process.

Further screening of training samples is performed on the updated training data set through

steps

104 and 105, and further updating of the updated training data set is achieved.

Similarly, while the sample data belonging to the target category is used as the reject sample (i.e., the second reject sample) in the round of screening, the sample data belonging to the other categories except the target category is retained as the retained sample, so that the updated training data set realizes further content update of the updated training data set under the condition that the second reject sample is removed and the retained sample is retained.

Wherein "in this way" means in particular: based on the training data set which is initially set, through one classifier and one classifier, a cyclic processing mode is adopted, the abandoned samples which are determined in the classification result of the previous classifier are successively based on the abandoned samples, the training data set is updated after the abandoned samples are abandoned, and the updated training data set is used as the input of the current classifier, the processing process of the classifier on the input data is repeated, determining discarded samples from the updated training data set according to the classification result of the current classifier, discarding the discarded samples to update the updated training data set again, inputting the updated training data set into the next classifier, and repeatedly executing the processing process of determining the abandoned sample according to the classification result of the classifier and updating the training data set until the classification result output by one classifier meets the model convergence requirement.

During the execution of the process, a new classifier is set every time the process is executed circularly, and data category prediction is performed on the training data set after the input data is updated on the basis of the new classifier, so that data interference among models in the circular processing process is reduced.

Specifically, as an optional implementation manner, after taking the sample data belonging to the target category in the second classification result as a second discarded sample, and removing the second discarded sample from the updated training data set to obtain a second updated training data set, the method further includes:

judging whether the second classification result meets the model convergence requirement or not;

if the second classification result does not meet the model convergence requirement, inputting the second updated training data set into a third classifier to obtain an output third classification result, wherein the third classification result comprises the classes of different sample data in the second updated training data set; and taking the sample data belonging to the target class in the third classification result as a third abandoned sample, and removing the third abandoned sample from the updated training data set to obtain a third updated training data set.

Wherein, the judgment that the classification result meets the model convergence requirement may be: the method comprises the steps of carrying out error judgment on a classification result of a classifier, judging that the classification result meets a model convergence requirement when the error of the classification result is smaller than a preset smaller value, or judging that the classification result meets the model convergence requirement when the weight change between two adjacent iterations is very small in a data iteration process of the classifier, setting a threshold value, judging that the classification result meets the model convergence requirement when the weight change value is smaller than the threshold value, and stopping training, or setting the maximum iteration times of the classifier in combination with the data iteration process of the classifier, judging that the classification result meets the model convergence requirement when the iteration exceeds the maximum times, and stopping training, or judging that the classification result meets the model convergence requirement when the number of the classifiers is determined so that the calculation complexity of a model formed by each classifier reaches the threshold value, and stopping training.

The classifier is trained by updating the data of the training data set and the training data set based on the updated training data set, and the analogy is carried out until the classification result output by the classifier meets the implementation process of the model convergence requirement.

And step 106, cascading all the classifiers to obtain a target classification discrimination model.

The all classifiers refer to all classifiers used in the process that the classification result of the classifier meets the model convergence requirement, including the first classifier and the second classifier in the previous steps.

In the continuous updating process of the training data set, training of the classifiers is realized by all the classifiers, and after the model training data set is updated by the classification result of the classifier, the easily separable data is eliminated, so that the easily separable samples do not participate in subsequent model training any more, and the problem that the classification performance is influenced by the easily separable samples which are converted into difficultly separable samples in the subsequent training is solved.

After all classifiers are cascaded, a formed target classification discrimination model can realize cascade screening, a large number of easily-separable samples are screened out, the data imbalance degree is reduced, the classifiers are more concerned with difficultly-separated samples, the effect of improving classification precision is finally achieved, and the model training process based on the cascade thought is realized.

And cascading all the set classifiers, specifically determining the cascading sequence of each classifier according to the sequence setting sequence of each classifier, and cascading the classifiers according to the cascading sequence to obtain the target classification discrimination model.

In the process, with the combination of the foregoing embodiment, if the second classification result meets the model convergence requirement, the first classifier and the second classifier are cascaded to obtain the target classification discrimination model.

In the specific implementation process, all classifiers are cascaded to obtain a target classification discrimination model, which includes:

according to the using sequence of all the classifiers, sequentially connecting input and output channels of all the classifiers; and taking all the classifiers after input and output connection as a whole to obtain a target classification discrimination model.

The using sequence of the classifiers refers to the using sequence of the classifiers for sequentially carrying out data classification processing on the training data set in the process that the classification result of the classifiers meets the model convergence requirement.

When all the classifiers are connected in sequence, the output channels and the input channels of two adjacent classifiers are connected according to the using sequence of the classifiers, and a target classification discrimination model is obtained.

In the embodiment of the application, the obtained training data set is input into a first classifier to obtain the output classification result comprising the classes to which different sample data in the training data set belong, the sample data belonging to the target class in the classification result is taken as a abandoned sample, the abandoned sample is removed from the training data set to obtain an updated training data set, the updated training data set is input into a new classifier to obtain the output classification result comprising the classes to which the different sample data in the updated training data set belong, the sample data belonging to the target class in the classification result is taken as the abandoned sample again, the abandoned sample is removed from the updated training data set to obtain the updated training data set, and repeating the steps until the classification result output by the last classifier meets the model convergence requirement, and cascading all the classifiers to obtain the target classification discrimination model. The process introduces a cascade screening thought, easily-separable samples are screened from massive multi-sample categories and then removed, and the sample data after the easily-separable samples are removed is used as training data of a subsequent classifier, so that the easily-separable samples do not participate in subsequent model training any more, the problem that the classification performance is influenced by the easily-separable samples which are turned into difficultly-separable samples in the subsequent training is solved, and the model classification effect of unbalanced data is improved.

The embodiment of the application also provides different implementation modes of the model training method based on sample screening.

Referring to fig. 4, fig. 4 is a flowchart ii of a model training method based on sample screening according to an embodiment of the present application. As shown in fig. 4, a model training method based on sample screening includes the following steps:

step 401, a training data set is obtained.

The implementation process of this step is the same as that of step 101 in the foregoing embodiment, and is not described here again.

Step 402, inputting the training data set into a first classifier to obtain an output first classification result.

The first classification result comprises the classes of different sample data in the training data set.

The implementation process of this step is the same as that of step 102 in the foregoing embodiment, and is not described here again.

Step 403, taking the sample data belonging to the target category in the first classification result as a first discarded sample, and removing the first discarded sample from the training data set to obtain a first updated training data set.

The implementation process of this step is the same as the implementation process of step 103 in the foregoing embodiment, and is not described here again.

Step 404, inputting the first updated training data set into the second classifier to obtain an output second classification result.

The second classification result includes the class to which different sample data in the first updated training data set belongs.

The implementation process of this step is the same as that of step 104 in the foregoing embodiment, and is not described here again.

Step 405, taking the sample data belonging to the target category in the second classification result as a second discarded sample, and removing the second discarded sample from the first updated training data set to obtain a second updated training data set.

The implementation process of this step is the same as that of step 105 in the foregoing embodiment, and is not described here again.

And 406, accumulating the number of the used classifiers to obtain the classifier number value.

In this step, the number of classifier cascade layers is counted by the classifier number value to control the number of classifier cascade layers in the subsequent classifier cascade process, so that the computational complexity of the finally obtained target classification discrimination model is controlled within a reasonable range.

Step 407, determine whether the classifier quantity value reaches a threshold value.

The number threshold may be set based on a theoretical value, and a specific value may be, for example, 1 or 2.

It should be noted that the computational complexity of the target classification and discrimination model is specifically the sum of the computational complexity of all classifiers, so the algorithm may increase the amount of computation, and to alleviate this problem, the maximum classifier number N _ max (i.e. the number threshold) may be set to be smaller, for example, when N _ max is 1, that is, only one filtering is performed.

The implementation of the step can reduce the unbalanced degree of the samples to the maximum extent on the premise of ensuring reasonable calculation complexity, thereby improving the classification precision.

Wherein, before this judgement classifier number value reaches the threshold value, still include: judging whether the classification result output by the current classifier meets the model convergence requirement or not; and if the classification result does not meet the model convergence requirement, executing a step of judging whether the classifier quantity value reaches a threshold value.

The process of determining whether the classification result output by the classifier meets the model convergence requirement is the same as the specific implementation process mentioned in step 105 of the foregoing embodiment, and is not repeated here.

Correspondingly, after determining whether the quantity value reaches the quantity threshold, the method further includes: if the quantity value has reached the quantity threshold, the model training operation is terminated.

In the process, in the process of judging whether the set number of the classifiers reaches the maximum value, whether the current classifiers meet the set performance requirements or not is judged, namely whether the models converge or not is judged, if the set number of the classifiers meet the set performance requirements, the model training is considered to meet the requirements, all the classifiers are directly cascaded to obtain the target classification judgment model, if the set number of the classifiers does not meet the set performance requirements, whether the set number of the classifiers reaches the threshold value or not is judged, if the set number of the classifiers reaches the threshold value, the precision requirement is still not met, the model training fails, at the moment, the model training operation is terminated, and if the set number of the classifiers does not reach the threshold value, the cascade cycle training process of the models is continuously executed within the threshold value limit range.

And step 408, if the quantity value of the classifier does not reach the quantity threshold, executing the analogy until the classification result output by the classifier meets the model convergence requirement.

If the quantity value of the classifier does not reach the quantity threshold value, the updated training data set needs to be input into the next classifier for classification, and the processing process is the same as the classification processing process of the training data set involved in each step.

The specific processing procedure is the same as the implementation procedure mentioned in step 105 in the foregoing embodiment, and is not described here again.

And 409, cascading all classifiers to obtain a target classification discrimination model.

The implementation process of this step is the same as that of step 106 in the foregoing embodiment, and is not described here again.

Specifically, the specific implementation steps of the embodiment are described in conjunction with a specific model training process:

first, a maximum value N _ max of the number of classifiers, i.e., a number threshold, may be set based on the original data set (i.e., the severe class imbalance data set).

Secondly, a classifier is arranged, sample data can be screened through the classifier, the classifier is used as a filter for screening samples which are easy to separate, and the optimization goal of the classifier in the training process is to ensure that the TN is larger and better under the condition of controlling the FN number to be constant. That is, under the condition that the number of positive samples which are wrongly classified as negative samples is constant, the negative samples as many as possible are identified, that is, under the condition that the number of the wrongly screened positive samples in the control screening samples is constant, the negative samples in the data set are screened as many as possible. Wherein, FN means that the classifier recognizes the result as wrong, the classifier considers the sample as a negative sample, so that the sample is actually a positive sample; TN means that the classifier recognizes correctly, and the classifier considers the sample as a negative sample.

Next, the samples are screened by using the classifier, and as shown in fig. 2, the samples predicted to be of the class M by the classifier are thrown away, and the samples predicted to be of the class L by the classifier are kept for the next round of training. And meanwhile, accumulating the set number of the classifiers, if the number of the classifiers N _ filter meets the following requirements: and N _ filter > N _ max, which indicates that the upper limit of the screening times is reached and the task performance target is not reached yet, the output screening fails, and the model training process is exited. And if the N _ filter is less than or equal to the N _ max, executing the next round of classifier training process by taking the sample predicted to be the class L by the reserved classifier as a training sample, and training the newly-set classifier again until the classification result output by the classifier meets the set performance requirement.

Further, after all the classifiers are cascaded to obtain the target classification and discrimination model, the method further includes: and uploading the target classification discriminant model to a block chain.

In all embodiments of the present application, uploading the object classification discriminant model to the blockchain can ensure its security and fair transparency to the user. The user equipment may download the target classification discriminant model from the blockchain to verify whether the target classification discriminant model is tampered with. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A Block chain (Block chain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data Block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next Block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

In the embodiment of the application, a cascade screening thought is introduced, easily-separable samples are screened out from massive multi-sample categories and then are rejected, sample data after the easily-separable samples are rejected are used as training data of a subsequent classifier, the easily-separable samples do not participate in subsequent model training any more, and meanwhile, the unbalanced degree of the samples is reduced to the maximum extent through the number threshold of the set classifiers on the premise of ensuring the reasonable calculation complexity of the model, so that the problem that the easily-separable samples are turned into difficultly-separable samples in the subsequent training and then the classification performance is influenced is solved, the model classification effect of unbalanced data is improved, and the classification precision is improved.

Referring to fig. 5, fig. 5 is a structural diagram of a model training apparatus based on sample screening according to an embodiment of the present application, and for convenience of description, only a part related to the embodiment of the present application is shown.

The model training apparatus 500 based on sample screening includes:

an obtaining module 501, configured to obtain a training data set;

a classification module 502, configured to input the training data set into a first classifier to obtain an output first classification result, where the first classification result includes classes to which different sample data in the training data set belong;

a screening module 503, configured to use the sample data belonging to the target category in the first classification result as a first discarded sample, and remove the first discarded sample from the training data set to obtain a first updated training data set;

a loop execution module 504, configured to input the first updated training data set into a second classifier, so as to obtain an output second classification result, where the second classification result includes classes to which different sample data in the first updated training data set belong; taking the sample data belonging to the target class in the second classification result as a second abandoned sample, removing the second abandoned sample from the first updated training data set to obtain a second updated training data set, and repeating the steps until the classification result output by the classifier meets the model convergence requirement;

and the model determining module 505 is configured to cascade all the classifiers to obtain a target classification and judgment model.

Wherein, the model training device further comprises:

the judging module is used for accumulating the number of the used classifiers to obtain the classifier number value; judging whether the quantity value of the classifier reaches a threshold value; if the quantity value of the classifier does not reach the quantity threshold value, the analogy is carried out until the classification result output by the classifier meets the model convergence requirement.

The device also includes:

the performance judgment module is used for judging whether the classification result output by the current classifier meets the model convergence requirement or not; and if the classification result does not meet the model convergence requirement, executing the step of judging whether the classifier quantity value reaches a threshold value.

Further, the loop execution module is further configured to:

if the second classification result does not meet the model convergence requirement, inputting the second updated training data set into a third classifier to obtain an output third classification result, wherein the third classification result comprises the classes of different sample data in the second updated training data set;

and taking the sample data belonging to the target category in the third classification result as a third abandoned sample, and removing the third abandoned sample from the updated training data set to obtain a third updated training data set.

Further, the model determination module is specifically configured to:

according to the using sequence of all the classifiers, sequentially connecting input and output channels of all the classifiers;

and taking all the classifiers after input and output connection as a whole to obtain the target classification discrimination model.

Wherein, the acquisition module is specifically configured to:

selecting a first training sample of a first category according to the first number;

selecting a second training sample of a second category according to the second quantity; wherein a difference between the first number and the second number is greater than a threshold, the first class being determined to be the target class;

using a data set comprising the first training sample and the second training sample as the training data set.

The device also includes:

and the storage module is used for uploading the target classification discrimination model to a block chain.

The model training device based on sample screening provided by the embodiment of the application can realize each process of the embodiment of the model training method based on sample screening, and can achieve the same technical effect, and in order to avoid repetition, the repeated description is omitted here.

Fig. 6 is a block diagram of a terminal according to an embodiment of the present application. As shown in the figure, the terminal 6 of this embodiment includes: at least one processor 60 (only one shown in fig. 6), a memory 61, and a computer program 62 stored in the memory 61 and executable on the at least one processor 60, the steps of any of the various method embodiments described above being implemented when the computer program 62 is executed by the processor 60.

The terminal 6 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal 6 may include, but is not limited to, a processor 60, a memory 61. It will be appreciated by those skilled in the art that fig. 6 is only an example of a terminal 6 and does not constitute a limitation of the terminal 6, and that it may comprise more or less components than those shown, or some components may be combined, or different components, for example the terminal may further comprise input output devices, network access devices, buses, etc.

The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may be an internal storage unit of the terminal 6, such as a hard disk or a memory of the terminal 6. The memory 61 may also be an external storage device of the terminal 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) and the like provided on the terminal 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the terminal 6. The memory 61 is used for storing the computer program and other programs and data required by the terminal. The memory 61 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal and method may be implemented in other ways. For example, the above-described apparatus/terminal embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The present application realizes all or part of the processes in the method of the above embodiments, and may also be implemented by a computer program product, when the computer program product runs on a terminal, the steps in the above method embodiments may be implemented when the terminal executes the computer program product.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A model training method based on sample screening is characterized by comprising the following steps:

acquiring a training data set;

2. The method according to claim 1, wherein the step of removing the second discarded sample from the updated training data set to obtain a second updated training data set, further comprises:

accumulating the number of used classifiers to obtain the classifier number value;

judging whether the quantity value of the classifier reaches a threshold value;

if the quantity value of the classifier does not reach the quantity threshold value, the analogy is carried out until the classification result output by the classifier meets the model convergence requirement.

3. The model training method of claim 2, wherein before determining whether the classifier metric value reaches a threshold, further comprising:

judging whether the classification result output by the current classifier meets the model convergence requirement or not;

and if the classification result does not meet the model convergence requirement, executing the step of judging whether the classifier quantity value reaches a threshold value.

4. The model training method according to any one of claims 1 to 3, wherein said determining the sample data belonging to the target class in the second classification result as a second reject sample, and removing the second reject sample from the updated training data set to obtain a second updated training data set further comprises:

5. The model training method of claim 1, wherein the step of cascading all classifiers to obtain a target classification discriminant model comprises:

6. The model training method of claim 1, wherein the obtaining a training data set comprises:

7. The model training method of claim 1, wherein after cascading all classifiers to obtain the target classification and discrimination model, the method further comprises:

and uploading the target classification discrimination model to a block chain.

8. A model training device based on sample screening is characterized by comprising:

the acquisition module is used for acquiring a training data set;

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.