[go: up one dir, main page]

CN111931698B - Image deep learning network construction method and device based on small training set - Google Patents

Image deep learning network construction method and device based on small training set Download PDF

Info

Publication number
CN111931698B
CN111931698B CN202010937042.3A CN202010937042A CN111931698B CN 111931698 B CN111931698 B CN 111931698B CN 202010937042 A CN202010937042 A CN 202010937042A CN 111931698 B CN111931698 B CN 111931698B
Authority
CN
China
Prior art keywords
picture
batch
current
value
pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010937042.3A
Other languages
Chinese (zh)
Other versions
CN111931698A (en
Inventor
张玉琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An International Smart City Technology Co Ltd
Original Assignee
Ping An International Smart City Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An International Smart City Technology Co Ltd filed Critical Ping An International Smart City Technology Co Ltd
Priority to CN202010937042.3A priority Critical patent/CN111931698B/en
Publication of CN111931698A publication Critical patent/CN111931698A/en
Application granted granted Critical
Publication of CN111931698B publication Critical patent/CN111931698B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a method, a device, computer equipment and a storage medium for constructing an image deep learning network based on a small training set, relates to the artificial intelligence technology, and particularly inputs the current small picture training set comprising actual data pictures and additional data pictures into an initial deep convolution neural network in batches for training to obtain a target deep convolution neural network, and separately processes the actual data and the additional data in a batch normalization layer in the initial deep convolution neural network, thereby solving the problem of model forgetting caused by a pre-training model method, reducing the data quantity required by model training and solving the problem of poor model effect caused by different data distribution. In addition, the invention also relates to a block chain technology, and the model parameter set corresponding to the target deep convolutional neural network can be stored in the block chain.

Description

Image deep learning network construction method and device based on small training set
Technical Field
The invention relates to the technical field of artificial intelligence model hosting, in particular to a method and a device for constructing an image deep learning network based on a small training set, computer equipment and a storage medium.
Background
At present, the deep convolutional neural network can obtain better effects on public data sets of a plurality of tasks such as image classification, image detection, image segmentation and the like, wherein one important reason is that a large amount of training data is used for ensuring the model training effect. On the premise of ensuring a large amount of data, the deep convolution neural network can improve the accuracy of the model by deepening the network depth and increasing the model parameters. When a large amount of data exists, the deep convolutional neural network can even exceed the level of human eyes, but when the deep convolutional neural network is applied to an actual scene, because the cost for collecting actual data and marking data is too high, training a network which can be practically applied is often difficult.
The existing method for reducing the collection of actual data and actual mark amount is mainly through a model pre-training mode. Model pre-training refers to training a model on a relatively large public data set followed by fine-tuning on a particular small data set task. However, the model pre-training has the problem of network parameter forgetting, because the data in the large data set is invisible to the network at the moment in the fine tuning training process on the specific small data set.
Disclosure of Invention
The embodiment of the invention provides a method and a device for constructing an image deep learning network based on a small training set, computer equipment and a storage medium, and aims to solve the problem that in the prior art, the method for reducing the collected actual data and the actual mark amount mainly adopts a model pre-training mode, and the model pre-training has the problem that the precision of a model is not high due to the fact that network parameters are forgotten.
In a first aspect, an embodiment of the present invention provides a method for constructing an image deep learning network based on a small training set, including:
receiving a current small picture training set, calling a video memory size value, calculating to obtain a batch input picture number according to a single picture memory value and the video memory size value in the current small picture training set, and calculating to obtain a picture total batch value according to the total picture number of the training set in the current small picture training set and the batch input picture number; the current small picture training set comprises an actual data picture and an additional data picture;
acquiring current batch input pictures in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and additional data pictures;
respectively inputting actual data pictures and additional data pictures included in the current batch of input pictures into a convolution layer of an initial deep convolution neural network for convolution to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set and a second output matrix corresponding to each additional data picture to form a second output matrix set;
inputting each first output matrix in the first output matrix set to a batch normalization layer of an initial deep convolutional neural network for batch normalization processing to obtain first batch normalization processing results corresponding to the first output matrix set;
inputting each second output matrix in the second output matrix set to an additional batch normalization layer of an initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set;
inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining output results corresponding to the current batch of input pictures;
obtaining a predicted value corresponding to the output result according to the initial depth convolution neural network, and obtaining a corresponding current loss function value according to an actual value and a predicted value corresponding to the current batch input pictures;
obtaining a last loss function value corresponding to a last batch of stored input pictures, and judging whether the current loss function value is smaller than the last loss function value;
if the current loss function value is smaller than the last loss function value, acquiring a first actual number corresponding to an actual data picture and a second actual number corresponding to an additional data picture in the current batch of input pictures to be used as picture number selection parameters of the next batch of input pictures;
adding one to the picture batch value to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, and judging whether the picture batch value exceeds the picture total batch value; if the picture batch value does not exceed the picture total batch value, returning to execute the step of obtaining the current batch input pictures in the current small picture training set; and
and if the image batch value exceeds the total image batch value, acquiring a current deep convolutional neural network corresponding to the initial deep convolutional neural network to serve as a target deep convolutional neural network.
In a second aspect, an embodiment of the present invention provides an image deep learning network building apparatus based on a small training set, including:
a training set receiving unit, configured to receive a current small picture training set, call a video memory size value, calculate a number of batch input pictures according to a single picture memory value in the current small picture training set and the video memory size value, and calculate a total picture batch value according to a total number of training set pictures in the current small picture training set and the number of batch input pictures; the current small picture training set comprises an actual data picture and an additional data picture;
the current batch input picture acquisition unit is used for acquiring current batch input pictures in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and additional data pictures;
the convolution unit is used for respectively inputting actual data pictures and additional data pictures included in the current batch of input pictures into a convolution layer of the initial deep convolution neural network for convolution to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set and a second output matrix corresponding to each additional data picture to form a second output matrix set;
the batch normalization unit is used for inputting each first output matrix in the first output matrix set to a batch normalization layer of an initial deep convolutional neural network for batch normalization processing to obtain first batch normalization processing results corresponding to the first output matrix set;
the additional batch normalization unit is used for inputting each second output matrix in the second output matrix set to an additional batch normalization layer of the initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set;
the activation unit is used for inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation to obtain output results corresponding to the current batch of input pictures;
a loss function value obtaining unit, configured to obtain a predicted value corresponding to the output result according to an initial deep convolutional neural network, and obtain a corresponding current loss function value according to an actual value and a predicted value corresponding to the current batch of input pictures;
the loss function value comparison unit is used for acquiring a last loss function value corresponding to a last batch of stored input pictures and judging whether the current loss function value is smaller than the last loss function value or not;
a first picture number setting unit, configured to, if the current loss function value is smaller than the previous loss function value, obtain a first actual number corresponding to an actual data picture and a second actual number corresponding to an extra data picture in a current batch of input pictures, so as to serve as picture number selection parameters of a next batch of input pictures;
the updating unit is used for adding one to the picture batch value to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, and judging whether the picture batch value exceeds the total picture batch value; if the picture batch value does not exceed the picture total batch value, returning to execute the step of obtaining the current batch input pictures in the current small picture training set; and
and the target neural network acquisition unit is used for acquiring the current deep convolutional neural network corresponding to the initial deep convolutional neural network as a target deep convolutional neural network if the picture batch value exceeds the total picture batch value.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the processor implements the method for constructing the image deep learning network based on the mini training set according to the first aspect.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the method for constructing an image deep learning network based on a small training set according to the first aspect.
The embodiment of the invention provides a method, a device, computer equipment and a storage medium for constructing an image deep learning network based on a small training set, wherein the current small picture training set comprising actual data pictures and additional data pictures is input into an initial deep convolution neural network in batches for training, and the actual data and the additional data are processed separately in a batch normalization layer in the initial deep convolution neural network, so that the problem of model forgetting caused by a pre-training model method is solved, the data quantity required by model training is reduced, and the problem of poor model effect caused by different data distribution is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic view of an application scenario of a small training set-based image deep learning network construction method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for constructing an image deep learning network based on a small training set according to an embodiment of the present invention;
FIG. 3 is a schematic block diagram of an apparatus for constructing an image deep learning network based on a small training set according to an embodiment of the present invention;
FIG. 4 is a schematic block diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1 and fig. 2, fig. 1 is a schematic view of an application scenario of a method for constructing an image deep learning network based on a small training set according to an embodiment of the present invention; fig. 2 is a schematic flowchart of a method for constructing an image deep learning network based on a small training set according to an embodiment of the present invention, where the method is applied to a server and is executed by application software installed in the server.
As shown in FIG. 2, the method includes steps S101 to S111.
S101, receiving a current small picture training set, calling a video memory size value, calculating to obtain a batch input picture number according to a single picture memory value and the video memory size value in the current small picture training set, and calculating to obtain a picture total batch value according to the total picture number of the training set in the current small picture training set and the batch input picture number; wherein the current thumbnail picture training set comprises an actual data picture and an additional data picture.
In this embodiment, in the process of training the deep convolutional neural network, it is often difficult to train a network that can be practically applied because the cost of collecting actual data and labeled data is too high. At present, the method for reducing the collection of actual data and the actual mark amount is mainly through a model pre-training mode.
Model pre-training refers to training a model on a relatively large public data set (e.g., ImageNet data set) followed by fine-tuning the training on the task of a particular small data set. However, the model pre-training has the problem of network parameter forgetting, because the data in the large data set is invisible to the network at the moment in the fine tuning training process on the specific small data set.
In the application, just after the initial deep convolutional neural network is trained through a relatively large public data set in the server, a current small picture training set is selected to perform fine tuning training on the initial deep convolutional neural network, and the current small picture training set can be uploaded by a user side in communication connection with the server. At this time, the current thumbnail training set includes a plurality of actual data pictures and a plurality of additional data pictures. The actual data picture refers to picture data which do not belong to the public data set and are used for carrying out fine tuning training on the initial deep convolutional neural network, and each actual picture data corresponds to an actual mark amount. The extra data picture is a very small part of picture data selected from a relatively large public data set, each extra data picture is also corresponding to the mark amount of the extra picture, and the extra data picture is selected from the relatively large public data set, so the acquisition cost is low.
When the initial deep convolutional neural network is trained locally at the server, the current mini-picture training set sent by other terminals or the server can be received first. After the current small picture training set is received and stored in the server, the current small picture training set is not input into the initial deep convolutional neural network for training all at one time, but is divided into a plurality of batches of input pictures for training.
In one embodiment, step S101 includes:
obtaining the number of the batch input pictures according to the quotient of the video memory size value and the memory value of a single picture in the current small picture training set;
and obtaining a total picture batch value according to the quotient of the total number of the training set pictures corresponding to the current small picture training set and the number of the batch input pictures.
In this embodiment, in order to divide the current small picture training set into multiple batches of batch input pictures for respective training, at this time, a local pre-stored video memory size value may be called first, then a single-picture memory value corresponding to a single picture in the previous small picture training set is obtained, a batch input picture number is obtained according to a result corresponding to the video memory size value/the single-picture memory value, and then a picture total batch value is obtained according to a quotient between a total number of pictures in the training set corresponding to the current small picture training set and the batch input picture number. After the current small picture training set is divided into a plurality of batches according to the number of the batch input pictures (the sum of the batches is equal to the total batch value of the pictures), each batch of pictures comprises actual data pictures and extra data pictures.
S102, obtaining current batch input pictures in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and extra data pictures.
In this embodiment, the initial current mini picture training set is assumed to include N pictures, which in turn includes N1 actual data pictures and N2 extra data pictures; wherein N, N1 and N2 are both positive integers, and N = N1+ N2. When the first batch of current input pictures is selected from the initial current small picture training set, the steps S103-S109 are continuously executed in the first iteration, and then the next iteration is continued when the judgment is made after the judgment of step S110. The current batch input pictures are acquired in batches and input to the initial deep convolutional neural network for training, and the number of actual data pictures and extra data pictures respectively included in each batch of input pictures can be dynamically adjusted, so that the data cost of model training is reduced.
In one embodiment, step S102 includes:
acquiring a first total picture number corresponding to the actual data picture in the current small picture training set and a second total picture number corresponding to the additional data picture;
calculating to obtain a picture ratio according to the quotient of the first total picture number and the second total picture number;
judging whether the picture ratio is less than 0.01 or not;
if the picture ratio is smaller than 0.01, automatically setting the number of sheets corresponding to the actual data picture as p1, and setting the number of sheets corresponding to the extra data picture as q 1; wherein p1+ q1= number of batch input pictures, and p1 is smaller than q 1;
if the value range of the picture ratio is [0.01,100], automatically setting the number of sheets corresponding to the actual data picture as p2 and setting the number of sheets corresponding to the additional data picture as q 2; wherein p2+ q2= number of batch input pictures, and p2 is equal to q 2;
if the picture ratio is larger than 100, automatically setting the number of sheets corresponding to the actual data picture as p3, and setting the number of sheets corresponding to the additional data picture as q 3; wherein p3+ q3= number of batch input pictures, and p3 is greater than q 3;
and acquiring actual data pictures and extra data pictures with corresponding numbers in the current small picture training set according to the number corresponding to the actual data pictures and the number corresponding to the extra data pictures to form a current batch input picture.
In this embodiment, when determining the number p of actual data pictures and the number q of additional data pictures included in the batch of pictures of each batch, a ratio between the number of the first total pictures and the number of the second total pictures is determined; if the first total picture number/the second total picture number is less than 0.01, setting p + q = batch input picture number, and p is less than q; if the first total picture number/the second total picture number is larger than 100, setting p + q = batch input picture number, and p is larger than q; if 0.01 is not greater than the first total number of pictures/the second total number of pictures, p + q = the number of batch input pictures and p = q may be set. Wherein p and q are both positive integers.
More specifically, if the picture ratio is less than 0.01, automatically setting the number of sheets corresponding to the actual data picture as p1, and setting the number of sheets corresponding to the additional data picture as q 1; wherein p1+ q1= number of batch input pictures, and p1 is smaller than q 1; if the value range of the picture ratio is [0.01,100], automatically setting the number of sheets corresponding to the actual data picture as p2 and setting the number of sheets corresponding to the additional data picture as q 2; wherein p2+ q2= number of batch input pictures, and p2 is equal to q 2; if the picture ratio is larger than 100, automatically setting the number of sheets corresponding to the actual data picture as p3, and setting the number of sheets corresponding to the additional data picture as q 3; where p3+ q3= number of batch input pictures, and p3 is greater than q 3. Wherein, p1, q1, p2, q2, p3 and q3 are all positive integers.
It should be noted that, when the current mini-picture training set is divided into a plurality of batches according to the number of batch input pictures, the division is not completed at one time, but pictures with corresponding numbers are taken out from the current mini-picture training set according to the number of batch input pictures, and the taken-out batch of pictures including the actual data pictures and the extra data pictures meets the conditions that p + q = the number of batch input pictures and p < q, or p + q = the number of batch input pictures and p > q, or p + q = the number of batch input pictures and p = q. And after each batch of batch pictures are taken out from the current small picture training set, the batch pictures of the batch are removed from the current small picture training set so as to avoid being repeatedly extracted to the batch pictures forming the subsequent batch.
For example, compared to a batch of pictures including the actual data picture p1 and the additional data picture q1, and a batch of pictures including the actual data picture p2 and the additional data picture q2, p1 is not necessarily equal to p2, and q1 is not necessarily equal to q2, i.e., the actual data picture and the additional data picture are not completely equally divided according to the number of batch input pictures.
S103, inputting actual data pictures and additional data pictures included in the current batch of input pictures into a convolution layer of the initial deep convolution neural network for convolution respectively to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set, and a second output matrix corresponding to each additional data picture to form a second output matrix set.
In this embodiment, the current small picture training set may be a face head image set applied to face recognition model training, and then the corresponding initial deep convolutional neural network is the face recognition model. Of course, the current thumbnail training set is not limited to the data set of the scene of face recognition, but may also be a data set of other specific application scenes (e.g. OCR recognition), and the corresponding initial deep convolutional neural network is a deep convolutional neural network of the same application scene as the current thumbnail training set.
At this time, in order to facilitate understanding of the complete technical scheme of the present application, the picture batch value corresponding to the current batch input picture is illustrated as 2 (because there is no previous batch input picture when the picture batch value is 1).
For example, the obtained current batch of input pictures is a second batch of input pictures, the previous batch of input pictures of the first batch is recorded as the previous batch of input pictures, and so on, the batch of input pictures of the third batch before the current batch of input pictures is recorded as the next batch of input pictures.
At this time, because p actual data pictures and q additional data pictures included in the current batch of input pictures, each actual data picture in the p actual data pictures corresponds to one pixel matrix, and each additional data picture in the q additional data pictures corresponds to one pixel matrix, the pixel matrices corresponding to the p actual data pictures are input to the convolution layer of the initial deep convolution neural network for convolution, and a first output matrix corresponding to each actual data picture in the p actual data pictures one to one is obtained to form a first output matrix set.
Similarly, the pixel matrices corresponding to the q additional data pictures are input to the convolution layer of the initial deep convolutional neural network for convolution, and a second output matrix corresponding to each additional data picture in the q additional data pictures one to one is obtained to form a second output matrix set.
And S104, inputting each first output matrix in the first output matrix set to a batch normalization layer of an initial deep convolutional neural network for batch normalization to obtain a first batch of normalization processing results corresponding to the first output matrix set.
In this embodiment, in order to accelerate the convergence rate of the initial deep convolutional neural network and to alleviate the problem of "gradient dispersion" in the deep network to a certain extent, a batch of first data matrices composed of each first output matrix in the first output matrix set may be normalized. When the first output matrix set is subjected to batch normalization processing, the specific process is as follows:
batch normalization occurs after convolution calculation and before application of the activation function. If the convolution calculation outputs a plurality of channels, the outputs of the channels need to be subjected to batch normalization respectively, and each channel has independent stretching and offset parameters and is scalar. Let m samples in the small batch. On a single channel, the height and width of the convolution computation output are assumed to be h and w, respectively. Batch normalization of m × h × w elements in the channel is required. The same mean and variance, i.e., the mean and variance of m x h x w elements in the channel, are used in the normalization calculation for these elements.
And S105, inputting each second output matrix in the second output matrix set to an additional batch normalization layer of the initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set.
In this embodiment, in step S105, with reference to the operation method in step S104, a second batch of normalization processing results corresponding to the second output matrix set is obtained. When the p actual data pictures are subjected to batch normalization processing through the batch normalization layer to obtain a corresponding first batch of normalization processing results, and q additional data pictures are subjected to batch normalization processing through the additional batch normalization layer to obtain a second batch of normalization processing results, the batch normalization layer and the additional batch normalization layer are considered separately, so that confusion caused by inconsistent distribution of actual picture data and additional picture data can be well isolated. Because the batch processing layer is most sensitive to data distribution, the problem of poor model effect caused by different data distribution is solved by separately processing actual data and additional data in the batch processing layer.
In one embodiment, step S105 includes:
acquiring a second output matrix mean value corresponding to the second output matrix set;
acquiring a second output matrix variance corresponding to the second output matrix set;
normalizing the second output matrix set according to the second output matrix mean and the second output matrix variance to obtain a corresponding normalized second output matrix set;
and correspondingly carrying out scale change according to the normal distribution data included in the normalized second output matrix set and the pre-stored scale adjustment values and deviation values to obtain a second batch of normalization processing results corresponding to the second output matrix set.
In this embodiment, when the batch normalization processing is performed on the second output matrix set, the mean and the variance of the second output matrix set are calculated, and the second output matrix set is subtracted from the mean of the second output matrix set and then divided by the variance of the second output matrix, so that the second output matrix set is changed into a standard normal distribution (i.e., the second output matrix set is normalized). And then, performing scale change operation (namely scale operation) and shift operation (namely shift operation) on each normal distribution data included in the normalized second output matrix set, namely y = scale x + shift, namely performing mean shift and variance transformation on each normal distribution data to enable the data to move a certain range from a linear region to a nonlinear region, so that a balance point is found between a larger gradient and the nonlinear transformation, and the characterization capability is improved without losing the linear transformation while the larger gradient is maintained and the training speed is increased. These two parameters need to be learned by the neural network itself in training, i.e. the scale and shift values in the formula.
And S106, inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining output results corresponding to the current batch of input pictures.
In this embodiment, the first batch of normalization processing results and the second batch of normalization processing results are input to a RELU activation function layer of an initial deep convolutional neural network for activation, so as to obtain an output result for inputting to a pooling layer.
The expression of the RELU activation function (i.e., linear rectification activation function) is:
f(x)=max(0,x);
namely, when x is larger than 0, the value of the ReLu activating function is kept, and when x is smaller than or equal to 0, the value of the ReLu activating function is 0, so that the ReLu activating function has the advantages that: when a Gradient Descent (GD) method is used, the convergence rate is higher; only one threshold value is needed to obtain the activation value, and the calculation speed is higher.
In one embodiment, step S106 includes:
and after the first batch of normalization processing results and the second batch of normalization processing results are merged to obtain a merged set, inputting the merged set into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining an output result corresponding to the current batch of input pictures.
In this embodiment, specifically, when the first batch of normalization processing results and the second batch of normalization processing results are input to the RELU activation function layer of the initial deep convolutional neural network for activation, more specifically, after performing union operation on the two sets to obtain a merged set, the merged set is input to the RELU activation function layer of the initial deep convolutional neural network for activation, so as to obtain an output result corresponding to the current batch of input pictures.
And S107, obtaining a predicted value corresponding to the output result according to the initial depth convolution neural network, and obtaining a corresponding current loss function value according to the actual value and the predicted value corresponding to the current batch input picture.
In this embodiment, after a current batch of input pictures sequentially passes through a convolutional layer, a batch normalization layer and an activation function to obtain an output result, the output result is continuously processed in the initial deep convolutional neural network, specifically sequentially passes through a pooling layer, a full link layer and a softmax layer to be processed, and a predicted value corresponding to the output result is obtained. Because the actual picture data and the extra picture data included in the current batch of input pictures are both marked values (namely actual values), at this time, the cross entropy loss function can be calculated by the p actual picture data and the q extra picture data of each batch together, and the current loss function value is obtained. Calculating the cross entropy loss function is prior art and is not discussed here.
And S108, acquiring a last loss function value corresponding to the stored last batch of input pictures, and judging whether the current loss function value is smaller than the last loss function value.
In this embodiment, after obtaining the current loss function value, it is necessary to obtain a previous loss function value corresponding to a previous batch of input pictures to determine whether the current loss function value is smaller than the previous loss function value, and if the current loss function value is smaller than the previous loss function value, it indicates that the p-value and the q-value corresponding to the current batch can be used to take out a next batch of batch pictures; and if the current loss function value is larger than or equal to the last loss function value, representing that the p value and the q value corresponding to the last batch are used for taking out batch processing pictures of the next batch.
And S109, if the current loss function value is smaller than the last loss function value, acquiring a first actual number corresponding to the actual data picture and a second actual number corresponding to the extra data picture in the current batch of input pictures to be used as picture number selection parameters of the next batch of input pictures.
In this embodiment, when the p value and the q value corresponding to the current batch can be used to take out a next batch of batch-processed pictures, a first actual number corresponding to an actual data picture and a second actual number corresponding to an additional data picture in the current batch of input pictures are directly obtained to be used as picture number selection parameters of the next batch of input pictures.
In an embodiment, step S108 is followed by:
and if the current loss function value is larger than or equal to the last loss function value, acquiring a third actual number corresponding to the actual data picture in the last batch of input pictures and a fourth actual number corresponding to the extra data picture to be used as picture number selection parameters of the next batch of input pictures.
In this embodiment, when the p value and the q value corresponding to the previous batch are used to take out the batch processed pictures of the next batch, the third actual number corresponding to the actual data picture in the previous batch of input pictures and the fourth actual number corresponding to the extra data picture are obtained to be used as the picture number selection parameter of the next batch of input pictures.
S110, adding one to the picture batch value to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, and judging whether the picture batch value exceeds the total picture batch value; if the picture batch value does not exceed the picture total batch value, returning to execute the step S102; if the picture batch value exceeds the total picture batch value, step S111 is executed.
In this embodiment, after the iteration of this round is completed, the picture batch value needs to be updated by adding one to the picture batch value, so as to obtain the current batch of input pictures of the next batch. And when the iteration of the current round is finished, the current batch of input pictures selected in the iteration process of the current round are deleted from the current small picture training set, and the latest current small picture training set is formed by the rest pictures.
And when the picture batch value is added with one to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, judging whether the picture batch value exceeds the picture total batch value according to the judgment that whether the iteration process is continued, continuing the iteration process if the picture batch value does not exceed the picture total batch value, and stopping the iteration process if the picture batch value exceeds the picture total batch value. In the process of iteratively inputting the current batch of input pictures to the initial deep convolutional neural network, continuously adjusting each model parameter of the initial deep convolutional neural network, thereby obtaining the deep convolutional neural network which can be finally trained and can be practically applied.
And S111, acquiring a current deep convolutional neural network corresponding to the initial deep convolutional neural network to serve as a target deep convolutional neural network.
In this embodiment, after a plurality of batches of small pictures in the current small picture training set are all input to the initial deep convolutional neural network for training, the target convolutional neural network can be finally obtained. Through the current small-sized picture training set, the requirement that a large amount of data needs to be collected and marked in the process of actually applying the deep convolutional network can be reduced, and therefore the development cost of image-related artificial intelligence products is reduced.
In an embodiment, step S111 is followed by:
and obtaining a model parameter set corresponding to the target deep convolutional neural network and uploading the model parameter set to a block chain network.
In this embodiment, the server may serve as a block chain node device to upload the first model parameter set corresponding to the convolutional neural network to the block chain network, and the data solidification storage is implemented by fully utilizing the characteristic that the block chain data cannot be tampered. Moreover, the server can download the model parameter set corresponding to the target deep convolutional neural network from the blockchain to locally generate the deep convolutional neural network.
The corresponding digest information is obtained based on the model parameter set, and specifically, the digest information is obtained by hashing the model parameter set, for example, by using a sha256 algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The server may download the summary information from the blockchain to verify whether the model parameter sets are tampered with. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
According to the method, an additional batch normalization layer is added in the deep convolutional network, so that the network can simultaneously put actual data and additional data into the network for training, and the problem of model forgetting caused by a pre-training model method is solved. Meanwhile, the batch processing layer is most sensitive to data distribution, and actual data and extra data are processed separately in the batch processing layer, so that the problem of poor model effect caused by different data distribution is solved.
The embodiment of the invention also provides an image deep learning network construction device based on the small training set, which is used for executing any embodiment of the image deep learning network construction method based on the small training set. Specifically, referring to fig. 3, fig. 3 is a schematic block diagram of an image deep learning network constructing apparatus based on a small training set according to an embodiment of the present invention. The image deep learning network construction apparatus 100 based on the mini training set may be configured in a server.
As shown in fig. 3, the apparatus 100 for constructing an image deep learning network based on a small training set includes: a training set receiving unit 101, a current batch input picture acquiring unit 102, a convolution unit 103, a batch normalization unit 104, an additional batch normalization unit 105, an activation unit 106, a loss function value acquiring unit 107, a loss function value comparing unit 108, a first picture number setting unit 109, an updating unit 110, and a target neural network acquiring unit 111.
A training set receiving unit 101, configured to receive a current small picture training set, call a video memory size value, calculate a number of batch input pictures according to a single picture memory value and the video memory size value in the current small picture training set, and calculate a total picture batch value according to a total number of training set pictures in the current small picture training set and the number of batch input pictures; wherein the current thumbnail picture training set comprises an actual data picture and an additional data picture.
In the application, just after the initial deep convolutional neural network is trained through a relatively large public data set in the server, a current small picture training set is selected to perform fine tuning training on the initial deep convolutional neural network, and the current small picture training set can be uploaded by a user side in communication connection with the server. At this time, the current thumbnail training set includes a plurality of actual data pictures and a plurality of additional data pictures. The actual data picture refers to picture data which do not belong to the public data set and are used for carrying out fine tuning training on the initial deep convolutional neural network, and each actual picture data corresponds to an actual mark amount. The extra data picture is a very small part of picture data selected from a relatively large public data set, each extra data picture is also corresponding to the mark amount of the extra picture, and the extra data picture is selected from the relatively large public data set, so the acquisition cost is low.
When the initial deep convolutional neural network is trained locally at the server, the current mini-picture training set sent by other terminals or the server can be received first. After the current small picture training set is received and stored in the server, the current small picture training set is not input into the initial deep convolutional neural network for training all at one time, but is divided into a plurality of batches of input pictures for training.
In one embodiment, the training set receiving unit 101 includes:
the batch input picture number acquiring unit is used for acquiring a batch input picture number according to the quotient of the video memory size value and the memory value of a single picture in the current small picture training set;
and the picture total batch value acquisition unit is used for acquiring a picture total batch value according to the quotient of the training set picture total number corresponding to the current small picture training set and the batch input picture number.
In this embodiment, in order to divide the current small picture training set into multiple batches of batch input pictures for respective training, at this time, a local pre-stored video memory size value may be called first, then a single-picture memory value corresponding to a single picture in the previous small picture training set is obtained, a batch input picture number is obtained according to a result corresponding to the video memory size value/the single-picture memory value, and then a picture total batch value is obtained according to a quotient between a total number of pictures in the training set corresponding to the current small picture training set and the batch input picture number. After the current small picture training set is divided into a plurality of batches according to the number of the batch input pictures (the sum of the batches is equal to the total batch value of the pictures), each batch of pictures comprises actual data pictures and extra data pictures.
A current batch input picture acquiring unit 102, configured to acquire a current batch input picture in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and extra data pictures.
In this embodiment, the initial current mini picture training set is assumed to include N pictures, which in turn includes N1 actual data pictures and N2 extra data pictures; wherein N, N1 and N2 are both positive integers, and N = N1+ N2. When the first batch of current input pictures is selected from the initial current small picture training set, the steps S103-S109 are continuously executed in the first iteration, and then the next iteration is continued when the judgment is made after the judgment of step S110. The current batch input pictures are acquired in batches and input to the initial deep convolutional neural network for training, and the number of actual data pictures and extra data pictures respectively included in each batch of input pictures can be dynamically adjusted, so that the data cost of model training is reduced.
In an embodiment, the current batch input picture acquiring unit 102 includes:
a picture number acquiring unit, configured to acquire a first total picture number corresponding to the actual data picture in the current small-size picture training set and a second total picture number corresponding to the additional data picture;
the picture ratio calculating unit is used for calculating to obtain a picture ratio according to the quotient of the first total picture number and the second total picture number;
the picture ratio judging unit is used for judging whether the picture ratio is less than 0.01 or not;
the first setting unit is used for automatically setting the number of sheets corresponding to the actual data picture as p1 and setting the number of sheets corresponding to the extra data picture as q1 if the picture ratio is less than 0.01; wherein p1+ q1= number of batch input pictures, and p1 is smaller than q 1;
the second setting unit is used for automatically setting the number of sheets corresponding to the actual data picture as p2 and the number of sheets corresponding to the extra data picture as q2 if the value range of the picture ratio is [0.01,100 ]; wherein p2+ q2= number of batch input pictures, and p2 is equal to q 2;
a third setting unit, configured to automatically set the number of sheets corresponding to the actual data picture as p3 and set the number of sheets corresponding to the additional data picture as q3 if the picture ratio is greater than 100; wherein p3+ q3= number of batch input pictures, and p3 is greater than q 3;
and the current batch picture screening and acquiring unit is used for acquiring actual data pictures and additional data pictures with corresponding numbers in the current small picture training set according to the number corresponding to the actual data pictures and the number corresponding to the additional data pictures to form a current batch input picture.
In this embodiment, when determining the number p of actual data pictures and the number q of additional data pictures included in the batch of pictures of each batch, a ratio between the number of the first total pictures and the number of the second total pictures is determined; if the first total picture number/the second total picture number is less than 0.01, setting p + q = batch input picture number, and p is less than q; if the first total picture number/the second total picture number is larger than 100, setting p + q = batch input picture number, and p is larger than q; if 0.01 is not greater than the first total number of pictures/the second total number of pictures, p + q = the number of batch input pictures and p = q may be set.
More specifically, if the picture ratio is less than 0.01, automatically setting the number of sheets corresponding to the actual data picture as p1, and setting the number of sheets corresponding to the additional data picture as q 1; wherein p1+ q1= number of batch input pictures, and p1 is smaller than q 1; if the value range of the picture ratio is [0.01,100], automatically setting the number of sheets corresponding to the actual data picture as p2 and setting the number of sheets corresponding to the additional data picture as q 2; wherein p2+ q2= number of batch input pictures, and p2 is equal to q 2; if the picture ratio is larger than 100, automatically setting the number of sheets corresponding to the actual data picture as p3, and setting the number of sheets corresponding to the additional data picture as q 3; where p3+ q3= number of batch input pictures, and p3 is greater than q 3.
It should be noted that, when the current mini-picture training set is divided into a plurality of batches according to the number of batch input pictures, the division is not completed at one time, but pictures with corresponding numbers are taken out from the current mini-picture training set according to the number of batch input pictures, and the taken-out batch of pictures including the actual data pictures and the extra data pictures meets the conditions that p + q = the number of batch input pictures and p < q, or p + q = the number of batch input pictures and p > q, or p + q = the number of batch input pictures and p = q. And after each batch of batch pictures are taken out from the current small picture training set, the batch pictures of the batch are removed from the current small picture training set so as to avoid being repeatedly extracted to the batch pictures forming the subsequent batch.
For example, compared to a batch of pictures including the actual data picture p1 and the additional data picture q1, and a batch of pictures including the actual data picture p2 and the additional data picture q2, p1 is not necessarily equal to p2, and q1 is not necessarily equal to q2, i.e., the actual data picture and the additional data picture are not completely equally divided according to the number of batch input pictures.
And the convolution unit 103 is configured to input actual data pictures and additional data pictures included in the current batch of input pictures to the convolution layer of the initial deep convolution neural network for convolution, so as to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set, and obtain a second output matrix corresponding to each additional data picture to form a second output matrix set.
In this embodiment, the current small picture training set may be a face head image set applied to face recognition model training, and then the corresponding initial deep convolutional neural network is the face recognition model. Of course, the current thumbnail training set is not limited to the data set of the scene of face recognition, but may also be a data set of other specific application scenes (e.g. OCR recognition), and the corresponding initial deep convolutional neural network is a deep convolutional neural network of the same application scene as the current thumbnail training set.
At this time, in order to facilitate understanding of the complete technical scheme of the present application, the picture batch value corresponding to the current batch input picture is illustrated as 2 (because there is no previous batch input picture when the picture batch value is 1).
For example, the obtained current batch of input pictures is a second batch of input pictures, the previous batch of input pictures of the first batch is recorded as the previous batch of input pictures, and so on, the batch of input pictures of the third batch before the current batch of input pictures is recorded as the next batch of input pictures.
At this time, because p actual data pictures and q additional data pictures included in the current batch of input pictures, each actual data picture in the p actual data pictures corresponds to one pixel matrix, and each additional data picture in the q additional data pictures corresponds to one pixel matrix, the pixel matrices corresponding to the p actual data pictures are input to the convolution layer of the initial deep convolution neural network for convolution, and a first output matrix corresponding to each actual data picture in the p actual data pictures one to one is obtained to form a first output matrix set.
Similarly, the pixel matrices corresponding to the q additional data pictures are input to the convolution layer of the initial deep convolutional neural network for convolution, and a second output matrix corresponding to each additional data picture in the q additional data pictures one to one is obtained to form a second output matrix set.
And the batch normalization unit 104 is configured to input each first output matrix in the first output matrix set to a batch normalization layer of the initial deep convolutional neural network for batch normalization processing, so as to obtain a first batch of normalization processing results corresponding to the first output matrix set.
In this embodiment, in order to accelerate the convergence rate of the initial deep convolutional neural network and to alleviate the problem of "gradient dispersion" in the deep network to a certain extent, a batch of first data matrices composed of each first output matrix in the first output matrix set may be normalized. When the first output matrix set is subjected to batch normalization processing, the specific process is as follows:
batch normalization occurs after convolution calculation and before application of the activation function. If the convolution calculation outputs a plurality of channels, the outputs of the channels need to be subjected to batch normalization respectively, and each channel has independent stretching and offset parameters and is scalar. Let m samples in the small batch. On a single channel, the height and width of the convolution computation output are assumed to be h and w, respectively. Batch normalization of m × h × w elements in the channel is required. The same mean and variance, i.e., the mean and variance of m x h x w elements in the channel, are used in the normalization calculation for these elements.
And an additional batch normalization unit 105, configured to input each second output matrix in the second output matrix set to an additional batch normalization layer of the initial deep convolutional neural network, so as to obtain a second batch of normalization processing results corresponding to the second output matrix set.
In this embodiment, the additional batch normalization unit 105 refers to the operation method of the batch normalization unit 104, and obtains the second batch normalization processing result corresponding to the second output matrix set. When the p actual data pictures are subjected to batch normalization processing through the batch normalization layer to obtain a corresponding first batch of normalization processing results, and q additional data pictures are subjected to batch normalization processing through the additional batch normalization layer to obtain a second batch of normalization processing results, the batch normalization layer and the additional batch normalization layer are considered separately, so that confusion caused by inconsistent distribution of actual picture data and additional picture data can be well isolated. Because the batch processing layer is most sensitive to data distribution, the problem of poor model effect caused by different data distribution is solved by separately processing actual data and additional data in the batch processing layer.
In one embodiment, the additional batch normalization unit 105 includes:
the mean value acquiring unit is used for acquiring a mean value of a second output matrix corresponding to the second output matrix set;
the variance acquiring unit is used for acquiring a second output matrix variance corresponding to the second output matrix set;
the normalization output unit is used for normalizing the second output matrix set according to the second output matrix mean value and the second output matrix variance to obtain a corresponding normalized second output matrix set;
and the scale adjusting and offsetting unit is used for correspondingly carrying out scale change according to the normal distribution data included in the normalized second output matrix set and the pre-stored scale adjusting numerical values and offsetting values to obtain a second batch of normalized processing results corresponding to the second output matrix set.
In this embodiment, when the batch normalization processing is performed on the second output matrix set, the mean and the variance of the second output matrix set are calculated, and the second output matrix set is subtracted from the mean of the second output matrix set and then divided by the variance of the second output matrix, so that the second output matrix set is changed into a standard normal distribution (i.e., the second output matrix set is normalized). And then, performing scale change operation (namely scale operation) and shift operation (namely shift operation) on each normal distribution data included in the normalized second output matrix set, namely y = scale x + shift, namely performing mean shift and variance transformation on each normal distribution data to enable the data to move a certain range from a linear region to a nonlinear region, so that a balance point is found between a larger gradient and the nonlinear transformation, and the characterization capability is improved without losing the linear transformation while the larger gradient is maintained and the training speed is increased. These two parameters need to be learned by the neural network itself in training, i.e. the scale and shift values in the formula.
And an activating unit 106, configured to input the first batch of normalization processing results and the second batch of normalization processing results to a RELU activation function layer of an initial deep convolutional neural network for activation, so as to obtain an output result corresponding to the current batch of input pictures.
In this embodiment, the first batch of normalization processing results and the second batch of normalization processing results are input to a RELU activation function layer of an initial deep convolutional neural network for activation, so as to obtain an output result for inputting to a pooling layer.
The expression of the RELU activation function (i.e., linear rectification activation function) is:
f(x)=max(0,x);
namely, when x is larger than 0, the value of the ReLu activating function is kept, and when x is smaller than or equal to 0, the value of the ReLu activating function is 0, so that the ReLu activating function has the advantages that: when a Gradient Descent (GD) method is used, the convergence rate is higher; only one threshold value is needed to obtain the activation value, and the calculation speed is higher.
In an embodiment, the activation unit 106 is further configured to:
and after the first batch of normalization processing results and the second batch of normalization processing results are merged to obtain a merged set, inputting the merged set into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining an output result corresponding to the current batch of input pictures.
In this embodiment, specifically, when the first batch of normalization processing results and the second batch of normalization processing results are input to the RELU activation function layer of the initial deep convolutional neural network for activation, more specifically, after performing union operation on the two sets to obtain a merged set, the merged set is input to the RELU activation function layer of the initial deep convolutional neural network for activation, so as to obtain an output result corresponding to the current batch of input pictures.
And a loss function value obtaining unit 107, configured to obtain a predicted value corresponding to the output result according to the initial deep convolutional neural network, and obtain a corresponding current loss function value according to the actual value and the predicted value corresponding to the current batch of input pictures.
In this embodiment, after a current batch of input pictures sequentially passes through a convolutional layer, a batch normalization layer and an activation function to obtain an output result, the output result is continuously processed in the initial deep convolutional neural network, specifically sequentially passes through a pooling layer, a full link layer and a softmax layer to be processed, and a predicted value corresponding to the output result is obtained. Because the actual picture data and the extra picture data included in the current batch of input pictures are both marked values (namely actual values), at this time, the cross entropy loss function can be calculated by the p actual picture data and the q extra picture data of each batch together, and the current loss function value is obtained. Calculating the cross entropy loss function is prior art and is not discussed here.
A loss function value comparing unit 108, configured to obtain a last loss function value corresponding to a last batch of stored input pictures, and determine whether the current loss function value is smaller than the last loss function value.
In this embodiment, after obtaining the current loss function value, it is necessary to obtain a previous loss function value corresponding to a previous batch of input pictures to determine whether the current loss function value is smaller than the previous loss function value, and if the current loss function value is smaller than the previous loss function value, it indicates that the p-value and the q-value corresponding to the current batch can be used to take out a next batch of batch pictures; and if the current loss function value is larger than or equal to the last loss function value, representing that the p value and the q value corresponding to the last batch are used for taking out batch processing pictures of the next batch.
The first picture number setting unit 109 is configured to, if the current loss function value is smaller than the last loss function value, obtain a first actual number corresponding to an actual data picture in the current batch of input pictures and a second actual number corresponding to an additional data picture, so as to serve as a picture number selection parameter of a next batch of input pictures.
In this embodiment, when the p value and the q value corresponding to the current batch can be used to take out a next batch of batch-processed pictures, a first actual number corresponding to an actual data picture and a second actual number corresponding to an additional data picture in the current batch of input pictures are directly obtained to be used as picture number selection parameters of the next batch of input pictures.
In an embodiment, the apparatus 100 for constructing an image deep learning network based on a small training set further includes:
and the second picture number setting unit is used for acquiring a third actual number corresponding to the actual data picture in the previous batch of input pictures and a fourth actual number corresponding to the additional data picture if the current loss function value is greater than or equal to the previous loss function value, and taking the third actual number and the fourth actual number as picture number selection parameters of the next batch of input pictures.
In this embodiment, when the p value and the q value corresponding to the previous batch are used to take out the batch processed pictures of the next batch, the third actual number corresponding to the actual data picture in the previous batch of input pictures and the fourth actual number corresponding to the extra data picture are obtained to be used as the picture number selection parameter of the next batch of input pictures.
An updating unit 110, configured to add one to the picture batch value to update the picture batch value, delete the current batch input pictures in the current mini-picture training set to update the current mini-picture training set, and determine whether the picture batch value exceeds the total picture batch value; and if the picture batch value does not exceed the picture total batch value, returning to execute the step of obtaining the current batch input pictures in the current small picture training set.
In this embodiment, after the iteration of this round is completed, the picture batch value needs to be updated by adding one to the picture batch value, so as to obtain the current batch of input pictures of the next batch. And when the iteration of the current round is finished, the current batch of input pictures selected in the iteration process of the current round are deleted from the current small picture training set, and the latest current small picture training set is formed by the rest pictures.
And when the picture batch value is added with one to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, judging whether the picture batch value exceeds the picture total batch value according to the judgment that whether the iteration process is continued, continuing the iteration process if the picture batch value does not exceed the picture total batch value, and stopping the iteration process if the picture batch value exceeds the picture total batch value. In the process of iteratively inputting the current batch of input pictures to the initial deep convolutional neural network, continuously adjusting each model parameter of the initial deep convolutional neural network, thereby obtaining the deep convolutional neural network which can be finally trained and can be practically applied.
And the target neural network obtaining unit 111 is configured to obtain, if the picture batch value exceeds the picture total batch value, a current deep convolutional neural network corresponding to the initial deep convolutional neural network as a target deep convolutional neural network.
In this embodiment, after a plurality of batches of small pictures in the current small picture training set are all input to the initial deep convolutional neural network for training, the target convolutional neural network can be finally obtained. Through the current small-sized picture training set, the requirement that a large amount of data needs to be collected and marked in the process of actually applying the deep convolutional network can be reduced, and therefore the development cost of image-related artificial intelligence products is reduced.
In an embodiment, the apparatus 100 for constructing an image deep learning network based on a small training set further includes:
and the uplink unit is used for acquiring the model parameter set corresponding to the target deep convolutional neural network and uploading the model parameter set to the block chain network.
In this embodiment, the server may serve as a block chain node device to upload the first model parameter set corresponding to the convolutional neural network to the block chain network, and the data solidification storage is implemented by fully utilizing the characteristic that the block chain data cannot be tampered. Moreover, the server can download the model parameter set corresponding to the target deep convolutional neural network from the blockchain to locally generate the deep convolutional neural network.
The corresponding digest information is obtained based on the model parameter set, and specifically, the digest information is obtained by hashing the model parameter set, for example, by using a sha256 algorithm. Uploading summary information to the blockchain can ensure the safety and the fair transparency of the user. The server may download the summary information from the blockchain to verify whether the model parameter sets are tampered with. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
According to the device, an additional batch normalization layer is added in the deep convolutional network, so that the network can simultaneously put actual data and additional data into the network for training, and the problem of model forgetting caused by a pre-training model method is solved. Meanwhile, the batch processing layer is most sensitive to data distribution, and actual data and extra data are processed separately in the batch processing layer, so that the problem of poor model effect caused by different data distribution is solved.
The image deep learning network construction device based on the small training set can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in fig. 4.
Referring to fig. 4, fig. 4 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 500 is a server, and the server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 4, the computer device 500 includes a processor 502, memory, and a network interface 505 connected by a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer program 5032, when executed, causes the processor 502 to perform a mini-training set based image deep learning network construction method.
The processor 502 is used to provide computing and control capabilities that support the operation of the overall computer device 500.
The internal memory 504 provides an environment for running the computer program 5032 in the non-volatile storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 can be caused to execute the image deep learning network construction method based on the mini training set.
The network interface 505 is used for network communication, such as providing transmission of data information. Those skilled in the art will appreciate that the configuration shown in fig. 4 is a block diagram of only a portion of the configuration associated with aspects of the present invention and is not intended to limit the computing device 500 to which aspects of the present invention may be applied, and that a particular computing device 500 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The processor 502 is configured to run the computer program 5032 stored in the memory to implement the method for constructing the image deep learning network based on the small training set disclosed in the embodiment of the present invention.
Those skilled in the art will appreciate that the embodiment of a computer device illustrated in fig. 4 does not constitute a limitation on the specific construction of the computer device, and that in other embodiments a computer device may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may only include a memory and a processor, and in such embodiments, the structures and functions of the memory and the processor are consistent with those of the embodiment shown in fig. 4, and are not described herein again.
It should be understood that, in the embodiment of the present invention, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer readable storage medium stores a computer program, wherein the computer program, when executed by a processor, implements the method for constructing the image deep learning network based on the small training set disclosed by the embodiment of the invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions when the actual implementation is performed, or units having the same function may be grouped into one unit, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for constructing an image deep learning network based on a small training set is characterized by comprising the following steps:
receiving a current small picture training set, calling a video memory size value, calculating to obtain a batch input picture number according to a single picture memory value and the video memory size value in the current small picture training set, and calculating to obtain a picture total batch value according to the total picture number of the training set in the current small picture training set and the batch input picture number; the current small picture training set comprises an actual data picture and an additional data picture;
acquiring current batch input pictures in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and additional data pictures;
respectively inputting actual data pictures and additional data pictures included in the current batch of input pictures into a convolution layer of an initial deep convolution neural network for convolution to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set and a second output matrix corresponding to each additional data picture to form a second output matrix set;
inputting each first output matrix in the first output matrix set to a batch normalization layer of an initial deep convolutional neural network for batch normalization processing to obtain first batch normalization processing results corresponding to the first output matrix set;
inputting each second output matrix in the second output matrix set to an additional batch normalization layer of an initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set;
inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining output results corresponding to the current batch of input pictures;
obtaining a predicted value corresponding to the output result according to the initial depth convolution neural network, and obtaining a corresponding current loss function value according to an actual value and a predicted value corresponding to the current batch input pictures;
obtaining a last loss function value corresponding to a last batch of stored input pictures, and judging whether the current loss function value is smaller than the last loss function value;
if the current loss function value is smaller than the last loss function value, acquiring a first actual number corresponding to an actual data picture and a second actual number corresponding to an additional data picture in the current batch of input pictures to be used as picture number selection parameters of the next batch of input pictures;
adding one to the picture batch value to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, and judging whether the picture batch value exceeds the picture total batch value; if the picture batch value does not exceed the picture total batch value, returning to execute the step of obtaining the current batch input pictures in the current small picture training set; and
if the batch value of the pictures exceeds the total batch value of the pictures, acquiring a current depth convolution neural network corresponding to the initial depth convolution neural network to be used as a target depth convolution neural network;
the actual data picture refers to picture data which does not belong to the public data set and is used for fine tuning training of the initial deep convolutional neural network and corresponds to the data picture with the actual mark amount, the additional data picture is a part of picture data selected from the public data set, and each additional data picture also corresponds to the additional picture mark amount.
2. The method as claimed in claim 1, wherein the calculating a batch number of input pictures according to the memory value of a single picture and the memory size value in the current mini picture training set comprises: obtaining the number of the batch input pictures according to the quotient of the video memory size value and the memory value of a single picture in the current small picture training set;
the calculating according to the total number of pictures in the training set in the current small picture training set and the number of the pictures input in batches to obtain a total picture batch value comprises the following steps: and obtaining a total picture batch value according to the quotient of the total number of the training set pictures corresponding to the current small picture training set and the number of the batch input pictures.
3. The method for constructing an image deep learning network based on a small training set according to claim 1, wherein the obtaining of the current batch of input pictures in the current small picture training set comprises:
acquiring a first total picture number corresponding to the actual data picture in the current small picture training set and a second total picture number corresponding to the additional data picture;
calculating to obtain a picture ratio according to the quotient of the first total picture number and the second total picture number;
judging whether the picture ratio is less than 0.01 or not;
if the picture ratio is smaller than 0.01, automatically setting the number of sheets corresponding to the actual data picture as p1, and setting the number of sheets corresponding to the extra data picture as q 1; wherein p1+ q1= number of batch input pictures, and p1 is smaller than q 1;
if the value range of the picture ratio is [0.01,100], automatically setting the number of sheets corresponding to the actual data picture as p2 and setting the number of sheets corresponding to the additional data picture as q 2; wherein p2+ q2= number of batch input pictures, and p2 is equal to q 2;
if the picture ratio is larger than 100, automatically setting the number of sheets corresponding to the actual data picture as p3, and setting the number of sheets corresponding to the additional data picture as q 3; wherein p3+ q3= number of batch input pictures, and p3 is greater than q 3;
and acquiring actual data pictures and extra data pictures with corresponding numbers in the current small picture training set according to the number corresponding to the actual data pictures and the number corresponding to the extra data pictures to form a current batch input picture.
4. The method for constructing an image deep learning network based on a small training set according to claim 1, wherein the step of inputting each second output matrix in the second output matrix set to an additional batch normalization layer of an initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set comprises:
acquiring a second output matrix mean value corresponding to the second output matrix set;
acquiring a second output matrix variance corresponding to the second output matrix set;
normalizing the second output matrix set according to the second output matrix mean and the second output matrix variance to obtain a corresponding normalized second output matrix set;
and correspondingly carrying out scale change according to the normal distribution data included in the normalized second output matrix set and the pre-stored scale adjustment values and deviation values to obtain a second batch of normalization processing results corresponding to the second output matrix set.
5. The method for constructing an image deep learning network based on a small training set according to claim 1, wherein the inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation to obtain output results corresponding to the current batch of input pictures comprises:
and after the first batch of normalization processing results and the second batch of normalization processing results are merged to obtain a merged set, inputting the merged set into a RELU activation function layer of an initial deep convolutional neural network for activation, and obtaining an output result corresponding to the current batch of input pictures.
6. The method for constructing an image deep learning network based on a small training set according to claim 1, wherein after obtaining a last loss function value corresponding to a last batch of stored input pictures and determining whether the current loss function value is smaller than the last loss function value, the method further comprises:
and if the current loss function value is larger than or equal to the last loss function value, acquiring a third actual number corresponding to the actual data picture in the last batch of input pictures and a fourth actual number corresponding to the extra data picture to be used as picture number selection parameters of the next batch of input pictures.
7. The method for constructing an image deep learning network based on a small training set according to claim 1, wherein if the batch value of the pictures exceeds the total batch value of the pictures, the method for acquiring the current deep convolutional neural network corresponding to the initial deep convolutional neural network as a target deep convolutional neural network further comprises:
and obtaining a model parameter set corresponding to the target deep convolutional neural network and uploading the model parameter set to a block chain network.
8. An image deep learning network construction device based on a small training set is characterized by comprising the following steps:
a training set receiving unit, configured to receive a current small picture training set, call a video memory size value, calculate a number of batch input pictures according to a single picture memory value in the current small picture training set and the video memory size value, and calculate a total picture batch value according to a total number of training set pictures in the current small picture training set and the number of batch input pictures; the current small picture training set comprises an actual data picture and an additional data picture;
the current batch input picture acquisition unit is used for acquiring current batch input pictures in the current small picture training set; the initial value of the picture batch value corresponding to the current batch input pictures is 1, and the current batch input pictures comprise actual data pictures and additional data pictures;
the convolution unit is used for respectively inputting actual data pictures and additional data pictures included in the current batch of input pictures into a convolution layer of the initial deep convolution neural network for convolution to obtain a first output matrix corresponding to each actual data picture to form a first output matrix set and a second output matrix corresponding to each additional data picture to form a second output matrix set;
the batch normalization unit is used for inputting each first output matrix in the first output matrix set to a batch normalization layer of an initial deep convolutional neural network for batch normalization processing to obtain first batch normalization processing results corresponding to the first output matrix set;
the additional batch normalization unit is used for inputting each second output matrix in the second output matrix set to an additional batch normalization layer of the initial deep convolutional neural network to obtain a second batch of normalization processing results corresponding to the second output matrix set;
the activation unit is used for inputting the first batch of normalization processing results and the second batch of normalization processing results into a RELU activation function layer of an initial deep convolutional neural network for activation to obtain output results corresponding to the current batch of input pictures;
a loss function value obtaining unit, configured to obtain a predicted value corresponding to the output result according to an initial deep convolutional neural network, and obtain a corresponding current loss function value according to an actual value and a predicted value corresponding to the current batch of input pictures;
the loss function value comparison unit is used for acquiring a last loss function value corresponding to a last batch of stored input pictures and judging whether the current loss function value is smaller than the last loss function value or not;
a first picture number setting unit, configured to, if the current loss function value is smaller than the previous loss function value, obtain a first actual number corresponding to an actual data picture and a second actual number corresponding to an extra data picture in a current batch of input pictures, so as to serve as picture number selection parameters of a next batch of input pictures;
the updating unit is used for adding one to the picture batch value to update the picture batch value, deleting the current batch input pictures in the current small picture training set to update the current small picture training set, and judging whether the picture batch value exceeds the total picture batch value; if the picture batch value does not exceed the picture total batch value, returning to execute the step of obtaining the current batch input pictures in the current small picture training set; and
the target neural network acquisition unit is used for acquiring a current deep convolutional neural network corresponding to the initial deep convolutional neural network as a target deep convolutional neural network if the picture batch value exceeds the total picture batch value;
the actual data picture refers to picture data which does not belong to the public data set and is used for fine tuning training of the initial deep convolutional neural network and corresponds to the data picture with the actual mark amount, the additional data picture is a part of picture data selected from the public data set, and each additional data picture also corresponds to the additional picture mark amount.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for constructing an image deep learning network based on a small training set according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to execute the method of constructing a training set-based image deep learning network according to any one of claims 1 to 7.
CN202010937042.3A 2020-09-08 2020-09-08 Image deep learning network construction method and device based on small training set Active CN111931698B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010937042.3A CN111931698B (en) 2020-09-08 2020-09-08 Image deep learning network construction method and device based on small training set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010937042.3A CN111931698B (en) 2020-09-08 2020-09-08 Image deep learning network construction method and device based on small training set

Publications (2)

Publication Number Publication Date
CN111931698A CN111931698A (en) 2020-11-13
CN111931698B true CN111931698B (en) 2021-01-26

Family

ID=73310116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010937042.3A Active CN111931698B (en) 2020-09-08 2020-09-08 Image deep learning network construction method and device based on small training set

Country Status (1)

Country Link
CN (1) CN111931698B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112446428B (en) * 2020-11-27 2024-03-05 杭州海康威视数字技术股份有限公司 An image data processing method and device
CN113762506B (en) * 2021-08-13 2023-11-24 中国电子科技集团公司第三十八研究所 A computer vision deep learning model pruning method and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106355248A (en) * 2016-08-26 2017-01-25 深圳先进技术研究院 Deep convolution neural network training method and device
US11106974B2 (en) * 2017-07-05 2021-08-31 International Business Machines Corporation Pre-training of neural network by parameter decomposition
CN111368989B (en) * 2018-12-25 2023-06-16 同方威视技术股份有限公司 Training method, device and equipment for neural network model and readable storage medium
CN109815826B (en) * 2018-12-28 2022-11-08 新大陆数字技术股份有限公司 Method and device for generating face attribute model
CN111028134A (en) * 2019-11-29 2020-04-17 杭州依图医疗技术有限公司 Image processing method, apparatus, system and medium
CN111126481A (en) * 2019-12-20 2020-05-08 湖南千视通信息科技有限公司 Training method and device of neural network model
CN111191732B (en) * 2020-01-03 2021-05-14 天津大学 Target detection method based on full-automatic learning
CN111259939B (en) * 2020-01-10 2022-06-07 苏州浪潮智能科技有限公司 Tuning management method, device, equipment and medium for deep learning model

Also Published As

Publication number Publication date
CN111931698A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
US11983850B2 (en) Image processing method and apparatus, device, and storage medium
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN111325851B (en) Image processing method and device, electronic device, and computer-readable storage medium
CN111383232B (en) Cutout method, device, terminal equipment and computer-readable storage medium
CN111950723B (en) Neural network model training method, image processing method, device and terminal equipment
CN111859023A (en) Video classification method, apparatus, device, and computer-readable storage medium
WO2022022154A1 (en) Facial image processing method and apparatus, and device and storage medium
WO2020248841A1 (en) Au detection method and apparatus for image, and electronic device and storage medium
CN113239875B (en) Facial feature acquisition method, system, device and computer-readable storage medium
CN112163637A (en) Image classification model training method and device based on unbalanced data
CN113706439B (en) Image detection method, device, storage medium and computer equipment
CN111814620A (en) Face image quality evaluation model establishing method, optimization method, medium and device
CN114187463B (en) Electronic archive generation method, device, terminal equipment and storage medium
CN107871103B (en) A face authentication method and device
CN111931698B (en) Image deep learning network construction method and device based on small training set
CN111126347B (en) Human eye state identification method, device, terminal and readable storage medium
CN113808267A (en) GIS map-based three-dimensional community display method and system
CN115223013A (en) Model training method, device, equipment and medium based on small data generation network
CN113762042B (en) Video identification method, device, equipment and storage medium
CN110489584B (en) Image classification method and system based on densely connected MobileNets model
CN113989219B (en) Method, device, electronic device and computer-readable storage medium for controlling positioning of steel billet before entering furnace
CN110647898B (en) Image processing method, image processing device, electronic equipment and computer storage medium
CN118397223A (en) Scene simulation method, device, equipment and medium based on virtual reality
CN117710621A (en) A three-dimensional model wearing method, device, equipment and storage medium
CN116579409A (en) Pruning Acceleration Method and Acceleration System for Intelligent Camera Model Based on Reparameterization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant