CN104102919B

CN104102919B - Image classification method capable of effectively preventing convolutional neural network from being overfit

Info

Publication number: CN104102919B
Application number: CN201410333924.3A
Authority: CN
Inventors: 王瀚漓; 俞定君
Original assignee: Tongji University
Current assignee: Deep Blue Technology Shanghai Co Ltd
Priority date: 2014-07-14
Filing date: 2014-07-14
Publication date: 2017-05-24
Anticipated expiration: 2034-07-14
Also published as: CN104102919A

Abstract

The invention relates to an image classification method for effectively preventing over-fitting of a convolutional neural network, comprising: obtaining an image training set and an image test set; training a convolutional neural network model; using the trained convolutional neural network model to test images set for image classification. Among them, the steps of convolutional neural network model training are: preprocessing and sample amplification of the image data in the image training set to form training samples; performing forward propagation on the training samples to extract image features; calculating each sample in the Softmax classifier The classification probability of ; the training error is calculated according to the probability y _i ; the training error is used to backpropagate from the last layer of the convolutional neural network in turn, and the network weight matrix W is modified by the stochastic gradient descent method SGD. Compared with the prior art, the present invention has the advantages of high classification accuracy, fast convergence speed, high calculation efficiency and the like.

Description

An Image Classification Method Effectively Preventing Overfitting of Convolutional Neural Networks

技术领域technical field

本发明涉及图像处理领域，尤其是涉及一种有效防止卷积神经网络过拟合的图像分类方法。The invention relates to the field of image processing, in particular to an image classification method for effectively preventing over-fitting of a convolutional neural network.

背景技术Background technique

随着多媒体技术与计算机网络的广泛应用，网络上出现大量图像数据。为了能够有效的管理这些图像文件，为用户提供更好的体验服务，自动识别这些图像的内容变的越来越重要。With the wide application of multimedia technology and computer network, a large amount of image data appears on the network. In order to effectively manage these image files and provide users with better experience services, it is becoming more and more important to automatically identify the contents of these images.

随机机器学习方法的不断完善和发展，深度学习算法越来越受到重视，其中卷积神经网络就是深度学习中一种重要的算法，目前已成为语音分析和图像识别领域的研究热点。卷积神经打破了传统神经网络中层与层之间的神经元全连接的方式，它的权值共享网络结构使之更类似于生物神经网络，降低了网络模型的复杂度，减少了权值的数量。该优点在网络的输入是图像是表现的更为明显，使图像可以直接作为网络的输入，避免了传统识别算法中复杂的特征提取和数据重建过程。卷积网络是为识别二维形状而特殊设计的一个多层感知器，这种网络结构对平移、比例缩放、倾斜或者其他形式的形变具有高度不变性。With the continuous improvement and development of random machine learning methods, more and more attention has been paid to deep learning algorithms. Among them, convolutional neural network is an important algorithm in deep learning, and it has become a research hotspot in the fields of speech analysis and image recognition. The convolutional neural network breaks the way of full connection of neurons between layers in the traditional neural network. Its weight sharing network structure makes it more similar to the biological neural network, which reduces the complexity of the network model and reduces the weight value. quantity. This advantage is more obvious when the input of the network is the image, so that the image can be directly used as the input of the network, avoiding the complicated feature extraction and data reconstruction process in the traditional recognition algorithm. The convolutional network is a multi-layer perceptron specially designed to recognize two-dimensional shapes. This network structure is highly invariant to translation, scaling, tilting, or other forms of deformation.

基于卷积神经网络的图像分类技术能够有效地自动的从图像中提取特征信息，提取的特征具有非常好的图像表达能力，因此该技术在一些图像分类问题中取得了令人满意的实验结果。尽管如此，该技术目前还存在以下缺陷：Image classification technology based on convolutional neural network can effectively and automatically extract feature information from images, and the extracted features have very good image expression ability, so this technology has achieved satisfactory experimental results in some image classification problems. However, the technology currently has the following drawbacks:

第一，由于图像数据库中带标签的数据是有限的，随着卷积神经网络的规模不断增大，需要训练的权值也会不断增加，这势必使得神经网络出现过拟合现象，即训练时的分类精度远远好于测试时的分类精度。First, because the labeled data in the image database is limited, as the scale of the convolutional neural network increases, the weights that need to be trained will also increase, which will inevitably lead to over-fitting of the neural network, that is, training The classification accuracy at test time is much better than that at test time.

第二，为了获取更好的特征表达能力以便取得更好的分类精度，某些研究人员采用增加网络深度、扩大网络规模的方法。但是，这种方法将极大的增加计算复杂度，传统的CPU运算速度已经不能满足这样的计算复杂度。Second, in order to obtain better feature expression ability and better classification accuracy, some researchers adopt the method of increasing network depth and expanding network size. However, this method will greatly increase the computational complexity, and the traditional CPU operation speed can no longer meet such computational complexity.

发明内容Contents of the invention

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种分类精度高、收敛速度快、计算效率高的有效防止卷积神经网络过拟合的图像分类方法。The object of the present invention is to provide an image classification method that can effectively prevent over-fitting of convolutional neural networks with high classification accuracy, fast convergence speed and high calculation efficiency in order to overcome the above-mentioned defects in the prior art.

本发明的目的可以通过以下技术方案来实现：The purpose of the present invention can be achieved through the following technical solutions:

一种有效防止卷积神经网络过拟合的图像分类方法，该方法运行在GPU中，包括：An image classification method that effectively prevents overfitting of convolutional neural networks, the method runs on the GPU, including:

步骤一，获得图像训练集和图像测试集；Step 1, obtain image training set and image test set;

步骤二，卷积神经网络模型的训练，具体包括以下步骤：Step 2, the training of the convolutional neural network model, specifically includes the following steps:

a)设定卷积神经网络的结构和训练次数上限N，初始化神经网络权值矩阵W，所述结构包括卷积神经网络的层数和每层中特征图的数量；a) set the structure of the convolutional neural network and the upper limit of the number of training times N, initialize the neural network weight matrix W, the structure includes the number of layers of the convolutional neural network and the number of feature maps in each layer;

b)从所述图像训练集中获取图像数据进行预处理，并进行样本扩增，形成训练样本；b) obtaining image data from the image training set for preprocessing, and performing sample amplification to form training samples;

c)对所述训练样本进行前向传播提取图像特征，所述前向传播包括卷积层、非线性归一化层和混合pooling层的计算；c) performing forward propagation on the training sample to extract image features, the forward propagation including calculation of convolutional layer, nonlinear normalization layer and mixed pooling layer;

d)在Softmax分类器中计算各样本的分类概率：d) Calculate the classification probability of each sample in the Softmax classifier:

式中，s_i表示Softmax分类器第i个神经元的输出值，s_i＝F·η，F为某个训练样本的图像特征向量，η为相应的权值，n为需要分类的类别数量；In the formula, s _i represents the output value of the i-th neuron of the Softmax classifier, s _i =F η, F is the image feature vector of a certain training sample, η is the corresponding weight, n is the number of categories to be classified ;

e)根据概率y_i计算得到训练误差e) Calculate the training error according to the probability y _i

当i＝k时，θ_ik＝1，i表示第i个类别，当原始输入属于类别i时， When i=k, θ _ik =1, i represents the i-th category, when the original input belongs to category i,

f)利用所述训练误差从卷积神经网络的最后一层依次往前反向传播，同时利用随机梯度下降法SGD修改网络权值矩阵W；f) Utilizing the training error from the last layer of the convolutional neural network to backpropagate forward sequentially, and simultaneously utilizing the stochastic gradient descent method SGD to modify the network weight matrix W;

g)判断模型训练是否完成，若是，则保存卷积神经网络模型和Softmax分类器后执行步骤三，若否，则返回步骤b)；g) judge whether the model training is completed, if so, execute step 3 after saving the convolutional neural network model and the Softmax classifier, if not, then return to step b);

步骤三，利用训练后的卷积神经网络模型对图像测试集进行图像分类。Step 3, using the trained convolutional neural network model to perform image classification on the image test set.

所述步骤二的a)中，初始权值矩阵W的元素的取值范围为[-0.01,0.01]。In a) of the step 2, the value range of the elements of the initial weight matrix W is [-0.01, 0.01].

所述步骤二的b)具体为：b) of said step 2 is specifically:

b1)对于长宽相等的图像，利用OPENCV中的cvResize函数进行缩放，缩放后的图片大小为N×N；b1) For images with equal length and width, use the cvResize function in OPENCV to scale, and the scaled picture size is N×N;

b2)对长宽不相等的图像，固定短边S不变，截取长边中间的连续S个像素，形成S×S大小的图像，再重复步骤b1)最终形成N×N大小的图像；b2) For images with unequal length and width, the short side S is fixed, and S continuous pixels in the middle of the long side are intercepted to form an image of S×S size, and then step b1) is repeated to finally form an image of N×N size;

b3)计算所有图像的像素值之和，并除以图像的数量得到一个均值图像，在每一幅图像中减去所述均值图像得到输入样本；b3) Calculate the sum of the pixel values of all images, and divide by the number of images to obtain an average image, and subtract the average image from each image to obtain an input sample;

b4)对所述输入样本进行数据扩增，形成最终的训练样本。b4) performing data amplification on the input samples to form final training samples.

步骤二的c)中，所述卷积层的计算具体为：In c) of step 2, the calculation of the convolutional layer is specifically:

y_k＝max{w_k*x,0}y _k = max{w _k *x,0}

其中，x表示前一层的输出，即当前层的输入，y_k表示第k个特征图的输出，w_k代表与前一层的输出相连的第k个权值矩阵，“*”表示二维的内积运算；Among them, x represents the output of the previous layer, that is, the input of the current layer, y _k represents the output of the kth feature map, w _k represents the kth weight matrix connected to the output of the previous layer, and "*" represents two Dimension inner product operation;

所述非线性归一化层的计算具体为：The calculation of the nonlinear normalization layer is specifically:

其中，x_kij为非线性归一化层计算时前一层第k个特征图的输出，累加运算是在第k个特征图相邻的N个特征图的相同位置(i,j)上完成的，α和β为预设的归一化参数，y_kij为新生成的特征图；Among them, x _kij is the output of the k-th feature map of the previous layer during the calculation of the nonlinear normalization layer, and the accumulation operation is completed on the same position (i, j) of the N feature maps adjacent to the k-th feature map , α and β are the preset normalization parameters, and y _kij is the newly generated feature map;

所述混合pooling层的计算具体为：The calculation of the mixed pooling layer is specifically:

其中，λ是取值为0或者1的随机参数，x_kpq为混合pooling层计算时前一层第k个特征图的输出，R_ij为待降采样的区域。Among them, λ is a random parameter with a value of 0 or 1, x _kpq is the output of the kth feature map of the previous layer when the hybrid pooling layer is calculated, and R _ij is the area to be down-sampled.

所述步骤g)中，判断模型训练是否完成的准则是：达到训练次数上限。In the step g), the criterion for judging whether the model training is completed is: reaching the upper limit of training times.

与现有技术相比，本发明具有以下优点：Compared with the prior art, the present invention has the following advantages:

第一，本发明首次提出在卷积神经网络降采样时使用混合降采样Mixed Pooling的方法，能够有效的防止神经网络的过拟合现象，最终达到提高分类精度的效果，并且具有鲁棒性好的特点。First, the present invention proposes for the first time the method of using mixed down-sampling Mixed Pooling when down-sampling the convolutional neural network, which can effectively prevent the over-fitting phenomenon of the neural network, and finally achieve the effect of improving the classification accuracy, and has good robustness specialty.

第二，本发明使用了图像处理单元GPU(Graphic Processing Units)加速处理计算的方式，使得卷积神经网络在处理较大数据量的时候能够快速的计算和收敛。Second, the present invention uses an image processing unit GPU (Graphic Processing Units) to accelerate processing and calculation, so that the convolutional neural network can quickly calculate and converge when processing a large amount of data.

第三，本发明的识别准确率优于在CIFAR10、CIFAR100、SVHN数据集上的主流算法，并且具有较高的计算效率。Third, the recognition accuracy of the present invention is superior to mainstream algorithms on CIFAR10, CIFAR100, and SVHN data sets, and has higher computational efficiency.

附图说明Description of drawings

图1为本发明模型训练过程的示意图；Fig. 1 is the schematic diagram of the model training process of the present invention;

图2为本发明图像分类过程的示意图。Fig. 2 is a schematic diagram of the image classification process of the present invention.

具体实施方式detailed description

下面结合附图和具体实施例对本发明进行详细说明。本实施例以本发明技术方案为前提进行实施，给出了详细的实施方式和具体的操作过程，但本发明的保护范围不限于下述的实施例。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

如图1-图2所示，一种有效防止卷积神经网络过拟合的图像分类方法，该方法首先从图像数据库获取图像集M并将其划分为训练集M_t和测试集M_y，然后根据训练集M_t建立卷积神经网络模型，最后以训练好的卷积神经网络模型对测试集M_y进行图像分类。As shown in Figures 1-2, an image classification method that effectively prevents convolutional neural networks from overfitting, the method first obtains an image set M from an image database and divides it into a training set M _t and a test set M _y , Then a convolutional neural network model is established according to the training set M _t , and finally the image classification is performed on the test set M _y with the trained convolutional neural network model.

如图1所示，卷积神经网络模型的训练具体包括以下步骤：As shown in Figure 1, the training of the convolutional neural network model specifically includes the following steps:

在步骤S101中：设定卷积神经网络的结构和训练次数上限N，并且初始化神经网络权值矩阵W，具体为：In step S101: set the structure of the convolutional neural network and the upper limit of training times N, and initialize the weight matrix W of the neural network, specifically:

1a)根据问题的规模预先设定卷积神经网络的层次和每层的特征图数量，实验采用Input1-Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64-Softmax(Input表示输入层，其后所带的数字n表示本层的feature map数量，Conv表示卷积层，LRN表示非线性归一化层，Pooling表示降采样层，Softmax表示最后一层的分类器层)的结构；1a) According to the scale of the problem, the level of the convolutional neural network and the number of feature maps of each layer are preset. The number n indicates the number of feature maps in this layer, Conv indicates the convolutional layer, LRN indicates the nonlinear normalization layer, Pooling indicates the downsampling layer, and Softmax indicates the structure of the classifier layer of the last layer);

1b)对层与层之间进行连接并初始化神经网络权值矩阵W，初始化的方法为对于W中每一个元素，随机产生一个[-0.01,0.01]的浮点数并赋值。1b) Connect the layers and initialize the weight matrix W of the neural network. The initialization method is to randomly generate a floating-point number of [-0.01, 0.01] for each element in W and assign it.

在步骤S102中：从训练集M_t中获取图像数据，对获取的图像数进行预处理和数据扩增：In step S102: obtain image data from training set M _t , carry out preprocessing and data amplification to the obtained image number:

2a)对于长宽相等的图像，直接利用OPENCV中的cvResize函数进行缩放，使得缩放后的图片为N×N像素大小，本实施例中，N＝32；2a) For images equal in length and width, directly utilize the cvResize function in OPENCV to zoom, so that the zoomed picture is N×N pixel size, in the present embodiment, N=32;

2b)对长宽不相等的图像，固定短边S不变，截取长边中间的S个像素，至此形成一幅S×S大小的图像，再重复2a)的步骤最终形成32×32大小的图像；2b) For images with unequal length and width, fix the short side S unchanged, and intercept S pixels in the middle of the long side to form an image of S×S size, and then repeat the steps of 2a) to finally form a 32×32 size image image;

2c)使用所有图像计算各个位置像素值之和，并除上图像的数量得到一个均值图像，最后在每一幅图像中减去该均值图像得到的输入样本；2c) Use all images to calculate the sum of the pixel values at each position, and divide by the number of images to obtain an average image, and finally subtract the input sample obtained by the average image from each image;

2d)对每一个32×32大小的输入样本进行数据扩增，把样本看成32×32的矩阵，截取左上24×24个元素、左下24×24个元素、右上24×24个元素、右下24×24个元素和中央24×24个作为新的输入样本，并且对这5个新的输入样本继续水平翻转，再形成5幅新的输入样本，这样原本32×32的样本经过数据扩增技术之后就得到了10个24×24的新样本。2d) Perform data amplification on each 32×32 input sample, regard the sample as a 32×32 matrix, and intercept 24×24 elements in the upper left, 24×24 elements in the lower left, 24×24 elements in the upper right, and 24×24 elements in the right The lower 24×24 elements and the central 24×24 are used as new input samples, and these 5 new input samples are continuously flipped horizontally to form 5 new input samples, so that the original 32×32 samples are processed by data expansion. After increasing the technology, 10 new samples of 24×24 are obtained.

在步骤S103中：使用步骤S102得到的新样本作为卷积神经网络的输入层Input1进行前向传播从而提取图像特征，前向传播的需要经过Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64的过程，也就是经过两个阶段的卷积进行特征的提取，具体步骤如下：In step S103: use the new sample obtained in step S102 as the input layer Input1 of the convolutional neural network to perform forward propagation to extract image features, and the forward propagation needs to go through the process of Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64, That is, feature extraction is performed after two stages of convolution. The specific steps are as follows:

3a)卷积层(Conv)计算：3a) Convolution layer (Conv) calculation:

y_k＝max{w_k*x,0}y _k = max{w _k *x,0}

其中，x表示前一层的输出(也就是本层的输入)，在第一个Conv64层中就是Input1层的输出值，在第二个Conv64层中就是第一个Pooling64的输出，y_k表示Conv64层的第k个特征图输出(也就是前一层的第k个输出分量)，w_k代表与前一层输出相连的第k个权值矩阵，“*”表示二维的内积运算。Among them, x represents the output of the previous layer (that is, the input of this layer), the output value of the Input1 layer in the first Conv64 layer, and the output of the first Pooling64 in the second Conv64 layer, and y _k represents The k-th feature map output of the Conv64 layer (that is, the k-th output component of the previous layer), w _k represents the k-th weight matrix connected to the output of the previous layer, and "*" represents a two-dimensional inner product operation .

3b)非线性归一化(LRN)计算：3b) Nonlinear Normalization (LRN) calculation:

其中，x_kij为Conv64层的第k个特征图的输出(也就是前一层的第k个输出分量)，累加运算是在第k个特征图相邻的5个特征图的相同位置(i,j)上完成的，α＝0.001和β＝0.75为预设的归一化参数，至此LRN层新生成的特征图可以表示成y_kij。Among them, x _kij is the output of the kth feature map of the Conv64 layer (that is, the kth output component of the previous layer), and the accumulation operation is the same position of the 5 feature maps adjacent to the kth feature map (i ,j) completed above, α=0.001 and β=0.75 are the preset normalization parameters, so far the newly generated feature map of the LRN layer can be expressed as y _kij .

3c)混合pooling层计算：3c) Hybrid pooling layer calculation:

其中，λ是取值为0或者1的随机参数，x_kpq为前一个LRN64层第k个特征图的输出(也就是前一层的第k个输出分量)，R_ij为待降采样的区域，选取的降采样区域为3×3大小。Among them, λ is a random parameter with a value of 0 or 1, x _kpq is the output of the kth feature map of the previous LRN64 layer (that is, the kth output component of the previous layer), and R _ij is the area to be downsampled , the selected downsampling area is 3×3 in size.

依次执行上述三种计算，直至完成所有卷积阶段。The above three calculations are performed sequentially until all convolution stages are completed.

在步骤S104中：使用步骤S103得到的图像特征向量F在Softmax分类器中计算该样本被分到每一类的概率y_i：In step S104: use the image feature vector F obtained in step S103 to calculate the probability _yi of the sample being classified into each class in the Softmax classifier:

其中，s_i表示Softmax分类器第i个神经元的输出值，它由图像特征向量F点积相应的权值计算得到，即s_i＝F·η，η为相应的权值，n为需要分类的类别数量。假设需要分类的图片共有n类，则最终形成了一个Y＝y₁,y₁,…,y_n}的实际输出向量，假设某一图片属于第i类，那么它的期望输出向量则为也就是向量的第i个元素为1，其余为0，根据即可计算该样本的误差向量δ，第i个训练误差δ_i的计算公式为：Among them, s _i represents the output value of the i-th neuron of the Softmax classifier, which is calculated from the corresponding weight of the image feature vector F dot product, that is, s _i =F·η, η is the corresponding weight, and n is the required The number of categories to classify. Assuming that there are n categories of pictures to be classified, an actual output vector of Y=y ₁ ,y ₁ ,...,y _n } is finally formed. Assuming that a certain picture belongs to the i-th category, its expected output vector is That is, the i-th element of the vector is 1, and the rest are 0, according to The error vector δ of the sample can be calculated, and the calculation formula of the i-th training error δ _i is:

在步骤S105中：利用训练误差从卷积神经网络的最后一层依次往前反向传播，从Softmax层开始依次向前把误差传播到pooling层、非线性归一化层和卷积层，同时利用随机梯度下降法SGD修改网络权值矩阵W。In step S105: use the training error to propagate forward and backward sequentially from the last layer of the convolutional neural network, and propagate the error forward to the pooling layer, the nonlinear normalization layer and the convolutional layer sequentially from the Softmax layer, and at the same time Use the stochastic gradient descent method SGD to modify the network weight matrix W.

在步骤S106中：反向传播完成之后判断训练次数是否达到了步骤1中设定的训练次数上限N，如果达到就停止训练保存模型；如果未达到则继续返回步骤S102继续训练。In step S106: After the backpropagation is completed, it is judged whether the number of training times has reached the upper limit N of training times set in step 1, and if it is reached, the training and storage model is stopped; if it is not reached, then continue to return to step S102 to continue training.

在步骤S107中：保存训练得到的模型和Softmax分类器。In step S107: save the trained model and Softmax classifier.

如图2所示，以训练好的卷积神经网络模型对测试集M_y进行图像分类具体步骤为：As shown in Figure 2, the specific steps for image classification of the test set M _y with the trained convolutional neural network model are as follows:

在步骤S201中：从测试集M_y中提取测试样本进行预处理和扩增，以获得的数据作为网络输入；In step S201: extract test samples from the test set M _y for preprocessing and amplification, and obtain the data as network input;

在步骤S202中：按步骤S103依次进行卷积层、非线性归一化层和混合pooling层的计算，直至完成所有卷积阶段；In step S202: according to step S103, the calculation of the convolutional layer, the nonlinear normalization layer and the mixed pooling layer is performed sequentially until all convolutional stages are completed;

在步骤S203中：得到测试样本的特征向量；In step S203: obtain the feature vector of the test sample;

在步骤S204中：利用特征向量用Softmax分类器计算该样本被分到每一类的概率y_i，计算{y₁,y₁,…,y_n}最大的元素，假设该元素为y_j,则该样本最终判断的类别为第j类。In step S204: use the feature vector and use the Softmax classifier to calculate the probability y _i of the sample being classified into each category, and calculate the largest element of {y ₁ , y ₁ ,...,y _n }, assuming that the element is y _j , Then the final judged category of the sample is the jth category.

为了验证本发明的性能，本实施例在三个公开数据集(CIFAR-10、CIFAR-100、SVHN)上进行了实验，并且对采用了混合降采样的卷积神经网络与只采用普通降采样的卷积神经网络方法进行了分析比较。实验均按照相应数据集的实验规定进行训练与测试。从表1、表2、表3中的对比中可以看出，混合降采样的在训练误差相对较大的情况依旧能够得到比传统降采样方法好的测试误差，这充分证明了混合降采样对在防止卷积神经网络的过拟合中有着重要的作用。在上述的三个数据集上，本发明的测试误差分别是10.80％、38.07％、3.01％。实验结果由于目前公布的主流算法，具有较高的识别率。In order to verify the performance of the present invention, this embodiment carried out experiments on three public data sets (CIFAR-10, CIFAR-100, SVHN), and used the convolutional neural network with hybrid downsampling and only ordinary downsampling The convolutional neural network methods are analyzed and compared. The experiments are carried out in accordance with the experimental regulations of the corresponding data sets for training and testing. From the comparison in Table 1, Table 2, and Table 3, it can be seen that the hybrid downsampling method can still obtain a better test error than the traditional downsampling method when the training error is relatively large, which fully proves that the hybrid downsampling method is beneficial to It plays an important role in preventing overfitting of convolutional neural networks. On the above three data sets, the test errors of the present invention are 10.80%, 38.07%, and 3.01%, respectively. The experimental results have a high recognition rate due to the currently published mainstream algorithms.

以上所述仅为本发明的优选实施例，并不用于限制本发明。本发明还包括由以上技术特征任意组合所组成的技术方案。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. The present invention also includes technical solutions composed of any combination of the above technical features.

表1 CIFAR-10数据集的实验数据Table 1 Experimental data of CIFAR-10 dataset

降采样方式downsampling method 训练误差training error 测试误差test error 传统最大值降采样Traditional Max Downsampling 3.01％3.01% 11.36％11.36% 传统均值降采样Traditional Mean Downsampling 4.52％4.52% 13.75％13.75% 混合降采样Mix PoolingMixed Downsampling Mix Pooling 6.25％6.25% 10.80％10.80%

表2 CIFAR-100数据集的实验数据Table 2 Experimental data of CIFAR-100 dataset

降采样方式downsampling method 训练误差training error 测试误差test error 传统最大值降采样Traditional Max Downsampling 5.42％5.42% 40.09％40.09% 传统均值降采样Traditional Mean Downsampling 14.61％14.61% 44.01％44.01% 混合降采样Mix PoolingMixed Downsampling Mix Pooling 25.71％25.71% 38.07％38.07%

表3 SVHN数据集的实验数据Table 3 Experimental data of SVHN dataset

Claims

1. it is a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, it is characterised in that the method operates in GPU In, including：

Step one, obtains training set of images and image measurement collection；

Step 2, the training of convolutional neural networks model, specifically includes following steps：

A) structure and frequency of training upper limit N of setting convolutional neural networks, initialize neural network weight matrix W, the structure Quantity including characteristic pattern in the number of plies of convolutional neural networks and every layer；

B) view data is obtained from described image training set to be pre-processed, and carry out sample amplification, form training sample；

C) propagated forward is carried out to the training sample and extracts characteristics of image, the propagated forward includes convolutional layer, non-linear returns One calculating for changing pooling layers of layer and mixing；

D) class probability of various kinds sheet is calculated in Softmax graders：

y_{i} = \frac{\exp (s_{i})}{Σ_{k = 1}^{n} \exp (s_{k})}

In formula, s_iRepresent the output valve of Softmax i-th neuron of grader, s_i=F η, F are the image of certain training sample Characteristic vector, η is corresponding weights, and n is the categorical measure for needing classification；

E) according to probability y_iIt is calculated training error

δ_{i} = Σ_{k = 1}^{n} \frac{y_{k}^{*}}{y_{k}} y_{k} (θ_{i k} - y_{i})

As i=k, θ_ik=1, i represent i-th classification, when be originally inputted belong to classification i when,

F) last layer successively forward backpropagation of the training error from convolutional neural networks is utilized, while using boarding steps Degree descent method SGD modification network weight matrix Ws；

Whether g) judgment models training completes, if so, performing step after then preserving convolutional neural networks model and Softmax graders Rapid three, if it is not, then return to step b)；

Step 3, image classification is carried out using the convolutional neural networks model after training to image measurement collection.

2. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature Be, the step 2 a) in, the span of the element of initial weight matrix W is [- 0.01,0.01].

3. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is, the b of the step 2) it is specially：

B1) for as broad as long image, zoomed in and out using the cvResize functions in OPENCV, the picture size after scaling It is N × N；

B2) to the unequal image of length and width, fixed short side S is constant, intercepts the continuous S pixel in the middle of side long, forms S × S big Small image, repeats step b1) ultimately form the image of N × N sizes；

B3 the pixel value sum of all images) is calculated, and quantity divided by image obtains an average image, in every piece image In subtract the average image and obtain input sample；

B4 data amplification) is carried out to the input sample, final training sample is formed.

4. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is, the c of step 2) in, the calculating of the convolutional layer is specially：

y_k=max { w_k*x,0}

Wherein, x represents the input of the output of preceding layer, i.e. current layer, y_kRepresent k-th output of characteristic pattern, w_kRepresent with it is previous K-th weight matrix that the output of layer is connected, " * " represents the inner product operation of two dimension；

The calculating of the non-linear normalizing layer is specially：

y_{k i j} = x_{k i j} / {(1 + \frac{α}{N} \cdot Σ_{l = k - \frac{N}{2}}^{k + \frac{N}{2}} {(x_{l i j})}^{2})}^{β}

Wherein, x_kijThe output of k-th characteristic pattern of preceding layer when being calculated for non-linear normalizing layer, accumulating operation is special at k-th Levy what is completed in the same position (i, j) of the adjacent N number of characteristic pattern of figure, α and β is default normalized parameter, y_kijFor newly-generated Characteristic pattern；

The calculating of pooling layers of the mixing is specially：

y_{k i j} = λ \cdot \max_{(p, q) &Element; R_{i j}} (x_{k p q}) + (1 - λ) \cdot \frac{1}{| R_{i j} |} \cdot \underset{(p, q) &Element; R_{i j}}{Σ} x_{k p q}

Wherein, λ is the random parameter that value is 0 or 1, x_kpqK-th characteristic pattern of preceding layer during for pooling layers of calculating of mixing Output, R_ijTo treat down-sampled region.

5. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is that in the step g), the criterion whether judgment models training completes is：Reach the frequency of training upper limit.