[go: up one dir, main page]

CN104102919B - Image classification method capable of effectively preventing convolutional neural network from being overfit - Google Patents

Image classification method capable of effectively preventing convolutional neural network from being overfit Download PDF

Info

Publication number
CN104102919B
CN104102919B CN201410333924.3A CN201410333924A CN104102919B CN 104102919 B CN104102919 B CN 104102919B CN 201410333924 A CN201410333924 A CN 201410333924A CN 104102919 B CN104102919 B CN 104102919B
Authority
CN
China
Prior art keywords
image
training
convolutional neural
layer
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410333924.3A
Other languages
Chinese (zh)
Other versions
CN104102919A (en
Inventor
王瀚漓
俞定君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deep Blue Technology Shanghai Co Ltd
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201410333924.3A priority Critical patent/CN104102919B/en
Publication of CN104102919A publication Critical patent/CN104102919A/en
Application granted granted Critical
Publication of CN104102919B publication Critical patent/CN104102919B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

本发明涉及一种有效防止卷积神经网络过拟合的图像分类方法,包括:获得图像训练集和图像测试集;卷积神经网络模型的训练;用训练后的卷积神经网络模型对图像测试集进行图像分类。其中,卷积神经网络模型训练的步骤为:对图像训练集中的图像数据进行预处理和样本扩增,形成训练样本;对训练样本进行前向传播提取图像特征;在Softmax分类器中计算各样本的分类概率;根据概率yi计算得到训练误差;利用所述训练误差从卷积神经网络的最后一层依次往前反向传播,同时利用随机梯度下降法SGD修改网络权值矩阵W。与现有技术相比,本发明具有分类精度高、收敛速度快、计算效率高等优点。

The invention relates to an image classification method for effectively preventing over-fitting of a convolutional neural network, comprising: obtaining an image training set and an image test set; training a convolutional neural network model; using the trained convolutional neural network model to test images set for image classification. Among them, the steps of convolutional neural network model training are: preprocessing and sample amplification of the image data in the image training set to form training samples; performing forward propagation on the training samples to extract image features; calculating each sample in the Softmax classifier The classification probability of ; the training error is calculated according to the probability y i ; the training error is used to backpropagate from the last layer of the convolutional neural network in turn, and the network weight matrix W is modified by the stochastic gradient descent method SGD. Compared with the prior art, the present invention has the advantages of high classification accuracy, fast convergence speed, high calculation efficiency and the like.

Description

一种有效防止卷积神经网络过拟合的图像分类方法An Image Classification Method Effectively Preventing Overfitting of Convolutional Neural Networks

技术领域technical field

本发明涉及图像处理领域,尤其是涉及一种有效防止卷积神经网络过拟合的图像分类方法。The invention relates to the field of image processing, in particular to an image classification method for effectively preventing over-fitting of a convolutional neural network.

背景技术Background technique

随着多媒体技术与计算机网络的广泛应用,网络上出现大量图像数据。为了能够有效的管理这些图像文件,为用户提供更好的体验服务,自动识别这些图像的内容变的越来越重要。With the wide application of multimedia technology and computer network, a large amount of image data appears on the network. In order to effectively manage these image files and provide users with better experience services, it is becoming more and more important to automatically identify the contents of these images.

随机机器学习方法的不断完善和发展,深度学习算法越来越受到重视,其中卷积神经网络就是深度学习中一种重要的算法,目前已成为语音分析和图像识别领域的研究热点。卷积神经打破了传统神经网络中层与层之间的神经元全连接的方式,它的权值共享网络结构使之更类似于生物神经网络,降低了网络模型的复杂度,减少了权值的数量。该优点在网络的输入是图像是表现的更为明显,使图像可以直接作为网络的输入,避免了传统识别算法中复杂的特征提取和数据重建过程。卷积网络是为识别二维形状而特殊设计的一个多层感知器,这种网络结构对平移、比例缩放、倾斜或者其他形式的形变具有高度不变性。With the continuous improvement and development of random machine learning methods, more and more attention has been paid to deep learning algorithms. Among them, convolutional neural network is an important algorithm in deep learning, and it has become a research hotspot in the fields of speech analysis and image recognition. The convolutional neural network breaks the way of full connection of neurons between layers in the traditional neural network. Its weight sharing network structure makes it more similar to the biological neural network, which reduces the complexity of the network model and reduces the weight value. quantity. This advantage is more obvious when the input of the network is the image, so that the image can be directly used as the input of the network, avoiding the complicated feature extraction and data reconstruction process in the traditional recognition algorithm. The convolutional network is a multi-layer perceptron specially designed to recognize two-dimensional shapes. This network structure is highly invariant to translation, scaling, tilting, or other forms of deformation.

基于卷积神经网络的图像分类技术能够有效地自动的从图像中提取特征信息,提取的特征具有非常好的图像表达能力,因此该技术在一些图像分类问题中取得了令人满意的实验结果。尽管如此,该技术目前还存在以下缺陷:Image classification technology based on convolutional neural network can effectively and automatically extract feature information from images, and the extracted features have very good image expression ability, so this technology has achieved satisfactory experimental results in some image classification problems. However, the technology currently has the following drawbacks:

第一,由于图像数据库中带标签的数据是有限的,随着卷积神经网络的规模不断增大,需要训练的权值也会不断增加,这势必使得神经网络出现过拟合现象,即训练时的分类精度远远好于测试时的分类精度。First, because the labeled data in the image database is limited, as the scale of the convolutional neural network increases, the weights that need to be trained will also increase, which will inevitably lead to over-fitting of the neural network, that is, training The classification accuracy at test time is much better than that at test time.

第二,为了获取更好的特征表达能力以便取得更好的分类精度,某些研究人员采用增加网络深度、扩大网络规模的方法。但是,这种方法将极大的增加计算复杂度,传统的CPU运算速度已经不能满足这样的计算复杂度。Second, in order to obtain better feature expression ability and better classification accuracy, some researchers adopt the method of increasing network depth and expanding network size. However, this method will greatly increase the computational complexity, and the traditional CPU operation speed can no longer meet such computational complexity.

发明内容Contents of the invention

本发明的目的就是为了克服上述现有技术存在的缺陷而提供一种分类精度高、收敛速度快、计算效率高的有效防止卷积神经网络过拟合的图像分类方法。The object of the present invention is to provide an image classification method that can effectively prevent over-fitting of convolutional neural networks with high classification accuracy, fast convergence speed and high calculation efficiency in order to overcome the above-mentioned defects in the prior art.

本发明的目的可以通过以下技术方案来实现:The purpose of the present invention can be achieved through the following technical solutions:

一种有效防止卷积神经网络过拟合的图像分类方法,该方法运行在GPU中,包括:An image classification method that effectively prevents overfitting of convolutional neural networks, the method runs on the GPU, including:

步骤一,获得图像训练集和图像测试集;Step 1, obtain image training set and image test set;

步骤二,卷积神经网络模型的训练,具体包括以下步骤:Step 2, the training of the convolutional neural network model, specifically includes the following steps:

a)设定卷积神经网络的结构和训练次数上限N,初始化神经网络权值矩阵W,所述结构包括卷积神经网络的层数和每层中特征图的数量;a) set the structure of the convolutional neural network and the upper limit of the number of training times N, initialize the neural network weight matrix W, the structure includes the number of layers of the convolutional neural network and the number of feature maps in each layer;

b)从所述图像训练集中获取图像数据进行预处理,并进行样本扩增,形成训练样本;b) obtaining image data from the image training set for preprocessing, and performing sample amplification to form training samples;

c)对所述训练样本进行前向传播提取图像特征,所述前向传播包括卷积层、非线性归一化层和混合pooling层的计算;c) performing forward propagation on the training sample to extract image features, the forward propagation including calculation of convolutional layer, nonlinear normalization layer and mixed pooling layer;

d)在Softmax分类器中计算各样本的分类概率:d) Calculate the classification probability of each sample in the Softmax classifier:

式中,si表示Softmax分类器第i个神经元的输出值,si=F·η,F为某个训练样本的图像特征向量,η为相应的权值,n为需要分类的类别数量;In the formula, s i represents the output value of the i-th neuron of the Softmax classifier, s i =F η, F is the image feature vector of a certain training sample, η is the corresponding weight, n is the number of categories to be classified ;

e)根据概率yi计算得到训练误差e) Calculate the training error according to the probability y i

当i=k时,θik=1,i表示第i个类别,当原始输入属于类别i时, When i=k, θ ik =1, i represents the i-th category, when the original input belongs to category i,

f)利用所述训练误差从卷积神经网络的最后一层依次往前反向传播,同时利用随机梯度下降法SGD修改网络权值矩阵W;f) Utilizing the training error from the last layer of the convolutional neural network to backpropagate forward sequentially, and simultaneously utilizing the stochastic gradient descent method SGD to modify the network weight matrix W;

g)判断模型训练是否完成,若是,则保存卷积神经网络模型和Softmax分类器后执行步骤三,若否,则返回步骤b);g) judge whether the model training is completed, if so, execute step 3 after saving the convolutional neural network model and the Softmax classifier, if not, then return to step b);

步骤三,利用训练后的卷积神经网络模型对图像测试集进行图像分类。Step 3, using the trained convolutional neural network model to perform image classification on the image test set.

所述步骤二的a)中,初始权值矩阵W的元素的取值范围为[-0.01,0.01]。In a) of the step 2, the value range of the elements of the initial weight matrix W is [-0.01, 0.01].

所述步骤二的b)具体为:b) of said step 2 is specifically:

b1)对于长宽相等的图像,利用OPENCV中的cvResize函数进行缩放,缩放后的图片大小为N×N;b1) For images with equal length and width, use the cvResize function in OPENCV to scale, and the scaled picture size is N×N;

b2)对长宽不相等的图像,固定短边S不变,截取长边中间的连续S个像素,形成S×S大小的图像,再重复步骤b1)最终形成N×N大小的图像;b2) For images with unequal length and width, the short side S is fixed, and S continuous pixels in the middle of the long side are intercepted to form an image of S×S size, and then step b1) is repeated to finally form an image of N×N size;

b3)计算所有图像的像素值之和,并除以图像的数量得到一个均值图像,在每一幅图像中减去所述均值图像得到输入样本;b3) Calculate the sum of the pixel values of all images, and divide by the number of images to obtain an average image, and subtract the average image from each image to obtain an input sample;

b4)对所述输入样本进行数据扩增,形成最终的训练样本。b4) performing data amplification on the input samples to form final training samples.

步骤二的c)中,所述卷积层的计算具体为:In c) of step 2, the calculation of the convolutional layer is specifically:

yk=max{wk*x,0}y k = max{w k *x,0}

其中,x表示前一层的输出,即当前层的输入,yk表示第k个特征图的输出,wk代表与前一层的输出相连的第k个权值矩阵,“*”表示二维的内积运算;Among them, x represents the output of the previous layer, that is, the input of the current layer, y k represents the output of the kth feature map, w k represents the kth weight matrix connected to the output of the previous layer, and "*" represents two Dimension inner product operation;

所述非线性归一化层的计算具体为:The calculation of the nonlinear normalization layer is specifically:

其中,xkij为非线性归一化层计算时前一层第k个特征图的输出,累加运算是在第k个特征图相邻的N个特征图的相同位置(i,j)上完成的,α和β为预设的归一化参数,ykij为新生成的特征图;Among them, x kij is the output of the k-th feature map of the previous layer during the calculation of the nonlinear normalization layer, and the accumulation operation is completed on the same position (i, j) of the N feature maps adjacent to the k-th feature map , α and β are the preset normalization parameters, and y kij is the newly generated feature map;

所述混合pooling层的计算具体为:The calculation of the mixed pooling layer is specifically:

其中,λ是取值为0或者1的随机参数,xkpq为混合pooling层计算时前一层第k个特征图的输出,Rij为待降采样的区域。Among them, λ is a random parameter with a value of 0 or 1, x kpq is the output of the kth feature map of the previous layer when the hybrid pooling layer is calculated, and R ij is the area to be down-sampled.

所述步骤g)中,判断模型训练是否完成的准则是:达到训练次数上限。In the step g), the criterion for judging whether the model training is completed is: reaching the upper limit of training times.

与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

第一,本发明首次提出在卷积神经网络降采样时使用混合降采样Mixed Pooling的方法,能够有效的防止神经网络的过拟合现象,最终达到提高分类精度的效果,并且具有鲁棒性好的特点。First, the present invention proposes for the first time the method of using mixed down-sampling Mixed Pooling when down-sampling the convolutional neural network, which can effectively prevent the over-fitting phenomenon of the neural network, and finally achieve the effect of improving the classification accuracy, and has good robustness specialty.

第二,本发明使用了图像处理单元GPU(Graphic Processing Units)加速处理计算的方式,使得卷积神经网络在处理较大数据量的时候能够快速的计算和收敛。Second, the present invention uses an image processing unit GPU (Graphic Processing Units) to accelerate processing and calculation, so that the convolutional neural network can quickly calculate and converge when processing a large amount of data.

第三,本发明的识别准确率优于在CIFAR10、CIFAR100、SVHN数据集上的主流算法,并且具有较高的计算效率。Third, the recognition accuracy of the present invention is superior to mainstream algorithms on CIFAR10, CIFAR100, and SVHN data sets, and has higher computational efficiency.

附图说明Description of drawings

图1为本发明模型训练过程的示意图;Fig. 1 is the schematic diagram of the model training process of the present invention;

图2为本发明图像分类过程的示意图。Fig. 2 is a schematic diagram of the image classification process of the present invention.

具体实施方式detailed description

下面结合附图和具体实施例对本发明进行详细说明。本实施例以本发明技术方案为前提进行实施,给出了详细的实施方式和具体的操作过程,但本发明的保护范围不限于下述的实施例。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

如图1-图2所示,一种有效防止卷积神经网络过拟合的图像分类方法,该方法首先从图像数据库获取图像集M并将其划分为训练集Mt和测试集My,然后根据训练集Mt建立卷积神经网络模型,最后以训练好的卷积神经网络模型对测试集My进行图像分类。As shown in Figures 1-2, an image classification method that effectively prevents convolutional neural networks from overfitting, the method first obtains an image set M from an image database and divides it into a training set M t and a test set M y , Then a convolutional neural network model is established according to the training set M t , and finally the image classification is performed on the test set M y with the trained convolutional neural network model.

如图1所示,卷积神经网络模型的训练具体包括以下步骤:As shown in Figure 1, the training of the convolutional neural network model specifically includes the following steps:

在步骤S101中:设定卷积神经网络的结构和训练次数上限N,并且初始化神经网络权值矩阵W,具体为:In step S101: set the structure of the convolutional neural network and the upper limit of training times N, and initialize the weight matrix W of the neural network, specifically:

1a)根据问题的规模预先设定卷积神经网络的层次和每层的特征图数量,实验采用Input1-Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64-Softmax(Input表示输入层,其后所带的数字n表示本层的feature map数量,Conv表示卷积层,LRN表示非线性归一化层,Pooling表示降采样层,Softmax表示最后一层的分类器层)的结构;1a) According to the scale of the problem, the level of the convolutional neural network and the number of feature maps of each layer are preset. The number n indicates the number of feature maps in this layer, Conv indicates the convolutional layer, LRN indicates the nonlinear normalization layer, Pooling indicates the downsampling layer, and Softmax indicates the structure of the classifier layer of the last layer);

1b)对层与层之间进行连接并初始化神经网络权值矩阵W,初始化的方法为对于W中每一个元素,随机产生一个[-0.01,0.01]的浮点数并赋值。1b) Connect the layers and initialize the weight matrix W of the neural network. The initialization method is to randomly generate a floating-point number of [-0.01, 0.01] for each element in W and assign it.

在步骤S102中:从训练集Mt中获取图像数据,对获取的图像数进行预处理和数据扩增:In step S102: obtain image data from training set M t , carry out preprocessing and data amplification to the obtained image number:

2a)对于长宽相等的图像,直接利用OPENCV中的cvResize函数进行缩放,使得缩放后的图片为N×N像素大小,本实施例中,N=32;2a) For images equal in length and width, directly utilize the cvResize function in OPENCV to zoom, so that the zoomed picture is N×N pixel size, in the present embodiment, N=32;

2b)对长宽不相等的图像,固定短边S不变,截取长边中间的S个像素,至此形成一幅S×S大小的图像,再重复2a)的步骤最终形成32×32大小的图像;2b) For images with unequal length and width, fix the short side S unchanged, and intercept S pixels in the middle of the long side to form an image of S×S size, and then repeat the steps of 2a) to finally form a 32×32 size image image;

2c)使用所有图像计算各个位置像素值之和,并除上图像的数量得到一个均值图像,最后在每一幅图像中减去该均值图像得到的输入样本;2c) Use all images to calculate the sum of the pixel values at each position, and divide by the number of images to obtain an average image, and finally subtract the input sample obtained by the average image from each image;

2d)对每一个32×32大小的输入样本进行数据扩增,把样本看成32×32的矩阵,截取左上24×24个元素、左下24×24个元素、右上24×24个元素、右下24×24个元素和中央24×24个作为新的输入样本,并且对这5个新的输入样本继续水平翻转,再形成5幅新的输入样本,这样原本32×32的样本经过数据扩增技术之后就得到了10个24×24的新样本。2d) Perform data amplification on each 32×32 input sample, regard the sample as a 32×32 matrix, and intercept 24×24 elements in the upper left, 24×24 elements in the lower left, 24×24 elements in the upper right, and 24×24 elements in the right The lower 24×24 elements and the central 24×24 are used as new input samples, and these 5 new input samples are continuously flipped horizontally to form 5 new input samples, so that the original 32×32 samples are processed by data expansion. After increasing the technology, 10 new samples of 24×24 are obtained.

在步骤S103中:使用步骤S102得到的新样本作为卷积神经网络的输入层Input1进行前向传播从而提取图像特征,前向传播的需要经过Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64的过程,也就是经过两个阶段的卷积进行特征的提取,具体步骤如下:In step S103: use the new sample obtained in step S102 as the input layer Input1 of the convolutional neural network to perform forward propagation to extract image features, and the forward propagation needs to go through the process of Conv64-LRN64-Pooling64-Conv64-LRN64-Pooling64, That is, feature extraction is performed after two stages of convolution. The specific steps are as follows:

3a)卷积层(Conv)计算:3a) Convolution layer (Conv) calculation:

yk=max{wk*x,0}y k = max{w k *x,0}

其中,x表示前一层的输出(也就是本层的输入),在第一个Conv64层中就是Input1层的输出值,在第二个Conv64层中就是第一个Pooling64的输出,yk表示Conv64层的第k个特征图输出(也就是前一层的第k个输出分量),wk代表与前一层输出相连的第k个权值矩阵,“*”表示二维的内积运算。Among them, x represents the output of the previous layer (that is, the input of this layer), the output value of the Input1 layer in the first Conv64 layer, and the output of the first Pooling64 in the second Conv64 layer, and y k represents The k-th feature map output of the Conv64 layer (that is, the k-th output component of the previous layer), w k represents the k-th weight matrix connected to the output of the previous layer, and "*" represents a two-dimensional inner product operation .

3b)非线性归一化(LRN)计算:3b) Nonlinear Normalization (LRN) calculation:

其中,xkij为Conv64层的第k个特征图的输出(也就是前一层的第k个输出分量),累加运算是在第k个特征图相邻的5个特征图的相同位置(i,j)上完成的,α=0.001和β=0.75为预设的归一化参数,至此LRN层新生成的特征图可以表示成ykijAmong them, x kij is the output of the kth feature map of the Conv64 layer (that is, the kth output component of the previous layer), and the accumulation operation is the same position of the 5 feature maps adjacent to the kth feature map (i ,j) completed above, α=0.001 and β=0.75 are the preset normalization parameters, so far the newly generated feature map of the LRN layer can be expressed as y kij .

3c)混合pooling层计算:3c) Hybrid pooling layer calculation:

其中,λ是取值为0或者1的随机参数,xkpq为前一个LRN64层第k个特征图的输出(也就是前一层的第k个输出分量),Rij为待降采样的区域,选取的降采样区域为3×3大小。Among them, λ is a random parameter with a value of 0 or 1, x kpq is the output of the kth feature map of the previous LRN64 layer (that is, the kth output component of the previous layer), and R ij is the area to be downsampled , the selected downsampling area is 3×3 in size.

依次执行上述三种计算,直至完成所有卷积阶段。The above three calculations are performed sequentially until all convolution stages are completed.

在步骤S104中:使用步骤S103得到的图像特征向量F在Softmax分类器中计算该样本被分到每一类的概率yiIn step S104: use the image feature vector F obtained in step S103 to calculate the probability yi of the sample being classified into each class in the Softmax classifier:

其中,si表示Softmax分类器第i个神经元的输出值,它由图像特征向量F点积相应的权值计算得到,即si=F·η,η为相应的权值,n为需要分类的类别数量。假设需要分类的图片共有n类,则最终形成了一个Y=y1,y1,…,yn}的实际输出向量,假设某一图片属于第i类,那么它的期望输出向量则为也就是向量的第i个元素为1,其余为0,根据即可计算该样本的误差向量δ,第i个训练误差δi的计算公式为:Among them, s i represents the output value of the i-th neuron of the Softmax classifier, which is calculated from the corresponding weight of the image feature vector F dot product, that is, s i =F·η, η is the corresponding weight, and n is the required The number of categories to classify. Assuming that there are n categories of pictures to be classified, an actual output vector of Y=y 1 ,y 1 ,...,y n } is finally formed. Assuming that a certain picture belongs to the i-th category, its expected output vector is That is, the i-th element of the vector is 1, and the rest are 0, according to The error vector δ of the sample can be calculated, and the calculation formula of the i-th training error δ i is:

当i=k时,θik=1,i表示第i个类别,当原始输入属于类别i时, When i=k, θ ik =1, i represents the i-th category, when the original input belongs to category i,

在步骤S105中:利用训练误差从卷积神经网络的最后一层依次往前反向传播,从Softmax层开始依次向前把误差传播到pooling层、非线性归一化层和卷积层,同时利用随机梯度下降法SGD修改网络权值矩阵W。In step S105: use the training error to propagate forward and backward sequentially from the last layer of the convolutional neural network, and propagate the error forward to the pooling layer, the nonlinear normalization layer and the convolutional layer sequentially from the Softmax layer, and at the same time Use the stochastic gradient descent method SGD to modify the network weight matrix W.

在步骤S106中:反向传播完成之后判断训练次数是否达到了步骤1中设定的训练次数上限N,如果达到就停止训练保存模型;如果未达到则继续返回步骤S102继续训练。In step S106: After the backpropagation is completed, it is judged whether the number of training times has reached the upper limit N of training times set in step 1, and if it is reached, the training and storage model is stopped; if it is not reached, then continue to return to step S102 to continue training.

在步骤S107中:保存训练得到的模型和Softmax分类器。In step S107: save the trained model and Softmax classifier.

如图2所示,以训练好的卷积神经网络模型对测试集My进行图像分类具体步骤为:As shown in Figure 2, the specific steps for image classification of the test set M y with the trained convolutional neural network model are as follows:

在步骤S201中:从测试集My中提取测试样本进行预处理和扩增,以获得的数据作为网络输入;In step S201: extract test samples from the test set M y for preprocessing and amplification, and obtain the data as network input;

在步骤S202中:按步骤S103依次进行卷积层、非线性归一化层和混合pooling层的计算,直至完成所有卷积阶段;In step S202: according to step S103, the calculation of the convolutional layer, the nonlinear normalization layer and the mixed pooling layer is performed sequentially until all convolutional stages are completed;

在步骤S203中:得到测试样本的特征向量;In step S203: obtain the feature vector of the test sample;

在步骤S204中:利用特征向量用Softmax分类器计算该样本被分到每一类的概率yi,计算{y1,y1,…,yn}最大的元素,假设该元素为yj,则该样本最终判断的类别为第j类。In step S204: use the feature vector and use the Softmax classifier to calculate the probability y i of the sample being classified into each category, and calculate the largest element of {y 1 , y 1 ,...,y n }, assuming that the element is y j , Then the final judged category of the sample is the jth category.

为了验证本发明的性能,本实施例在三个公开数据集(CIFAR-10、CIFAR-100、SVHN)上进行了实验,并且对采用了混合降采样的卷积神经网络与只采用普通降采样的卷积神经网络方法进行了分析比较。实验均按照相应数据集的实验规定进行训练与测试。从表1、表2、表3中的对比中可以看出,混合降采样的在训练误差相对较大的情况依旧能够得到比传统降采样方法好的测试误差,这充分证明了混合降采样对在防止卷积神经网络的过拟合中有着重要的作用。在上述的三个数据集上,本发明的测试误差分别是10.80%、38.07%、3.01%。实验结果由于目前公布的主流算法,具有较高的识别率。In order to verify the performance of the present invention, this embodiment carried out experiments on three public data sets (CIFAR-10, CIFAR-100, SVHN), and used the convolutional neural network with hybrid downsampling and only ordinary downsampling The convolutional neural network methods are analyzed and compared. The experiments are carried out in accordance with the experimental regulations of the corresponding data sets for training and testing. From the comparison in Table 1, Table 2, and Table 3, it can be seen that the hybrid downsampling method can still obtain a better test error than the traditional downsampling method when the training error is relatively large, which fully proves that the hybrid downsampling method is beneficial to It plays an important role in preventing overfitting of convolutional neural networks. On the above three data sets, the test errors of the present invention are 10.80%, 38.07%, and 3.01%, respectively. The experimental results have a high recognition rate due to the currently published mainstream algorithms.

以上所述仅为本发明的优选实施例,并不用于限制本发明。本发明还包括由以上技术特征任意组合所组成的技术方案。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. The present invention also includes technical solutions composed of any combination of the above technical features.

表1 CIFAR-10数据集的实验数据Table 1 Experimental data of CIFAR-10 dataset

降采样方式downsampling method 训练误差training error 测试误差test error 传统最大值降采样Traditional Max Downsampling 3.01%3.01% 11.36%11.36% 传统均值降采样Traditional Mean Downsampling 4.52%4.52% 13.75%13.75% 混合降采样Mix PoolingMixed Downsampling Mix Pooling 6.25%6.25% 10.80%10.80%

表2 CIFAR-100数据集的实验数据Table 2 Experimental data of CIFAR-100 dataset

降采样方式downsampling method 训练误差training error 测试误差test error 传统最大值降采样Traditional Max Downsampling 5.42%5.42% 40.09%40.09% 传统均值降采样Traditional Mean Downsampling 14.61%14.61% 44.01%44.01% 混合降采样Mix PoolingMixed Downsampling Mix Pooling 25.71%25.71% 38.07%38.07%

表3 SVHN数据集的实验数据Table 3 Experimental data of SVHN dataset

Claims (5)

1. it is a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, it is characterised in that the method operates in GPU In, including:
Step one, obtains training set of images and image measurement collection;
Step 2, the training of convolutional neural networks model, specifically includes following steps:
A) structure and frequency of training upper limit N of setting convolutional neural networks, initialize neural network weight matrix W, the structure Quantity including characteristic pattern in the number of plies of convolutional neural networks and every layer;
B) view data is obtained from described image training set to be pre-processed, and carry out sample amplification, form training sample;
C) propagated forward is carried out to the training sample and extracts characteristics of image, the propagated forward includes convolutional layer, non-linear returns One calculating for changing pooling layers of layer and mixing;
D) class probability of various kinds sheet is calculated in Softmax graders:
y i = exp ( s i ) Σ k = 1 n exp ( s k )
In formula, siRepresent the output valve of Softmax i-th neuron of grader, si=F η, F are the image of certain training sample Characteristic vector, η is corresponding weights, and n is the categorical measure for needing classification;
E) according to probability yiIt is calculated training error
δ i = Σ k = 1 n y k * y k y k ( θ i k - y i )
As i=k, θik=1, i represent i-th classification, when be originally inputted belong to classification i when,
F) last layer successively forward backpropagation of the training error from convolutional neural networks is utilized, while using boarding steps Degree descent method SGD modification network weight matrix Ws;
Whether g) judgment models training completes, if so, performing step after then preserving convolutional neural networks model and Softmax graders Rapid three, if it is not, then return to step b);
Step 3, image classification is carried out using the convolutional neural networks model after training to image measurement collection.
2. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature Be, the step 2 a) in, the span of the element of initial weight matrix W is [- 0.01,0.01].
3. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is, the b of the step 2) it is specially:
B1) for as broad as long image, zoomed in and out using the cvResize functions in OPENCV, the picture size after scaling It is N × N;
B2) to the unequal image of length and width, fixed short side S is constant, intercepts the continuous S pixel in the middle of side long, forms S × S big Small image, repeats step b1) ultimately form the image of N × N sizes;
B3 the pixel value sum of all images) is calculated, and quantity divided by image obtains an average image, in every piece image In subtract the average image and obtain input sample;
B4 data amplification) is carried out to the input sample, final training sample is formed.
4. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is, the c of step 2) in, the calculating of the convolutional layer is specially:
yk=max { wk*x,0}
Wherein, x represents the input of the output of preceding layer, i.e. current layer, ykRepresent k-th output of characteristic pattern, wkRepresent with it is previous K-th weight matrix that the output of layer is connected, " * " represents the inner product operation of two dimension;
The calculating of the non-linear normalizing layer is specially:
y k i j = x k i j / ( 1 + α N · Σ l = k - N 2 k + N 2 ( x l i j ) 2 ) β
Wherein, xkijThe output of k-th characteristic pattern of preceding layer when being calculated for non-linear normalizing layer, accumulating operation is special at k-th Levy what is completed in the same position (i, j) of the adjacent N number of characteristic pattern of figure, α and β is default normalized parameter, ykijFor newly-generated Characteristic pattern;
The calculating of pooling layers of the mixing is specially:
y k i j = λ · max ( p , q ) ∈ R i j ( x k p q ) + ( 1 - λ ) · 1 | R i j | · Σ ( p , q ) ∈ R i j x k p q
Wherein, λ is the random parameter that value is 0 or 1, xkpqK-th characteristic pattern of preceding layer during for pooling layers of calculating of mixing Output, RijTo treat down-sampled region.
5. it is according to claim 1 a kind of effectively to prevent the image classification method of convolutional neural networks over-fitting, its feature It is that in the step g), the criterion whether judgment models training completes is:Reach the frequency of training upper limit.
CN201410333924.3A 2014-07-14 2014-07-14 Image classification method capable of effectively preventing convolutional neural network from being overfit Active CN104102919B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410333924.3A CN104102919B (en) 2014-07-14 2014-07-14 Image classification method capable of effectively preventing convolutional neural network from being overfit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410333924.3A CN104102919B (en) 2014-07-14 2014-07-14 Image classification method capable of effectively preventing convolutional neural network from being overfit

Publications (2)

Publication Number Publication Date
CN104102919A CN104102919A (en) 2014-10-15
CN104102919B true CN104102919B (en) 2017-05-24

Family

ID=51671059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410333924.3A Active CN104102919B (en) 2014-07-14 2014-07-14 Image classification method capable of effectively preventing convolutional neural network from being overfit

Country Status (1)

Country Link
CN (1) CN104102919B (en)

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10650508B2 (en) * 2014-12-03 2020-05-12 Kla-Tencor Corporation Automatic defect classification without sampling and feature selection
CN106156807B (en) * 2015-04-02 2020-06-02 华中科技大学 Convolutional Neural Network Model Training Method and Device
CN106056529B (en) * 2015-04-03 2020-06-02 阿里巴巴集团控股有限公司 Method and equipment for training convolutional neural network for picture recognition
CN104850836B (en) * 2015-05-15 2018-04-10 浙江大学 Insect automatic distinguishing method for image based on depth convolutional neural networks
US10614339B2 (en) 2015-07-29 2020-04-07 Nokia Technologies Oy Object detection with neural network
CN105117739A (en) * 2015-07-29 2015-12-02 南京信息工程大学 Clothes classifying method based on convolutional neural network
CN105117330B (en) * 2015-08-07 2018-04-03 百度在线网络技术(北京)有限公司 CNN code test methods and device
CN105184313B (en) * 2015-08-24 2019-04-19 小米科技有限责任公司 Disaggregated model construction method and device
CN106485259B (en) * 2015-08-26 2019-11-15 华东师范大学 An Image Classification Method Based on Highly Constrained and Highly Dispersed Principal Component Analysis Network
CN105426930B (en) * 2015-11-09 2018-11-02 国网冀北电力有限公司信息通信分公司 A kind of substation's attribute dividing method based on convolutional neural networks
CN105426908B (en) * 2015-11-09 2018-11-02 国网冀北电力有限公司信息通信分公司 A kind of substation's attributive classification method based on convolutional neural networks
CN105512681A (en) * 2015-12-07 2016-04-20 北京信息科技大学 Method and system for acquiring target category picture
CN106874296B (en) * 2015-12-14 2021-06-04 阿里巴巴集团控股有限公司 Method and device for identifying style of commodity
CN106875203A (en) * 2015-12-14 2017-06-20 阿里巴巴集团控股有限公司 A kind of method and device of the style information for determining commodity picture
CN106874924B (en) * 2015-12-14 2021-01-29 阿里巴巴集团控股有限公司 Picture style identification method and device
CN107220641B (en) * 2016-03-22 2020-06-26 华南理工大学 Multi-language text classification method based on deep learning
CN109074472B (en) * 2016-04-06 2020-12-18 北京市商汤科技开发有限公司 Method and system for person identification
CN107341547B (en) * 2016-04-29 2021-04-20 中科寒武纪科技股份有限公司 Apparatus and method for performing convolutional neural network training
CN107346448B (en) 2016-05-06 2021-12-21 富士通株式会社 Deep neural network-based recognition device, training device and method
CN105957086B (en) * 2016-05-09 2019-03-26 西北工业大学 A kind of method for detecting change of remote sensing image based on optimization neural network model
CN106023154B (en) * 2016-05-09 2019-03-29 西北工业大学 Multidate SAR image change detection based on binary channels convolutional neural networks
CN107622272A (en) * 2016-07-13 2018-01-23 华为技术有限公司 A kind of image classification method and device
CN106250931A (en) * 2016-08-03 2016-12-21 武汉大学 A kind of high-definition picture scene classification method based on random convolutional neural networks
CN106297297B (en) * 2016-11-03 2018-11-20 成都通甲优博科技有限责任公司 Traffic jam judging method based on deep learning
CN106709421B (en) * 2016-11-16 2020-03-31 广西师范大学 Cell image identification and classification method based on transform domain features and CNN
CN106682697B (en) * 2016-12-29 2020-04-14 华中科技大学 An end-to-end object detection method based on convolutional neural network
CN106686472B (en) * 2016-12-29 2019-04-26 华中科技大学 A method and system for generating high frame rate video based on deep learning
CN106778910B (en) * 2017-01-12 2020-06-16 张亮 Deep learning system and method based on local training
US10546242B2 (en) 2017-03-03 2020-01-28 General Electric Company Image analysis neural network systems
CN107229968B (en) * 2017-05-24 2021-06-29 北京小米移动软件有限公司 Gradient parameter determination method, gradient parameter determination device and computer-readable storage medium
CN107067043B (en) * 2017-05-25 2020-07-24 哈尔滨工业大学 Crop disease and insect pest detection method
CN107358176A (en) * 2017-06-26 2017-11-17 武汉大学 Sorting technique based on high score remote sensing image area information and convolutional neural networks
CN107316066B (en) * 2017-07-28 2021-01-01 北京工商大学 Image classification method and system based on multi-channel convolutional neural network
TWI647658B (en) * 2017-09-29 2019-01-11 樂達創意科技有限公司 Device, system and method for automatically identifying image features
CN109685756A (en) * 2017-10-16 2019-04-26 乐达创意科技有限公司 Image feature automatic identifier, system and method
CN110399929B (en) * 2017-11-01 2023-04-28 腾讯科技(深圳)有限公司 Fundus image classification method, fundus image classification apparatus, and computer-readable storage medium
CN108009638A (en) * 2017-11-23 2018-05-08 深圳市深网视界科技有限公司 A kind of training method of neural network model, electronic equipment and storage medium
CN108596206A (en) * 2018-03-21 2018-09-28 杭州电子科技大学 Texture image classification method based on multiple dimensioned multi-direction spatial coherence modeling
CN110147873B (en) * 2018-05-18 2020-02-18 中科寒武纪科技股份有限公司 Convolutional neural network processor and training method
CN109325514A (en) * 2018-08-02 2019-02-12 成都信息工程大学 Image Classification Method Based on Simple Learning Framework of Improved CNN
CN111274422A (en) * 2018-12-04 2020-06-12 北京嘀嘀无限科技发展有限公司 Model training method, image feature extraction method and device and electronic equipment
CN110033035A (en) * 2019-04-04 2019-07-19 武汉精立电子技术有限公司 A kind of AOI defect classification method and device based on intensified learning
CN110222733B (en) * 2019-05-17 2021-05-11 嘉迈科技(海南)有限公司 High-precision multi-order neural network classification method and system
CN110490842B (en) * 2019-07-22 2023-07-04 同济大学 A detection method for strip surface defects based on deep learning
CN110599496A (en) * 2019-07-30 2019-12-20 浙江工业大学 Sun shadow displacement positioning method based on deep learning
CN112182214B (en) * 2020-09-27 2024-03-19 中国建设银行股份有限公司 Data classification method, device, equipment and medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622585A (en) * 2012-03-06 2012-08-01 同济大学 Back propagation (BP) neural network face recognition method based on local feature Gabor wavelets
CN103914711A (en) * 2014-03-26 2014-07-09 中国科学院计算技术研究所 Improved top speed learning model and method for classifying modes of improved top speed learning model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7146050B2 (en) * 2002-07-19 2006-12-05 Intel Corporation Facial classification of static images using support vector machines

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102622585A (en) * 2012-03-06 2012-08-01 同济大学 Back propagation (BP) neural network face recognition method based on local feature Gabor wavelets
CN103914711A (en) * 2014-03-26 2014-07-09 中国科学院计算技术研究所 Improved top speed learning model and method for classifying modes of improved top speed learning model

Also Published As

Publication number Publication date
CN104102919A (en) 2014-10-15

Similar Documents

Publication Publication Date Title
CN104102919B (en) Image classification method capable of effectively preventing convolutional neural network from being overfit
CN110263705B (en) Two phases of high-resolution remote sensing image change detection system for the field of remote sensing technology
CN110929603B (en) A Weather Image Recognition Method Based on Lightweight Convolutional Neural Network
CN106485251B (en) Classification of egg embryos based on deep learning
CN106600577B (en) A Cell Counting Method Based on Deep Deconvolutional Neural Networks
WO2019233166A1 (en) Surface defect detection method and apparatus, and electronic device
CN110827213A (en) Super-resolution image restoration method based on generation type countermeasure network
CN111898432B (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN111507990A (en) Tunnel surface defect segmentation method based on deep learning
CN106408562A (en) Fundus image retinal vessel segmentation method and system based on deep learning
CN113034483B (en) Tobacco defect detection method based on deep transfer learning
CN106529447A (en) Small-sample face recognition method
CN110533683B (en) A radiomics analysis method integrating traditional features and deep features
CN106228124A (en) SAR image object detection method based on convolutional neural networks
CN108510485A (en) It is a kind of based on convolutional neural networks without reference image method for evaluating quality
CN110287777B (en) Golden monkey body segmentation algorithm in natural scene
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
CN108875696A (en) The Off-line Handwritten Chinese Recognition method of convolutional neural networks is separated based on depth
CN105117739A (en) Clothes classifying method based on convolutional neural network
CN106250931A (en) A kind of high-definition picture scene classification method based on random convolutional neural networks
CN110197205A (en) A kind of image-recognizing method of multiple features source residual error network
CN111161278B (en) Deep network aggregation-based fundus image focus segmentation method
CN111861906A (en) A virtual augmentation model for pavement crack images and a method for image augmentation
CN111489364A (en) Medical image segmentation method based on lightweight full convolution neural network
CN112233129B (en) Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230517

Address after: Unit 1001, 369 Weining Road, Changning District, Shanghai, 200336 (9th floor of actual floor)

Patentee after: DEEPBLUE TECHNOLOGY (SHANGHAI) Co.,Ltd.

Address before: 200092 Siping Road 1239, Shanghai, Yangpu District

Patentee before: TONGJI University

TR01 Transfer of patent right