[go: up one dir, main page]

CN110110845B - Learning method based on parallel multi-level width neural network - Google Patents

Learning method based on parallel multi-level width neural network Download PDF

Info

Publication number
CN110110845B
CN110110845B CN201910331708.8A CN201910331708A CN110110845B CN 110110845 B CN110110845 B CN 110110845B CN 201910331708 A CN201910331708 A CN 201910331708A CN 110110845 B CN110110845 B CN 110110845B
Authority
CN
China
Prior art keywords
neural network
level
test
sample set
width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910331708.8A
Other languages
Chinese (zh)
Other versions
CN110110845A (en
Inventor
席江波
房建武
吴田军
康梦华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN201910331708.8A priority Critical patent/CN110110845B/en
Publication of CN110110845A publication Critical patent/CN110110845A/en
Application granted granted Critical
Publication of CN110110845B publication Critical patent/CN110110845B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本发明公开了一种基于并行多级宽度神经网络的学习方法,包括以下步骤:获取验证集,构建基分类器;对并行M级宽度神经网络的每级进行训练和验证,得到训练后的并行M级宽度神经网络和每级宽度神经网络对应的验证输出;通过统计计算得到每级宽度神经网络的决策阈值;通过测试集对验证后的并行多级宽度神经网络进行测试。本发明的神经网络具有多级结构,每级针对数据的不同部分进行学习,且可实现并行化训练和测试。每一级采用一种宽度神经网络在宽度方向进行特征学习;通过多个宽度神经网络作为基分类器在宽度方向的再次连接,实现两个宽度方向上的分类器集成;通过增加新一级的宽度神经网络实现网络的增量学习;且可实现并行化测试。

Figure 201910331708

The present invention discloses a learning method based on a parallel multi-level wide neural network, comprising the following steps: obtaining a verification set and constructing a base classifier; training and verifying each level of the parallel M-level wide neural network to obtain the trained parallel M-level wide neural network and the verification output corresponding to each level of the wide neural network; obtaining the decision threshold of each level of the wide neural network by statistical calculation; and testing the verified parallel multi-level wide neural network by a test set. The neural network of the present invention has a multi-level structure, each level learns for different parts of the data, and can achieve parallel training and testing. Each level uses a wide neural network to learn features in the width direction; multiple wide neural networks are reconnected as base classifiers in the width direction to achieve classifier integration in two width directions; incremental learning of the network is achieved by adding a new level of wide neural network; and parallel testing can be achieved.

Figure 201910331708

Description

一种基于并行多级宽度神经网络的学习方法A learning method based on parallel multi-level wide neural network

技术领域technical field

本发明属于人工智能及机器学习技术领域,具体涉及一种基于并行多级宽度神经网络的学习方法。The invention belongs to the technical field of artificial intelligence and machine learning, and in particular relates to a learning method based on a parallel multi-level wide neural network.

背景技术Background technique

随着以深度学习网络为主的学习模型在大规模图像处理及机器视觉等领域获得巨大成功,学习模型的复杂度也快速增加,它们需要大量的高维数据去训练,从而大大增加了所需的计算资源和计算时间。此外,实际的数据往往不是同质的,有些样本非常容易分类,但是也有很多样本分类比较困难。大多数分类误差发生在输入样本很难分类的时候,比如样本的不平衡分布、异常获取样本、以及接近分类边界或者线性不可分的样本等。With the great success of deep learning network-based learning models in the fields of large-scale image processing and machine vision, the complexity of learning models has also increased rapidly. They require a large amount of high-dimensional data to train, which greatly increases the demand for computing resources and computing time. In addition, the actual data is often not homogeneous, some samples are very easy to classify, but there are also many samples that are difficult to classify. Most classification errors occur when the input samples are difficult to classify, such as unbalanced distribution of samples, abnormally obtained samples, and samples that are close to the classification boundary or linearly inseparable.

在现有的深度学习模型里,简单样本和复杂样本均用相同的方式处理,降低了计算资源的使用效率。同时,现有深度学习网络比如卷积神经网络往往具有很多层,所有样本都要经过所有的网络层,在对网络进行泛化或者测试的时候会非常耗时。而早期的并行多级自组织网络在每一级只接收被上一级拒绝的经过非线性变换的样本,这些样本被变换到易于分类的其它空间,从而再一次进行分类。但是,如何实现高维数据针对不同难度的数据样本进行计算资源的调整和分配以提高学习分类的速度和效率这一问题并没有得到很好的解决。In the existing deep learning model, simple samples and complex samples are processed in the same way, which reduces the efficiency of computing resources. At the same time, existing deep learning networks such as convolutional neural networks often have many layers, and all samples have to go through all network layers, which is very time-consuming when generalizing or testing the network. However, the early parallel multi-level self-organizing network only receives nonlinearly transformed samples rejected by the previous level at each level, and these samples are transformed to other spaces that are easy to classify, so as to classify again. However, the problem of how to adjust and allocate computing resources for high-dimensional data for data samples of different difficulties to improve the speed and efficiency of learning and classification has not been well resolved.

发明内容SUMMARY OF THE INVENTION

针对上述缺陷,本发明提供了一种基于并行多级宽度神经网络的学习方法,本发明的神经网络具有多级结构,每一级针对数据中的不同部分进行学习,且可实现并行化训练和测试。每一级采用一种宽度神经网络在宽度方向进行特征学习;通过多个宽度神经网络作为基分类器在宽度方向的再次连接,实现两个宽度方向上的分类器集成;通过增加新一级的宽度神经网络实现网络的增量学习;且可实现并行化测试,大大缩短了复杂样本的学习分类时间,提高网络运行效率。In view of the above defects, the present invention provides a learning method based on a parallel multi-level wide neural network. The neural network of the present invention has a multi-level structure, each level learns for different parts of the data, and can realize parallel training and test. Each level uses a wide neural network for feature learning in the width direction; through the reconnection of multiple wide neural networks as the base classifiers in the width direction, the integration of two classifiers in the width direction is realized; by adding a new level of The breadth neural network realizes the incremental learning of the network; and can realize parallel testing, which greatly shortens the learning and classification time of complex samples and improves the efficiency of network operation.

为了达到上述目的,本发明采用以下技术方案予以解决。In order to achieve the above object, the present invention adopts the following technical solutions to solve it.

(二)一种基于并行多级宽度神经网络的学习方法,并行多级宽度神经网络包括多级宽度神经网络,其中,每级宽度神经网络包含依次连接的输入层、隐藏层、决策层和输出层,所述决策层用于确定每个测试样本是否由当前级输出,所述学习方法包括以下步骤:(2) A learning method based on a parallel multi-level wide neural network, the parallel multi-level wide neural network includes a multi-level wide neural network, wherein each level of the wide neural network includes an input layer, a hidden layer, a decision layer and an output layer connected in sequence layer, the decision layer is used to determine whether each test sample is output by the current stage, and the learning method includes the following steps:

步骤1,获取原始训练样本集,构建并行M级宽度神经网络 Net1,…Netm,…,NetM(m=1,2…,M),每级宽度神经网络作为对应级的基分类器;通过对原始训练样本集进行M次数据变换,对应得到M个验证集 xv_1,…xv_m,…xv_MStep 1, obtain the original training sample set, build a parallel M-level width neural network Net 1 ,...Net m ,...,Net M (m=1, 2..., M), each level of width neural network is used as the base classifier of the corresponding level ; By performing M data transformations on the original training sample set, M corresponding verification sets x v_1 , ... x v_m , ... x v_M are obtained ;

其中,原始训练样本集的样本总数为NtrAmong them, the total number of samples in the original training sample set is N tr .

步骤2,采用原始训练样本集和M个验证集xv_1,…xv_m,…xv_M分别对并行M 级宽度神经网络的每级进行训练和验证,得到训练后的并行M级宽度神经网络和每级宽度神经网络对应的验证输出yv_m(m=1,2…,M);采用最小误差法得到每个验证输出yv_m对应的标签yv_ind_m,进而得到训练后的并行M级宽度神经网络的每级宽度神经网络的验证集的正确分类样本集yvc_m和错误分类样本集 yvw_mStep 2, using the original training sample set and M verification sets x v_1 , ... x v_m , ... x v_M to train and verify each level of the parallel M-level width neural network respectively, and obtain the parallel M-level width neural network after training and The verification output y v_m (m=1, 2..., M) corresponding to each level of width neural network; the minimum error method is used to obtain the label y v_ind_m corresponding to each verification output y v_m , and then the parallel M-level width neural network after training is obtained The correct classification sample set y vc_m and the wrong classification sample set y vw_m of the validation set of the neural network of each level width;

步骤3,对训练后的并行M级宽度神经网络的每级宽度神经网络的验证集的正确分类样本集yvc_m和错误分类样本集yvw_m分别进行统计计算,对应得到训练后的每级宽度神经网络的决策阈值Tm;将每级宽度神经网络的决策阈值 Tm作为对应级宽度神经网络的决策依据,得到决策阈值确定的并行M级宽度神经网络;Step 3: Statistical calculation is respectively performed on the correct classification sample set y vc_m and the wrong classification sample set y vw_m of the verification set of each level width neural network of the parallel M-level width neural network after training, corresponding to each level width neural network after training. The decision threshold T m of the network; the decision threshold T m of the width neural network of each level is used as the decision basis of the corresponding level width neural network, and the parallel M level width neural network determined by the decision threshold is obtained;

步骤4,获取测试集,将测试集作为决策阈值确定的并行M级宽度神经网络的输入数据并行输入给决策阈值确定的每级宽度神经网络进行测试,得到决策阈值确定的每级宽度神经网络的输出;获取每级宽度神经网络的误差向量,对决策阈值确定的每级宽度神经网络的输出进行判断,从而得到决策阈值确定的每级宽度神经网络的测试输出对应的标签ytest_ind_mStep 4: Obtain a test set, and take the test set as the input data of the parallel M-level width neural network determined by the decision threshold and input it to the neural network of each level determined by the decision threshold for testing, and obtain the width of each level determined by the decision threshold. Output: Obtain the error vector of the neural network with the width of each level, and judge the output of the neural network with the width of each level determined by the decision threshold, so as to obtain the label y test_ind_m corresponding to the test output of the neural network with the width of each level determined by the decision threshold.

本发明技术方案的特点和进一步的改进为:The characteristics and further improvement of the technical solution of the present invention are:

(1)步骤1中,所述数据变换为通过弹性变换(Elastic)对原始样本集中的样本进行压缩或变形;或所述数据变换为通过仿射变换(Affine)对原始样本集中的样本进行旋转、翻转、放大或缩小。(1) In step 1, the data is transformed to compress or deform the samples in the original sample set through elastic transformation (Elastic); or the data transformation is to rotate the samples in the original sample set through affine transformation (Affine). , flip, zoom in or zoom out.

(2)步骤2中,所述采用原始训练样本集和M个验证集xv_1,…xv_m,…xv_M分别对并行M级宽度神经网络的每级进行训练和验证,其包含以下子步骤:(2) In step 2, the original training sample set and M verification sets x v_1 , ... x v_m , ... x v_M are used to train and verify each level of the parallel M-level width neural network respectively, which includes the following sub-steps :

子步骤2.1,将原始训练样本集作为第1级宽度神经网络Net1的输入样本,对第1级宽度神经网络Net1进行训练,得到训练后的第一级宽度神经网络。In sub-step 2.1, the original training sample set is used as the input sample of the first-level wide neural network Net 1 , and the first-level wide neural network Net 1 is trained to obtain the trained first-level wide neural network.

子步骤2.2,采用第一验证集xv_1对训练后的第1级宽度神经网络进行验证,得到第1级宽度神经网络的验证集的错误分类样本集yvw_1In sub-step 2.2, the first validation set x v_1 is used to validate the first-level width neural network after training, and a misclassified sample set y vw_1 of the validation set of the first-level width neural network is obtained.

子步骤2.3,将第一级宽度神经网络的错误分类样本集yvw_1作为第2级宽度神经网络的输入样本Av_1;再从原始训练样本集中随机抽取训练样本集Av_2,使总输入样本集{Av_1+Av_2}中的样本数等于原始训练样本集中的样本数,并将总输入样本集{Av_1+Av_2}作为第2级宽度神经网络的输入样本。Sub-step 2.3, use the misclassified sample set y vw_1 of the first-level width neural network as the input sample A v_1 of the second-level width neural network; then randomly select the training sample set A v_2 from the original training sample set, so that the total input sample set The number of samples in {A v_1 +A v_2 } is equal to the number of samples in the original training sample set, and the total input sample set {A v_1 +A v_2 } is used as the input sample of the 2-stage wide neural network.

子步骤2.4,采用总输入样本集{Av_1+Av_2}对第2级宽度神经网络进行训练,得到训练后的第2级宽度神经网络;采用第二验证集xv_2对训练后的第2 级宽度神经网络进行验证,得到第2级宽度神经网络的验证集的错误分类样本集yvw_2Sub-step 2.4, use the total input sample set {A v_1 +A v_2 } to train the second-level width neural network to obtain the second-level width neural network after training; use the second validation set x v_2 to train the second-level width neural network after training. The second-level width neural network is used for verification, and the misclassified sample set y vw_2 of the verification set of the second-level width neural network is obtained.

依次类推,对第3级到第M级宽度神经网络分别进行训练,得到训练后的并行M级宽度神经网络和每级宽度神经网络的对应验证输出 yv_m(m=1,2…,M)。By analogy, the third-level to M-th level width neural networks are trained respectively, and the trained parallel M-level width neural network and the corresponding verification output y v_m of each level width neural network (m=1, 2..., M) are obtained. .

(3)步骤2中,所述最小误差法为:(3) In step 2, the minimum error method is:

首先,设定原始训练样本集的总类别数为C,构建参考矩阵Rj(1≤j≤C)。First, set the total number of categories of the original training sample set as C, and construct a reference matrix R j (1≤j≤C).

其中,参考矩阵Rj的第j行的元素都为1,其余元素都为0,每个参考矩阵Rj的维数为C×NtrWherein, the elements of the jth row of the reference matrix R j are all 1, the other elements are all 0, and the dimension of each reference matrix R j is C×N tr .

其次,根据训练后的每级宽度神经网络的验证输出yv_m,获取验证输出 yv_m与对应级的参考矩阵Rj之间的误差向量:Secondly, according to the verification output y v_m of the neural network of each stage width after training, the error vector between the verification output y v_m and the reference matrix R j of the corresponding stage is obtained:

Jv_mj=||softmax(yv_m)-Rj||2,1≤j≤C;J v_mj =||softmax(y v_m )-R j || 2 , 1≤j≤C;

其中,Jv_mj的维数为1×Ntr;yv_m的维数为C×NtrThe dimension of J v_mj is 1×N tr ; the dimension of y v_m is C×N tr .

最后,对验证输出yv_m与对应级的参考矩阵Rj之间的误差向量Jv_mj求最小值,得到训练后的每级宽度神经网络对应的类别标签yv_ind_mFinally, take the minimum value of the error vector J v_mj between the verification output y v_m and the reference matrix R j of the corresponding level, and obtain the class label y v_ind_m corresponding to the neural network of each level width after training:

Figure BDA0002037902470000041
Figure BDA0002037902470000041

其中,yv_ind_m的维数为1×NtrAmong them, the dimension of y v_ind_m is 1×N tr .

(4)步骤3中,所述统计计算包含以下子步骤:(4) In step 3, the statistical calculation includes the following substeps:

子步骤3.1,设定训练后的并行M级宽度神经网络的第m级宽度神经网络的正确分类样本集和错误分类样本集分别为:yvc_m和yvw_m,正确分类样本集和错误分类样本集中的样本总数分别为:Nvc_m和Nvw_m,且Nvc_m+Nvw_m=Ntr,则正确分类样本集和错误分类样本集的误差分别为:Sub-step 3.1, set the correct classification sample set and misclassified sample set of the mth-level width neural network of the trained parallel M-level width neural network are: y vc_m and y vw_m , the correct classification sample set and the wrong classification sample set The total number of samples are: N vc_m and N vw_m , and N vc_m +N vw_m =N tr , then the errors of the correctly classified sample set and the wrongly classified sample set are:

evc_m=||softmax(yvc_m)-tvc_m||2e vc_m =||softmax(y vc_m )-t vc_m || 2 ;

evw_m=||softmax(yvw_m)-tvw_m||2e vw_m =||softmax(y vw_m )-t vw_m || 2 ;

其中,tvc_m是m级宽度神经网络中正确分类样本yvc_m对应的真实标签, tvw_m是m级宽度神经网络中错误分类样本yvw_m对应的真实标签。Among them, t vc_m is the true label corresponding to the correctly classified sample y vc_m in the m-level width neural network, and t vw_m is the true label corresponding to the misclassified sample y vw_m in the m-level width neural network.

子步骤3.2,根据正确分类样本集yvc_m和错误分类样本集yvw_m,分别计算出正确分类样本集yvc_m的均值和方差分别为μc和σc;错误分类样本集yvw_m的均值和方差分别是:uw和σw;则正确分类样本集yvc_m和错误分类样本集yvw_m对应的高斯分布分别是:Sub-step 3.2, according to the correctly classified sample set y vc_m and the wrongly classified sample set y vw_m , calculate the mean and variance of the correctly classified sample set y vc_m as μ c and σ c respectively; the mean and variance of the wrongly classified sample set y vw_m are: u w and σ w ; then the Gaussian distributions corresponding to the correctly classified sample set y vc_m and the wrongly classified sample set y vw_m are:

Figure BDA0002037902470000051
Figure BDA0002037902470000051

Figure BDA0002037902470000052
Figure BDA0002037902470000052

正确分类样本集yvc_m和错误分类样本集yvw_m对应的高斯概率密度函数分别是:The Gaussian probability density functions corresponding to the correctly classified sample set y vc_m and the incorrectly classified sample set y vw_m are:

Figure BDA0002037902470000053
Figure BDA0002037902470000053

Figure BDA0002037902470000054
Figure BDA0002037902470000054

子步骤3.3,根据错误分类样本集yvw_m的误差evw_m和方差σw,获得m级宽度神经网络的决策阈值Tm=min(evw_m)-ασwIn sub-step 3.3, according to the error e vw_m and the variance σ w of the misclassified sample set y vw_m , the decision threshold T m =min(e vw_m )-ασ w of the m-level width neural network is obtained.

其中,α是一个常数,用来给出裕量,以使所有错误分类样本yvw_m在当前级被拒绝。where α is a constant to give a margin so that all misclassified samples y vw_m are rejected at the current stage.

(5)步骤4中,所述获取测试集为:获取原始测试样本集xtest;通过M 次数据扩充,对应获取M组测试样本集xtest_1,...,xtest_m,...,xtest_M,即为测试集。(5) In step 4, the acquisition of the test set is: acquiring the original test sample set x test ; through M times of data expansion, correspondingly acquiring M groups of test sample sets x test_1 , . . . , x test_m , . . , x test_M , which is the test set.

进一步地,所述数据扩充为:对所述原始测试样本集xtest中的每个样本分别进行NtestD次所述数据变换,对应得到NtestD个测试样本集,作为决策阈值确定的并行M级宽度神经网络的第m级宽度神经网络的测试集xtest_mFurther, the data expansion is: perform N testD times of the data transformation on each sample in the original test sample set x test respectively, and correspondingly obtain N testD test sample sets, as the parallel M level determined by the decision threshold. The test set x test_m of the m-th-level wide neural network.

其中,原始测试样本集Xtest中测试样本总数为Ntest_saplesAmong them, the total number of test samples in the original test sample set X test is N test_saples .

(6)步骤4中,所述获取每级宽度神经网络的误差向量包含以下子步骤:(6) In step 4, the described acquisition of the error vector of each level of width neural network includes the following sub-steps:

子步骤4.1,将M组测试样本集xtest_1,xtest_2,...,xtest_M分别并行输入给决策阈值确定的并行M级宽度神经网络,对应得到决策阈值确定的每级宽度神经网络的NtestD个输出ytest_m_d,(d=1,2…NtestD)。Sub-step 4.1, the M groups of test sample sets x test_1 , x test_2 , ..., x test_M are respectively input in parallel to the parallel M-level width neural network determined by the decision threshold, corresponding to the N of each level width neural network determined by the decision threshold. testD outputs y test_m_d , (d=1, 2...N testD ).

子步骤4.2,对决策阈值确定的每级宽度神经网络的NtestD个输出 ytest_m_d,(d=1,2…NtestD)计算平均值,得到决策阈值确定的每级宽度神经网络的测试输出

Figure BDA0002037902470000061
Sub-step 4.2, calculate the average value of the N testD outputs y test_m_d of the neural network with the width of each level determined by the decision threshold, (d=1, 2...N testD ), and obtain the test output of the neural network with the width of each level determined by the decision threshold
Figure BDA0002037902470000061

子步骤4.3,设定测试集的总类别数为C,构建参考矩阵Rj(1≤j≤C);获取验证输出yv_m与对应级的参考矩阵Rj之间的误差向量:Sub-step 4.3, set the total number of categories in the test set as C, construct a reference matrix R j (1≤j≤C); obtain the error vector between the verification output y v_m and the reference matrix R j of the corresponding level:

Jtest_mj=||softmax(ytest_m)-Rj||2,1≤j≤C;J test_mj =||softmax(y test_m )-R j || 2 , 1≤j≤C;

其中,参考矩阵Rj的第j行的元素都为1,其余元素都为0,每个参考矩阵Rj的维数为C×Ntest_samples;Jtest_mj的维数为1×Ntest_samples,yv_m的维数为 C×Ntest_samplesAmong them, the elements of the jth row of the reference matrix R j are all 1, and the other elements are 0. The dimension of each reference matrix R j is C×N test_samples ; The dimension of J test_mj is 1×N test_samples , y v_m The dimension of test_samples is C×N.

(7)所述对决策阈值确定的每级宽度神经网络的输出进行判断为:(7) The output of each level width neural network determined by the decision threshold is judged as:

当前级宽度神经网络的最小误差小于等于当前级决策阈值时,则判断为当前级为该输出的正确分类输出级:When the minimum error of the current stage width neural network is less than or equal to the current stage decision threshold, it is judged that the current stage is the correct classification output stage of the output:

min(Jtest_mj)≤Tmmin(J test_mj )≦T m .

当前级宽度神经网络的最小误差大于当前级决策阈值时,则判断为当前级无法对该输出进行正确分类,将该输出转入下一级宽度神经网络进行测试,如此循环,直到该输出找到正确分类输出级:When the minimum error of the current-level width neural network is greater than the current-level decision threshold, it is judged that the current level cannot correctly classify the output, and the output is transferred to the next-level width neural network for testing, and so on until the output is found to be correct Classification output stage:

min(Jtest_mj)>Tmmin(J test_mj )>T m .

(8)步骤4中,所述得到决策阈值确定的每级宽度神经网络的测试输出对应的标签ytest_ind_m为:

Figure BDA0002037902470000071
(8) In step 4, the label y test_ind_m corresponding to the test output of the neural network of each level width determined by the obtained decision threshold is:
Figure BDA0002037902470000071

其中,ytest_ind_m的维数为1×Ntest_sampleswhere the dimension of y test_ind_m is 1×N test_samples .

与现有技术相比,本发明的有益效果为:Compared with the prior art, the beneficial effects of the present invention are:

(1)本发明的神经网络具有多级基分类器,每一级用来学习数据集的不同部分样本,能够根据问题及数据集的复杂程度,自适应地确定神经网络的结构,实现计算资源的优化。(1) The neural network of the present invention has a multi-level base classifier, and each level is used to learn different parts of the data set. According to the complexity of the problem and the data set, the structure of the neural network can be adaptively determined to realize computing resources. Optimization.

(2)本发明的神经网络具有增量学习的优点,在新的训练数据可用时,实现对当前神经网络的判断,根据判断结果,确定是否能对新增训练数据进行正确分类,若不能进行正确分离,则通过增加新的宽度径向基函数作为神经网络新的一级来学习新的样本,而无需重新训练整个网络。(2) The neural network of the present invention has the advantage of incremental learning. When new training data is available, the judgment of the current neural network is realized. According to the judgment result, it is determined whether the newly added training data can be correctly classified. Correct separation, new samples are learned by adding a new width radial basis function as a new stage of the neural network without retraining the entire network.

(3)本发明的神经网络在测试的时候可以进行并行测试,也就是把测试数据同时给网络的所有级,由训练过程中得到的每一级的决策阈值来决定每个测试样本最终由哪一级的神经网络输出,并行测试过程大大减少了实际使用网络时候的等待时间。(3) The neural network of the present invention can be tested in parallel during the test, that is, the test data is given to all levels of the network at the same time, and the decision threshold of each level obtained during the training process determines where each test sample is ultimately One-level neural network output, parallel testing process greatly reduces the waiting time when actually using the network.

(4)本发明的神经网络可作为一种通用的学习框架,具有很强的灵活性,其每一级可根据实际需要使用BP神经网络、卷积神经网络或者其他类型的分类器。(4) The neural network of the present invention can be used as a general learning framework with strong flexibility, and each level of the neural network can use a BP neural network, a convolutional neural network or other types of classifiers according to actual needs.

附图说明Description of drawings

下面结合附图和具体实施例对本发明做进一步详细说明。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

图1是本发明的并行多级神经网络的原理图及其训练测试过程原理图;其中,图1(a)是本发明的并行多级宽度神经网络原理图;图1(b)是本发明的并行多级宽度神经网络的训练和验证过程原理图;图1(c)是本发明的并行多级宽度神经网络的测试过程原理图。1 is a schematic diagram of a parallel multistage neural network of the present invention and a schematic diagram of a training and testing process; wherein, FIG. 1(a) is a schematic diagram of a parallel multistage width neural network of the present invention; FIG. 1(b) is a schematic diagram of the present invention Schematic diagram of the training and verification process of the parallel multi-level width neural network; Fig. 1(c) is the principle diagram of the testing process of the parallel multi-level width neural network of the present invention.

图2是本发明的并行多级宽度神经网络的结构图。FIG. 2 is a structural diagram of the parallel multi-stage wide neural network of the present invention.

图3(a)是本发明的并行多级宽度神经网络的验证集在其中一级上的误差分布图;图3(b)是图3(a)中的统计参数的高斯概率密度函数。Fig. 3(a) is the error distribution diagram of the validation set of the parallel multi-stage width neural network of the present invention on one of the stages; Fig. 3(b) is the Gaussian probability density function of the statistical parameters in Fig. 3(a).

图4是本发明实施例中的并行26级宽度神经网络在MNIST数据集上的测试结果与现有学习模型的分类结果对比图。FIG. 4 is a comparison diagram of the test result of the parallel 26-level wide neural network on the MNIST data set in the embodiment of the present invention and the classification result of the existing learning model.

具体实施方式Detailed ways

下面将结合实施例对本发明的实施方案进行详细描述,但是本领域的技术人员将会理解,以下实施例仅用于说明本发明,而不应视为限制本发明的范围。The embodiments of the present invention will be described in detail below with reference to the examples, but those skilled in the art will understand that the following examples are only used to illustrate the present invention and should not be regarded as limiting the scope of the present invention.

采用MNIST手写数据集,该数据集每幅图像为8位灰度手写数字0~9的图像,图像大小为28×28,总共10类,有60000张原始训练样本集,10000 张图像作为测试集,是新学习模型训练测试的重要通用图像数据集之一。针对该数据集,参考图1和图2,本实施例采用宽度径向基函数网络作为基分类器,即并行多级宽度神经网络的每级均采用宽度径向基函数网络,选取并行的宽度神经网络的级数为26。Using the MNIST handwriting dataset, each image in this dataset is an 8-bit grayscale handwritten digit 0-9 image, the image size is 28×28, there are 10 categories in total, there are 60,000 original training sample sets, and 10,000 images are used as the test set , which is one of the important general image datasets for training and testing of new learning models. For this data set, referring to FIG. 1 and FIG. 2 , in this embodiment, the width radial basis function network is used as the base classifier, that is, each level of the parallel multi-level width neural network adopts the width radial basis function network, and the parallel width radial basis function network is selected. The number of stages of the neural network is 26.

(1)获取验证集,构建基分类器。(1) Obtain the validation set and build the base classifier.

首先,对Ntr=60000张原始训练样本集中的图像样本分别进行26次弹性变换,得到M=26个验证集xv_1,xv_2,...,xv_26,本实施例中为了保证有足够多的验证集的错分样本,每个验证集中包括Nval=10个由原始训练集变换得到的数据集。其中,每个验证集的样本数为原始训练样本集的Nval=10倍。First, perform 26 elastic transformations on the image samples in the original training sample set of N tr = 60000 to obtain M = 26 verification sets x v_1 , x v_2 , . . . , x v_26 . In this embodiment, in order to ensure sufficient There are many misclassified samples of the validation set, and each validation set includes N val = 10 data sets transformed from the original training set. Among them, the number of samples in each validation set is N val =10 times of the original training sample set.

其次,采用宽度径向基函数网络作为基分类器来设计并行多级宽度神经网络;M=26个宽度径向基函数网络连接在一起,形成并行多级宽度神经网络 Net1,Net2,...NetM;每一个基分类器作为一级,专注于数据集中的不同部分。Secondly, a wide radial basis function network is used as the base classifier to design a parallel multi-level wide neural network; M=26 wide radial basis function networks are connected together to form a parallel multi-level wide neural network Net 1 , Net 2 , . ..Net M ; each base classifier acts as a stage, focusing on a different part of the dataset.

最后,构建宽度径向基函数网络。具体过程如下:Finally, construct the width radial basis function network. The specific process is as follows:

构建包括N0k=1000个高斯基函数为

Figure BDA0002037902470000091
的径向基函数网络,该径向基函数网络的中心为随机取自原始训练样本集的一个子集,标准差取值为常数。采用滑动窗口获取原始训练样本集中每个图像样本的多组局部特征图像,从而获得多组局部特征矩阵,将多组局部特征矩阵作为高斯基函数的输入数据,得到多个径向基函数网络即为宽度径向基函数网络。The construction includes N 0k = 1000 Gaussian base functions as
Figure BDA0002037902470000091
The radial basis function network of , the center of the radial basis function network is randomly selected from a subset of the original training sample set, and the standard deviation is constant. The sliding window is used to obtain multiple sets of local feature images of each image sample in the original training sample set, thereby obtaining multiple sets of local feature matrices, and using multiple sets of local feature matrices as the input data of the Gaussian basis function to obtain multiple radial basis function networks. is the width radial basis function network.

2)对并行M级宽度神经网络的每级进行训练和验证,得到训练后的并行 M级宽度神经网络和每级宽度神经网络对应的验证输出yv_m(m=1,2…,M)。2) Train and verify each stage of the parallel M-level width neural network, and obtain the parallel M-level width neural network after training and the verification output y v_m (m=1, 2..., M) corresponding to each level width neural network.

第1级宽度径向基函数网络使用原始训练样本集来训练,在训练之后,错分的训练样本送到第2级的宽度径向基函数网络,作为第二个训练集的一部分,来训练第2级的网络。采用步骤(1)获得的验证集,对当前级的训练网络验证,同时提供更多的错分样本,作为下一级训练集的一部分。如图1 (a)、(b)所示,具体地,包含以下子步骤:The first-level BRF network is trained using the original training sample set. After training, the misclassified training samples are sent to the second-level BRF network as part of the second training set for training. Level 2 network. The verification set obtained in step (1) is used to verify the training network of the current level, and at the same time, more misclassified samples are provided as part of the training set of the next level. As shown in Figure 1 (a) and (b), specifically, the following sub-steps are included:

子步骤2.1,将原始训练样本集作为第1级宽度神经网络Net1的输入样本,对第1级宽度神经网络Net1进行训练,得到训练后的第1级宽度神经网络。In sub-step 2.1, the original training sample set is used as the input sample of the first-level wide neural network Net 1 , and the first-level wide neural network Net 1 is trained to obtain the first-level wide neural network after training.

子步骤2.2,采用第一验证集xv_1对训练后的第1级宽度神经网络进行验证,得到第1级宽度神经网络的错误分类样本集yvw_1In sub-step 2.2, the first verification set x v_1 is used to verify the trained first-level wide neural network, and a misclassified sample set y vw_1 of the first-level wide neural network is obtained.

子步骤2.3,将第1级宽度神经网络的错误分类样本集yvw_1作为第2级宽度神经网络的输入样本Av_1;再从原始训练样本集中随机抽取训练样本集Av_2,使总输入样本集{Av_1+Av_2}中的样本数等于原始训练样本集中的样本数,并将总输入样本集{Av_1+Av_2}作为第2级宽度神经网络的输入样本。Sub-step 2.3, use the misclassified sample set y vw_1 of the first-level width neural network as the input sample A v_1 of the second-level width neural network; then randomly select the training sample set A v_2 from the original training sample set, so that the total input sample set The number of samples in {A v_1 +A v_2 } is equal to the number of samples in the original training sample set, and the total input sample set {A v_1 +A v_2 } is used as the input sample of the 2-stage wide neural network.

子步骤2.4,采用总输入样本集{Av_1+Av_2}对第2级宽度神经网络进行训练,得到训练后的第2级宽度神经网络;采用第二验证集xv_2对训练后的第2 级宽度神经网络进行验证,得到第2级宽度神经网络的错误分类样本集yvw_2Sub-step 2.4, use the total input sample set {A v_1 +A v_2 } to train the second-level width neural network to obtain the second-level width neural network after training; use the second validation set x v_2 to train the second-level width neural network after training. The class width neural network is used for verification, and the misclassified sample set y vw_2 of the second class width neural network is obtained.

重复子步骤2.3和2.4,对第3级到第M级宽度神经网络分别进行训练,得到训练后的并行M级宽度神经网络和每级宽度神经网络的对应验证输出 yv_m(m=1,2…,M)。Repeat sub-steps 2.3 and 2.4 to train the 3rd to Mth-level width neural networks respectively, and obtain the parallel M-level width neural network after training and the corresponding verification output y v_m of each level width neural network (m=1, 2 ..., M).

上述的宽度径向基函数网络的具体训练和验证过程如下:The specific training and verification process of the above-mentioned width radial basis function network is as follows:

将原始训练样本集中的图像样本作为输入数据,图像大小为 M1×M2=28×28。滑动窗口大小为r=13×13,滑动窗口的初始位置设在每个图像样本的左上角,选择滑动步长为1个像元,滑动窗口从左到右,从上到下依次滑动,把滑动窗口中的60000个图像样本的3维图像块拉伸成为矩阵 xk∈Rr×N,即将每个局部特征图像分别按像元组成对应的原始矩阵,将每个原始矩阵的第2至最后一列依次顺序排列至第1列后形成一个列向量;将N个列向量顺序排列组成一组训练图像样本的局部特征矩阵xk(1≤k≤K),局部特征矩阵xk的每一列代表一个样本。再把局部特征矩阵xk输入给包括N0k=1000个高斯基函数为

Figure BDA0002037902470000111
的径向基函数网络,输出记为:The image samples in the original training sample set are used as input data, and the image size is M 1 ×M 2 =28×28. The size of the sliding window is r=13×13, the initial position of the sliding window is set at the upper left corner of each image sample, the sliding step is selected as 1 pixel, the sliding window slides from left to right, and from top to bottom, and the The 3-dimensional image block of 60000 image samples in the sliding window is stretched into a matrix x k ∈ R r×N , that is, each local feature image is formed into a corresponding original matrix by pixel, and the second to The last column is sequentially arranged to the first column to form a column vector; N column vectors are sequentially arranged to form a local feature matrix x k (1≤k≤K) of a set of training image samples, each column of the local feature matrix x k represents a sample. Then input the local feature matrix x k to include N 0k =1000 Gaussian base functions as
Figure BDA0002037902470000111
The radial basis function network of , the output is recorded as:

Figure BDA0002037902470000112
Figure BDA0002037902470000112

其中,

Figure BDA0002037902470000113
为包含N=60000个元素的列向量。in,
Figure BDA0002037902470000113
is a column vector containing N=60000 elements.

滑动窗口每次滑动对应一个径向基函数网络,最终滑动结束后,可得到 K=(M1-m+1)(M2-m+1)=(28-13+1)×(28-13+1)=256个径向基函数网络。Each sliding of the sliding window corresponds to a radial basis function network. After the final sliding, K=(M 1 -m+1)(M 2 -m+1)=(28-13+1)×(28- 13+1)=256 radial basis function networks.

针对每一个径向基函数网络,对其经过高斯基函数的输出引入排序和下采样。针对每一个径向基函数网络,对其经过非线性变换的高斯基函数输出数据Φk引入排序和下采样。对宽度径向基函数网络的输出数据Φk的每一列进行求和,得到一个行向量,行向量的每个元素为每个待处理图像的局部特定位置的像元之和,对每个待处理图像的局部特定位置的像元之和进行降序排列,得到降序向量

Figure BDA0002037902470000114
采用索引sk将降序向量ak中每个待处理图像的局部特定位置对应的原始位置进行标记,得到排序的输出数据Φ′k=sort(Φk,sk)。For each radial basis function network, sorting and downsampling are introduced to the output of the Gaussian basis function. For each radial basis function network, sorting and down-sampling are introduced to its Gaussian basis function output data Φk after nonlinear transformation. Sum up each column of the output data Φk of the width radial basis function network to obtain a row vector , each element of the row vector is the sum of the pixels at the local specific position of each image to be processed, for each to-be-processed image. The sum of the pixels in the local specific position of the processed image is sorted in descending order to obtain a descending vector
Figure BDA0002037902470000114
The index sk is used to mark the original position corresponding to the local specific position of each image to be processed in the descending vector ak to obtain sorted output data Φ′ k =sort(Φ k , sk ).

对排序的输出数据进行下采样,设定下采样间隔NkS=20,经过采样的输出个数为:Downsampling the sorted output data, setting the downsampling interval N kS =20, and the number of sampled outputs is:

Figure BDA0002037902470000115
Figure BDA0002037902470000115

则总的宽度径向基函数网络的输出个数为

Figure BDA0002037902470000116
采样输出为Φks=subsample(Φ′k,NkS),则高斯基函数的输出为Φ=[Φ1S,Φ2S,…,ΦKS]。Then the output number of the total width radial basis function network is
Figure BDA0002037902470000116
The sampling output is Φ ks =subsample(Φ′ k , N kS ), then the output of the Gaussian basis function is Φ=[Φ 1S , Φ 2S , . . . , Φ KS ].

设定期望的输出为D=[D1,D2,…,DC];对宽度径向基函数网络的高斯基函数的输出进行线性层连接,则线性层的权重为:W=[W1,W2,…,WC];Set the desired output as D=[D 1 , D 2 , ..., D C ]; perform linear layer connection on the output of the Gaussian basis function of the width radial basis function network, then the weight of the linear layer is: W=[W 1 , W 2 , ..., W C ];

其中,C=10是原始样本的类别总数。where C=10 is the total number of categories of the original sample.

得到宽度径向基函数网络的类别输出Y=[Y1,Y2,…,YC]=ΦW;具体地,通过最小化平方误差计算线性层的权重的最小均方估计

Figure BDA0002037902470000125
具体公式为:Obtain the class output Y=[Y 1 , Y 2 , .
Figure BDA0002037902470000125
The specific formula is:

Figure BDA0002037902470000121
Figure BDA0002037902470000121

通过宽度径向基函数网络的高斯基函数输出Φ的伪逆矩阵计算线性层的权重的最小均方估计

Figure BDA0002037902470000122
Compute the least-mean-square estimate of the weights of the linear layers via the pseudo-inverse matrix of the Gaussian basis function output Φ of the wide radial basis function network
Figure BDA0002037902470000122

Figure BDA0002037902470000123
Figure BDA0002037902470000123

其中,Φ+为宽度径向基函数网络的高斯基函数输出Φ的伪逆矩阵。where Φ + is the pseudo-inverse matrix of the Gaussian basis function output Φ of the width radial basis function network.

最终,计算得到宽度径向基函数网络的类别输出为:Finally, the category output of the width radial basis function network is calculated as:

Figure BDA0002037902470000124
Figure BDA0002037902470000124

进而获得训练后的宽度径向基函数网络,对每级训练后的宽度径向基函数网络采用对应验证集进行验证,获得训练后的每级宽度径向基函数网络对应的验证输出yv_m(m=1,2…,M)。Then obtain the width radial basis function network after training, use the corresponding validation set to verify the width radial basis function network after each level of training, and obtain the validation output y v_m corresponding to the width radial basis function network of each level after training ( m=1,2...,M).

通过获得的验证输出yv_m(m=1,2…,M),进一步获得每个验证输出yv_m对应的类别标签yv_ind_m,具体步骤如下:Through the obtained verification output y v_m (m=1, 2..., M), the category label y v_ind_m corresponding to each verification output y v_m is further obtained, and the specific steps are as follows:

首先,设定原始训练样本集的总类别数为C,构建参考矩阵Rj(1≤j≤C)。First, set the total number of categories of the original training sample set as C, and construct a reference matrix R j (1≤j≤C).

其中,参考矩阵Rj的第j行的元素都为1,其余元素都为0,每个参考矩阵Rj的维数为C×NtrWherein, the elements of the jth row of the reference matrix R j are all 1, the other elements are all 0, and the dimension of each reference matrix R j is C×N tr .

其次,根据训练后的每级宽度神经网络的验证输出yv_m,获取验证输出 yv_m与对应级的参考矩阵Rj之间的误差向量:Secondly, according to the verification output y v_m of the neural network of each stage width after training, the error vector between the verification output y v_m and the reference matrix R j of the corresponding stage is obtained:

Jv_mj=||softmax(yv_m)-Rj||2,1≤j≤C;J v_mj =||softmax(y v_m )-R j || 2 , 1≤j≤C;

其中,Jv_mj的维数为1×Ntr;yv_m的维数为C×NtrThe dimension of J v_mj is 1×N tr ; the dimension of y v_m is C×N tr .

最后,对验证输出yv_m与对应级的参考矩阵Rj之间的误差向量Jv_mj求最小值,得到训练后的每级宽度神经网络对应的类别标签yv_ind_mFinally, take the minimum value of the error vector J v_mj between the verification output y v_m and the reference matrix R j of the corresponding level, and obtain the class label y v_ind_m corresponding to the neural network of each level width after training:

Figure BDA0002037902470000131
Figure BDA0002037902470000131

其中,yv_ind_m的维数为1×NtrAmong them, the dimension of y v_ind_m is 1×N tr .

将训练后的每级宽度神经网络对应的类别标签yv_ind_m与每级的验证输出 yv_m进行比对,即可获得每级宽度神经网络的正确分类样本集yvc_m和错误分类样本集yvw_mComparing the class label y v_ind_m corresponding to the width neural network of each level after training with the verification output y v_m of each level, the correct classification sample set y vc_m and the wrong classification sample set y vw_m of each level width neural network can be obtained.

(3)通过统计计算得到每级宽度神经网络的决策阈值Tm (3) The decision threshold T m of the width neural network of each stage is obtained by statistical calculation

本网络比较困难的部分是每一级决策阈值的确定,它用来确定在测试的时候,每一个样本应该由哪一级的网络输出。在训练和测试之后,对正确分类样本集和错误分类的样本集分别进行统计计算。假设在m级宽度神经网络,正确分类样本集和错误分类样本集分别为:yvc_m和yvw_m,正确分类样本集和错误分类样本集的样本总数分别为:Nvc_m和Nvw_m,且Nvc_m+Nvw_m=NtrThe difficult part of this network is the determination of the decision threshold of each level, which is used to determine which level of network output each sample should be during testing. After training and testing, statistical calculations are performed on the correctly classified sample set and the misclassified sample set, respectively. Assuming that in the m-level width neural network, the correct classification sample set and the wrong classification sample set are: y vc_m and y vw_m respectively, the total number of samples of the correct classification sample set and the wrong classification sample set are: N vc_m and N vw_m , and N vc_m +N vw_m =N tr .

以上验证过程中,为了保证最终样本有足够多的错分样本集,每个验证集可以是包含有Nval个将原始训练样本集经过数据变换得到的验证样本集,即每个验证集可以包含Nval组验证样本集,也就是说每个验证集的样本数为原始训练样本的Nval倍。In the above verification process, in order to ensure that there are enough misclassified sample sets in the final sample, each verification set can contain N val verification sample sets obtained by transforming the original training sample set through data, that is, each verification set can contain N val sets of validation samples, that is to say, the number of samples in each validation set is N val times the original training samples.

两类样本集的误差通过下式计算:The error of the two types of sample sets is calculated by the following formula:

evc_m=||softmax(yvc_m)-tvc_m||2e vc_m =||softmax(y vc_m )-t vc_m || 2 ;

evw_m=||softmax(yvw_m)-tvw_m||2e vw_m =||softmax(y vw_m )-t vw_m || 2 ;

其中,tvc_m和tvw_m是m级中正确分类样本yvc_m和错误分类样本yvw_m对应的真实标签。假设正确分类和错误分类这两类样本统计的均值和方差分别是:μc,uw,σc,σw,与之对应的两个高斯分布分别是:Among them, t vc_m and t vw_m are the true labels corresponding to the correctly classified samples y vc_m and the misclassified samples y vw_m in the m level. Assuming that the mean and variance of the two types of sample statistics for correct classification and misclassification are: μ c , u w , σ c , σ w , the corresponding two Gaussian distributions are:

Figure BDA0002037902470000141
Figure BDA0002037902470000141

Figure BDA0002037902470000142
Figure BDA0002037902470000142

其高斯概率密度函数分别是:The Gaussian probability density functions are:

Figure BDA0002037902470000143
Figure BDA0002037902470000143

Figure BDA0002037902470000144
Figure BDA0002037902470000144

在并行多级宽度神经网络的一级上,其验证集误差分布及其概率密度函数如图3(a)和(b)所示,则m级宽度神经网络的决策阈值为:At the first level of the parallel multi-level wide neural network, its validation set error distribution and its probability density function are shown in Figure 3(a) and (b), then the decision threshold of the m-level wide neural network is:

Tm=min(evw_m)-ασwT m =min(e vw_m )-ασ w ;

其中,α是一个常数,用来给出裕量,以使所有错误分类样本yvw_m在当前级被拒绝。where α is a constant to give a margin so that all misclassified samples y vw_m are rejected at the current stage.

(4)通过测试集对决策阈值确定的并行多级宽度神经网络进行测试(4) Test the parallel multi-level wide neural network determined by the decision threshold through the test set

如图1(c)所示,具体的测试过程为:As shown in Figure 1(c), the specific test process is:

首先,获取测试集,具体过程为:获取原始测试样本集Xtest;通过M次数据扩充,对应获取M组测试样本集xtest_1,...,xtest_m,...,xtest_M,即为测试集;其中,原始测试样本集xtest中测试样本总数为Ntest_samplesFirst, obtain the test set, the specific process is: obtain the original test sample set X test ; through M data expansion, correspondingly obtain M groups of test sample sets x test_1 , . . . , x test_m , . . . , x test_M , namely Test set; among them, the total number of test samples in the original test sample set x test is N test_samples .

上述的数据扩充为:对所述原始测试样本集Xtest中的每个样本分别进行 NtestD次数据变换,对应得到NtestD个测试样本集,作为决策阈值确定的并行M级宽度神经网络的第m级宽度神经网络的测试集xtest_mThe above-mentioned data expansion is: perform N testD data transformations on each sample in the original test sample set X test respectively, and correspondingly obtain N testD test sample sets, which are used as the first step of the parallel M-level width neural network determined by the decision threshold. A test set x test_m for a class-m wide neural network.

上述的测试集获取方法能够在后续的测试过程中得到测试的稳定性。The above test set acquisition method can obtain the stability of the test in the subsequent test process.

其次,将M组测试样本集xtest_1,...,xtest_m,...,xtest_M并行输入给决策阈值确定的并行M级宽度神经网络,对测试集进行测试,即将每组测试集对应输入给决策阈值确定的每级宽度神经网络进行测试,对应得到决策阈值确定的每级宽度神经网络的NtestD个测试样本集输出;对NtestD个测试样本集的输出取平均值,得到决策阈值确定的每级宽度神经网络的测试输出

Figure BDA0002037902470000151
Secondly, the M groups of test sample sets x test_1 , ..., x test_m , ..., x test_M are input in parallel to the parallel M-level width neural network determined by the decision threshold, and the test set is tested, that is, each test set corresponds to Input the input to the neural network with the width of each level determined by the decision threshold for testing, and correspondingly obtain the output of N testD test sample sets of the neural network with the width of each level determined by the decision threshold; take the average of the outputs of the N testD test sample sets to obtain the decision threshold. Determine the test output of the neural network of width per stage
Figure BDA0002037902470000151

再次,设定测试集的总类别数为C,构建参考矩阵Rj(1≤j≤C);获取验证输出yv_m与对应级的参考矩阵Rj之间的误差向量:Again, set the total number of categories in the test set as C, and construct the reference matrix R j (1≤j≤C); obtain the error vector between the verification output y v_m and the reference matrix R j of the corresponding level:

Jtest_mj=||softmax(ytest_m)-Rj||2,1≤j≤C;J test_mj =||softmax(y test_m )-R j || 2 , 1≤j≤C;

其中,参考矩阵Rj的第j行的元素都为1,其余元素都为0,每个参考矩阵Rj的维数为C×Ntest_samples;Jtest_mj的维数为1×Ntest_samples,yv_m的维数为 C×Ntest_samplesAmong them, the elements of the jth row of the reference matrix R j are all 1, and the other elements are 0. The dimension of each reference matrix R j is C×N test_samples ; The dimension of J test_mj is 1×N test_samples , y v_m The dimension of test_samples is C×N.

最后,对决策阈值确定的每级宽度神经网络的输出进行判断,具体地:当前级宽度神经网络的最小误差小于等于当前级决策阈值时,即 min(Jtest_mj)≤Tm,则判断为当前级为该输出的正确分类输出级。Finally, judge the output of each level width neural network determined by the decision threshold, specifically: when the minimum error of the current level width neural network is less than or equal to the current level decision threshold, that is, min(J test_m j)≤T m , then it is judged as The current stage is the correct classification output stage for this output.

当前级宽度神经网络的最小误差大于当前级决策阈值时,即 min(Jtest_mj)>Tm,则判断为当前级无法对该输出进行正确分类,将该输出转入下一级宽度神经网络进行测试,如此循环,直到该输出找到正确分类输出级。进而得到决策阈值确定的每级宽度神经网络的测试输出对应的标签

Figure BDA0002037902470000152
其中,ytest_ind_m的维数为1×Ntest_samples。When the minimum error of the current-level width neural network is greater than the current-level decision threshold, that is, min(J test_mj )>T m , it is judged that the current level cannot correctly classify the output, and the output is transferred to the next-level width neural network for processing. Test, and so on, until the output finds the correct classification output stage. Then get the label corresponding to the test output of the neural network of each level width determined by the decision threshold
Figure BDA0002037902470000152
where the dimension of y test_ind_m is 1×N test_samples .

如果测试样本在前面25级均不能输出,则在最后的第26级直接输出。If the test sample cannot be output in the first 25 levels, it will be output directly in the last 26 level.

最终可得到测试集在整个网络的输出Ltest;其中,正确分类样本和错误分类样本都可以统计算出,进而可以得到本发明并行多级宽度神经网络的样本分类的精度。Finally, the output L test of the test set in the entire network can be obtained; wherein, both the correctly classified samples and the wrongly classified samples can be calculated statistically, and then the accuracy of the sample classification of the parallel multi-level wide neural network of the present invention can be obtained.

对比例Comparative ratio

采用与上述实施例相同的原始训练样本集、验证集和测试集,分别采用随机森林(RF),多层感知器(MP),传统径向基函数网络(RBF),支持向量机(SVM),广度学习系统(BLS)、条件深度学习模型(CDL),深度信念网络(DBL),卷积神经网络LeNet-5,深度玻尔兹曼机(DBM)以及深度随机深林(gc)作为基分类器,进行学习分类,最终得到的各种学习方法对数据分类的精度如图4所示。The same original training sample set, validation set and test set as in the above-mentioned embodiment are used, and random forest (RF), multi-layer perceptron (MP), traditional radial basis function network (RBF), support vector machine (SVM) are used respectively. , Breadth Learning System (BLS), Conditional Deep Learning Model (CDL), Deep Belief Network (DBL), Convolutional Neural Network LeNet-5, Deep Boltzmann Machine (DBM) and Deep Stochastic Deep Forest (gc) as base classification Figure 4 shows the accuracy of data classification obtained by various learning methods.

从图4可以看出,相比于目前主流的学习模型:随机森林(RF),多层感知器(MP),传统径向基函数网络(RBF),支持向量机(SVM),广度学习系统(BLS)、条件深度学习模型(CDL),深度信念网络(DBL),卷积神经网络LeNet-5,深度玻尔兹曼机(DBM),以及深度随机深林(gc forest),本发明的并行多级宽度神经网络(PMWNN)的分类结果的准确率具有非常高的竞争性,本发明方法最终的分类精度为99.10%,WRBF为宽度径向基函数网络。而相比于深度随机深林学习模型,本发明方法神经网络具有多级基神经网络,每一级用来学习数据集的不同部分样本,能够根据问题及数据集的复杂程度,自适应地确定神经网络的结构,实现计算资源的优化;同时,本发明的神经网络在测试的时候可以进行并行测试,也就是把测试数据同时给网络的所有级,由训练过程中得到的每一级的决策阈值来决定每个测试样本最终由哪一级的神经网络输出,并行测试过程大大减少了实际使用网络时候的等待时间。As can be seen from Figure 4, compared to the current mainstream learning models: random forest (RF), multi-layer perceptron (MP), traditional radial basis function network (RBF), support vector machine (SVM), breadth learning system (BLS), Conditional Deep Learning Models (CDL), Deep Belief Networks (DBL), Convolutional Neural Networks LeNet-5, Deep Boltzmann Machines (DBM), and Deep Stochastic Deep Forests (gc forest), the parallel of the present invention The accuracy of the classification result of the multi-level wide neural network (PMWNN) has very high competitiveness. The final classification accuracy of the method of the present invention is 99.10%, and WRBF is a wide radial basis function network. Compared with the deep random deep forest learning model, the neural network of the method of the present invention has a multi-level base neural network, each level is used to learn different parts of the data set samples, and can adaptively determine the neural network according to the complexity of the problem and the data set. The structure of the network realizes the optimization of computing resources; at the same time, the neural network of the present invention can be tested in parallel during the test, that is, the test data is given to all levels of the network at the same time, and the decision threshold of each level obtained in the training process is obtained. To decide which level of neural network each test sample is finally output by, the parallel testing process greatly reduces the waiting time when actually using the network.

此外,本发明的并行多级宽度神经网络可以实现增量学习,即当有新数据来的时候,可以增加新的宽度径向基函数网络来学习新的特性,而无需重新训练整个并行多级宽度神经网络,这个意味着提出的网络可以在不遗忘旧知识的前提下学习新的知识。新的训练数据输入给当前M级网络,如果有错分的样本,那么它们和经过数据扩充的原始训练集一起建立新的训练数据集,训练新的宽度径向基函数网络,同时使用新的验证集进行验证,并且计算决策阈值,从而建立第M+1级网络。最终,新的并行多级宽度神经网络将由M+1 级宽度径向基函数网络组成。同时,本发明设计的并行多级宽度神经网络在测试的时候可以并行测试,所有的测试样本都会送给所有级的宽度径向基函数网络,决策阈值决定了哪一个宽度径向基函数网络分配给相应的测试样本。该过程不需要等待其它级的网络输出,从而在测试的时候并行化,加速了测试过程。In addition, the parallel multi-level wide neural network of the present invention can realize incremental learning, that is, when new data comes, a new wide radial basis function network can be added to learn new characteristics without retraining the entire parallel multi-level neural network Wide neural network, this means that the proposed network can learn new knowledge without forgetting old knowledge. The new training data is input to the current M-level network. If there are misclassified samples, then they and the original training set with data augmentation are used to build a new training data set, train the new width radial basis function network, and use the new The validation set is validated, and the decision threshold is calculated to build the M+1-th level network. Ultimately, the new parallel multi-level wide neural network will consist of M+1 level wide radial basis function networks. At the same time, the parallel multi-level width neural network designed by the present invention can be tested in parallel during testing, all test samples will be sent to the width radial basis function network of all levels, and the decision threshold determines which width radial basis function network is allocated. Give the corresponding test sample. This process does not need to wait for the network output of other stages, so that it can be parallelized during testing and speed up the testing process.

本发明的并行多级宽度神经网络中的每一级宽度神经网络,可以是宽度径向基函数网络、BP神经网络、卷积神经网络或者其他分类器,且多级宽度神经网络的每级基分类器的类型可以不同。Each level of width neural network in the parallel multi-level width neural network of the present invention may be a width radial basis function network, BP neural network, convolutional neural network or other classifiers, and each level of the multi-level width neural network The types of classifiers can be different.

显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些改动和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention. Thus, provided that these changes and modifications of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these changes and modifications.

Claims (9)

1. A learning method based on a parallel multi-level width neural network comprises a multi-level width neural network, wherein each level of width neural network comprises an input layer, a hidden layer and an output layer which are connected in sequence, and the learning method is characterized by comprising the following steps:
step 1, obtaining an original training sample set, and constructing a parallel M-level width neural network Net1,…Netm,…,NetMM is 1, 2 …, M, each level width neural network is used as a base classifier of the corresponding level; performing M times of data transformation on an original training sample set to correspondingly obtain M verification sets xv_1,…xv_m,…xv_M
Wherein, the total number of samples of the original training sample set is NtrEach training sample is an image sample to be learned; the parallel M-level width neural network Net is constructed1,…Netm,…,NetMThe specific process is as follows:
designing a parallel multi-level width neural network by adopting a width radial basis function network as a basis classifier; m width radial basis function networks are connected together to form a parallel multi-level width neural network Net1,Net2,...NetM(ii) a Each base classifier is used as a first level;
the specific process for constructing the wide radial basis function network comprises the following steps:
construction includes N0kA Gaussian base function of
Figure FDA0002616763390000011
The center of the radial basis function network is a subset randomly taken from an original training sample set, and the value of the standard deviation is a constant; acquiring multiple groups of local characteristic images of each image sample to be learned in an original training sample set by adopting a sliding window so as to obtain multiple groups of local characteristic matrixes, and taking the multiple groups of local characteristic matrixes as input data of a Gaussian basis function to obtain multiple radial basis function networks, namely a wide radial basis function network;
step 2, adopting an original training sample set and M verification sets xv_1,…xv_m,…xv_MRespectively training and verifying each level of the parallel M-level width neural network to obtain the trained parallel M-level width neural network and the verification output y corresponding to each level of the width neural networkv_mM is 1, 2 …, M; obtaining each verification output y by adopting a minimum error methodv_mCorresponding label yv_ind_mFurther obtaining a correct classification sample set y of the verification set of each level of width neural network of the trained parallel M level width neural networkvc_mAnd misclassification sample set yvw_m
The original training sample set and M verification sets x are adoptedv_1,…xv_m,…xv_MTraining and verifying each stage of the parallel M-stage width neural network respectively, comprising the following sub-steps:
substep 2.1, using the original training sample set as the level 1 width neural networkNet1For the 1 st order width neural network Net1Training to obtain a trained first-level width neural network;
substep 2.2, using a first verification set xv_1Verifying the trained 1 st-level width neural network to obtain an error classification sample set y of a verification set of the 1 st-level width neural networkvw_1
Substep 2.3, misclassification sample set y of first-level width neural networkvw_1Input samples A as a level 2 wide neural networkv_1(ii) a Then randomly extracting a training sample set A from the original training sample setv_2Let the total input sample set { A }v_1+Av_2The number of samples in (A) is equal to the number of samples in the original training sample set, and the total input sample set (A) isv_1+Av_2Taking the samples as input samples of a 2 nd-level width neural network;
substep 2.4, use the total input sample set { A }v_1+Av_2Training the 2 nd-level width neural network to obtain a trained 2 nd-level width neural network; using a second verification set xv_2Verifying the trained 2 nd-level width neural network to obtain an error classification sample set y of a verification set of the 2 nd-level width neural networkvw_2
And analogizing in sequence, respectively training the neural networks with the widths from the 3 rd level to the M th level to obtain the trained parallel neural networks with the widths of the M levels and the corresponding verification output y of the neural networks with the widths of each levelv_m
The training process of each stage of the width neural network comprises the following steps:
(a) using image samples to be learned in an original training sample set as input data, setting the initial position of a sliding window at the upper left corner of each image sample to be learned, selecting a sliding step length of 1 pixel, sliding the sliding window from left to right in sequence from top to bottom, and stretching 3-dimensional image blocks of all the image samples in the sliding window into a matrix xkRespectively forming corresponding original matrixes by each local characteristic image according to pixels, and sequentially arranging the 2 nd to the last columns of each original matrix to the 1 st column to form a column vector;arranging N column vectors in sequence to form a local feature matrix x of a group of training image sampleskK is more than or equal to 1 and less than or equal to K, and a local feature matrix xkEach column of (a) represents one image sample to be learned;
(b) matrix x of local featureskInput to the device including N0kA Gaussian base function of
Figure FDA0002616763390000031
The output is noted as:
Figure FDA0002616763390000032
wherein,
Figure FDA0002616763390000033
a column vector containing N elements;
sliding the sliding window to correspond to one radial basis function network every time, and obtaining K radial basis function networks after sliding is finished;
(c) for each radial basis function network, the output data phi of the Gaussian basis function subjected to nonlinear transformationkIntroducing sequencing and downsampling:
output data phi for wide radial basis function networkskEach row of the image to be learned is summed to obtain a row vector, each element of the row vector is the sum of the pixels of the local specific position of each image to be learned, the sums of the pixels of the local specific position of each image to be learned are arranged in a descending order to obtain a descending order vector
Figure FDA0002616763390000034
Using an index skVector a in descending orderkMarking an original position corresponding to the local specific position of each image to be learned to obtain sequenced output data phi'k=sort(Φk,sk);
Downsampling the sorted output data, and setting a downsampling interval NkSThen the sampling output is phikS=subsample(Φ′k,NkS) The output of the Gaussian base function is [ phi ] - [ phi ]1S,Φ2S,…,ΦKS];
(d) Setting the desired output to D ═ D1,D2,…,DC](ii) a And performing linear layer connection on the output of the Gaussian basis function of the wide radial basis function network, wherein the weight of the linear layer is as follows: w ═ W1,W2,…,WC];
Wherein C is the total number of categories of the original sample;
obtaining class output Y ═ Y of the wide radial basis function network1,Y2,…,YC]Phi W; in particular, a least mean square estimate of the weights of the linear layer is calculated by minimizing the square error
Figure FDA0002616763390000045
The concrete formula is as follows:
Figure FDA0002616763390000041
least mean square estimation of weights of linear layers by pseudo-inverse matrix of gaussian basis function output phi of wide radial basis function network
Figure FDA0002616763390000042
Figure FDA0002616763390000043
Wherein phi+Outputting a pseudo-inverse matrix of phi for the Gaussian basis function of the wide radial basis function network;
finally, the class output of the wide radial basis function network obtained by calculation is as follows:
Figure FDA0002616763390000044
further obtaining a trained width radial basis function network, and completing the training process of each level of width neural network;
step 3, correctly classifying a sample set y of a verification set of each level of width neural network of the trained parallel M level width neural networkvc_mAnd misclassification sample set yvw_mRespectively carrying out statistical calculation to correspondingly obtain the decision threshold T of the trained neural network with each level of widthm(ii) a Decision threshold T of neural network with each level widthmThe decision basis is used as the decision basis of the neural network with the corresponding level width, and the parallel neural network with the M level width determined by the decision threshold is obtained;
step 4, obtaining a test set, taking the test set as input data of the parallel M-level width neural network determined by the decision threshold, and inputting the input data to each level of width neural network determined by the decision threshold in parallel to test to obtain the output of each level of width neural network determined by the decision threshold; obtaining an error vector of each level of width neural network, and judging the output of each level of width neural network determined by the decision threshold, thereby obtaining a label y corresponding to the test output of each level of width neural network determined by the decision thresholdtest_ind_m
2. The parallel multi-level width neural network-based learning method of claim 1, wherein in step 1, the data transformation compresses or deforms the samples in the original sample set by elastic transformation; or the data transformation rotates, flips, zooms in, or zooms out samples in the original sample set through affine transformation.
3. The learning method based on the parallel multi-level width neural network as claimed in claim 1, wherein in step 2, the minimum error method is:
firstly, setting the total class number of an original training sample set as C, and constructing a reference matrix Rj,1≤j≤C;
Wherein, the reference matrix RjEach reference matrix R having 1 for the element in the jth row and 0 for the remaining elementsjOf dimension C × Ntr
Second, it is used forAccording to the verification output y of the trained neural network with each stage widthv_mObtaining verification output yv_mReference matrix R corresponding to the stagejError vector between:
Jv_mj=||softmax(yv_m)-Rj||2,1≤j≤C;
wherein | | | purple hair2Representing the 2 norm of the matrix, softmax () being a normalized exponential function; j. the design is a squarev_mjDimension of 1 × Ntr;yv_mOf dimension C × Ntr
Finally, output y to verificationv_mReference matrix R corresponding to the stagejError vector J therebetweenv_mjCalculating the minimum value to obtain the class label y corresponding to the trained neural network with each level of widthv_ind_m
Figure FDA0002616763390000051
Wherein, yv_ind_mDimension of 1 × Ntr
4. The parallel multi-level width neural network-based learning method of claim 1, wherein in step 3, the statistical calculation comprises the following sub-steps:
and 3.1, setting a correct classification sample set and an incorrect classification sample set of the mth level width neural network of the trained parallel M level width neural network as follows: y isvc_mAnd yvw_mThe total number of samples in the correctly classified sample set and the incorrectly classified sample set is respectively as follows: n is a radical ofvc_mAnd Nvw_mAnd N isvc_m+Nvw_m=NtrThen, the errors of the correctly classified sample set and the incorrectly classified sample set are respectively:
evc_m=||softmax(yvc_m)-tvc_m||2
evw_m=||softmax(yvw_m)-tvw_m||2
wherein, tvc_mIs an m-level width neural networkIn correctly classifying the sample yvc_mCorresponding real label, tvw_mIs a misclassified sample y in an m-level wide neural networkvw_mA corresponding real label;
substep 3.2, sample set y is classified correctlyvc_mAnd misclassification sample set yvw_mRespectively calculate the correct classification sample set yvc_mRespectively, mean and variance ofcAnd σc(ii) a Misclassification sample set yvw_mThe mean and variance of (a) are respectively: u. ofwAnd σw(ii) a Then the sample set y is correctly classifiedvc_mAnd misclassification sample set yvw_mThe corresponding gaussian distributions are:
Figure FDA0002616763390000061
Figure FDA0002616763390000062
correctly classifying sample set yvc_mAnd misclassification sample set yvw_mThe corresponding gaussian probability density functions are:
Figure FDA0002616763390000063
Figure FDA0002616763390000064
substep 3.3, sorting the sample set y according to errorsvw_mError e ofvw_mSum variance σwObtaining a decision threshold T of the neural network with m-level widthm=min(evw_m)-ασw
Wherein α is a constant to give a margin to allow all misclassified samples yvw_mIs rejected at the current stage.
5. Parallel multi-level-width neural network-based science according to claim 2The learning method is characterized in that in step 4, the test set acquisition is as follows: obtaining an original test sample set xtest(ii) a Correspondingly acquiring M groups of test sample sets x through M times of data expansiontest_1,...,xtest_m,...,xtest_MI.e. a test set.
6. The parallel multi-level-width neural network-based learning method of claim 5, wherein the data is augmented as: for the original test sample set xtestIs performed for each sample in NtestDConverting the data to obtain NtestDTest sample set as test set x of M-th order width neural network of parallel M-order width neural network determined by decision thresholdtest_m
Wherein, the original test sample set xtestTotal number of middle test samples is Ntest_samples
7. The parallel multi-level width neural network-based learning method of claim 1, wherein in step 4, the obtaining the error vector of each level of width neural network comprises the following sub-steps:
substep 4.1, set M groups of test samples xtest_1,xtest_2,...,xtest_MRespectively parallelly inputting the data to parallel M-level width neural networks determined by the decision threshold, and correspondingly obtaining N of each-level width neural network determined by the decision thresholdtestDAn output ytest_m_d,d=1,2…NtestD
Substep 4.2, N for each level of width neural network determined for decision thresholdtestDAn output ytest_m_d,d=1,2…NtestDCalculating the average value to obtain the test output of each level of width neural network determined by the decision threshold
Figure FDA0002616763390000071
Substep 4.3, setting the total class number of the test set as C, and constructing a reference matrix Rj,1≤j is less than or equal to C; obtaining verification output yv_mReference matrix R corresponding to the stagejError vector between:
Jtest_mj=||softmax(ytest_m)-Rj||2,1≤j≤C;
wherein, the reference matrix RjEach reference matrix R having 1 for the element in the jth row and 0 for the remaining elementsjOf dimension C × Ntest_samples;Jtest_mjDimension of 1 × Ntest_samples,yv_mOf dimension C × Ntest_samples
8. The learning method based on the parallel multi-level width neural network of claim 7, wherein the output of each level of width neural network determined by the decision threshold is determined as:
when the minimum error of the current stage width neural network is less than or equal to the current stage decision threshold, judging that the current stage is the correct classification output stage of the output:
min(Jtest_mj)≤Tm
when the minimum error of the current stage width neural network is larger than the decision threshold of the current stage, judging that the current stage can not correctly classify the output, transferring the output to the next stage width neural network for testing, and repeating the steps until the output finds the correctly classified output stage:
min(Jtest_mj)>Tm
9. the learning method based on the parallel multi-level width neural network of claim 8, wherein in step 4, the label y corresponding to the test output of each level of width neural network determined by the decision threshold is obtainedtest_ind_mComprises the following steps:
Figure FDA0002616763390000081
wherein, ytest_ind_mDimension of 1 × Ntest_samples
CN201910331708.8A 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network Active CN110110845B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910331708.8A CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910331708.8A CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Publications (2)

Publication Number Publication Date
CN110110845A CN110110845A (en) 2019-08-09
CN110110845B true CN110110845B (en) 2020-09-22

Family

ID=67486407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910331708.8A Active CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Country Status (1)

Country Link
CN (1) CN110110845B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008647B (en) * 2019-11-06 2022-02-08 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111340184B (en) * 2020-02-12 2023-06-02 北京理工大学 Method and device for controlling surface shape of deformable mirror based on radial basis function
CN113449569B (en) * 2020-03-27 2023-04-25 威海北洋电气集团股份有限公司 Mechanical signal health state classification method and system based on distributed deep learning
CN112966761B (en) * 2021-03-16 2024-03-19 长安大学 Extensible self-adaptive width neural network learning method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784312A (en) * 2016-08-24 2018-03-09 腾讯征信有限公司 Machine learning model training method and device
CN108351985A (en) * 2015-06-30 2018-07-31 亚利桑那州立大学董事会 Method and apparatus for large-scale machines study

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9811775B2 (en) * 2012-12-24 2017-11-07 Google Inc. Parallelizing neural networks during training
US10242313B2 (en) * 2014-07-18 2019-03-26 James LaRue Joint proximity association template for neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108351985A (en) * 2015-06-30 2018-07-31 亚利桑那州立大学董事会 Method and apparatus for large-scale machines study
CN107784312A (en) * 2016-08-24 2018-03-09 腾讯征信有限公司 Machine learning model training method and device

Also Published As

Publication number Publication date
CN110110845A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CN110110845B (en) Learning method based on parallel multi-level width neural network
US11935326B2 (en) Face recognition method based on evolutionary convolutional neural network
CN109977918B (en) An Optimization Method for Object Detection and Localization Based on Unsupervised Domain Adaptation
CN104850890B (en) Instance-based learning and the convolutional neural networks parameter regulation means of Sadowsky distributions
JPH07296117A (en) Constitution method of sort weight matrix for pattern recognition system using reduced element feature section set
Yuan et al. Iterative cross learning on noisy labels
CN107480261A (en) One kind is based on deep learning fine granularity facial image method for quickly retrieving
JPH06176202A (en) Method and device for controlled training- increasing polynominal for character recognition
CN101373519A (en) Character recognition device and method
CN112199536A (en) A cross-modality-based fast multi-label image classification method and system
CN112364974B (en) YOLOv3 algorithm based on activation function improvement
CN114492581B (en) A method based on transfer learning and attention mechanism meta-learning applied to small sample image classification
CN113869098B (en) Plant disease identification method, device, electronic device and storage medium
CN111046961A (en) Fault classification method based on bidirectional long-and-short-term memory unit and capsule network
CN107358172A (en) A kind of human face characteristic point initial method based on facial orientation classification
CN107358253A (en) A kind of adaptive integrated learning approach and system based on differential evolution
CN114330650A (en) Small sample feature analysis method and device based on evolutionary meta-learning model training
WO2019155523A1 (en) Classifier forming device, classifier forming method, and non-transitory computer-readable medium for storing program
Cho et al. Genetic evolution processing of data structures for image classification
CN111371611A (en) A deep learning-based weighted network community discovery method and device
CN112364848B (en) Recognition method and device for generating confrontation network repairing abnormal vein image based on classification loss
CN113435525A (en) Classification network training method and device, computer equipment and storage medium
CN108960275A (en) A kind of image-recognizing method and system based on depth Boltzmann machine
CN109858520B (en) Multi-layer semi-supervised classification method
CN114742127B (en) A small sample image classification method based on improved prototype network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant