CN107330294A

CN107330294A - The application process of many hidden layer extreme learning machines of online sequential with forgetting factor

Info

Publication number: CN107330294A
Application number: CN201710577695.3A
Authority: CN
Inventors: 肖冬; 李北京; 毛亚纯; 柳小波
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2017-07-15
Filing date: 2017-07-15
Publication date: 2017-11-07

Abstract

The present invention relates to a kind of application process of many hidden layer extreme learning machines of the online sequential with forgetting factor, comprise the following steps：1) an extreme learning machine model with many hidden layers is sought, the output expression formula of many hidden layer extreme learning machine models is obtained；2) real-time update is carried out to above-mentioned many hidden layer extreme learning machine models, the expression formula of model after output updates.The present invention handles the data variation of batch process using the method for many hidden layer extreme learning machines of the online sequential with forgetting factor, method can adjust model according to the change in data structure, depth optimization can also be carried out to model parameter, reach more preferable effect, it is ensured that final hiding output hides output closer to expected.

Description

Application method of online sequential multi-hidden layer extreme learning machine with forgetting factor

技术领域technical field

本发明涉及一种学习机，具体为一种带遗忘因子的在线时序多隐含层极限学习机的应用方法。The invention relates to a learning machine, in particular to an application method of an online sequential multi-hidden layer extreme learning machine with forgetting factors.

背景技术Background technique

由于间歇过程的变量间存在着很强的非线性和耦合性，且很多实时数据都随时间变化，具有很强的时效性，而传统的单隐层极限学习机或者多隐层极限学习机都是事先根据建立好的模型预测结果输出，不能根据数据结构上的变化来调整模型的变化，模型太过僵硬；单隐层的极限学习机对模型结构参数的优化不够彻底，不能有效的减少噪声的干扰，无法保证最终的隐藏输出更接近预期的隐藏输出。因此需要使用集成的在线时序单隐层极限学习机(EOS-ELM)，或者带遗忘机制的集成的在线时序单隐层极限学习机(FOS-ELM)，它们可以根据数据结构上的变化调整模型变化，适应不同的时间阶段，达到更好地效果；由于实时变化的数据都带有一系列的不可避免噪声信号的干扰。Due to the strong nonlinearity and coupling between the variables of the intermittent process, and many real-time data change with time, it has strong timeliness, while the traditional single-hidden-layer extreme learning machine or multi-hidden-layer extreme learning machine cannot It is based on the established model to predict the output in advance, and the model cannot be adjusted according to the change in the data structure. The model is too rigid; the extreme learning machine with a single hidden layer is not thorough enough to optimize the model structural parameters, and cannot effectively reduce noise. interference, there is no guarantee that the final hidden output is closer to the expected hidden output. Therefore, it is necessary to use an integrated online sequential single hidden layer extreme learning machine (EOS-ELM), or an integrated online sequential single hidden layer extreme learning machine with a forgetting mechanism (FOS-ELM), which can adjust the model according to changes in the data structure Changes to adapt to different time stages to achieve better results; because real-time changing data has a series of unavoidable noise signal interference.

目前，能够满足上述要求的在线时序学习机尚未见报道。Currently, no online sequential learning machine that can meet the above requirements has been reported.

发明内容Contents of the invention

针对现有技术中单隐层极限学习机或者多隐层极限学习机存在不能根据数据结构上的变化来调整模型变化等不足，本发明要解决的问题是提供一种既能根据数据结构上的变化来调整模型，也可以对模型参数进行深度优化的带遗忘因子的在线时序多隐含层极限学习机的应用方法。Aiming at the disadvantages of single hidden layer extreme learning machine or multi-hidden layer extreme learning machine that cannot adjust model changes according to changes in data structure in the prior art, the problem to be solved by the present invention is to provide a It is an application method of an online time-series multi-hidden layer extreme learning machine with forgetting factor to adjust the model by changing the model parameters in depth.

为解决上述技术问题，本发明采用的技术方案是：In order to solve the problems of the technologies described above, the technical solution adopted in the present invention is:

本发明一种带遗忘因子的在线时序多隐含层极限学习机的应用方法，包括以下步骤：An application method of an online time-series multi-hidden layer extreme learning machine with a forgetting factor of the present invention comprises the following steps:

1)求一个具有多隐含层的极限学习机模型，得到该多隐含层极限学习机模型的输出表达式；1) seek an extreme learning machine model with multiple hidden layers, and obtain the output expression of the extreme learning machine model with multiple hidden layers;

2)对上述多隐含层极限学习机模型进行实时更新，输出更新后模型的表达式。2) The above-mentioned multi-hidden layer extreme learning machine model is updated in real time, and the expression of the updated model is output.

步骤1)中，求一个具有多隐含层的极限学习机模型，得到该多隐含层极限学习机模型的输出表达式，具体为：In step 1), seek an extreme learning machine model with multiple hidden layers, and obtain the output expression of the extreme learning machine model with multiple hidden layers, specifically:

11)给定样本和多个隐含层的网络结构，隐含层的激活函数为g，网络输出为g(a,b,X)，其中a为输入层和第一隐含层之间的权重，b为第一隐层的偏差，X为输入矩阵；11) Given a sample and a network structure of multiple hidden layers, the activation function of the hidden layer is g, and the network output is g(a,b,X), where a is the distance between the input layer and the first hidden layer Weight, b is the bias of the first hidden layer, X is the input matrix;

12)假设数据分批次变化，且每一批次都持续S个单位时间，在第k-th个单位时间的数据表示为N_j为j批次数据的个数；χ_k在[k k+s]的范围内有效，j＝0,1,…k.，t_i为标志变量，在第(k+1)-th个单位时间的数据表示为k为任意大的正整数，x_i为输入样本，t_i为样本标志量，th为批次。12) Assuming that the data changes in batches, and each batch lasts for S units of time, the data of the k-th unit time is expressed as N _j is the number of j batches of data; χ _k is valid within the range of [k k+s], j=0,1,...k., t _i is a flag variable, at (k+1)-th The data of unit time is expressed as k is any large positive integer, _xi is the input sample, t _i is the sample mark, and th is the batch.

13)假设k≥s-1，训练数据的个数远大于隐藏层节点的数目，Z_k+1为第(k+1)-th个单位时间预测出的结果，设l＝k-s+1,k-s+2,…k；数据在l-th时刻网络第一个隐含层的输出为：13) Suppose k≥s-1, the number of training data is much larger than the number of hidden layer nodes, Z _k+1 is the predicted result of the (k+1)-th unit time, set l=k-s+ 1, k-s+2,...k; the output of the first hidden layer of the network at the l-th moment of the data is:

(a_i,b_i)为输入层和第一个隐含层之间的权值与阈值，i＝1,...L.随机初始化；G为隐含层激活函数，T为[k-s+1，k]内批数据样本的标志量，T_l为第l批数据样本的标志量；l为在[k-s+1，k]内的一个正整数；(a _i , b _i ) is the weight and threshold between the input layer and the first hidden layer, i=1,...L. Random initialization; G is the hidden layer activation function, T is [k- s+1, k] inner batch data sample mark amount, T ₁ is the mark amount of the lth batch data sample; l is a positive integer in [k-s+1, k];

得到最终隐含层的输出权值β为：The output weight β of the final hidden layer is obtained as:

且 and

14)假设第二个隐含层的权值和偏差为W₁,B₁，则第二个隐含层的输出为：14) Suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is:

15)假设W_HE＝[B₁ W₁]，则第二个隐含层的权值和偏差通过计算且假设H_E＝[1 H]^T，1为元素全为1的一维行向量，g^-1(x)为激活函数的g(x)反函数，W_HE和HE为假设的变量；15) Assuming W _HE = [B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And suppose H _E =[1 H] ^T , 1 is a one-dimensional row vector whose elements are all 1, g ^-1 (x) is the inverse function of g(x) of the activation function, W _HE and HE are assumed variables;

16)更新第二个隐含层的输出为H₂＝g(W_HEH_E)；更新最终隐含层的输出权值β为 16) Update the output of the second hidden layer as H ₂ =g(W _HE H _E ); update the output weight β of the final hidden layer as

17)假设第三个隐含层的权值和偏差为W₂,B₂，则第三个隐含层的输出为17) Suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

18)假设W_HE1＝[B₂ W₂]，则第三个隐含层的权值和偏差通过计算且H_E1＝[1 H₂]^T，1为元素全为1的一维行向量；g^-1(x)为激活函数的g(x)反函数；W_HE1、HE1为为假设的变量；18) Assuming W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1; g ^-1 (x) is the inverse function of g(x) of the activation function; W _HE1 and HE1 are hypothetical variables;

更新第三个隐含层的输出为：The output of the updated third hidden layer is:

H₄＝g(W_HE1H_E1) (2)H ₄ ＝g(W _HE1 H _E1 ) (2)

19)更新最终隐含层的输出权值β为：19) Update the output weight β of the final hidden layer as:

则最终的输出为f＝β_new1H₄。Then the final output is f=β _new1 H ₄ .

对于具有多个隐含层的网络结构，循环迭代公式(1)、(2)、(3)，三个隐层迭代一次，四个隐层迭代两次，N个隐层迭代N-2次，每次迭代结束使得β_new＝β_new1，H₂＝H₄。For a network structure with multiple hidden layers, loop iteration formulas (1), (2), (3), three hidden layers iterate once, four hidden layers iterate twice, and N hidden layers iterate N-2 times , so that β _new = β _new1 , H ₂ = H ₄ at the end of each iteration.

步骤2)中，对上述多隐含层极限学习机模型进行实时更新，输出更新后模型的表达式，具体为：In step 2), the above multi-hidden layer extreme learning machine model is updated in real time, and the expression of the updated model is output, specifically:

21)假设Z_k+2为第(k+2)-th个单位时间预测出的结果，在(k+1)-th时间的结果已知的情况下，与上一次保持同样的(a_i,b_i),i＝1,...L，输出权值表示为：21) Assuming that Z _k+2 is the predicted result of the (k+2)-th unit time, when the result of the (k+1)-th time is known, keep the same (a _i ,b _i ),i=1,...L, the output weight is expressed as:

且则P_k+1由P_k表示为：and Then P _k+1 is expressed by P _k as:

且 and

则 but

22)假设第二个隐含层的权值和偏差为W₁,B₁，则第二个隐含层的输出为22) Suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is

23)假设W_HE＝[B₁ W₁]，则第二个隐含层的权值和偏差通过计算且H_E＝[1 H]^T，1为元素全为1的一维行向量；g^-1(x)为激活函数的g(x)反函数， 23) Assuming W _HE = [B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And H _E ＝[1 H] ^T , 1 is a one-dimensional row vector whose elements are all 1; g ^-1 (x) is the inverse function of g(x) of the activation function,

24)更新第二个隐含层的输出为H₂＝g(W_HEH_E)；更新最终隐含层的输出权值β为 24) Update the output of the second hidden layer as H ₂ =g(W _HE H _E ); update the output weight β of the final hidden layer as

25)假设第三个隐含层的权值和偏差为W₂,B₂，则第三个隐含层的输出为25) Suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

26)假设W_HE1＝[B₂ W₂]，则第三个隐含层的权值和偏差通过计算且H_E1＝[1 H₂]^T，1为元素全为1的一维行向量，g^-1(x)为激活函数的g(x)反函数；26) Assuming W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1, and g ^-1 (x) is the inverse function of g(x) of the activation function;

27)更新第三个隐含层的输出为：H₄＝g(W_HE1H_E1) (5)27) Update the output of the third hidden layer as: H ₄ =g(W _HE1 H _E1 ) (5)

28)更新最终隐含层的输出权值β为： 28) Update the output weight β of the final hidden layer as:

本发明还包括以下步骤：The present invention also includes the following steps:

对于具有多个隐含层的网络结构，循环迭代公式(4)、(5)、(6)，三个隐层迭代一次，四个隐层迭代两次，N个隐层迭代N-2次，每次迭代结束，使得β_new＝β_new1，H₂＝H₄。For a network structure with multiple hidden layers, formulas (4), (5), and (6) are iterated cyclically, three hidden layers are iterated once, four hidden layers are iterated twice, and N hidden layers are iterated N-2 times , each iteration ends, so that β _new = β _new1 , H ₂ = H ₄ .

本发明具有以下有益效果及优点：The present invention has the following beneficial effects and advantages:

1.本发明采用带遗忘因子的在线时序多隐含层极限学习机的方法来处理间歇过程的数据变化，法既能根据数据结构上的变化来调整模型，也可以对模型参数进行深度优化，达到更好的效果，保证最终的隐藏输出更接近预期的隐藏输出。1. The present invention adopts the method of online sequential multi-hidden layer extreme learning machine with forgetting factor to process the data change of the intermittent process, the method can adjust the model according to the change in the data structure, and can also carry out deep optimization to the model parameters, Achieving better results ensures that the final hidden output is closer to the expected hidden output.

附图说明Description of drawings

图1为FOS-ELM模型和FOS-MELM模型下，在每一批次的数据只有2个有效时间时，训练集的均方根误差值；Figure 1 shows the root mean square error value of the training set when each batch of data has only 2 valid times under the FOS-ELM model and the FOS-MELM model;

图2为FOS-ELM模型和FOS-MELM模型下，在每一批次的数据只有2个有效时间时，测试集的均方根误差值；Figure 2 shows the root mean square error value of the test set under the FOS-ELM model and the FOS-MELM model when each batch of data has only 2 valid times;

图3为FOS-ELM模型和FOS-MELM模型下，在每一批次的数据只有3个有效时间时，训练集的均方根误差值；Figure 3 shows the root mean square error value of the training set when each batch of data has only 3 valid times under the FOS-ELM model and the FOS-MELM model;

图4为FOS-ELM模型和FOS-MELM模型下，在每一批次的数据只有3个有效时间时，测试集的均方根误差值。Figure 4 shows the root mean square error value of the test set under the FOS-ELM model and the FOS-MELM model when each batch of data has only 3 valid times.

具体实施方式detailed description

下面结合说明书附图对本发明作进一步阐述。The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

为了使训练输出更加接近于实际输出，考虑到单隐层的极限学习机对模型结构参数的优化不够彻底，不能有效的减少噪声的干扰，因此为了保证最终的隐藏输出更接近预期的隐藏输出，结合以前的改进极限学习机的优势成果，本发明用一种带遗忘因子的在线时序多隐含层极限学习机的方法来处理间歇过程的数据变化，这种方法既能根据数据结构上的变化来调整模型，也可以对模型参数进行深度优化，以达到更好地效果。In order to make the training output closer to the actual output, considering that the optimization of the model structure parameters by the single hidden layer extreme learning machine is not thorough enough to effectively reduce the noise interference, so in order to ensure that the final hidden output is closer to the expected hidden output, Combined with the previous advantageous results of improving the extreme learning machine, the present invention uses a method of online sequential multi-hidden layer extreme learning machine with forgetting factor to deal with the data changes of the intermittent process. To adjust the model, you can also deeply optimize the model parameters to achieve better results.

本发明带遗忘因子的在线时序多隐含层极限学习机的应用方法包括以下步骤：The application method of the online sequential multi-hidden layer extreme learning machine with forgetting factor of the present invention comprises the following steps:

本实施例使用一个带有三个隐含层的ELM网络结构为例，来分析带遗忘机制的在线时序多隐层极值学习机(FOS-MELM)的算法步骤。步骤1)中，求一个具有多隐含层的极限学习机模型，得到该多隐含层极限学习机模型的输出表达式，具体为：This embodiment uses an ELM network structure with three hidden layers as an example to analyze the algorithm steps of an online sequential multi-hidden layer extreme value learning machine (FOS-MELM) with a forgetting mechanism. In step 1), seek an extreme learning machine model with multiple hidden layers, and obtain the output expression of the extreme learning machine model with multiple hidden layers, specifically:

11)首先给定样本和三个隐含层的网络结构(每个隐含层含有L个节点)，隐含层的激活函数选择为g，因此网络输出为g(a,b,X)，其中a为输入层和第一隐含层之间的权重，b为第一隐层的偏差，X为输入矩阵；11) First, given the sample and the network structure of three hidden layers (each hidden layer contains L nodes), the activation function of the hidden layer is selected as g, so the network output is g(a,b,X), where a is the weight between the input layer and the first hidden layer, b is the bias of the first hidden layer, and X is the input matrix;

12)假设数据是一批一批的变化，且每一批都持续S个单位时间。因此在第k-th个单位时间的数据可以表示为并且N_j为j批次数据的个数，χ_k在[kk+s]的范围内有效j＝0,1,…k.，t_i为标志变量。所以在第(k+1)-th个单位时间的数据可以表示为 12) Assume that the data changes batch by batch, and each batch lasts for S unit time. So the data of the k-th unit time can be expressed as And N _j is the number of j batches of data, χ _k is valid within the range of [kk+s] j=0,1,...k., t _i is a flag variable. So the data of the (k+1)-th unit time can be expressed as

13)假设k≥s-1，训练数据的个数远大于隐藏层节点的数目，Z_k+1为第(k+1)-th个单位时间预测出的结果，设l＝k-s+1,k-s+2,…k.。数据在l-th时刻网络第一个隐含层的输出为：13) Suppose k≥s-1, the number of training data is much larger than the number of hidden layer nodes, Z _k+1 is the predicted result of the (k+1)-th unit time, set l=k-s+ 1,k-s+2,...k. The output of the first hidden layer of the network at the l-th moment of the data is:

(a_i,b_i)为输入层和第一个隐含层之间的权值与阈值,i＝1,...L.为随机初始化的；可以得到最终隐含层的输出权值β为：(a _i , b _i ) are the weights and thresholds between the input layer and the first hidden layer, i=1,...L. are randomly initialized; the output weight β of the final hidden layer can be obtained for:

且 and

24)假设第二个隐含层的权值和偏差为W₁,B₁，则第二个隐含层的输出为：24) Suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is:

15)假设W_HE＝[B₁ W₁]，则第二个隐含层的权值和偏差通过计算且H_E＝[1 H]^T，1为元素全为1的一维行向量。g^-1(x)为激活函数的g(x)反函数， 15) Assuming W _HE = [B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And H _E =[1 H] ^T , 1 is a one-dimensional row vector whose elements are all 1. g ^-1 (x) is the inverse function of g(x) of the activation function,

17)现在假设第三个隐含层的权值和偏差为W₂,B₂，则第三个隐含层的输出为17) Now suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

18)假设W_HE1＝[B₂ W₂]，则第三个隐含层的权值和偏差通过计算且H_E1＝[1 H₂]^T，1为元素全为1的一维行向量；g^-1(x)为激活函数的g(x)反函数；更新第三个隐含层的输出为H₄＝g(W_HE1H_E1)(2)18) Assuming W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1; g ^-1 (x) is the inverse function of g(x) of the activation function; the output of updating the third hidden layer is H ₄ =g(W _HE1 H _E1 )(2)

19)更新最终隐含层的输出权值β为 19) Update the output weight β of the final hidden layer as

21)假设Z_k+2为第(k+2)-th个单位时间预测出的结果，在(k+1)-th时间的结果已知的情况下，与上一次保持同样的(a_i,b_i),i＝1,...L；输出权值可以表示为：21) Assuming that Z _k+2 is the predicted result of the (k+2)-th unit time, when the result of the (k+1)-th time is known, keep the same (a _i ,b _i ),i=1,...L; the output weight can be expressed as:

且则P_k+1可以由P_k表示为and Then P _k+1 can be expressed by P _k as

且 and

则 but

22)现在假设第二个隐含层的权值和偏差为W₁,B₁，则第二个隐含层的输出为22) Now suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is

23)现在假设W_HE＝[B₁ W₁]，则第二个隐含层的权值和偏差通过计算且H_E＝[1 H]^T，1为元素全为1的一维行向量，g^-1(x)为激活函数的g(x)反函数， 23) Now suppose W _HE =[B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And H _E ＝[1 H] ^T , 1 is a one-dimensional row vector whose elements are all 1, g ^-1 (x) is the inverse function of g(x) of the activation function,

25)现在假设第三个隐含层的权值和偏差为W₂,B₂，则第三个隐含层的输出为25) Now suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

26)现在假设W_HE1＝[B₂ W₂]，则第三个隐含层的权值和偏差通过计算且H_E1＝[1 H₂]^T，1为元素全为1的一维行向量。g^-1(x)为激活函数的g(x)反函数；26) Now suppose W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1. g ^-1 (x) is the inverse function of g(x) of the activation function;

27)更新第三个隐含层的输出为H₄＝g(W_HE1H_E1) (5)27) Update the output of the third hidden layer as H ₄ =g(W _HE1 H _E1 ) (5)

28)更新最终隐含层的输出权值β为 28) Update the output weight β of the final hidden layer as

本实施例采用苯乙烯的聚合反应器作为研究对象，该反应器的生产目的是通过调节反应器的温度，使得反应的转化率、反应产物的数均链长和重均链长在反应结束时接近最优。反应的机理模型如下：This embodiment uses the polymerization reactor of styrene as the research object. The production purpose of this reactor is to adjust the temperature of the reactor so that the conversion rate of the reaction, the number-average chain length and the weight-average chain length of the reaction product are at the end of the reaction. close to optimum. The mechanistic model of the reaction is as follows:

表3.2标称参数值Table 3.2 Nominal parameter values

式中T表示反应器的绝对温度，Tc为摄氏温度值，Aw和B为重均链长与温度的相关系数，Am和Em为单体聚合反应的频率因子和激活能量，r1-r4为密度-温度修正值，Mm和χ分别为单体分子重量和聚合体相互作用参数。产品的质量指标包括转化率y₁，无量纲数均链长y₂，无量纲重均链长y₃，即终点质量指标为：y＝[y₁(t_f)y₂(t_f)y₃(t_f)]，t_f代表终点时刻。设总反应时间设定为400分钟，控制变量为反应器温度T，按时间平均分成20个区间段，在每一段内温度保持恒定，即过程变量为X＝[T₁ T₂ T₃ … T₂₀]，方程中的参数由表3.2给出。产生60个批次的数据X(60×20)，在反应器上实施得到质量变量矩阵Y(60×3)。取前20个数据作为训练样本，建立初始模型；中间20个样本作为更新样本；后20个数据作为测试样本，用于检验模型预测效果。建模前选取激励函数为sigmoid函数。In the formula, T represents the absolute temperature of the reactor, Tc is the temperature in Celsius, Aw and B are the correlation coefficients between the weight-average chain length and temperature, Am and Em are the frequency factor and activation energy of the monomer polymerization reaction, and r1-r4 is the density - temperature correction value, Mm and χ are monomer molecular weight and polymer interaction parameters, respectively. The quality index of the product includes the conversion rate y ₁ , the dimensionless number average chain length y ₂ , and the dimensionless weight average chain length y ₃ , that is, the final quality index is: y=[y ₁ (t _f )y ₂ (t _f )y ₃ (t _f )], t _f represents the end point. Assuming that the total reaction time is set to 400 minutes, the control variable is the reactor temperature T, which is divided into 20 intervals according to the time, and the temperature in each section is kept constant, that is, the process variable is X=[T ₁ T ₂ T ₃ ... T ₂₀ ], the parameters in the equation are given in Table 3.2. Generate 60 batches of data X (60×20), and implement it on the reactor to obtain a quality variable matrix Y (60×3). The first 20 data are taken as training samples to establish the initial model; the middle 20 samples are used as update samples; the last 20 data are used as test samples to test the prediction effect of the model. The activation function is selected as the sigmoid function before modeling.

如图1～4所示，图1、2分别为当每一批次的数据只有2个有效时间时，训练集与测试集的均方根误差值在两种不同的模型下(FOS-ELM和FOS-MELM)和不同的隐含层节点；图3、4分别为当每一批次的数据只有3个有效时间时，训练集与测试集的均方根误差值在两种不同的模型下(FOS-ELM和FOS-MELM)和不同的隐含层节点。As shown in Figures 1 to 4, Figures 1 and 2 respectively show the root mean square error values of the training set and the test set under two different models (FOS-ELM and FOS-MELM) and different hidden layer nodes; Figures 3 and 4 respectively show that when each batch of data has only 3 valid times, the root mean square error values of the training set and the test set are in two different models Lower (FOS-ELM and FOS-MELM) and different hidden layer nodes.

以上图示说明，由于聚合反应器的参数变化有时效性，为了有效的预测温度值，我们采用本方法在线时序更新系统模型，减少甚至避免数据剧烈变化对模型以及预测输出带来的影响。The above illustration shows that due to the timeliness of the parameter changes of the polymerization reactor, in order to effectively predict the temperature value, we use this method to update the system model online in time series, reducing or even avoiding the impact of drastic data changes on the model and prediction output.

Claims

1. an application method of online sequential multi-hidden layer extreme learning machine with forgetting factor, it is characterized in that comprising the following steps:

1) seek an extreme learning machine model with multiple hidden layers, and obtain the output expression of the extreme learning machine model with multiple hidden layers;

2) The above-mentioned multi-hidden layer extreme learning machine model is updated in real time, and the expression of the updated model is output.

2. the application method of the online sequential multi-hidden layer extreme learning machine of band forgetting factor according to claim 1, it is characterized in that:

In step 1), seek an extreme learning machine model with multiple hidden layers, and obtain the output expression of the extreme learning machine model with multiple hidden layers, specifically:

11) Given a sample and a network structure of multiple hidden layers, the activation function of the hidden layer is g, and the network output is g(a,b,X), where a is the distance between the input layer and the first hidden layer Weight, b is the bias of the first hidden layer, X is the input matrix;

12) Assuming that the data changes in batches, and each batch lasts for S units of time, the data of the k-th unit time is expressed as N _j is the number of j batches of data; χ _k is valid within the range of [k k+s], j=0,1,...k., t _i is a flag variable, at (k+1)-th The data of unit time is expressed as k is any large positive integer, _xi is the input sample, t _i is the sample mark, and th is the batch.

13) Suppose k≥s-1, the number of training data is much larger than the number of hidden layer nodes, Z _k+1 is the predicted result of the (k+1)-th unit time, set l=k-s+ 1, k-s+2,...k; the output of the first hidden layer of the network at the l-th moment of the data is:

(a _i , b _i ) is the weight and threshold between the input layer and the first hidden layer, i=1,...L. Random initialization; (G is the activation function of the hidden layer, T is [k -s+1, k] inner batch data sample mark amount, T _l is the mark amount of the lth batch data sample; l is a positive integer in [k-s+1, k];

The output weight β of the final hidden layer is obtained as:

and

14) Suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is:

15) Assuming W _HE = [B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And suppose H _E =[1 H] ^T , 1 is a one-dimensional row vector whose elements are all 1, g ^-1 (x) is the inverse function of g(x) of the activation function, W _HE and HE are assumed variables;

16) Update the output of the second hidden layer as H ₂ =g(W _HE H _E ); update the output weight β of the final hidden layer as

17) Suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

18) Assuming W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1; g ^-1 (x) is the inverse function of g(x) of the activation function; W _HE1 and HE1 are hypothetical variables;

The output of the updated third hidden layer is:

H ₄ ＝g(W _HE1 H _E1 ) (2)

19) Update the output weight β of the final hidden layer as:

Then the final output is f=β _new1 H ₄ .

3. the application method of the online sequential multi-hidden layer extreme learning machine of band forgetting factor according to claim 2, it is characterized in that:

For a network structure with multiple hidden layers, loop iteration formulas (1), (2), (3), three hidden layers iterate once, four hidden layers iterate twice, and N hidden layers iterate N-2 times , so that β _new = β _new1 , H ₂ = H ₄ at the end of each iteration.

4. the application method of the online sequential multi-hidden layer extreme learning machine of band forgetting factor according to claim 1, it is characterized in that:

In step 2), the above multi-hidden layer extreme learning machine model is updated in real time, and the expression of the updated model is output, specifically:

21) Assuming that Z _k+2 is the predicted result of the (k+2)-th unit time, when the result of the (k+1)-th time is known, keep the same (a _i ,b _i ),i=1,...L, the output weight is expressed as:

and Then P _k+1 is expressed by P _k as:

<mfenced open = "" close = ""><mtable><mtr><mtd><mrow><msub><mi>P</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>=</mo><msup><mrow><mo>(</mo><munderover><mo>&Sigma;</mo><mrow><mi>l</mi><mo>=</mo><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>2</mn></mrow><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></munderover><msubsup><mi>H</mi><mi>l</mi><mi>T</mi></msubsup><msub><mi>H</mi><mi>l</mi></msub><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>=</mo><msup><mrow><mo>(</mo><mrow><munderover><mo>&Sigma;</mo><mrow><mi>l</mi><mo>=</mo><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></munderover><msubsup><mi>H</mi><mi>l</mi><mi>T</mi></msubsup><msub><mi>H</mi><mi>l</mi></msub><mo>+</mo><msup><mfenced open = "[" close = "]"><mtable><mtr><mtd><mrow><mo>-</mo><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mi>mn></mrow></msub></mrow></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><mfenced open = "[" close = "]"><mtable><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced></mrow><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup></mrow></mtd></mtr><mtr><mtd><mrow><mo>=</mo><msub><mi>P</mi><mi>k</mi></msub><mo>-</mo><msub><mi>P</mi><mi>k</mi></msub><msup><mfenced open = "[" close = "]"><mtable><mtr><mtd><mrow><mo>-</mo><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow></msub></mrow></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup><msup><mrow><mo>(</mo><mrow><mi>I</mi><mo>+</mo><mfenced open = "[" close = "]"><mtable><mtr><mtd><mrow><mo>-</mo><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow></msub></mrow></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced><msub><mi>P</mi><mi>k</mi></msub><msup><mfenced open = "[" close = "]"><mtable><mtr><mtd><mrow><mo>-</mo><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow></msub></mrow></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced><mi>T</mi></msup></mrow><mo>)</mo></mrow><mrow><mo>-</mo><mn>1</mn></mrow></msup><mo>&times;</mo><mfenced open = "[" close = "]"><mtable><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>-</mo><mi>s</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr><mtr><mtd><msub><mi>H</mi><mrow><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></mtd></mtr></mtable></mfenced><msub><mi>P</mi><mi>k</mi></msub></mrow></mtd></mtr></mtable></mfenced>

and

but

22) Suppose the weight and bias of the second hidden layer are W ₁ , B ₁ , then the output of the second hidden layer is

23) Assuming W _HE = [B ₁ W ₁ ], then the weight and bias of the second hidden layer are calculated by And H _E ＝[1H] ^T , 1 is a one-dimensional row vector whose elements are all 1; g ^-1 (x) is the inverse function of g(x) of the activation function,

24) Update the output of the second hidden layer as H ₂ =g(W _HE H _E ); update the output weight β of the final hidden layer as

25) Suppose the weight and bias of the third hidden layer are W ₂ , B ₂ , then the output of the third hidden layer is

26) Assuming W _HE1 = [B ₂ W ₂ ], then the weight and bias of the third hidden layer are calculated by And H _E1 =[1 H ₂ ] ^T , 1 is a one-dimensional row vector whose elements are all 1, and g ^-1 (x) is the inverse function of g(x) of the activation function;

27) Update the output of the third hidden layer as: H ₄ =g(W _HE1 H _E1 ) (5)

28) Update the output weight β of the final hidden layer as:

Then the final output is f=β _new1 H ₄ .

5. the application method of the online sequential multi-hidden layer extreme learning machine of band forgetting factor according to claim 4, it is characterized in that also comprising the following steps:

For a network structure with multiple hidden layers, formulas (4), (5), and (6) are iterated cyclically, three hidden layers are iterated once, four hidden layers are iterated twice, and N hidden layers are iterated N-2 times , each iteration ends, so that β _new = β _new1 , H ₂ = H ₄ .