CN111738482A

CN111738482A - A kind of adjustment method of process parameter in polyester fiber polymerization process

Info

Publication number: CN111738482A
Application number: CN202010311105.4A
Authority: CN
Inventors: 郝矿荣; 朱秀丽; 陈磊; 蔡欣; 唐雪嵩; 王彤; 刘肖燕
Original assignee: Donghua University
Current assignee: Donghua University
Priority date: 2020-04-20
Filing date: 2020-04-20
Publication date: 2020-10-02
Anticipated expiration: 2040-04-20
Also published as: CN111738482B

Abstract

The invention relates to a method for adjusting process parameters in a polyester fiber polymerization process, which comprises the steps of obtaining a predicted value of a polyester fiber performance index, comparing the predicted value with a set value, and adjusting the process parameters in the polyester fiber polymerization process according to a comparison result; the process of obtaining the predicted value is as follows: firstly, collecting m characteristic data from all the characteristic data collected by a sensor; then, performing feature relevance sequencing on the m feature data based on a grid search extreme gradient lifting tree algorithm; selecting the first n characteristic data with larger characteristic correlation from the sorted m characteristic data; then, all samples of the n characteristic data are subjected to normalization processing; and finally, processing the normalized n characteristic data by using a weighted double-flow bidirectional long-time and short-time memory attention network, and respectively outputting the predicted values of the 4 polyester fiber performance indexes. The method can obtain the predicted value of the performance index of the polyester fiber with high precision, and further can better guide industrial production.

Description

A kind of adjustment method of process parameter in polyester fiber polymerization process

技术领域technical field

本发明属于化纤生产工艺技术领域，涉及一种聚酯纤维聚合过程中的工艺参数的调节方法。The invention belongs to the technical field of chemical fiber production technology, and relates to a method for adjusting process parameters in a polyester fiber polymerization process.

背景技术Background technique

聚合过程是聚酯纤维纤维生产全流程的研究首要环节。该过程包括酯化、预缩聚和终缩聚三个阶段。聚合过程主要包含四个性能指标，分别是平均分子量M_n、聚合度P_n、特性粘度MIV和酯化率E_s。例如，聚合物的平均分子量M_n决定了产品的强度和抗冲击性。但是，实际工业生产中有三个性能指标不能通过传感器实时测量，因此不能通过实时控制这四个性能指标的变化来直接控制聚酯纤维纤维的生产过程，以确保产品质量。具体的，特性粘度MIV可以通过传感器实时测量，而余下的酯化率E_s、聚合度P_n和平均分子量M_n不能通过传感器实时测量。The polymerization process is the primary link in the research of the whole process of polyester fiber production. The process includes three stages of esterification, pre-polycondensation and final polycondensation. The polymerization process mainly includes four performance indicators, namely, the average molecular weight _Mn , the degree of polymerization _Pn , the intrinsic viscosity MIV and the esterification rate _Es . For example, the average molecular weight _Mn of the polymer determines the strength and impact resistance of the product. However, there are three performance indicators in actual industrial production that cannot be measured in real time by sensors, so the production process of polyester fiber cannot be directly controlled by controlling the changes of these four performance indicators in real time to ensure product quality. Specifically, the intrinsic viscosity MIV can be measured in real time by the sensor, while the remaining esterification rate _Es , the degree of polymerization P _n and the average molecular weight _Mn cannot be measured in real time by the sensor.

现有技术中，不能通过传感器直接测得聚酯纤维纤维聚合过程的酯化率，聚合度和平均分子量这三个性能指标。In the prior art, the three performance indicators of the esterification rate, the degree of polymerization and the average molecular weight of the polyester fiber during the polymerization process cannot be directly measured by sensors.

在聚酯纤维纤维聚合实际生产过程中，操作员总是依据生产经验调节工艺参数。这不能精准的调节所需调节的工艺参数，进而提高聚酯纤维纤维产品的质量。In the actual production process of polyester fiber polymerization, the operator always adjusts the process parameters according to the production experience. This cannot precisely adjust the process parameters that need to be adjusted, thereby improving the quality of polyester fiber products.

所以，现有技术中聚酯纤维聚合过程中的工艺参数的调节方法存在不精确性和不及时性等问题亟待解决。Therefore, in the prior art, the adjustment method of the process parameters in the polyester fiber polymerization process has problems such as imprecision and untimely that need to be solved urgently.

发明内容SUMMARY OF THE INVENTION

本发明旨在解决现有技术中存在的问题，提供一种聚酯纤维聚合过程中的工艺参数的调节方法。本发明的目的是提供一种根据对特性粘度MIV、酯化率E_s、聚合度P_n和平均分子量M_n的预测结果来调整工艺参数以使获得理想聚合物的聚酯纤维聚合过程中的工艺参数的调节方法。The invention aims to solve the problems existing in the prior art, and provides a method for adjusting process parameters in the polymerization process of polyester fibers. The object of the present invention is to provide a method for adjusting the process parameters according to the predicted results of the intrinsic viscosity MIV, the esterification rate _Es , the degree of polymerization P _n and the average molecular weight _Mn so as to obtain the ideal polymer in the polyester fiber polymerization process. Adjustment method of process parameters.

本发明通过对特性粘度MIV的值进行转化，进而可以获得酯化率E_s、聚合度P_n和平均分子量M_n三个性能指标值，具体的转化方程如下：The present invention converts the value of intrinsic viscosity MIV to obtain three performance index values of esterification rate E _s , degree of polymerization P _n and average molecular weight _Mn , and the specific conversion equation is as follows:

P_n＝187.95×(MIV^1.46-0.001718)；P _n =187.95×(MIV ^{1.46-0.001718} );

K＝2.1×10^-4,α＝0.82；K=2.1×10 ^-4 , α=0.82;

其中，OHV代表端羟基的浓度，OHV％代表着端羟基占整个溶液的百分比，AV代表端羧基的浓度，α和K均是常数。Among them, OHV represents the concentration of terminal hydroxyl groups, OHV% represents the percentage of terminal hydroxyl groups in the whole solution, AV represents the concentration of terminal carboxyl groups, and α and K are both constants.

也即本发明只需通过传感器实时测量特性粘度MIV的值，并根据当前生产工艺参数，建立基于网格搜索的极限梯度提升树算法和加权的双流双向长短时记忆注意力网络相结合的适用于当前聚合生产线的预测模型，然后在实际的聚合生产中实时对输入工艺参数进行聚合结果预测并与设定理想值进行比较，并以此调整到合适的各工艺参数。That is, the present invention only needs to measure the value of the intrinsic viscosity MIV in real time through the sensor, and according to the current production process parameters, establish a grid search-based limit gradient boosting tree algorithm and a weighted dual-stream bidirectional long-term memory attention network. The prediction model of the current polymerization production line, and then in the actual polymerization production, the input process parameters are predicted in real time and compared with the set ideal values, and the appropriate process parameters are adjusted accordingly.

为达到上述目的，本发明采用的技术方案如下：For achieving the above object, the technical scheme adopted in the present invention is as follows:

一种聚酯纤维聚合过程中的工艺参数的调节方法，获取聚酯纤维性能指标的预测值后，将其与设定值进行比较，根据比较结果调整聚酯纤维聚合过程中的工艺参数；A method for adjusting process parameters in the polyester fiber polymerization process, after obtaining the predicted value of the polyester fiber performance index, comparing it with the set value, and adjusting the process parameters in the polyester fiber polymerization process according to the comparison result;

聚酯纤维性能指标为特性粘度MIV、酯化率E_s、聚合度P_n和平均分子量M_n；The performance indicators of polyester fiber are intrinsic viscosity MIV, esterification rate E _s , degree of polymerization P _n and average molecular weight _Mn ;

预测值的获取过程如下：The process of obtaining the predicted value is as follows:

首先从传感器采集到的所有特征数据中收集m个特征数据；First, collect m feature data from all feature data collected by the sensor;

然后基于网格搜索的极限梯度提升树算法对m个特征数据进行特征相关性排序，即对每个特征数据计算得分函数的值，得分函数的值越高代表特征相关性越大；对排序好的m个特征数据选取特征相关性较大的前n个特征数据，n∈m；Then the limit gradient boosting tree algorithm based on grid search performs feature correlation sorting on the m feature data, that is, the value of the score function is calculated for each feature data. The higher the value of the score function, the greater the feature correlation; select the first n feature data with relatively large feature correlation, n∈m;

接着对n个特征数据的所有样本进行归一化处理；Then, normalize all the samples of the n feature data;

最后运用加权的双流双向长短时记忆注意力网络处理归一化后的n个特征数据，分别输出4种所述聚酯纤维性能指标的预测值；Finally, the weighted dual-stream bidirectional long-term and short-term memory attention network is used to process the normalized n feature data, and output the predicted values of the four polyester fiber performance indicators;

所述加权的双流双向长短时记忆注意力网络为由合并层将两条支流获得的信息进行纵向拼接后经注意力层输入到全连接层的网络；The weighted dual-stream bidirectional long-term and short-term memory attention network is a network in which the information obtained from the two tributaries is longitudinally spliced by the merging layer and input to the fully-connected layer through the attention layer;

所述两条支流为并联的第一支流和第二支流；所述第一支流由两个相同的加权双向长短时记忆网络单元I组成，并在两者之间插入一个批量标准化层；所述第二支流由两个相同的加权双向长短时记忆网络单元II组成，并在两者之间插入一个批量标准化层；The two tributaries are the first tributary and the second tributary in parallel; the first tributary consists of two identical weighted bidirectional long-short-term memory network units I, and a batch normalization layer is inserted between the two; the The second branch consists of two identical weighted bidirectional long-short-term memory network units II with a batch normalization layer inserted between them;

所述加权双向长短时记忆网络单元I为对当前时刻信息加权的双向长短时记忆网络，是记忆细胞单元状态运算中对当前时刻的信息加权运算的双向长短时记忆网络，所述加权双向长短时记忆网络单元I的记忆细胞单元状态

Described weighted two-way long-short-term memory network unit 1 is a two-way long-short-term memory network weighted to current moment information, is a two-way long-short-term memory network weighted to current moment information in memory cell unit state operation, described weighted two-way long-short-term memory network Memory cell unit state of memory network unit I

所述加权双向长短时记忆网络单元II为对过去时刻信息加权的双向长短时记忆网络，是记忆细胞单元状态运算中对过去时刻的信息加权运算的双向长短时记忆网络，所述加权双向长短时记忆网络单元II的记忆细胞单元状态

The weighted two-way long-short-term memory network unit II is a two-way long-short-term memory network that weights the information of the past time, and is a two-way long-short-term memory network that weights the information of the past time in the state operation of the memory cell unit. Memory cell unit states of memory network unit II

其中，

代表当前时刻的信息，c_t-1代表过去时刻的信息，λ₁为对保留下来的当前时刻信息进一步加权的权重，λ₂为对保留下来的过去时刻信息进一步加权的权重，i_t为记忆细胞单元状态对当前时刻信息保留的比重，f_t为记忆细胞单元状态对过去时刻信息保留的比重；in,

Represents the information of the current moment, c _t-1 represents the information of the past moment, λ ₁ is the weight that further weights the retained information of the current moment, λ ₂ is the weight that further weights the retained information of the past moment, i _t is the memory The proportion of the cell unit state to the information retained at the current moment, f _t is the proportion of the memory cell unit state to the information retained at the past moment;

获取预测值后，将其与设定值进行比较：After getting the predicted value, compare it with the set value:

若聚合度、平均分子量预测值与工艺设定值的误差在正负1以内，特性粘度与酯化率的预测值与工艺设定值的误差在正负0.1以内，则不需要调节工艺参数；If the error between the predicted value of the degree of polymerization, the average molecular weight and the process setting value is within plus or minus 1, and the error between the predicted value of intrinsic viscosity and esterification rate and the process setting value is within plus or minus 0.1, it is not necessary to adjust the process parameters;

若超过以上有效范围，则需要对工艺参数进行调节；If the above valid range is exceeded, the process parameters need to be adjusted;

具体调节方式为：The specific adjustment method is:

根据网格搜索的极限梯度提升树算法获得的特征相关性排序排名，从相关性最大的即排名最靠前的工艺参数先进行调节，调节至极限仍不满足，则继续调节排名次之的工艺参数，以此类推，直至预测性能参数满足要求；According to the feature correlation ranking and ranking obtained by the extreme gradient boosting tree algorithm of grid search, the process parameters with the highest correlation, that is, the highest ranking process parameters, are adjusted first. If the adjustment is still not satisfied to the limit, the process parameters that are ranked second will continue to be adjusted. parameters, and so on, until the predicted performance parameters meet the requirements;

调节步长为该工艺参数具体数值的1％，调节的极限为该工艺参数具体数值的±10％；The adjustment step is 1% of the specific value of the process parameter, and the adjustment limit is ±10% of the specific value of the process parameter;

调节方向根据性能预测结果与设定值的比较，若性能预测结果高于设定值，则调节方向为减，若性能预测结果低于设定值，则调节方向为增。The adjustment direction is based on the comparison between the performance prediction result and the set value. If the performance prediction result is higher than the set value, the adjustment direction is decrease, and if the performance prediction result is lower than the set value, the adjustment direction is increase.

作为优选的方案：As a preferred solution:

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，所述加权的双流双向长短时记忆注意力网络具体的建模和训练过程为：The above-mentioned method for adjusting process parameters in a polyester fiber polymerization process, the specific modeling and training process of the weighted dual-stream bidirectional long-term memory attention network is:

(1)建模；(1) Modeling;

所述加权的双流双向长短时记忆注意力网络包括数据预处理、信息的提取和融合以及回归输出三部分；The weighted dual-stream bidirectional long-term and short-term memory attention network includes three parts: data preprocessing, information extraction and fusion, and regression output;

a)数据预处理部分：a) Data preprocessing part:

由特征数据采集、特征数据的特征相关性排序和择优处理3个部分组成；It consists of three parts: feature data collection, feature correlation sorting of feature data, and optimal processing;

首先从传感器收集到的所有特征数据中收集m个特征数据，该m个特征数据分别为整个聚合过程所有温度传感器中从第一个温度传感器开始间隔相同的选取四舍五入取整[(m-1)/5]个温度传感器的温度，间隔相同的从第一个压力传感器开始选取四舍五入取整[(m-1)/5]个压力传感器的压力，浆料配制槽的1个浆料配比数据，间隔相同的从第一个液位传感器开始选取四舍五入取整[(m-1)/5]个液位传感器的液位，间隔相同的从第一个流量传感器开始选取四舍五入取整[(m-1)/5]个流量传感器的流量，间隔相同的的从第一个转速传感器开始选取四舍五入取整[(m-1)/5]个转速传感器的转速；First, collect m feature data from all the feature data collected by the sensor. The m feature data are respectively selected from all temperature sensors in the whole aggregation process with the same interval from the first temperature sensor. [(m-1) /5] The temperature of the temperature sensors, the same interval is selected from the first pressure sensor and rounded to the nearest [(m-1)/5] The pressure of the pressure sensors, 1 slurry ratio data of the slurry preparation tank , the liquid level of the first liquid level sensor with the same interval will be rounded to the nearest [(m-1)/5] liquid level sensors, and the liquid level of the first flow sensor with the same interval will be rounded to the nearest [(m The flow rate of -1)/5] flow sensors, and the rotation speed of [(m-1)/5] speed sensors is rounded up from the first speed sensor with the same interval;

基于网格搜索的极限梯度提升树算法对m个特征数据进行特征相关性排序；The limit gradient boosting tree algorithm based on grid search performs feature correlation sorting on m feature data;

择优处理，根据特征相关性排序选择前n个特征数据；Optimal processing, select the top n feature data according to feature correlation sorting;

对选择的n个特征数据进行归一化处理得到新的n个特征数据；Normalize the selected n feature data to obtain new n feature data;

b)提取和融合部分：b) Extraction and fusion part:

首先运用并联的第一支流和第二支流的双流框架分别提取归一化处理得到的n个特征数据的过去时刻信息和当前时刻信息；Firstly, the past time information and current time information of the n characteristic data obtained by normalization are extracted respectively by using the double-stream framework of the first and second branches in parallel;

在运用合并层将双流框架的两条支流获得的信息进行纵向拼接，然后根据输入信息与输出变量的相关性，利用注意力层对输入的信息进一步加权求和；The information obtained from the two tributaries of the dual-stream framework is longitudinally spliced by using the merging layer, and then the input information is further weighted and summed by the attention layer according to the correlation between the input information and the output variable;

其中所采用的双流框架中的两条支流分别为：第一支流由两个相同的加权双向长短时记忆网络单元I，并在其中插入一个批量标准化层构成，即一个加权双向长短时记忆网络单元I顺接一个批量标准化层在顺接一个加权双向长短时记忆网络单元I；所述加权双向长短时记忆网络单元I为对当前时刻信息加权的双向长短时记忆网络，是记忆细胞单元状态运算中对当前时刻的信息加权运算的双向长短时记忆网络，所述加权双向长短时记忆网络单元I的记忆细胞单元状态

The two branches in the adopted dual-stream framework are: the first branch consists of two identical weighted bidirectional long-short-term memory network units I, and a batch normalization layer is inserted into them, that is, a weighted bidirectional long-short-term memory network unit. I is followed by a batch normalization layer followed by a weighted two-way long-short-term memory network unit I; Described weighted two-way long-short-term memory network unit I is a two-way long-short-term memory network weighted to the current moment information, which is a memory cell unit state calculation. A bidirectional long-term and short-term memory network for weighted operation of the information at the current moment, the memory cell unit state of the weighted bidirectional long-term and short-term memory network unit I

第二支流由两个相同的加权双向长短时记忆网络单元II，并在其中插入一个批量标准化层构成，即一个加权双向长短时记忆网络单元II顺接一个批量标准化层在顺接一个加权双向长短时记忆网络单元II；所述加权双向长短时记忆网络单元II为对过去时刻信息加权的双向长短时记忆网络，是记忆细胞单元状态运算中对过去时刻的信息加权运算的双向长短时记忆网络，所述加权双向长短时记忆网络单元II的记忆细胞单元状态

The second branch is composed of two identical weighted bidirectional long-short-term memory network units II, and a batch normalization layer is inserted into them, that is, a weighted bidirectional long-short-term memory network unit II is followed by a batch normalization layer followed by a weighted bidirectional long-term memory network unit II. Time memory network unit II; Described weighted bidirectional long and short-term memory network unit II is a bidirectional long and short-term memory network that weights the information of the past time, and is a bidirectional long and short-term memory network that weights the information of the past time in the state operation of the memory cell unit, The memory cell unit state of the weighted bidirectional long-term memory network unit II

然后运用合并层将双流框架提取的信息进行融合，在将融合后的信息输入到注意力层进一步提取信息；Then use the merging layer to fuse the information extracted by the dual-stream framework, and input the fused information to the attention layer to further extract information;

c)回归输出；c) regression output;

最后将注意力层提取的信息输入到全连接层进行分类，输出预测结果；Finally, the information extracted by the attention layer is input to the fully connected layer for classification, and the prediction result is output;

加权的双流双向长短时记忆注意力网络建立完成；The establishment of a weighted dual-stream bidirectional long-term and short-term memory attention network is completed;

(2)训练；(2) training;

所述加权的双流双向长短时记忆注意力网络通过随机梯度下降的方法进行训练；训练的对象为加权的双流双向长短时记忆注意力网络的所有超参数；The weighted dual-stream bidirectional long-short-term memory attention network is trained by the method of stochastic gradient descent; the training objects are all hyperparameters of the weighted dual-stream bidirectional long-short-term memory attention network;

训练的终止条件为精度足够高或者达到预定义的最大迭代数；精度足够高是指MSE<0.005，同时 MAE<0.05，MSE的计算公式如下：The termination condition of training is that the accuracy is high enough or reaches the predefined maximum number of iterations; the accuracy is high enough means that MSE<0.005, and MAE<0.05, the calculation formula of MSE is as follows:

式中，M为样本个数，t为样本编号，y_t为真实值，

为预测值；In the formula, M is the number of samples, t is the sample number, y _t is the true value,

is the predicted value;

MAE的计算公式如下：The formula for calculating MAE is as follows:

(3)模型校正；(3) Model calibration;

若训练在达到提前设定的最大迭代次数后MSE≥0.005或MAE≥0.05，则选择将批尺寸调小，调整步长为1，或者将预定义代数增大，调整步长为30代，其中预定义的最大迭代次数为100代，批尺寸的值为 32。If the MSE ≥ 0.005 or MAE ≥ 0.05 after the training reaches the maximum number of iterations set in advance, choose to reduce the batch size and adjust the step size to 1, or increase the predefined number of generations and adjust the step size to 30 generations. The predefined maximum number of iterations is 100 generations, and the value of batch size is 32.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，加权双向长短时记忆网络单元I的双向长短时记忆网络单元由输入层、输出层、前向层和后向层组成，每一层由多个神经元组成，其中每个神经元即为一个长短时记忆网络单元，加权双向长短时记忆网络单元II的双向长短时记忆网络单元由输入层、输出层、前向层和后向层组成，每一层由多个神经元组成，其中每个神经元即为一个长短时记忆网络单元，所述批量标准化层是一种为神经网络中的任何层提供零均值/单位方差输入的技术。The adjustment method of the process parameter in a kind of polyester fiber polymerization process as above, the bidirectional long-short-term memory network unit of weighted bidirectional long-short-term memory network unit 1 is made up of input layer, output layer, forward layer and backward layer, Each layer is composed of multiple neurons, and each neuron is a long-short-term memory network unit. The backward layer is composed of multiple neurons, each of which is a long-short-term memory network unit. The batch normalization layer is a method that provides zero mean/unit variance for any layer in the neural network. input technology.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，i_t与f_t均为0到1之间的数，通过随机梯度下降的方式训练得到i_t与f_t的值；The above-mentioned method for adjusting process parameters in the polyester fiber polymerization process, i _t and f _t are both numbers between 0 and 1, and the values of i _t and f _t are obtained by stochastic gradient descent;

i_t＝σ(W_i*[h_t-1，x_t]+b_i)；i _t =σ(W _i *[h _t-1 , x _t ]+ _bi );

f_t＝σ(W_f*[h_t-1，x_t]+b_f)；f _t =σ(W _f *[h _t-1 , x _t ]+b _f );

式中，σ为sigmoid函数，h_t-1为t-1时刻的信息，x_t代表t时刻的输入特征，W_i、b_i、W_f与b_f为要训练的参数；In the formula, σ is the sigmoid function, h _t-1 is the information at time t-1, x _t represents the input feature at time t, and W _i , _bi , W _f and b _f are the parameters to be trained;

具体训练过程为：The specific training process is:

设定目标函数

set objective function

其中，J(θ)为损失函数，θ是要训练的参数即W_i、b_i、W_f与b_f，是要迭代求解的值，h(θ)是要拟合的函数即对四个性能指标的预测，yⁱ是四个性能指标的真实值，m是训练集的条数，因为随机梯度下降是一个一个样本进行训练，所以这里m＝1；通过梯度下降法最小化损失函数实现所需参数的训练。Among them, J(θ) is the loss function, θ is the parameters to be trained, namely W _i , b _i , W _f and b _f , which are the values to be solved iteratively, and h(θ) is the function to be fitted, that is, for four For the prediction of performance indicators, y ⁱ is the real value of the four performance indicators, and m is the number of training sets. Because stochastic gradient descent is trained one by one, m=1 here; it is achieved by minimizing the loss function by gradient descent. Training with required parameters.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，所述得分函数计算的具体过程为：A method for adjusting the process parameters in the above-mentioned polyester fiber polymerization process, the specific process of the score function calculation is:

式中，

为第t个特征在树结构为q时的得分函数值，T为树结构为q时的树的叶子节点的个数， I_j为树结构为q时树的第j个节点，g_i为损失函数的一阶导数，w_j为树结构为q时的第j个叶子结点的权重值， h_i为损失函数的二阶导数，λ和γ为超参数；每棵树对于每个特征都有一个得分函数值，最后所有树关于该特征的得分函数值的和即为该特征关于输出变量相关性的得分值，得分函数的值越大，代表该特征与输出变量的相关性越大；In the formula,

is the score function value of the t-th feature when the tree structure is q, T is the number of leaf nodes of the tree when the tree structure is q, I _j is the j-th node of the tree when the tree structure is q, and g _i is The first derivative of the loss function, w _j is the weight value of the _jth leaf node when the tree structure is q, hi is the second derivative of the loss function, λ and γ are hyperparameters; each tree is for each feature. There is a score function value. Finally, the sum of the score function values of all trees about the feature is the score value of the feature’s correlation with the output variable. The larger the value of the score function, the greater the correlation between the feature and the output variable. Big;

损失函数为聚酯纤维纤维聚合过程熔体特性粘度MIV的平方损失，平方损失是指实际值和预测值之间的误差平方和。The loss function is the square loss of the melt intrinsic viscosity MIV in the polyester fiber fiber polymerization process, and the square loss refers to the squared error between the actual value and the predicted value.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，基于网格搜索的极限梯度提升树通过从 m个特征中选出前n个特征，具体的n的取值数目为四舍五入取整9m/20。A method for adjusting process parameters in a polyester fiber polymerization process as described above, the limit gradient boosting tree based on grid search selects the first n features from m features, and the specific number of n values is rounded up Round 9m/20.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，归一化处理的公式如下：The above-mentioned adjustment method of the technological parameter in a kind of polyester fiber polymerization process, the formula of normalization treatment is as follows:

式中，x_i为第i个样本归一化后的结果，以某特征数据中的所有样本的集合为集合X，X_i为集合X中第i个样本，X_min为集合X中所有样本的最小值，X_max为集合X中所有样本的最大值。In the formula, x _i is the result of the normalization of the ith sample, the set of all samples in a certain characteristic data is set X, X _i is the ith sample in the set X, and X _min is all the samples in the set X. The minimum value of X _max is the maximum value of all samples in the set X.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，步骤(2)训练中，采用随机梯度下降的方式对预测模型中的超参数进行训练；随机梯度下降即：每次迭代使用一个样本来对参数进行更新，使得训练速度加快；In the above-mentioned method for adjusting process parameters in the polyester fiber polymerization process, in step (2) training, stochastic gradient descent is used to train the hyperparameters in the prediction model; stochastic gradient descent is: each iteration Use a sample to update the parameters to speed up the training;

对于一个样本的目标函数为：The objective function for one sample is:

其中，J(θ)为损失函数，θ是参数，要迭代求解的值，h(θ)是要拟合的函数，yⁱ是真实值，m是训练集的条数，因为随机梯度下降是一个一个样本进行训练，所以这里m＝1。Among them, J(θ) is the loss function, θ is the parameter, the value to be solved iteratively, h(θ) is the function to be fitted, y ⁱ is the true value, and m is the number of bars in the training set, because stochastic gradient descent is Training is performed one sample at a time, so m=1 here.

如上所述的一种聚酯纤维聚合过程中的工艺参数的调节方法，λ₁的取值范围为0-1，λ₂的值取值范围为0-1，当达到提前设定的最大迭代次数后MSE≥0.005或MAE≥0.05时，对该两个参数进行调整，调整步长为0.1。The above-mentioned method for adjusting the process parameters in the polyester fiber polymerization process, the value range of λ ₁ is 0-1, and the value range of λ ₂ is 0-1, when the maximum iteration set in advance is reached When MSE ≥ 0.005 or MAE ≥ 0.05 after the number of times, the two parameters are adjusted, and the adjustment step is 0.1.

本发明由于预测精度高，并且对特征进行了相关性排序，因而能有效解决现有技术存在的不确定性和不及时性问题。The present invention can effectively solve the problems of uncertainty and lack of timeliness existing in the prior art due to the high prediction accuracy and the correlation sorting of the features.

具体的，本发明首先通过基于网格搜索的极限梯度提升树对m个特征进行特征相关性排序，从中选出排在前n个的特征，这样做的目的是去除了冗余的特征，减少了不相关特征对预测结果的影响，然后对前 n个特征的所有样本数据进行归一化处理，减少不同维度的大小对特征相关性的影响，接下来运用加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)对n个特征提取和融合信息，最后通过一个全连接层对得到的信息进行分类，输出四个性能指标的预测值。Specifically, the present invention first sorts m features by feature correlation based on a grid search-based limit gradient boosting tree, and selects the top n features. The purpose of this is to remove redundant features and reduce After analyzing the influence of irrelevant features on the prediction results, normalize all the sample data of the first n features to reduce the influence of different dimensions on feature correlation, and then use weighted dual-stream bidirectional long-term memory attention. The network (TS-λBiLSTMs-attention) extracts and fuses information for n features, and finally classifies the obtained information through a fully connected layer, and outputs the predicted values of four performance indicators.

在加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)算法框架中，本发明引入了参数λ₁和λ₂对过去时刻的信息和当前时刻的信息进行加权，使得第一支流着重于对当前时刻信息的提取，第二支流着重于对过去时刻信息的提取。且又采用了双向长短时记忆网络，进一步考虑下一时刻信息。所以加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)算法框架提取和融合的信息包含了过去时刻，当前时刻和下一时刻的信息。In the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm framework, the present invention introduces parameters λ ₁ and λ ₂ to weight the information of the past moment and the information of the current moment, so that the first branch focuses on For the extraction of current time information, the second branch focuses on the extraction of past time information. And a bidirectional long-short-term memory network is used to further consider the next moment information. Therefore, the information extracted and fused by the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm framework includes the information of the past moment, the current moment and the next moment.

综上所述提出的加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)具有较高的精度。In conclusion, the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) proposed above has high accuracy.

本发明的技术要点如下：The technical points of the present invention are as follows:

(一)软测量(1) Soft measurement

当前，软测量基于过程工业中正在测量和存储的大量数据来构建预测模型。据报道，相关工作是对在线仪器测量的补充，以进行过程监测和控制。Currently, soft sensing builds predictive models based on the vast amounts of data being measured and stored in the process industry. Related work is reported to complement on-line instrumental measurements for process monitoring and control.

软测量建模主要分为三类，即第一原理模型(FPM)(白盒模型)、数据驱动模型(黑盒模型)和混合模型。顾名思义，混合模型是第一原理模型和数据驱动模型的组合，第一原理模型建立在对过程物理化学背景的深入了解基础之上，这通常很耗时且难以获得，随着传感器技术的发展，通过分布式控制系统(DCS) 可以获得大量数据，它使数据驱动建模更具吸引力，数据驱动模型的本质是挖掘数据中的隐藏信息，数据驱动模型的典型算法是普通最小二乘法、支持向量回归(SVR)和偏最小二乘(PLS)。本发明运用工厂DCS 采集到的数据建立数据驱动软测量预测模型，进而工业生产中可以将工艺参数代入本发明的预测模型，得到四个性能指标的预测结果，然后将四个性能指标的预测结果与其对应的工艺设定值进行比较，最后根据二者之差确定是否调节工艺参数，达到高质量生产聚酯纤维纤维的目的。Soft-sensor modeling is mainly divided into three categories, namely first-principles models (FPM) (white-box models), data-driven models (black-box models), and hybrid models. As the name suggests, a hybrid model is a combination of a first-principles model and a data-driven model. The first-principles model is built on a deep understanding of the physical and chemical context of the process, which is often time-consuming and difficult to obtain. With the development of sensor technology, A large amount of data can be obtained through distributed control system (DCS), which makes data-driven modeling more attractive. The essence of data-driven model is to mine the hidden information in data. The typical algorithm of data-driven model is ordinary least squares, support Vector regression (SVR) and partial least squares (PLS). The invention uses the data collected by the factory DCS to establish a data-driven soft measurement prediction model, and then in industrial production, process parameters can be substituted into the prediction model of the invention to obtain the prediction results of four performance indicators, and then the prediction results of the four performance indicators can be used. Compare with its corresponding process setting value, and finally determine whether to adjust the process parameters according to the difference between the two, so as to achieve the purpose of producing polyester fiber with high quality.

(二)Grid-Xgboost算法(基于网格搜索的极限梯度提升树)(2) Grid-Xgboost algorithm (limit gradient boosting tree based on grid search)

Xgboost算法已被数据科学家广泛使用，以在许多机器学习应用程序上获得最新结果，这是陈天奇等人开发的一种高效的可伸缩机器学习算法，用于树的增强。The Xgboost algorithm has been widely used by data scientists to obtain state-of-the-art results on many machine learning applications, an efficient and scalable machine learning algorithm developed by Tianqi Chen et al. for tree augmentation.

极限梯度提升树通过构建提升树来智能地获取特征分数，从而表明每个特征对模型的重要性。特征的分数越高，表明它与输出变量的相关性越高。Extreme Gradient Boosting Trees intelligently obtain feature scores by building boosted trees that indicate the importance of each feature to the model. The higher the score of a feature, the higher its correlation with the output variable.

以上方程能被用来作为得分函数去测量特征与输出变量之间的相关性。The above equation can be used as a scoring function to measure the correlation between features and output variables.

网格搜索是一种用于参数调整的典型方法，它有条理地为特定网格中的每个参数组合构建和评估模型。Xgboost是一种成功的机器学习方法，基于陈天奇提出的梯度提升算法。目标函数的二阶泰勒展开式使得对高精度的预测成为可能，这也归因于其在Kaggle竞赛中的高成功率。Grid search is a typical method for parameter tuning, which methodically builds and evaluates a model for each parameter combination in a particular grid. Xgboost is a successful machine learning method based on the gradient boosting algorithm proposed by Chen Tianqi. The second-order Taylor expansion of the objective function enables predictions with high accuracy, which is also attributed to its high success rate in Kaggle competitions.

本发明选择与输出变量最相关的特征是通过Xgboost算法选择的，因为聚合物的质量由四个主要性能指标决定。因此，有四个输出变量。其他三个性能指标是由熔体特性粘度MIV决定的。然后，将熔体特性粘度MIV的平方损失作为Xgboost算法中的损失函数来对特征重要性进行排序。平方损失是指实际值和预测值之间的误差平方和。The present invention selects the features most relevant to the output variables by the Xgboost algorithm, since the quality of the polymer is determined by four main performance indicators. Therefore, there are four output variables. The other three performance indicators are determined by the melt intrinsic viscosity MIV. Then, the squared loss of melt intrinsic viscosity MIV is used as the loss function in the Xgboost algorithm to rank the feature importance. The squared loss is the sum of squared errors between the actual and predicted values.

(三)双向长短时记忆网络(Bi-directional Long Short-term MemoryNetworks,BiLSTMs)(3) Bi-directional Long Short-term Memory Networks (BiLSTMs)

LSTM是基于递归神经网络(Recurrent Neural Network,RNN)提出的，用于解决时间序列长时依赖问题。它在诸如工业时间序列分析和识别应用之类的各种任务中均取得了最先进的性能。一个LSTM单元由三个乘法门组成，分别是输入门i_t、输出门o_i和遗忘门f_t。他们分别控制着向下一时刻输入，输出或者遗忘信息的比例。长短时记忆网络的核心是记忆细胞单元状态c_t，因为它能够存储上一时刻的信息，这有效的解决了工业过程中的长时依赖问题。具体的LSTM的一个单元的结构被展示在图1。LSTM is proposed based on Recurrent Neural Network (RNN) to solve the long-term dependency problem of time series. It achieves state-of-the-art performance in various tasks such as industrial time series analysis and recognition applications. An LSTM unit consists of three multiplication gates, which are input gate i _t , output gate o _i and forget gate ft _. They respectively control the proportion of input, output or forgetting information to the next moment. The core of the long-short-term memory network is the memory cell unit state c _t , because it can store the information of the last moment, which effectively solves the long-term dependence problem in the industrial process. The structure of a unit of a specific LSTM is shown in Figure 1.

LSTM的原理通过以下方程进行描述：The principle of LSTM is described by the following equation:

这里σ代表一个逻辑S型函数，i_t、f_t和o_t得到的值均在0-1之间，这预示着当前输入，输出或者保留到下一时刻的信息的比例，tanh是一个双曲正切函数。符号

代表着矩阵乘法，这代表着当前时刻的信息与过去时刻信息的融合。权重矩阵W_*和偏置矩阵b_*分别是模型的参数，他们的值在模型训练的过程中被决定。

代表当前时刻的信息，c_t-1代表上一时刻的信息。

代表保留i_t比例的当前时刻的信息，

代表保留f_t比例的过去时刻的信息。Here σ represents a logical sigmoid function, and the values obtained by it, ft and o _t are all between ₀ and ₁ , which indicates the current input, output or the proportion of information retained to the next moment, tanh is a double tangent function. symbol

Represents matrix multiplication, which represents the fusion of current moment information and past moment information. The weight matrix W _* and the bias matrix b _* are the parameters of the model, respectively, and their values are determined during the model training process.

Represents the information of the current moment, and c _t-1 represents the information of the previous moment.

represents the information of the current moment in which the proportion of it is _preserved ,

Represents information about past _moments in which the scale of ft is preserved.

通常情况下模型预测结果不仅与当前输入有关，也与下一时刻的输入有关。基于此，双向长短时记忆网络(BiLSTMs)被提出去提高LSTM模型预测精度。Usually, the model prediction results are not only related to the current input, but also related to the input at the next moment. Based on this, bidirectional long short-term memory networks (BiLSTMs) are proposed to improve the prediction accuracy of LSTM models.

如图2所示，双向长短时记忆网络(BiLSTMs)主要包含输入层、输出层、前向层和后向层，前向层和后向层共同连接到输出层，其中包含了6个共享的权重w₁-w₆。首先，在前向层中从时间1到时间t正向计算，这样可以每次获取并保存隐藏层的输出。接下来，在后向层计算从时间t到时间1的，以获取并保存每个时刻之后隐藏层的输出。最后，在每个时刻，通过组合前向层和后向层相应时刻的输出结果来获得最终输出。数学表达式如下：As shown in Figure 2, the bidirectional long short-term memory network (BiLSTMs) mainly includes input layer, output layer, forward layer and backward layer. The forward layer and the backward layer are jointly connected to the output layer, which contains 6 shared Weights w ₁ -w ₆ . First, forward computation from time 1 to time t in the forward layer, so that the output of the hidden layer can be obtained and saved each time. Next, the backward layer computes from time t to time 1 to obtain and save the output of the hidden layer after each time instant. Finally, at each moment, the final output is obtained by combining the output results of the forward layer and the backward layer at the corresponding moment. The mathematical expression is as follows:

h_t＝f(w₁x_t+w₂h_t-1)h _t =f(w ₁ x _t +w ₂ h _t-1 )

h_t′＝f(w₃x_t+w₀h_t+1′)h _t ′=f(w ₃ x _t +w ₀ h _t+1 ′)

o_t＝g(w₄h_t+w₆h_t′)o _t =g(w ₄ h _t +w ₆ h _t ′)

其中f通常是s型函数。同样的，g是一个激活函数，对于不同的问题，g取不同的函数。h_t-1，h_t和 h_t+1分别代表着过去时刻，当前时刻和下一时刻的隐藏层的信息，x_t代表着当前输入信息。where f is usually a sigmoid function. Similarly, g is an activation function, and for different problems, g takes different functions. h _t-1 , h _t and h _t+1 represent the information of the hidden layer at the past moment, the current moment and the next moment, respectively, and x _t represents the current input information.

(四)attention(注意力机制)(4) attention (attention mechanism)

在过去的两年里，注意机制已被广泛应用于自然语言处理，图像识别，语音识别和其他不同类型的深度学习任务中。它是深度学习技术的核心技术之一，值得关注和深入理解。关注机制的本质是获取有关需要关注的内容的更多详细信息，并抑制其他无用的信息。也就是说，其核心操作是一组权重参数，以从序列中了解每个元素的重要性，然后根据重要性合并这些元素。权重参数是注意系数分布，反映了分配给哪个元素多少关注。In the past two years, attention mechanisms have been widely used in natural language processing, image recognition, speech recognition and other different types of deep learning tasks. It is one of the core technologies of deep learning technology, which deserves attention and in-depth understanding. The essence of the attention mechanism is to get more detailed information about what needs attention and suppress other useless information. That is, its core operation is a set of weight parameters to learn the importance of each element from the sequence, and then combine those elements based on importance. The weight parameter is the attention coefficient distribution, reflecting how much attention is assigned to which element.

具体的，在传统的注意机制中，首先，定义之前的状态H＝{h₁，h₂，...h_t-1}。然后得到每一列的权重和被记为v_t。向量h_t-1代表着从过去状态提取到的信息。h_t代表着从当前状态提取到的信息。然后假设一个得分函数f：R^m×R^m→R去计算输入变量之间的相关性。最后，通过下面的方程去计算得到向量v_t的值，其中α_i代表着第i个特征与输出变量相关性的权重。Specifically, in the traditional attention mechanism, first, the previous state H={h ₁ , h ₂ , ... h _t-1 } is defined. Then the weight sum of each column is obtained as v _t . The vector h _t-1 represents the information extracted from the past state. h _t represents the information extracted from the current state. Then assume a score function f: R ^m × R ^m → R to calculate the correlation between the input variables. Finally, the value of the vector v _t is calculated by the following equation, where α _i represents the weight of the correlation between the ith feature and the output variable.

有益效果：Beneficial effects:

(1)本发明提出新的特征提取和融合方法，引入加权的双流双向长短时记忆注意力网络 (TS-λBiLSTMs-attention)算法框架不仅考虑了过去和现在时刻的信息还考虑了下一时刻即将输入的信息，其中，通过引入注意力机制对与输出变量相关的特征进行加权，进一步提高预测精度；(1) The present invention proposes a new feature extraction and fusion method, and introduces a weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm framework that not only considers the information of the past and present moments, but also considers the upcoming moment at the next moment. The input information, in which, by introducing an attention mechanism to weight the features related to the output variables, the prediction accuracy is further improved;

(2)本发明解决了聚酯纤维聚合过程的多输入多输出软测量建模问题，为聚酯纤维性能指标建立了软测量模型，方便工业生产过程中能够实时监测四个性能指标的变化进而达到高水平生产的目的；(2) The present invention solves the multi-input multi-output soft-sensor modeling problem in the polyester fiber polymerization process, and establishes a soft-sensor model for the polyester fiber performance indicators, which is convenient for real-time monitoring of the changes of the four performance indicators in the industrial production process. To achieve the purpose of high-level production;

(3)本发明只要给定工艺参数的值，即可精准地获得预测值，生产者通过判断得到的预测值是否接近设定值即可更好地指导工业生产。(3) In the present invention, as long as the value of the process parameter is given, the predicted value can be accurately obtained, and the producer can better guide the industrial production by judging whether the obtained predicted value is close to the set value.

附图说明Description of drawings

图1为长短时记忆网络(LSTM)的结构示意图；Figure 1 is a schematic diagram of the structure of a long short-term memory network (LSTM);

图2为双向长短时记忆网络(BiLSTMs)的结构示意图；Figure 2 is a schematic diagram of the structure of a bidirectional long short-term memory network (BiLSTMs);

图3为本发明获取聚酯纤维性能指标的预测值所采用的算法框架；Fig. 3 is the algorithm framework adopted by the present invention to obtain the predicted value of the polyester fiber performance index;

图4为聚酯纤维聚合过程示意图；4 is a schematic diagram of a polyester fiber polymerization process;

图5为四个性能指标基于加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)算法的预测值与真实值的对比图；Figure 5 is a comparison chart of the predicted value and the real value of the four performance indicators based on the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm;

图6为基于不同的软测量算法的MIV的绝对误差对比图。FIG. 6 is a comparison diagram of absolute errors of MIV based on different soft sensing algorithms.

具体实施方式Detailed ways

下面结合具体实施方式，进一步阐述本发明。应理解，这些实施例仅用于说明本发明而不用于限制本发明的范围。此外应理解，在阅读了本发明讲授的内容之后，本领域技术人员可以对本发明作各种改动或修改，这些等价形式同样落于本申请所附权利要求书所限定的范围。The present invention will be further described below in conjunction with specific embodiments. It should be understood that these examples are only used to illustrate the present invention and not to limit the scope of the present invention. In addition, it should be understood that after reading the content taught by the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.

如图3所示，预测值的获取过程如下：As shown in Figure 3, the acquisition process of the predicted value is as follows:

然后基于网格搜索的极限梯度提升树算法对m个特征数据进行特征相关性排序，即对每个特征数据计算得分函数的值，得分函数的值越高代表特征相关性越大；对排序好的m个特征数据选取特征相关性较大的前n个特征数据，n∈m，具体的n的取值数目为四舍五入取整9m/20；Then the limit gradient boosting tree algorithm based on grid search performs feature correlation sorting on the m feature data, that is, the value of the score function is calculated for each feature data. The higher the value of the score function, the greater the feature correlation; For the m feature data, select the first n feature data with high feature correlation, n∈m, and the specific number of n values is rounded to the nearest 9m/20;

所述得分函数计算的具体过程为：The specific process of calculating the score function is as follows:

式中，

为第t个特征在树结构为q时的得分函数值，T为树结构为q时的树的叶子节点的个数， I_j为树结构为q时树的第j个节点，g_i为损失函数的一阶导数，w_j为树结构为q时的第j个叶子结点的权重值，h_i为损失函数的二阶导数，λ和γ为超参数；每棵树对于每个特征都有一个得分函数值，最后所有树关于该特征的得分函数值得和即为该特征关于输出变量相关性的得分值，得分函数的值得结果越大，代表该特征与输出变量的相关性越大；In the formula,

is the score function value of the t-th feature when the tree structure is q, T is the number of leaf nodes of the tree when the tree structure is q, I _j is the j-th node of the tree when the tree structure is q, and g _i is The first derivative of the loss function, w _j is the weight value of the _jth leaf node when the tree structure is q, hi is the second derivative of the loss function, λ and γ are hyperparameters; each tree is for each feature. There is a score function value. Finally, the sum of the score function values of all trees about the feature is the score value of the feature’s correlation with the output variable. The greater the value of the score function, the greater the correlation between the feature and the output variable. Big;

接着对n个特征数据的所有样本进行归一化处理；归一化处理的公式如下：Then, normalize all samples of n feature data; the formula for normalization is as follows:

式中，x_i为第i个样本归一化后的结果，以某特征数据中的所有样本的集合为集合X，X_i为集合X中第i个样本，X_min为集合X中所有样本的最小值，X_max为集合X中所有样本的最大值；In the formula, x _i is the result of the normalization of the ith sample, the set of all samples in a certain characteristic data is set X, X _i is the ith sample in the set X, and X _min is all the samples in the set X. The minimum value of , X _max is the maximum value of all samples in the set X;

其中，

i_t与f_t均为0到1之间的数，通过随机梯度下降的方式训练得到i_t与f_t的值；Both i _t and f _t are numbers between 0 and 1, and the values of i _t and f _t are obtained by training through stochastic gradient descent;

l_t＝σ(W_i*[h_t-1，x_t]+b_i)；l _t =σ(W _i *[h _t-1 , x _t ]+ _bi );

f_t＝σ(W_f*[h_t-1，x_t]+b_f)；f _t =σ(W _f *[h _t-1 , x _t ]+b _f );

具体训练过程为：The specific training process is:

设定目标函数

set objective function

其中，J(θ)为损失函数，损失函数为熔体特性粘度MIV的平方损失，平方损失是指实际值和预测值之间的误差平方和，θ是要训练的参数即W_i、b_i、W_f与b_f，是要迭代求解的值，h(θ)是要拟合的函数即对四个性能指标的预测，yⁱ是四个性能指标的真实值，m是训练集的条数，因为随机梯度下降是一个一个样本进行训练，所以这里m＝1；通过梯度下降法最小化损失函数实现所需参数的训练；Among them, J(θ) is the loss function, the loss function is the squared loss of the melt intrinsic viscosity MIV, the squared loss refers to the squared error between the actual value and the predicted value, and θ is the parameter to be trained, namely W _i , b _i , W _f and b _f , are the values to be solved iteratively, h(θ) is the function to be fitted, that is, the prediction of the four performance indicators, y ⁱ is the real value of the four performance indicators, and m is the bar of the training set. number, because the stochastic gradient descent is trained one by one, so here m=1; the training of the required parameters is achieved by minimizing the loss function through the gradient descent method;

所述加权的双流双向长短时记忆注意力网络具体的建模和训练过程为：The specific modeling and training process of the weighted dual-stream bidirectional long-term memory attention network is as follows:

(1)建模；(1) Modeling;

a)数据预处理部分：a) Data preprocessing part:

b)提取和融合部分：b) Extraction and fusion part:

加权双向长短时记忆网络单元I的双向长短时记忆网络单元由输入层、输出层、前向层和后向层组成，每一层由多个神经元组成，其中每个神经元即为一个长短时记忆网络单元，加权双向长短时记忆网络单元 II的双向长短时记忆网络单元由输入层、输出层、前向层和后向层组成，每一层由多个神经元组成，其中每个神经元即为一个长短时记忆网络单元，所述批量标准化层是一种为神经网络中的任何层提供零均值/ 单位方差输入的技术；The weighted bidirectional long and short-term memory network unit I of the bidirectional long-term and short-term memory network unit consists of an input layer, an output layer, a forward layer and a backward layer, each layer is composed of multiple neurons, and each neuron is a long and short Time memory network unit, weighted bidirectional long and short-term memory network unit II The bidirectional long and short-term memory network unit is composed of an input layer, an output layer, a forward layer and a backward layer, each layer is composed of multiple neurons, each of which is composed of neurons. The element is a long-short-term memory network unit, and the batch normalization layer is a technique that provides zero mean/unit variance input to any layer in the neural network;

c)回归输出；c) regression output;

(2)训练；(2) training;

训练的终止条件为精度足够高或者达到预定义的最大迭代数；精度足够高是指MSE<0.005，同时MAE<0.05，MSE的计算公式如下：The termination condition of training is that the accuracy is high enough or reaches the predefined maximum number of iterations; the accuracy is high enough means that MSE<0.005, and MAE<0.05, the calculation formula of MSE is as follows:

式中，M为样本个数，t为样本编号，y_t为真实值，

is the predicted value;

MAE的计算公式如下：The formula for calculating MAE is as follows:

采用随机梯度下降的方式对预测模型中的超参数进行训练；随机梯度下降即：每次迭代使用一个样本来对参数进行更新，使得训练速度加快；The hyperparameters in the prediction model are trained by means of stochastic gradient descent; stochastic gradient descent means that each iteration uses a sample to update the parameters, which speeds up the training speed;

其中，J(θ)为损失函数，θ是参数，要迭代求解的值，h(θ)是要拟合的函数，yⁱ是真实值，m是训练集的条数，因为随机梯度下降是一个一个样本进行训练，所以这里m＝1；Among them, J(θ) is the loss function, θ is the parameter, the value to be solved iteratively, h(θ) is the function to be fitted, y ⁱ is the true value, and m is the number of bars in the training set, because stochastic gradient descent is One by one for training, so here m=1;

(3)模型校正；(3) Model calibration;

若训练在达到提前设定的最大迭代次数后MSE≥0.005或MAE≥0.05，则选择将批尺寸调小，调整步长为1，或者将预定义代数增大，调整步长为30代，其中预定义的最大迭代次数为100代，批尺寸的值为 32；If the MSE ≥ 0.005 or MAE ≥ 0.05 after the training reaches the maximum number of iterations set in advance, choose to reduce the batch size and adjust the step size to 1, or increase the predefined number of generations and adjust the step size to 30 generations. The predefined maximum number of iterations is 100 generations, and the value of batch size is 32;

λ₁的取值范围为0-1，λ₂的值取值范围为0-1，当达到提前设定的最大迭代次数后MSE≥0.005或 MAE≥0.05时，对该两个参数进行调整，调整步长为0.1；The value range of λ ₁ is 0-1, and the value range of λ ₂ is 0-1. When the maximum number of iterations set in advance is reached, MSE≥0.005 or MAE≥0.05, these two parameters are adjusted, Adjust the step size to 0.1;

具体调节方式为：The specific adjustment method is:

实施例1Example 1

一种聚酯纤维聚合过程中的工艺参数的调节方法，步骤如下：A method for adjusting process parameters in a polyester fiber polymerization process, the steps are as follows:

(1)收集m个特征数据，特征数据为与聚酯纤维性能指标相关的工艺参数数据；(1) Collect m characteristic data, and the characteristic data is the process parameter data related to the polyester fiber performance index;

如图4所示，聚酯纤维聚合过程主要包含三个阶段，分别是酯化过程、预缩聚过程和终缩聚过程，本发明的研究主要针对当前广泛使用的杜邦三釜工艺，首先将原料聚对苯二甲酸(Polyterephthalic acid,PTA) 和乙二醇(Ethylene glycol,EG)在浆料混合罐以一定的比例进行混合，然后混合好的浆料由浆料喂入罐连续的送入酯化反应釜，在一定温度和压力的作用下，原料通过化学反应生成低聚物，然后将低聚物连续进料到预缩聚反应器和终缩聚反应器中以形成高聚物，在各个阶段产生的水蒸汽和乙二醇蒸汽进入真空系统进行去除或再循环，生产过程具有非线性高、时间变化慢、参数分布多等特点；As shown in Figure 4, the polyester fiber polymerization process mainly includes three stages, namely esterification process, pre-polycondensation process and final polycondensation process. The research of the present invention is mainly aimed at the currently widely used DuPont three-pot process. Polyterephthalic acid (PTA) and ethylene glycol (EG) are mixed in a certain proportion in the slurry mixing tank, and then the mixed slurry is continuously fed into the slurry feeding tank for esterification. The reaction kettle, under the action of a certain temperature and pressure, the raw materials are chemically reacted to form oligomers, and then the oligomers are continuously fed into the pre-polycondensation reactor and the final polycondensation reactor to form high polymers, which are produced in various stages. The water vapor and ethylene glycol vapor obtained from the vacuum system enter the vacuum system for removal or recycling, and the production process has the characteristics of high nonlinearity, slow time change, and many parameter distributions;

特征数据是从中国的一家工厂采集到的，频率是每秒，根据该工厂的过程知识和操作员经验，本发明最终选择67个工艺参数作为特征数据，即m＝67；The characteristic data is collected from a factory in China, and the frequency is per second. According to the process knowledge and operator experience of the factory, the present invention finally selects 67 process parameters as the characteristic data, that is, m=67;

(2)基于网格搜索的极限梯度提升树(Grid-Xgboost)算法从m个特征数据中选择前n个与聚酯纤维性能指标相关性最高的特征数据；(2) The extreme gradient boosting tree (Grid-Xgboost) algorithm based on grid search selects the top n characteristic data with the highest correlation with the polyester fiber performance index from m characteristic data;

因为极限梯度提升树一次只能对一个输出进行特征重要性排序，而本发明有四个输出，所以本发明采用基于网格搜索的极限梯度提升树即Grid-Xgboost算法分别对四个输出进行特征重要性排序；Because the extreme gradient boosting tree can only sort the feature importance of one output at a time, and the present invention has four outputs, the present invention adopts the grid search-based extreme gradient boosting tree, namely the Grid-Xgboost algorithm, to characterize the four outputs respectively. order of importance;

本发明需要调整参数以获得极限梯度提升树的最佳性能，类似于随机森林，极限梯度提升树使用超参数进行调整，极限梯度提升树的所有参数分为三类：第一类包括用于控制宏功能的常规参数；第二类包括用于控制每个步骤的增强器(树/回归)的增强器参数；最后一类具有学习目标参数，该参数用于控制训练目标的性能；The present invention needs to adjust the parameters to obtain the best performance of the extreme gradient boosting tree, similar to the random forest, the extreme gradient boosting tree is adjusted using hyperparameters, and all the parameters of the extreme gradient boosting tree are divided into three categories: the first category includes control The general parameters of the macro function; the second category includes the booster parameters used to control the booster (tree/regression) at each step; the last category has the learning target parameter, which is used to control the performance of the training target;

本发明专注于网格搜索两个最重要的参数，即max_depth和n_estimators，特别地，max_depth用于控制树结构的深度，n_estimators的参数是重要的调整参数，因为它与极限梯度提升树模型的复杂性有关，此外，它还代表决策树中弱学习器的数量，表1列出了两个参数的范围和最佳值：The present invention focuses on the two most important parameters of grid search, namely max_depth and n_estimators, in particular, max_depth is used to control the depth of the tree structure, and the parameter of n_estimators is an important tuning parameter because it is related to the complexity of the limit gradient boosting tree model In addition, it also represents the number of weak learners in the decision tree, and Table 1 lists the ranges and optimal values of the two parameters:

表1Table 1

然后，通过将基于网格搜索的极限梯度提升树应用于每个特征数据，分别对四个性能指标的特征重要性等级进行排名，结果显示在表2中，由于空间限制，仅显示了与输出变量最相关的40个特征，本发明发现，对于四个不同的性能指标，大多数特征的重要性排序相同：Then, by applying grid-search-based extreme gradient boosting tree to each feature data, the feature importance ranks of the four performance indicators are ranked separately, and the results are shown in Table 2. Due to space constraints, only the same output as the output is shown. The 40 most relevant features of the variable, the present invention found that for four different performance indicators, the importance order of most features is the same:

表2Table 2

又因为四个性能指标中，MIV占主导地位，所以本申请选择了与特性粘度MIV最相关的30个特征数据，至于为什么选取30个特征数据，本申请也进行了一组实验，最终得到的不同特征数目的测试集的统计MSE结果如表3所示：And because MIV is dominant among the four performance indicators, the application has selected 30 characteristic data most relevant to the intrinsic viscosity MIV. The statistical MSE results of the test set with different number of features are shown in Table 3:

表3table 3

从表3中可以看出，当特征数据个数为30时，测试集MSE的值开始趋向于收敛，综上，本申请选择与特性粘度MIV最相关的30个特征数据作为加权的双流双向长短时记忆注意力网络 (TS-λBiLSTMs-attention)的输入，即n＝30，该30个特征数据被突出强调在图4中；As can be seen from Table 3, when the number of characteristic data is 30, the value of MSE in the test set tends to converge. To sum up, this application selects the 30 characteristic data most relevant to the intrinsic viscosity MIV as the weighted dual-stream bidirectional length The input of the temporal memory attention network (TS-λBiLSTMs-attention), that is, n=30, the 30 feature data are highlighted in Figure 4;

(3)30个特征数据共包含10000个样本，选择8000个样本来构成训练集，剩下的构成测试集；(3) The 30 feature data contains a total of 10,000 samples, 8,000 samples are selected to form the training set, and the rest form the test set;

(4)构建和训练加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)；(4) Construct and train a weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention);

采用训练集通过随机梯度下降训练加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)，训练的对象为加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)的所有参数，为了获得参数λ₁和λ₂的最佳值，本申请将其从0调整为1，将参数的步长设置为0.1，最后将λ₁设置为最佳值0.9，λ₁的值与λ₂相同，然后对学习率(dropout rate)采用相同的方法，最后dropout rate的最佳值为0.7，在批量大小和收敛之间需要权衡，批次大小越大，训练速度越快，最后，批量大小被设置为36，最大迭代次数被设置为 100；训练的终止条件为精度足够高或达到预定义的代数；精度足够高是指MSE<0.005，同时MAE<0.05， MSE的计算公式如下：Using the training set to train a weighted two-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) through stochastic gradient descent, the training objects are all parameters of the weighted two-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) , in order to obtain the optimal values of the parameters λ ₁ and λ ₂ , this application adjusts them from 0 to 1, sets the parameter step size to 0.1, and finally sets λ ₁ to the optimal value of 0.9, the value of λ ₁ is the same as λ ₂ is the same, then the same method is used for the learning rate (dropout rate), and the optimal value of the final dropout rate is 0.7. There is a trade-off between batch size and convergence. The larger the batch size, the faster the training speed. Finally, the batch size The size is set to 36, and the maximum number of iterations is set to 100; the termination condition of training is that the accuracy is high enough or reaches a predefined algebra; the accuracy is high enough means that MSE<0.005, and MAE<0.05, the calculation formula of MSE is as follows:

式中，M为样本个数，t为样本编号，y_t为真实值，

is the predicted value;

MAE的计算公式如下：The formula for calculating MAE is as follows:

(5)将测试值输入加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)，由其输出聚酯纤维性能指标的预测值，具体如图5所示，预测值与真实值基本一致，将预测值与设定值进行比较，设定值为：特性粘度MIV 0.7，酯化率E_s 0.95，聚合度P_n125，平均分子量M_n 20000，根据比较结果调整聚酯纤维聚合过程中的工艺参数；(5) Input the test value into the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention), which outputs the predicted value of the polyester fiber performance index, as shown in Figure 5, the predicted value is basically the same as the actual value. Consistent, compare the predicted value with the set value, the set value is: intrinsic viscosity MIV 0.7, esterification rate E _s 0.95, polymerization degree P _n 125, average molecular weight M _n 20000, adjust polyester fiber polymerization process according to the comparison results Process parameters in;

为了证明所提出算法框架的有效性，本发明执行了两组实验：To demonstrate the effectiveness of the proposed algorithm framework, the present invention performs two sets of experiments:

在第一组中，基于加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)、加权的双流双向长短时记忆网络(TS-λBiLSTMs)、双流双向长短时记忆网络(TS-BiLSTMs)、双向长短时记忆注意力网络 (BiLSTMs-attention)和双向长短时记忆网络(BiLSTMs)的MIV性能指数的绝对误差显示在图6中，该几种软测量模型并没在多输出软测量模型的文章中见到过，执行该组实验只是为了证明所提出的双流框架，加权的BiLSTM及其与注意力机制结合的算法框架是非常有效的，其中绝对误差(abosulte error)的定义是预测值与真实值误差的绝对值，具体公式如下：m指第m个测试样本，y_m为真实值，

为预测值，AE的值越接近0，代表预测结果越好；In the first group, weighted two-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention), weighted two-stream bidirectional long-short-term memory network (TS-λBiLSTMs), two-stream bidirectional long-short-term memory network (TS-BiLSTMs) , the absolute error of the MIV performance index of the bidirectional long-short-term memory attention network (BiLSTMs-attention) and the bidirectional long-short-term memory network (BiLSTMs) are shown in Figure 6. These soft-sensor models are not in the multi-output soft-sensor model. As seen in the article, this set of experiments is performed only to prove that the proposed dual-stream framework, the weighted BiLSTM and its algorithmic framework combined with the attention mechanism are very effective, where the absolute error is defined as the difference between the predicted value and the The absolute value of the true value error, the specific formula is as follows: m refers to the mth test sample, y _m is the true value,

is the predicted value, the closer the value of AE is to 0, the better the prediction result;

值得一提的是，图6中的五个算法具有相同的训练集、测试集、迭代次数和批尺寸大小。可以看出，基于加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)、双向长短时记忆注意力网络 (BiLSTMs-attention)和双向长短时记忆网络(BiLSTMs)的MIV性能指数的绝对误差在0-0.2之间波动，而基于双流双向长短时记忆网络(TS-BiLSTMs)和加权的双流双向长短时记忆网络(TS-λBiLSTMs)的MIV性能指标的绝对误差在0-0.4之间波动，具体而言，加权的双流双向长短时记忆注意力网络 (TS-λBiLSTMs-attention)算法的MIV绝对误差主要集中在0和0.05之间，双向长短时记忆注意力网络 (BiLSTMs-attention)的MIV绝对误差从0变为0.1，加权的双流双向长短时记忆网络(TS-λBiLSTMs)比双流双向长短时记忆网络(TS-BiLSTMs)稍好，其绝对误差主要在0.2到0.3之间波动，可以得出结论，引入参数λ是有效的，多输出双流λBiLSTM和双流BiLSTM的效果不如BiLSTMs，主要原因是多个输出使信息混乱，不能有效地学习每个特征的权重，这也证明了引入注意机制的必要性和有效性，总之，提出的加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)算法比其他方法更优越；It is worth mentioning that the five algorithms in Figure 6 have the same training set, test set, number of iterations and batch size. It can be seen that the absolute MIV performance index of the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention), bidirectional long-short-term memory attention network (BiLSTMs-attention) and bidirectional long-short-term memory network (BiLSTMs) The error fluctuates between 0-0.2, while the absolute error of the MIV performance metrics based on two-stream bidirectional long-short-term memory networks (TS-BiLSTMs) and weighted two-stream bidirectional long-short-term memory networks (TS-λBiLSTMs) fluctuates between 0-0.4 , specifically, the MIV absolute error of the weighted dual-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm is mainly concentrated between 0 and 0.05, and the MIV of the bidirectional long-short-term memory attention network (BiLSTMs-attention) The absolute error changes from 0 to 0.1, and the weighted dual-stream bidirectional long-short-term memory network (TS-λBiLSTMs) is slightly better than the dual-stream bidirectional long-short-term memory network (TS-BiLSTMs), and its absolute error mainly fluctuates between 0.2 and 0.3, and it can be obtained that It is concluded that the introduction of the parameter λ is effective, and the multi-output dual-stream λBiLSTM and dual-stream BiLSTM are not as effective as BiLSTMs. The main reason is that the multiple outputs confuse the information and cannot effectively learn the weight of each feature, which also proves the introduction of the attention mechanism. Necessity and effectiveness, in conclusion, the proposed weighted two-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm is superior to other methods;

在第二组中，将提出的加权的双流双向长短时记忆注意力网络(TS-λBiLSTMs-attention)算法框架与几种广泛使用的软测量建模方法进行了对比实验，它们分别是LSTMs(Long Short-term Memory Networks)、 GRU(Gate Recurrent Unit)、PLS(PartialLeast Squares)和SVR(Support Vactor Regression)，10次统计平均结果被展示在表5，从表5可以看出，本发明提出的算法的四个性能指标的MSE值均集中在0.0013左右，而目前比较先进的集中软测量预测方法的最好的MSE的值大约在0.003，这说明本发明提出的算法明显优于其他算法。In the second group, the proposed weighted two-stream bidirectional long-short-term memory attention network (TS-λBiLSTMs-attention) algorithm framework is compared with several widely used soft-sensing modeling methods, which are LSTMs (Long Short-term Memory Networks), GRU (Gate Recurrent Unit), PLS (PartialLeast Squares) and SVR (Support Vactor Regression), the 10 statistical average results are shown in Table 5. It can be seen from Table 5 that the algorithm proposed by the present invention The MSE values of the four performance indicators are all concentrated at about 0.0013, while the best MSE value of the current relatively advanced centralized soft-sensor prediction method is about 0.003, which shows that the algorithm proposed by the present invention is obviously better than other algorithms.

表5table 5

Claims

1. a method for adjusting a process parameter in a polyester fiber polymerization process, characterized in that: after obtaining the predicted value of the polyester fiber performance index, compare it with a set value, and adjust the polyester fiber polymerization process according to the comparison result Process parameters in;

The performance indicators of polyester fiber are intrinsic viscosity MIV, esterification rate E _s , degree of polymerization P _n and average molecular weight _Mn ;

The process of obtaining the predicted value is as follows:

First, collect m feature data from all feature data collected by the sensor;

Then the limit gradient boosting tree algorithm based on grid search performs feature correlation sorting on the m feature data, that is, the value of the score function is calculated for each feature data. The higher the value of the score function, the greater the feature correlation; select the first n feature data with relatively large feature correlation, n∈m;

Then, normalize all the samples of the n feature data;

Finally, the weighted dual-stream bidirectional long-term and short-term memory attention network is used to process the normalized n feature data, and output the predicted values of the four polyester fiber performance indicators;

The weighted dual-stream bidirectional long-short-term memory attention network is a network in which the information obtained from the two tributaries is longitudinally spliced by the merging layer and input to the fully-connected layer through the attention layer;

The two tributaries are the first tributary and the second tributary in parallel; the first tributary consists of two identical weighted bidirectional long-short-term memory network units I, and a batch normalization layer is inserted between the two; the The second branch consists of two identical weighted bidirectional long-short-term memory network units II with a batch normalization layer inserted between them;

in,

After getting the predicted value, compare it with the set value:

If the error between the predicted value of the degree of polymerization, the average molecular weight and the process setting value is within plus or minus 1, and the error between the predicted value of intrinsic viscosity and esterification rate and the process setting value is within plus or minus 0.1, it is not necessary to adjust the process parameters;

If the above valid range is exceeded, the process parameters need to be adjusted;

The specific adjustment method is:

According to the feature correlation ranking and ranking obtained by the extreme gradient boosting tree algorithm of grid search, the process parameters with the highest correlation, that is, the highest ranking process parameters, are adjusted first. If the adjustment is still not satisfied to the limit, the process parameters that are ranked second will continue to be adjusted. parameters, and so on, until the predicted performance parameters meet the requirements;

The adjustment step is 1% of the specific value of the process parameter, and the adjustment limit is ±10% of the specific value of the process parameter;

The adjustment direction is based on the comparison between the performance prediction result and the set value. If the performance prediction result is higher than the set value, the adjustment direction is decrease, and if the performance prediction result is lower than the set value, the adjustment direction is increase.

2. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, the concrete modeling and training process of described weighted dual-stream bidirectional long-short-term memory attention network are:

(1) Modeling;

The weighted dual-stream bidirectional long-term and short-term memory attention network includes three parts: data preprocessing, information extraction and fusion, and regression output;

a) Data preprocessing part:

It consists of three parts: feature data collection, feature correlation sorting of feature data, and optimal processing;

First, collect m feature data from all the feature data collected by the sensor. The m feature data are respectively selected from all temperature sensors in the whole aggregation process with the same interval from the first temperature sensor. [(m-1) /5] The temperature of the temperature sensors, the same interval is selected from the first pressure sensor and rounded to the nearest [(m-1)/5] The pressure of the pressure sensors, 1 slurry ratio data of the slurry preparation tank , the liquid level of the first liquid level sensor with the same interval will be rounded to the nearest [(m-1)/5] liquid level sensors, and the liquid level of the first flow sensor with the same interval will be rounded to the nearest [(m The flow rate of -1)/5] flow sensors, and the rotation speed of [(m-1)/5] speed sensors is rounded up from the first speed sensor with the same interval;

The limit gradient boosting tree algorithm based on grid search performs feature correlation sorting on m feature data;

Optimal processing, select the top n feature data according to feature correlation sorting;

Normalize the selected n feature data to obtain new n feature data;

b) Extraction and fusion part:

Firstly, the past time information and current time information of the n characteristic data obtained by normalization are extracted respectively by using the double-stream framework of the first and second branches in parallel;

The information obtained from the two tributaries of the dual-stream framework is longitudinally spliced by using the merging layer, and then the input information is further weighted and summed by the attention layer according to the correlation between the input information and the output variable;

Then use the merging layer to fuse the information extracted by the dual-stream framework, and input the fused information to the attention layer to further extract information;

c) regression output;

Finally, the information extracted by the attention layer is input to the fully connected layer for classification, and the prediction result is output;

The establishment of a weighted dual-stream bidirectional long-term and short-term memory attention network is completed;

(2) training;

The weighted dual-stream bidirectional long-short-term memory attention network is trained by the method of stochastic gradient descent; the training objects are all hyperparameters of the weighted dual-stream bidirectional long-short-term memory attention network;

The termination condition of training is that the accuracy is high enough or reaches the predefined maximum number of iterations; the accuracy is high enough means that MSE<0.005, and MAE<0.05, the calculation formula of MSE is as follows:

In the formula, M is the number of samples, t is the sample number, y _t is the true value,

is the predicted value;

The formula for calculating MAE is as follows:

(3) Model calibration;

If the MSE ≥ 0.005 or MAE ≥ 0.05 after the training reaches the maximum number of iterations set in advance, choose to reduce the batch size and adjust the step size to 1, or increase the predefined number of generations and adjust the step size to 30 generations. The predefined maximum number of iterations is 100 generations, and the value of batch size is 32.

3. the adjustment method of the process parameter in a kind of polyester fiber polymerization process according to claim 1 and 2, is characterized in that, the bidirectional long-short-term memory network unit of weighted bidirectional long-short-term memory network unit 1 is composed of input layer, output Layer, forward layer and backward layer, each layer is composed of multiple neurons, each neuron is a long and short-term memory network unit, and the weighted bidirectional long and short-term memory network unit II is a bidirectional long and short-term memory network unit. It is composed of input layer, output layer, forward layer and backward layer, each layer is composed of multiple neurons, and each neuron is a long-short-term memory network unit. The batch normalization layer is a kind of neural network. A technique where any layer in the network provides zero mean/unit variance input.

4. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, i _t and f _t are the numbers between 0 and 1, the mode training by stochastic gradient descent _Get the values of it and _ft ;

_i _t =σ(W ₁ *[h _t-1 , x _t ]+bi );

f _t =σ(W _f *[h _t-1 , x _t ]+b _f );

In the formula, σ is the sigmoid function, h _t1 is the information at time t-1, x _t represents the input feature at time t, and W _i , b _i , W _f and b _f are the parameters to be trained;

The specific training process is:

set objective function

Among them, J(θ) is the loss function, θ is the parameters to be trained, namely W _i , b _i , W _f and b _f , which are the values to be solved iteratively, and h(θ) is the function to be fitted, that is, for four For the prediction of performance indicators, y ⁱ is the real value of the four performance indicators, and m is the number of training sets. Because stochastic gradient descent is trained one by one, m=1 here; it is achieved by minimizing the loss function by gradient descent. Training with required parameters.

5. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, the concrete process of described score function calculation is:

In the formula,

is the score function value of the t-th feature when the tree structure is q, T is the number of leaf nodes of the tree when the tree structure is q, I _j is the j-th node of the tree when the tree structure is q, and g _i is The first derivative of the loss function, w _j is the weight value of the _jth leaf node when the tree structure is q, hi is the second derivative of the loss function, λ and γ are hyperparameters; each tree is for each feature. There is a score function value. Finally, the sum of the score function values of all trees about the feature is the score value of the feature’s correlation with the output variable. The larger the value of the score function, the more the correlation between the feature and the output variable. Big;

The loss function is the square loss of the melt intrinsic viscosity MIV in the polyester fiber fiber polymerization process, and the square loss refers to the squared error between the actual value and the predicted value.

6. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, the limit gradient boosting tree based on grid search selects the first n features from m features, The specific value of n is rounded to the nearest 9m/20.

7. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, the formula of normalization is as follows:

In the formula, x _i is the result of the normalization of the ith sample, the set of all samples in a certain feature data is set X, X _i is the ith sample in the set X, and X _min is all the samples in the set X. The minimum value of X _max is the maximum value of all samples in the set X.

8. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 2, is characterized in that, in step (2) training, adopts the mode of stochastic gradient descent to train the hyperparameter in the prediction model ; Stochastic gradient descent means: each iteration uses a sample to update the parameters, which speeds up the training speed;

The objective function for one sample is:

Among them, J(θ) is the loss function, θ is the parameter, the value to be solved iteratively, h(θ) is the function to be fitted, y ⁱ is the true value, and m is the number of bars in the training set, because stochastic gradient descent is Training is performed one sample at a time, so m=1 here.

9. the adjustment method of the technological parameter in a kind of polyester fiber polymerization process according to claim 1, is characterized in that, the value range of λ ₁ is 0-1, and the value range of λ ₂ is 0-1 , when MSE ≥ 0.005 or MAE ≥ 0.05 after reaching the maximum number of iterations set in advance, adjust these two parameters, and the adjustment step is 0.1.