CN116108932A

CN116108932A - Method for establishing fusion model of steel production process data and mechanism

Info

Publication number: CN116108932A
Application number: CN202310003136.7A
Authority: CN
Inventors: 孙杰; 武文腾; 吴豪; 彭文; 张殿华
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2023-01-03
Filing date: 2023-01-03
Publication date: 2023-05-12

Abstract

The invention discloses a method for establishing a fusion model of steel production process data and mechanism, which comprises the following steps: s1: obtaining and processing original data; s2: establishing a mechanism model; s3: establishing a data and mechanism fusion model; s4: bayesian optimization of the XGBoost model; s5: and solving a data and mechanism fusion model. According to the method for establishing the fusion model of the steel production process data and the mechanism, the depth perception of the production process is realized through the method of fusion modeling of the data and the mechanism, so that the accurate prediction of the quality target is realized, the on-site actual production process can be guided, and the stability of the product quality is effectively improved.

Description

A method for establishing a steel production process data and mechanism fusion model

技术领域Technical Field

本发明涉及轧钢自动控制技术领域，尤其是涉及一种钢铁生产过程数据和机理融合模型建立方法。The present invention relates to the technical field of steel rolling automatic control, and in particular to a method for establishing a steel production process data and mechanism fusion model.

背景技术Background Art

在钢铁生产过程中，存在着复杂的物理、化学变化，外界随机干扰因素多，各工序功能不同但相互关联、相互支撑、相互制约，以串联方式集成后构成了复杂生产系统，在这样的生产组织模式下，产品质量问题的产生存在大量的不确定性，导致产品质量波动，涉及钢铁产品质量的关键工艺参数众多，工序内参数非线性耦合，并且质量问题也会在工序间累积与遗传。提高模型对复杂动态工况的适应能力是产品质量进一步提升的关键。In the steel production process, there are complex physical and chemical changes, many external random interference factors, and different processes have different functions but are interrelated, mutually supportive, and mutually constrained. After being integrated in series, they form a complex production system. Under such a production organization model, there is a lot of uncertainty in the generation of product quality problems, which leads to product quality fluctuations. There are many key process parameters related to the quality of steel products, and the parameters within the process are nonlinearly coupled, and quality problems will also accumulate and inherit between processes. Improving the model's adaptability to complex dynamic conditions is the key to further improving product quality.

钢铁生产现场使用的数学模型有以下几种形式，首先是静态或半动态的工艺机理模型，受制于简化性假设和不确定性边界条件变化等因素的限制，难以支撑复杂工况下的高精度质量控制，在工况条件和运行状态变化时模型预报存在较大偏差；其次是数据模型，能在复杂模型场景下常能取得较好的效果，但基于数据建模需要大量数据支撑、时间较长且没有可解释性；此外还有一些基于经验知识的统计模型，建模效果一般比较差，往往只被用于对特殊工艺情况下使用。There are several forms of mathematical models used in steel production sites. The first is static or semi-dynamic process mechanism models, which are limited by factors such as simplified assumptions and changes in uncertain boundary conditions. They are difficult to support high-precision quality control under complex working conditions, and there are large deviations in model predictions when working conditions and operating states change. The second is data models, which can often achieve good results in complex model scenarios, but data-based modeling requires a lot of data support, takes a long time, and is not interpretable. In addition, there are some statistical models based on empirical knowledge, which generally have poor modeling effects and are often only used in special process situations.

如何发挥工艺机理、生产数据和经验知识的各自优势，准确透视工艺和质量等关键参数之间的复杂关系，是钢铁生产过程模型构建需要突破的关键问题。因此，亟需一种钢铁生产过程数据和机理融合模型建立方法。How to give full play to the respective advantages of process mechanism, production data and experience knowledge, and accurately understand the complex relationship between key parameters such as process and quality, is a key issue that needs to be overcome in the construction of steel production process model. Therefore, a method for establishing a steel production process data and mechanism fusion model is urgently needed.

发明内容Summary of the invention

本发明的目的是提供一种钢铁生产过程数据和机理融合模型建立方法，通过数据与机理融合建模的方法实现对生产过程的深度感知，进而实现质量目标的精准预测。The purpose of the present invention is to provide a method for establishing a steel production process data and mechanism fusion model, which can achieve deep perception of the production process through the data and mechanism fusion modeling method, and then achieve accurate prediction of quality goals.

为实现上述目的，本发明提供了一种钢铁生产过程数据和机理融合模型建立方法，包括以下步骤：To achieve the above object, the present invention provides a method for establishing a steel production process data and mechanism fusion model, comprising the following steps:

S1：原始数据获取及处理；S1: raw data acquisition and processing;

S2：机理模型的建立；S2: Establishment of the mechanism model;

S3：数据和机理融合模型的建立；S3: Establishment of data and mechanism fusion model;

S4：贝叶斯优化XGBoost模型；S4: Bayesian optimization of XGBoost model;

S5：数据和机理融合模型求解。S5: Data and mechanism fusion model solution.

优选的，步骤S1中，原始数据获取的方法为：基于板带轧制计算机控制系统的过程自动化级，获取现场数据，并将现场数据导出储存。Preferably, in step S1, the method for acquiring the original data is: based on the process automation level of the plate and strip rolling computer control system, acquiring the field data, and exporting and storing the field data.

优选的，步骤S1中，原始数据处理方法包括以下步骤：Preferably, in step S1, the original data processing method comprises the following steps:

S11、数据的处理：对现场数据进行删除空值、缺失值处理，将文本数据删除处理后得到生产数据；S11. Data processing: Delete null values and missing values from the field data, and delete the text data to obtain production data;

S12、原始数据集的建立：对生产数据进行筛选，得到模型的输入特征，建立原始数据集。S12. Establishment of original data set: Screen the production data, obtain the input features of the model, and establish the original data set.

优选的，现场数据为生产过程的设定数据和基础自动化级反馈的现场实际检测数据。Preferably, the field data are setting data of the production process and actual field detection data fed back by the basic automation level.

优选的，步骤S2中，机理模型的建立以现场实际生产工艺作为输入，基于轧制过程机理公式计算得到机理模型预测数据。Preferably, in step S2, the establishment of the mechanism model takes the actual production process on site as input, and the mechanism model prediction data is calculated based on the rolling process mechanism formula.

优选的，步骤S3中，数据和机理模型的建立具体包括以下步骤：Preferably, in step S3, the establishment of data and mechanism model specifically includes the following steps:

S31、模型数据的确定与处理：将机理模型预测数据作为一种输入特征补充到原始数据集中，合并成为新数据集；S31. Determination and processing of model data: Supplement the mechanism model prediction data as an input feature to the original data set and merge them into a new data set;

S32、将新数据集划分为训练集和测试集；S32, dividing the new data set into a training set and a test set;

S33、对新数据集进行标准化处理。S33. Standardize the new data set.

优选的，步骤S4中，贝叶斯优化XGBoost模型具体流程如下：Preferably, in step S4, the specific process of Bayesian optimization of XGBoost model is as follows:

S41：建立超参数空间，并在空间内随机生成n个初始超参数组合，基于初始超参数组合进行模型训练，计算预测数据的r²，得到初始贝叶斯数据集为：S41: Establish a hyperparameter space, and randomly generate n initial hyperparameter combinations in the space, perform model training based on the initial hyperparameter combinations, calculate r ² of the predicted data, and obtain the initial Bayesian data set as:

D₀＝{(X₁,y₁),(X₂,y₂),…,(X_n,y_|)}D ₀ ={(X ₁ ,y ₁ ),(X ₂ ,y ₂ ),…,(X _n ,y _| )}

其中，X_n表示超参数组合，y_n为相对应的r²；Where _Xn represents the hyperparameter combination, _yn is the corresponding ^r2 ;

S42：将初始贝叶斯数据集D₀代入高斯过程中，计算其高斯分布模型，并使用采集函数EI模型选取具有最大r²的参数组合，得到下一个待计算的超参数组合；S42: Substitute the initial Bayesian data set D ₀ into the Gaussian process, calculate its Gaussian distribution model, and use the acquisition function EI model to select the parameter combination with the largest r ² to obtain the next hyperparameter combination to be calculated;

S43：基于模型计算下一个待计算的超参数组合的r²，补充到初始贝叶斯数据集D₀形成新的贝叶斯数据集D，更新高斯过程回归，重复步骤S42-S43，直至达到设置的最大迭代次数为止；S43: Calculate r ² of the next hyperparameter combination to be calculated based on the model, add it to the initial Bayesian data set D ₀ to form a new Bayesian data set D, update the Gaussian process regression, and repeat steps S42-S43 until the set maximum number of iterations is reached;

S44：模型优化停止后，选取具有最大r²的超参数组合作为最优超参数。S44: After the model optimization stops, the hyperparameter combination with the largest r ² is selected as the optimal hyperparameter.

优选的，步骤S5中，依据贝叶斯优化得到的模型参数搭建模型，导入训练集数据进行模型训练，并通过测试集数据对模型性能进行评估。Preferably, in step S5, a model is built based on the model parameters obtained by Bayesian optimization, training set data is imported for model training, and model performance is evaluated using test set data.

优选的，模型性能评估的指标包括均方误差MSE、最大百分比误差MAPE、平均绝对值误差MAE。Preferably, the indicators for evaluating model performance include mean square error (MSE), maximum percentage error (MAPE), and mean absolute error (MAE).

因此，本发明采用上述一种钢铁生产过程数据和机理融合模型建立方法，其技术效果如下：通过数据与机理融合建模的方法实现对生产过程的深度感知，进而实现质量目标的精准预测，能够为现场实际生产过程进行指导，有效提高产品质量的稳定性。Therefore, the present invention adopts the above-mentioned method for establishing a steel production process data and mechanism fusion model, and its technical effects are as follows: through the data and mechanism fusion modeling method, deep perception of the production process is achieved, and then accurate prediction of quality goals is achieved, which can provide guidance for the actual production process on site and effectively improve the stability of product quality.

下面通过附图和实施例，对本发明的技术方案做进一步的详细描述。The technical solution of the present invention is further described in detail below through the accompanying drawings and embodiments.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明一种钢铁生产过程数据和机理融合模型建立方法中热连轧生产线示意图；FIG1 is a schematic diagram of a hot rolling production line in a method for establishing a steel production process data and mechanism fusion model according to the present invention;

图2为本发明一种钢铁生产过程数据和机理融合模型建立方法机理模型计算流程；FIG2 is a calculation flow of a mechanism model of a method for establishing a steel production process data and mechanism fusion model according to the present invention;

图3为本发明一种钢铁生产过程数据和机理融合模型建立方法的数据与机理融合模型的建模步骤；FIG3 is a modeling step of a data and mechanism fusion model of a method for establishing a data and mechanism fusion model of a steel production process according to the present invention;

图4为本发明一种钢铁生产过程数据和机理融合模型建立方法的数据与机理融合模型贝叶斯优化迭代过程；FIG4 is a Bayesian optimization iteration process of a data and mechanism fusion model of a method for establishing a data and mechanism fusion model of a steel production process according to the present invention;

图5为本发明一种钢铁生产过程数据和机理融合模型建立方法的数据与机理融合模型的预测结果；FIG5 is a prediction result of a data and mechanism fusion model of a method for establishing a data and mechanism fusion model of a steel production process according to the present invention;

图6为本发明一种钢铁生产过程数据和机理融合模型建立方法的纯数据驱动模型的预测结果。FIG6 is a prediction result of a pure data-driven model of a method for establishing a steel production process data and mechanism fusion model according to the present invention.

附图标记Reference numerals

1、加热炉；2、板坯；3、粗轧机组；4、传感器；5、保温罩；6、精轧机组；7、层流冷却；8、卷取机；9、测宽仪；10、热金属检测仪；11、高温计；12、测厚仪。1. Heating furnace; 2. Slab; 3. Roughing mill; 4. Sensor; 5. Insulation cover; 6. Finishing mill; 7. Laminar cooling; 8. Coiler; 9. Width gauge; 10. Hot metal detector; 11. Pyrometer; 12. Thickness gauge.

具体实施方式DETAILED DESCRIPTION

以下通过附图和实施例对本发明的技术方案作进一步说明。The technical solution of the present invention is further described below through the accompanying drawings and embodiments.

除非另外定义，本发明使用的技术术语或者科学术语应当为本发明所属领域内具有一般技能的人士所理解的通常意义。Unless otherwise defined, technical or scientific terms used in the present invention shall have the common meanings understood by one having ordinary skills in the field to which the present invention belongs.

对于本领域技术人员而言，显然本发明不限于上述示范性实施例的细节，而且在不背离本发明的主旨或基本特征的情况下，能够以其它的具体形式实现本发明。因此，无论从哪一点来看，均应将实施例看作是示范性的，而且是非限制性的，本发明的范围由所附权利要求而不是上述说明限定，因此旨在将落在权利要求的等同要件的含义和范围内的所有变化囊括在本发明内，不应将权利要求中的任何附图标记视为限制所涉及的权利要求。It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above and that the present invention can be implemented in other specific forms without departing from the gist or essential features of the present invention. Therefore, the embodiments should be considered exemplary and non-restrictive in all respects, and the scope of the present invention is defined by the appended claims rather than the above description, and it is intended that all variations falling within the meaning and scope of the equivalent elements of the claims be included in the present invention, and any reference numerals in the claims should not be considered as limiting the claims to which they relate.

此外，应当理解，虽然本说明书按照实施方式加以描述，但并非每个实施方式仅包含一个独立的技术方案，说明书的这种叙述方式仅仅是为清楚起见，本领域技术人员应当将说明书作为一个整体，各实施例中的技术方案也可以经适当组合，形成本领域技术人员可以理解的其它实施方式。这些其它实施方式也涵盖在本发明的保护范围内。In addition, it should be understood that although this specification is described according to the implementation modes, not every implementation mode includes only one independent technical solution. This description of the specification is only for the sake of clarity. Those skilled in the art should regard the specification as a whole. The technical solutions in each embodiment can also be appropriately combined to form other implementation modes that can be understood by those skilled in the art. These other implementation modes are also covered within the protection scope of the present invention.

还应当理解，以上所述的具体实施例仅用于解释本发明，本发明的保护范围并不限于此，任何熟悉本技术领域的技术人员在本发明揭露的技术范围内，根据本发明的技术方案及其发明构思加以等同替换或改变，都应涵盖在本发明/发明的保护范围之内。It should also be understood that the specific embodiments described above are only used to explain the present invention, and the protection scope of the present invention is not limited thereto. Any technician familiar with the technical field, within the technical scope disclosed by the present invention, can make equivalent replacements or changes based on the technical solutions and inventive concepts of the present invention, which should be covered by the protection scope of the present invention/invention.

对于相关领域普通技术人员已知的技术、方法和设备可能不作为详细讨论，但在适当情况下，所述技术、方法和设备应当被视为说明书的一部分。Technologies, methods, and equipment known to ordinary technicians in the relevant field may not be discussed in detail, but where appropriate, the technologies, methods, and equipment should be considered as part of the specification.

本发明说明书中引用的现有技术文献所公开的内容整体均通过引用并入本发明中，并且因此是本发明公开内容的一部分。The disclosed contents of the prior art documents cited in the present specification are incorporated into the present invention by reference in their entirety and are therefore part of the disclosure of the present invention.

实施例一Embodiment 1

一种钢铁生产过程数据和机理融合模型建立方法。图1为热连轧生产线的布置形式。轧件在精轧区经过6个机架的轧制变形后得到最终成品，在最末机架出口处布置有测宽仪，用于最终成品的宽度测量。精轧宽展即为最终成品宽度与精轧入口宽度之差。A method for establishing a steel production process data and mechanism fusion model. Figure 1 shows the layout of a hot rolling production line. The rolled piece is rolled and deformed in 6 stands in the finishing area to obtain the final product. A width gauge is arranged at the exit of the last stand to measure the width of the final product. The finishing width is the difference between the width of the final product and the width at the finishing entrance.

具体实施步骤如下：The specific implementation steps are as follows:

一、原始数据获取及处理1. Raw data acquisition and processing

(1)原始数据的获取(1) Acquisition of original data

基于热连轧生产线，获取现场每块带钢的宽度实测值，从而得到板坯的轧制宽展，每块带钢对应一组生产工艺，作为数据驱动建模的源数据。基于板带轧制计算机控制系统的过程自动化级，获取现场数据，现场数据为生产过程的设定数据以及基础自动化级反馈的现场实际检测数据，并将现场数据导出储存。Based on the hot rolling production line, the actual measured value of the width of each strip steel is obtained, so as to obtain the rolling width of the slab. Each strip steel corresponds to a set of production processes, which serves as the source data for data-driven modeling. Based on the process automation level of the plate and strip rolling computer control system, the field data is obtained. The field data is the setting data of the production process and the actual field detection data fed back by the basic automation level, and the field data is exported and stored.

(2)原始数据的处理(2) Processing of raw data

现场获取的数据除去需要的工艺参数之外，还有一些本文类数据，如材料号、生产时间等，是与精轧宽展明显无关的特征，进行直接人工删除处理。除此之外，对于漏检或数据传输丢包导致的数据空值，基于占数据总量比例较小，进行删除处理。共去除缺失数据8条，剩余数据为生产数据，共1860条。In addition to the required process parameters, the data obtained on site also includes some text-related data, such as material number, production time, etc., which are features that are obviously irrelevant to the width expansion of the finishing rolling, and are directly deleted manually. In addition, for data null values caused by missed detection or data transmission packet loss, they are deleted based on their small proportion of the total data. A total of 8 missing data were removed, and the remaining data were production data, totaling 1,860.

(3)原始数据集的建立(3) Establishment of original data set

最终获得可用于计算的数据集含有1860条生产数据，去除非必要特征，剩余输入特征30种，包括现场检测仪表检测所得以及轧制模型中的计算设定值。其中实测数据包括：中间坯长度、F1入口温度、F6出口温度、F1入口板坯宽度，F6出口板坯宽度、F1入口板坯厚度、F1～F6轧制力、F1～F6轧制速度；通过轧制模型计算设定数据为：F1～F6辊缝、F1～F6穿带张力。将上述数据作为原始数据集。The final data set that can be used for calculation contains 1860 production data. After removing unnecessary features, there are 30 input features left, including those obtained from on-site detection instruments and calculated setting values in the rolling model. The measured data include: intermediate billet length, F1 inlet temperature, F6 outlet temperature, F1 inlet slab width, F6 outlet slab width, F1 inlet slab thickness, F1~F6 rolling force, F1~F6 rolling speed; the set data calculated by the rolling model are: F1~F6 roll gap, F1~F6 strip tension. The above data are used as the original data set.

二、机理模型的建立2. Establishment of the mechanism model

在热连轧精轧过程中，影响板坯宽度变化的因素主要有轧机入口宽度、厚度、板坯温度、轧辊半径、轧辊速度、轧制速度、摩擦系数。因此，本实施例使用巴赫契诺夫模型作为机理模型。计算公式如下：In the hot rolling finishing process, the factors affecting the change of slab width mainly include mill entrance width, thickness, slab temperature, roll radius, roll speed, rolling speed, and friction coefficient. Therefore, this embodiment uses the Bakhchinov model as the mechanism model. The calculation formula is as follows:

其中：Δb为板坯宽展，Δh为压下量，h₀为板坯初始宽度，R为轧辊半径，μ为摩擦系数。Where: Δb is the slab width, Δh is the reduction, _h0 is the initial slab width, R is the roller radius, and μ is the friction coefficient.

摩擦系数可以由下式计算得到：The friction coefficient can be calculated by the following formula:

μ＝0.8(1.05-0.0005θ)(2)μ＝0.8(1.05-0.0005θ)(2)

其中：θ为轧件温度，且700℃<θ<1200℃。Wherein: θ is the rolled piece temperature, and 700℃<θ<1200℃.

图2为计算精轧多道次宽展时的流程图。在热连轧精轧过程，除第一机架外，各机架板坯入口厚度不可测量，只能通过轧制工艺进行累加计算。计算每道次机架入口厚度，其余数据均采用实际工艺数据，导入机理公式进行计算，得到每机架轧制后板坯宽展。Figure 2 is a flow chart for calculating the width expansion of multiple passes of finishing rolling. In the hot rolling finishing process, except for the first stand, the slab inlet thickness of each stand cannot be measured and can only be calculated cumulatively through the rolling process. The inlet thickness of each stand is calculated, and the remaining data are calculated using actual process data and imported into the mechanism formula to obtain the slab width expansion after rolling each stand.

经过6机架轧制积累，最终板坯宽展为：After 6-stand rolling, the final slab width is:

三、数据和机理融合模型的建立3. Establishment of data and mechanism fusion model

(1)模型数据的确定与处理(1) Determination and processing of model data

将机理模型预测数据作为精轧宽展的一种输入特征加入到原始数据集，生成新数据集，用于预测热连轧精轧宽展。对原始的30种特征进行补充，加入机理模型宽展预测值作为第31种特征。组成新数据集。The prediction data of the mechanism model is added to the original data set as an input feature of the finishing width expansion to generate a new data set for predicting the finishing width expansion of hot rolling. The original 30 features are supplemented and the prediction value of the mechanism model width expansion is added as the 31st feature to form a new data set.

将新数据集以8：2的比例划分为训练集和测试集，训练集包含1488条数据，测试集数据包含372条数据。The new data set is divided into a training set and a test set in a ratio of 8:2. The training set contains 1488 data items and the test set contains 372 data items.

针对新数据集中各个特征之间的类型不同，单位量纲存在较大差异的问题，采用Z-Score标准化处理，即把不同量纲的数据转化为统一度量的Z-Score分值进行表示，其公式为：In view of the different types of features in the new data set and the large differences in unit dimensions, Z-Score standardization is used, that is, data of different dimensions are converted into Z-Score scores of unified measurement for representation. The formula is:

其中，μ为原始数据的均值，

为原始数据的标准差，N为数据数目。Among them, μ is the mean of the original data,

is the standard deviation of the original data, and N is the number of data.

(2)建立数据和机理融合模型(2) Establishing a data and mechanism fusion model

XGBoost作为一种集成学习方法，以提升树CART算法作为基模型决策树，并通过正则化项来降低模型复杂度，减少过拟合，具有较强的泛化能力。基于python建立初步的XGBoost模型，模型参数采用默认值，并对利用训练集数据对模型进行训练，初步建立数据和机理融合模型。As an integrated learning method, XGBoost uses the boosted tree CART algorithm as the base model decision tree, and reduces model complexity and overfitting through regularization terms, and has strong generalization ability. A preliminary XGBoost model is established based on python, with the model parameters using default values, and the model is trained using the training set data to preliminarily establish a data and mechanism fusion model.

四、贝叶斯优化XGBoost模型4. Bayesian Optimization of XGBoost Model

基于贝叶斯优化方法对模型的参数进行优化选择，以提升模型的预测性能。图3为数据和机理融合模型的建模以及模型调优的过程。The model parameters are optimized based on the Bayesian optimization method to improve the prediction performance of the model. Figure 3 shows the process of modeling and model tuning of the data and mechanism fusion model.

贝叶斯优化流程如下：The Bayesian optimization process is as follows:

(1)建立超参数空间，并在空间内随机生成5个初始超参数组合，基于初始超参数组合进行模型训练，计算预测数据的r²，得到初始贝叶斯数据集为：D₀＝({X₁,y₁),(X₂,y₂),…,(X_n,y_n)}，(1) Establish a hyperparameter space and randomly generate five initial hyperparameter combinations in the space. Perform model training based on the initial hyperparameter combinations and calculate r ² of the predicted data. The initial Bayesian data set is: D ₀ =({X ₁ ,y ₁ ),(X ₂ ,y ₂ ),…,(X _n ,y _n )}.

其中，X_n表示第n个超参数组合，y_n为相对应的r²。r²的计算公式如下：Where _Xn represents the nth hyperparameter combination, and _yn is the corresponding ^r2 . The calculation formula of ^r2 is as follows:

其中，y_i为第i组工艺下的精轧宽展实测值，

为第i组工艺下的宽展预测值，n为原数据数目，

为原始数据的平均值。Where _yi is the measured value of the finishing width under the i-th group of processes,

is the predicted value of the width expansion under the i-th group of processes, n is the number of original data,

is the average value of the original data.

(2)将D₀代入高斯过程中，计算其高斯分布模型，并使用采集函数EI模型选取具有最大r²的超参数组合，得到下一个待计算的超参数组合；(2) Substitute D ₀ into the Gaussian process, calculate its Gaussian distribution model, and use the acquisition function EI model to select the hyperparameter combination with the largest r ² to obtain the next hyperparameter combination to be calculated;

(3)将下一个待计算的超参数组合代入模型进行计算，补充到初始贝叶斯数据集D₀形成新的贝叶斯数据集D，更新高斯过程回归。(3) Substitute the next hyperparameter combination to be calculated into the model for calculation, add it to the initial Bayesian data set _D0 to form a new Bayesian data set D, and update the Gaussian process regression.

重复流程(2)、(3)，直至达到设置的最大迭代次数为止。Repeat steps (2) and (3) until the maximum number of iterations is reached.

(4)模型优化停止后，以具有r²最大的超参数组合作为最优超参数。(4) After the model optimization stops, the hyperparameter combination with the largest r ² is taken as the optimal hyperparameter.

表1为需要优化的模型超参数、参数空间以及超参数最优值。模型最大迭代次数为200次，参数优化迭代过程如图4所示。模型参数除表中以外的参数均采用默认参数，最终通过贝叶斯优化得到最优的超参数组合。Table 1 shows the model hyperparameters, parameter space and optimal values of hyperparameters that need to be optimized. The maximum number of model iterations is 200, and the parameter optimization iteration process is shown in Figure 4. All model parameters except those in the table use default parameters, and finally the optimal hyperparameter combination is obtained through Bayesian optimization.

表1 XGBoost待优化参数Table 1 XGBoost parameters to be optimized

五、数据和机理融合模型求解5. Data and mechanism fusion model solution

基于贝叶斯优化所得到的最优超参数组合，建立数据和机理融合模型进行训练。为了验证所建立模型的准确性，需要对其进行性能评估。采用均方误差MSE、最大百分比误差MAPE、平均绝对值误差MAE对模型的预测能力进行评估。各评价指标公式如下所示：Based on the optimal hyperparameter combination obtained by Bayesian optimization, a data and mechanism fusion model is established for training. In order to verify the accuracy of the established model, it is necessary to evaluate its performance. The mean square error MSE, maximum percentage error MAPE, and mean absolute error MAE are used to evaluate the prediction ability of the model. The formulas for each evaluation indicator are as follows:

其中，y_i为第i组工艺下的精轧宽展实测值，

为第i组工艺下的宽展预测值，n为原数据数目，

is the average value of the original data.

表2为数据和机理融合模型性能度量结果。测试模型平均绝对值误差为1.0809mm，模型决定系数为0.9598，模型具有较强的泛化能力，能够胜任实际生产过程的预测任务。Table 2 shows the performance measurement results of the data and mechanism fusion model. The average absolute error of the test model is 1.0809 mm, and the model determination coefficient is 0.9598. The model has strong generalization ability and is competent for the prediction task of the actual production process.

表2数据和机理融合模型性能度量Table 2 Performance metrics of data and mechanism fusion models

度量指标Metrics MSEMSE MAPEMAPE MAEMAE 训练集Training set 0.21520.2152 1.0964％1.0964% 0.33860.3386 测试集Test Set 2.19092.1909 3.3901％3.3901% 1.08091.0809

对比例一Comparative Example 1

为了验证所建立的数据和机理融合模型的预测效果，建立基于贝叶斯优化的纯数据驱动模型，与数据和机理融合模型进行对比。In order to verify the prediction effect of the established data and mechanism fusion model, a pure data-driven model based on Bayesian optimization was established and compared with the data and mechanism fusion model.

表3为纯数据模型的性能度量结果。相比于数据和机理模型，MSE、MAPE、MAE都比较大，即纯数据模型的性能较差。Table 3 shows the performance measurement results of the pure data model. Compared with the data and mechanism models, MSE, MAPE, and MAE are relatively large, that is, the performance of the pure data model is poor.

表3纯数据模型性能度量Table 3. Performance metrics of pure data models

度量指标Metrics MSEMSE MAPEMAPE MAEMAE 训练集Training set 2.15332.1533 0.3420％0.3420% 1.08071.0807 测试集Test Set 3.28263.2826 3.8288％3.8288% 1.29951.2995

图5和图6分别为数据和机理融合模型、纯数据模型的预测结果。数据和机理融合模型的预测结果与实际值更相近，具有更高的预测精度。在加入机理模型预测数据后，对模型的训练具有一定的引导作用，预测结果有更高的可信范围和更小的误差，模型的预测性能具有较大提升。Figures 5 and 6 show the prediction results of the data and mechanism fusion model and the pure data model, respectively. The prediction results of the data and mechanism fusion model are closer to the actual values and have higher prediction accuracy. After adding the prediction data of the mechanism model, it has a certain guiding effect on the training of the model, the prediction results have a higher credibility range and smaller errors, and the prediction performance of the model is greatly improved.

因此，本发明采用上述一种钢铁生产过程数据和机理融合模型建立方法，通过将机理模型预测数据补充到钢铁生产数据中，进行增维处理，合并为新的数据集，通过机理数据对模型训练进行引导，并利用贝叶斯优化算法选择最优的模型参数组合，实现了对质量目标的准确预测。能够为现场实际生产过程进行指导，有效提高产品质量的稳定性。Therefore, the present invention adopts the above-mentioned steel production process data and mechanism fusion model establishment method, by adding the mechanism model prediction data to the steel production data, performing dimension increase processing, merging into a new data set, guiding the model training through the mechanism data, and using the Bayesian optimization algorithm to select the optimal model parameter combination, to achieve accurate prediction of quality targets. It can provide guidance for the actual production process on site and effectively improve the stability of product quality.

最后应说明的是：以上实施例仅用以说明本发明的技术方案而非对其进行限制，尽管参照较佳实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对本发明的技术方案进行修改或者等同替换，而这些修改或者等同替换亦不能使修改后的技术方案脱离本发明技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention rather than to limit it. Although the present invention has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that they can still modify or replace the technical solution of the present invention with equivalents, and these modifications or equivalent replacements cannot cause the modified technical solution to deviate from the spirit and scope of the technical solution of the present invention.

Claims

1. A method for establishing a steel production process data and mechanism fusion model, characterized in that it includes the following steps:

S1: raw data acquisition and processing;

S2: Establishment of the mechanism model;

S3: Establishment of data and mechanism fusion model;

S4: Bayesian optimization of XGBoost model;

S5: Data and mechanism fusion model solution.

2. A method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S1, the method for obtaining original data is: based on the process automation level of the plate and strip rolling computer control system, field data is obtained, and the field data is exported and stored.

3. The method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S1, the original data processing method includes the following steps:

S11. Data processing: Delete null values and missing values from the field data, and delete the text data to obtain production data;

S12. Establishment of original data set: Screen the production data, obtain the input features of the model, and establish the original data set.

4. A method for establishing a steel production process data and mechanism fusion model according to claim 2, characterized in that the field data is the setting data of the production process and the actual field detection data fed back by the basic automation level.

5. A method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S2, the establishment of the mechanism model takes the actual production process on site as input, and obtains the mechanism model prediction data based on the rolling process mechanism formula.

6. The method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S3, the establishment of the data and mechanism model specifically includes the following steps:

S31, determination and processing of model data, adding the mechanism model prediction data as an input feature to the original data set and merging them into a new data set;

S32, dividing the new data set into a training set and a test set;

S33. Standardize the new data set.

7. The method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S4, the specific process of the Bayesian optimization XGBoost model is as follows:

S41: Establish a hyperparameter space, and randomly generate n initial hyperparameter combinations in the space, perform model training based on the initial hyperparameter combinations, calculate r ² of the predicted data, and obtain the initial Bayesian data set as:

D ₀ ={(X ₁ ,y ₁ ),(X ₂ ,y ₂ ),…,(X _n ,y _n )}

Where _Xn represents the nth hyperparameter combination, and _yn is the corresponding ^r2 ;

S42: Substitute the initial Bayesian data set D ₀ into the Gaussian process, calculate its Gaussian distribution model, and use the acquisition function EI model to select the parameter combination with the largest r ² to obtain the next hyperparameter combination to be calculated;

S43: Calculate r ² of the next hyperparameter combination to be calculated based on the model, add it to the initial Bayesian data set D ₀ to form a new Bayesian data set D, update the Gaussian process regression, and repeat steps S42-S43 until the set maximum number of iterations is reached;

S44: After the model optimization stops, the hyperparameter combination with the largest r ² is selected as the optimal hyperparameter.

8. A method for establishing a steel production process data and mechanism fusion model according to claim 1, characterized in that: in step S5, a model is built based on the optimal hyperparameter combination obtained by Bayesian optimization, training set data is imported for model training, and model performance is evaluated using test set data.

9. A method for establishing a steel production process data and mechanism fusion model according to claim 8, characterized in that the indicators for model performance evaluation include mean square error (MSE), maximum percentage error (MAPE), and mean absolute error (MAE).