CN116231749A

CN116231749A - New energy power system dispatching method based on digital twin

Info

Publication number: CN116231749A
Application number: CN202310062843.3A
Authority: CN
Inventors: 白易杰; 张林音; 谭勇; 胡晓华; 王璐; 李�真
Original assignee: Neixiang Power Supply Co of State Grid Henan Electric Power Co Ltd
Current assignee: Neixiang Power Supply Co of State Grid Henan Electric Power Co Ltd
Priority date: 2023-01-17
Filing date: 2023-01-17
Publication date: 2023-06-06

Abstract

The invention belongs to the technical field of new energy power system dispatching, and particularly relates to a new energy power system dispatching method based on digital twinning; it comprises the following steps: constructing a short-term direct probability prediction model suitable for photovoltaic power; constructing an electric power digital twin system, and utilizing ubiquitous Internet of things and electric power data flow to assist decision making of power grid operation management regulation through real-time situation awareness and real-time virtual deduction; based on the new energy interval prediction result, a robust scheduling model is established, the automatic power generation control response of the unit is considered to cope with the new energy output fluctuation, and the power balance of the power grid is ensured; the invention has higher prediction area coverage rate and smaller average width ratio of the prediction interval, based on the existing real-time data information of each level at the dispatching side, the prediction model has real-time updating property, the applicability and generalization of the prediction model to the local power grid situation are improved, and the optimization and improvement of the existing dispatching system are realized based on the real-time prediction result of the new energy interval.

Description

New energy power system dispatching method based on digital twin

技术领域Technical Field

本发明属于新能源电力系统调度技术领域，具体涉及基于数字孪生的新能源电力系统调度方法。The present invention belongs to the technical field of new energy power system dispatching, and specifically relates to a new energy power system dispatching method based on digital twins.

背景技术Background Art

新能源大容量、大规模接入电网，风、光、氢等多种类综合新能源电力系统正在加速形成，电网调度面临新的挑战，稳定运行风险逐步加大。New energy is being connected to the power grid on a large scale with large capacity, and a variety of integrated new energy power systems such as wind, solar, and hydrogen are being formed at an accelerated pace. Grid dispatching is facing new challenges, and the risks of stable operation are gradually increasing.

为了保证大电网的安全稳定和经济运行，电网调度系统均配备了负荷预测模块，其可以在系统出现大幅负荷波动前做出预判，通过调整系统出力、切除负荷等紧急控制方式，防止电网出现电压暂降甚至中断等电能质量问题，为调度计划的合理制定奠定坚实基础。In order to ensure the safety, stability and economical operation of large power grids, power grid dispatching systems are equipped with load forecasting modules, which can make predictions before large load fluctuations occur in the system. By adjusting system output, shedding loads and other emergency control methods, it can prevent power quality problems such as voltage drops or even interruptions in the power grid, laying a solid foundation for the rational formulation of dispatching plans.

然而，不同于传统火电机组的灵活性与可控性，新能源具有强波动性和随机性，大规模新能源电站的接入在增加系统调度多样性的同时也降低了调度的可靠性，此时如何在充分调用多种能源形式的基础上合理安排调度计划？如何兼顾电网安全稳定运行和经济性？这些都成为新能源电力系统调度的难题，而增设新能源机组功率预测模块是解决这一难题的关键之一。However, unlike the flexibility and controllability of traditional thermal power units, new energy has strong volatility and randomness. The access of large-scale new energy power stations increases the diversity of system dispatching while also reducing the reliability of dispatching. At this time, how to reasonably arrange the dispatching plan based on the full use of various energy forms? How to balance the safe and stable operation of the power grid and economic efficiency? These have become the difficulties in the dispatching of new energy power systems, and adding a power prediction module for new energy units is one of the keys to solving this problem.

现有新能源预测系统一般根据电网运行进行功能定制，种类多样，精度和稳健性均有待提高；同时，现有新能源预测系统多为离线模型，采用历史数据进行模型训练，并没有充分利用实时气象数据、出力调整策略、传感器位置、通道信息等实时运行信息，由此导致模型泛化能力低，不能根据当地电网实际情况给出精度更高的预测结果。基于上述背景，现有调度平台对于新能源处于离线管理模式，电力调度人员无法及时掌握新能源装置及系统的实时运行状态，也无法根据当前实际运行出力进行合理调控。随着电网数字化转型和精细化运行要求的深入，为了进一步消纳新能源，进一步提升调控人员对新能源电网运行安全风险的驾驭能力，主站侧需要更为精准的预测新能源电站运行状态，并根据当前实际运行情况，合理调整电网调度计划。The existing new energy prediction systems are generally customized according to the operation of the power grid, with various types, and their accuracy and robustness need to be improved; at the same time, most of the existing new energy prediction systems are offline models, which use historical data for model training and do not make full use of real-time meteorological data, output adjustment strategies, sensor locations, channel information and other real-time operation information, which leads to low model generalization ability and cannot give more accurate prediction results according to the actual situation of the local power grid. Based on the above background, the existing dispatching platform is in an offline management mode for new energy, and the power dispatching personnel cannot grasp the real-time operation status of new energy devices and systems in time, nor can they reasonably regulate according to the current actual operation output. With the deepening of the digital transformation of the power grid and the requirements for refined operation, in order to further absorb new energy and further improve the control personnel's ability to control the safety risks of the new energy power grid operation, the master station side needs to more accurately predict the operation status of new energy power stations and reasonably adjust the power grid dispatching plan according to the current actual operation.

发明内容Summary of the invention

本发明的目的是为了克服现有技术的不足，而提供一种基于数字孪生的新能源电力系统调度方法，具有更高预测区域覆盖率和更小预测区间平均宽度占比，基于调度侧已有的各层次实时数据信息，使预测模型具备实时更新性，提高预测模型对当地电网情况的适用性和泛化性，基于新能源区间实时预测结果，实现了对现有调度系统的优化改进。The purpose of the present invention is to overcome the shortcomings of the prior art and to provide a new energy power system scheduling method based on digital twins, which has a higher prediction area coverage and a smaller average prediction interval width ratio. Based on the existing real-time data information at all levels on the scheduling side, the prediction model has real-time updating capabilities, thereby improving the applicability and generalization of the prediction model to local power grid conditions. Based on the real-time prediction results of new energy intervals, the existing scheduling system is optimized and improved.

本发明的目的是这样实现的：基于数字孪生的新能源电力系统调度方法，它包括以下步骤：The object of the present invention is achieved by: a new energy power system dispatching method based on digital twin, which comprises the following steps:

步骤S1、构建适用于光伏功率的短期直接概率预测模型；Step S1, constructing a short-term direct probability prediction model suitable for photovoltaic power;

步骤S2、构建电力数字孪生系统，利用泛在物联网，电力数据流，通过实时态势感知和实时虚拟推演，辅助电网运管调控的决策制定；Step S2: construct a power digital twin system, use ubiquitous Internet of Things, power data stream, and assist in decision-making for power grid operation and management through real-time situation awareness and real-time virtual simulation;

步骤S3、基于新能源区间预测结果，建立一种鲁棒调度模型，考虑机组的自动发电控制(AGC)响应来应对新能源出力波动，保证电网功率平衡。Step S3: Based on the prediction results of the new energy interval, a robust scheduling model is established, which takes into account the automatic generation control (AGC) response of the unit to cope with the fluctuation of new energy output and ensure the power balance of the power grid.

所述步骤S1构建适用于光伏功率的短期直接概率预测模型包括：对初始数据集进行预处理，基于图基检测剔除数据集异常值，利用灰色关联分析提取强关联气象变量；借助一般化自然梯度计算方法，改进NGBoost元模型；利用Blending架构进行融合，进一步强化模型学习效果；The step S1 of constructing a short-term direct probability prediction model suitable for photovoltaic power includes: preprocessing the initial data set, removing outliers from the data set based on graph-based detection, and extracting strongly correlated meteorological variables using grey correlation analysis; improving the NGBoost metamodel with the help of a generalized natural gradient calculation method; and using a blending architecture for fusion to further enhance the model learning effect;

模型数据集D包含n_D个样本，m个特征，即D＝{(x_i，y_i)}(x_i∈R^m，y_i∈R)，其中x_i表征第i个样本的特征向量，y_i表征第i个样本对应标签值(真实值)，i∈(1，n_D)。The model dataset D ^contains _nD samples and m features, that is, D = {( _xi , _yi )}( _xi∈Rm , _yi∈R ), _wherexi represents the feature vector of the ith sample, _yi represents the label value (true value) corresponding to the ith sample, i∈(1, _nD ).

所述对初始数据集进行预处理，基于图基检测剔除数据集异常值，利用灰色关联分析提取强关联气象变量包括：The preprocessing of the initial data set, removing outliers from the data set based on graph-based detection, and extracting strongly correlated meteorological variables using grey correlation analysis include:

1)对各变量时间序列做归一化处理，以n个气象变量序列中的第k个为比较序列S^k(t)，光伏功率序列为参考序列S⁰(t)，求取两者差计作绝对值序列Δ^k(t)，其中k∈(1,n)；1) Normalize the time series of each variable, take the kth of the n meteorological variable sequences as the comparison sequence S ^k (t), and the photovoltaic power sequence as the reference sequence S ⁰ (t), and calculate the difference between the two as the absolute value sequence Δ ^k (t), where k∈(1,n);

Δ^k(t)＝|S^k(t)-S⁰(t)| (1)Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (1)

2)计算关联系数η^k(t)：2) Calculate the correlation coefficient η ^k (t):

式中：Min(·)、Max(·)表示求取序列的最小、最大值；ρ为分辨系数；Where: Min(·) and Max(·) represent the minimum and maximum values of the sequence; ρ is the resolution coefficient;

3)求解关联度γ^k：3) Solve the correlation degree γ ^k :

式中：T_n为序列长度；Where: _Tn is the sequence length;

4)设定阈值

选取关联度大于阈值的变量，组成新的模型数据集。4) Set the threshold

Variables with correlation greater than the threshold are selected to form a new model data set.

所述借助一般化自然梯度计算方法，改进NGBoost元模型包括：The improved NGBoost metamodel by using the generalized natural gradient calculation method includes:

对自然梯度的求解过程进行改进，通过Fisher信息量在一般梯度和自然梯度间建立起联系，具体如下：The solution process of the natural gradient is improved, and the connection between the general gradient and the natural gradient is established through the Fisher information, as follows:

以y_i的香农信息量为基准建立评分函数S(θ，y_i)，The scoring function S(θ, y _i ) is established based on the Shannon information of y _i .

S(θ,y_i)＝-log P_θ(y_i) (4)S(θ,y _i )＝-log P _θ (y _i ) (4)

式中：P_θ(y_i)为y_i在预测概率分布中的概率值；θ为预测概率分布的参数向量；Where: P _θ (y _i ) is the probability value of y _i in the predicted probability distribution; θ is the parameter vector of the predicted probability distribution;

进行泰勒展开并舍去三阶及以上余项：Perform Taylor expansion and discard the remainder of order 3 and above:

式中：d’为θ沿

移动的无穷小步长向量；

表示自然梯度；Where: d' is the

infinitesimal step vector of movement;

represents the natural gradient;

将欧式空间转化为统计流形，在黎曼空间下处理式(5)，Transform the Euclidean space into a statistical manifold and process equation (5) in Riemann space:

其中一次项的计算可简化为：The calculation of the first-order term can be simplified as:

将剩余部分表示为：The remainder is expressed as:

式中：Where:

由此实现通过一般梯度计算自然梯度：This allows the calculation of the natural gradient through the general gradient:

基于式(10)建立改进NGBoost元模型：以θ°为初始参数向量，计算进行到第m次迭代，通过式(10)由普通梯度计算y_i及其对应参数向量

的自然梯度

并沿该自然梯度方向生成一组新的基学习器，从而实现参数向量更新，最终预测结果可以表示为式(11)：Based on formula (10), the improved NGBoost metamodel is established: Taking θ° as the initial parameter vector, the calculation is performed to the mth iteration, and y _i and its corresponding parameter vector are calculated by ordinary gradient through formula (10)

Natural gradient

A new set of base learners is generated along the natural gradient direction to update the parameter vector. The final prediction result can be expressed as formula (11):

式中：α^m为比例因子；β为统一学习率；B^m为基学习器的统一表示，每个样本点的取值大小满足高斯分布，即θ＝(μ，σ)，θ的第m个训练阶段对应产生两个基学习器

将其统一表示为

Where: ^αm is the scaling factor; β is the uniform learning rate; ^Bm is the uniform representation of the base learner, and the value of each sample point satisfies the Gaussian distribution, that is, θ = (μ, σ), and the mth training stage of θ corresponds to the generation of two base learners

It is uniformly expressed as

所述利用Blending架构进行融合，进一步强化模型学习效果包括：The use of the Blending architecture for fusion further enhances the model learning effect, including:

1)原始数据集分割1) Original dataset segmentation

将原始训练集按比例划分为子训练集DT和测试集DA，定义原始预测数据集为DP；The original training set is divided into a sub-training set DT and a test set DA in proportion, and the original prediction data set is defined as DP;

2)模型融合2) Model Fusion

给定置信水平，构建V个NGBoost元模型MO₁、MO₂、…、MO_V，利用这些元模型对DT进行学习，训练完成后，输出DA、DP在元模型上的预测结果DA_P、DP_P；其中DA_P、DP_P为DA、DP对应预测值的初始统计参数向量；Given a confidence level, construct V NGBoost meta-models MO ₁ , MO ₂ , …, MO _V , use these meta-models to learn DT, and after training, output the prediction results DA_P and DP_P of DA and DP on the meta-model; where DA_P and DP_P are the initial statistical parameter vectors of the corresponding prediction values of DA and DP;

将DA_P确定的预测均值与原DA数据对应实际结果DA_OUT组成新的数据集，建立新的元模型MO_DA进行训练并得到预测输出MO_DA_P；其中MO_DA_P为修正后预测统计参数向量，与DA_P相比，MO_DA_P具有更高的准确性和更小的锐度表现，体现出模型融合的优势。The predicted mean determined by DA_P and the actual result DA_OUT corresponding to the original DA data are combined into a new data set, and a new meta-model MO _DA is established for training to obtain the predicted output MO _DA _P; among them, MO _DA _P is the corrected prediction statistical parameter vector. Compared with DA_P, MO _DA _P has higher accuracy and smaller sharpness, reflecting the advantages of model fusion.

将MO_DA_P与DP_P组成新的数据集，建立新的元模型MO_P进行训练，从而输出最终预测统计参数向量，通过这个向量可以计算出给定置信水平下预测值的上限和下限，由这些点连接成预测值上下限曲线。MO _DA _P and DP_P are combined into a new data set, and a new meta-model MO _P is established for training, so as to output the final prediction statistical parameter vector. Through this vector, the upper and lower limits of the prediction value under a given confidence level can be calculated, and these points are connected to form the upper and lower limit curves of the prediction value.

所述步骤S2构建电力数字孪生系统，利用泛在物联网，电力数据流，通过实时态势感知和实时虚拟推演，辅助电网运管调控的决策制定包括：基于调度侧已有的各层次数据信息，结合准实时的调度日志、检修单以及调度主站EMS系统中的线路情况、机组功率信息等，对预测模型进行实时训练和更新。The step S2 constructs a power digital twin system, and utilizes the ubiquitous Internet of Things and power data streams to assist in decision-making for power grid operation, management and control through real-time situation awareness and real-time virtual deduction, including: based on the existing data information at all levels on the dispatching side, combined with quasi-real-time dispatching logs, maintenance orders, and line conditions and unit power information in the dispatching master station EMS system, real-time training and updating of the prediction model.

所述步骤S3基于新能源区间预测结果，建立一种鲁棒调度模型，考虑机组的自动发电控制(AGC)响应来应对新能源出力波动，保证电网功率平衡包括：The step S3 establishes a robust dispatch model based on the prediction result of the new energy interval, considers the automatic generation control (AGC) response of the unit to cope with the fluctuation of the new energy output, and ensures the power balance of the power grid, including:

提出预测区域覆盖率I_F及预测区间平均宽度占比I_P作为基础指标，建立综合得分I_C作为最终指标，上述指标具体计算方法如下：The prediction area coverage rate I _F and the prediction interval average width ratio I _P are proposed as basic indicators, and the comprehensive score I _C is established as the final indicator. The specific calculation methods of the above indicators are as follows:

1)预测区域覆盖率1) Prediction of regional coverage

通过引入I_F衡量概率预测结果的精度，从而量化模型的可靠性，该指标以给定置信水平下实际值落入置信区间内的数目为参考，值越大说明模型越精确，By introducing _IF to measure the accuracy of probability prediction results, the reliability of the model can be quantified. This indicator uses the number of actual values falling within the confidence interval at a given confidence level as a reference. The larger the value, the more accurate the model.

式中：N_t为预测样本数；Ω_i为第i个样本是否落入置信区间的标记值，格式为布尔常量，样本落入计作1，未落入计作0；Where: _Nt is the number of predicted samples; _Ωi is the mark value of whether the i-th sample falls into the confidence interval, the format is a Boolean constant, the sample falls into the confidence interval is counted as 1, and the sample does not fall into the confidence interval is counted as 0;

2)预测区间平均宽度占比2) Average width of the prediction interval

通过引入I_P衡量概率预测结果的锐度，避免出现单纯追求I_F导致置信区间过宽，预测结果失去参考价值的情况，I_P值越大说明置信区间越宽，预测分布的锐度越大，预测效果越差；By introducing _IP to measure the sharpness of probability prediction results, we can avoid the situation where the confidence interval is too wide due to the simple pursuit of _IF , and the prediction results lose their reference value. The larger the _IP value, the wider the confidence interval, the sharper the prediction distribution, and the worse the prediction effect.

式中：I_P0为初始参数下的置信区间宽度；U_i、L_i为第i个预测样本对应置信区间的上限及下限值；Where: I _P0 is the confidence interval width under the initial parameters; U _i , L _i are the upper and lower limits of the confidence interval corresponding to the i-th prediction sample;

3)综合得分3) Comprehensive score

引入I_C对I_F及I_P进行综合评判，I_C值越高说明模型整体表现越好，在保证精度的同时减小了锐度， _IC is introduced to comprehensively evaluate I _F and I _P. The higher the _IC value, the better the overall performance of the model. It reduces the sharpness while ensuring the accuracy.

所述构建电力数字孪生系统包括：The construction of the power digital twin system includes:

采集离线信息：涉及的电网一次设备参数，包括厂站、发电机组、主变压器、母线、线路、安控装置信息等；采集准实时信息：调度指令、检修单、操作票等信息；采集实时信息：实时气象数据、安控装置故障判别以及待预测机组涉及的相关线路潮流、机组功率、相关开关状态信息。Collect offline information: parameters of primary equipment in the power grid, including plant, generator set, main transformer, bus, line, safety and control device information, etc.; collect quasi-real-time information: dispatch instructions, maintenance orders, operation tickets and other information; collect real-time information: real-time meteorological data, safety and control device fault diagnosis, and related line flow, unit power, and related switch status information involved in the predicted unit.

本发明的有益效果：本发明的基于数字孪生的新能源电力系统调度方法，包括步骤S1、构建适用于光伏功率的短期直接概率预测模型；步骤S2、构建电力数字孪生系统，利用泛在物联网，电力数据流，通过实时态势感知和实时虚拟推演，辅助电网运管调控的决策制定；步骤S3、基于新能源区间预测结果，建立一种鲁棒调度模型，考虑机组的自动发电控制(AGC)响应来应对新能源出力波动，保证电网功率平衡；本发明基于数字孪生的新能源电力系统调度方法，具有更高预测区域覆盖率和更小预测区间平均宽度占比，基于调度侧已有的各层次实时数据信息，使预测模型具备实时更新性，提高预测模型对当地电网情况的适用性和泛化性，基于新能源区间实时预测结果，实现了对现有调度系统的优化改进。Beneficial effects of the present invention: The digital twin-based new energy power system scheduling method of the present invention comprises step S1, constructing a short-term direct probability prediction model suitable for photovoltaic power; step S2, constructing a power digital twin system, utilizing ubiquitous Internet of Things and power data stream, through real-time situation perception and real-time virtual deduction, to assist in decision-making for power grid operation and management; step S3, based on the new energy interval prediction results, establishing a robust scheduling model, considering the automatic generation control (AGC) response of the unit to cope with the new energy output fluctuation, and ensuring the power balance of the power grid; the digital twin-based new energy power system scheduling method of the present invention has a higher prediction area coverage rate and a smaller average prediction interval width ratio, and based on the existing real-time data information of various levels on the scheduling side, the prediction model has real-time updateability, improves the applicability and generalization of the prediction model to the local power grid conditions, and realizes the optimization and improvement of the existing scheduling system based on the real-time prediction results of the new energy interval.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明基于数字孪生的新能源电力系统调度方法的流程示意图。FIG1 is a flow chart of a new energy power system dispatching method based on digital twins according to the present invention.

图2为Blending融合步骤示意图。FIG. 2 is a schematic diagram of the blending steps.

具体实施方式DETAILED DESCRIPTION

下面结合附图对本发明做进一步的说明。The present invention will be further described below in conjunction with the accompanying drawings.

基于数字孪生的新能源电力系统调度方法，如附图1所示，它包括以下步骤：The new energy power system dispatching method based on digital twin, as shown in Figure 1, includes the following steps:

针对集成学习算法在概率预测问题中的应用缺陷，AndrewY.Ng领衔的斯坦福大学团队提出一种自然梯度提升(Natural Gradient Boosting,NGBoost)模型，虽然实现了Boosting类算法的推广应用，但其在解决光伏功率短期直接概率预测的实际工程问题时仍存在以下缺陷：1)模型缺少数据预处理环节，NGBoost模型对不同光伏场的泛化能力及鲁棒性较弱；2)自然梯度本身计算原理复杂，实际工程应用困难；3)单一NGBoost元模型难以保证概率预测的精度和锐度。In response to the application defects of ensemble learning algorithms in probability prediction problems, a Stanford University team led by Andrew Y. Ng proposed a Natural Gradient Boosting (NGBoost) model. Although it has achieved the promotion and application of Boosting algorithms, it still has the following defects when solving the actual engineering problems of short-term direct probability prediction of photovoltaic power: 1) The model lacks data preprocessing, and the generalization ability and robustness of the NGBoost model for different photovoltaic fields are weak; 2) The calculation principle of the natural gradient itself is complex, and it is difficult to apply it in actual engineering; 3) A single NGBoost meta-model is difficult to guarantee the accuracy and sharpness of probability prediction.

为了更好的效果，所述步骤S1构建适用于光伏功率的短期直接概率预测模型包括：对初始数据集进行预处理，基于图基检测剔除数据集异常值，利用灰色关联分析提取强关联气象变量；借助一般化自然梯度计算方法，改进NGBoost元模型；利用Blending架构进行融合，进一步强化模型学习效果；For better results, the step S1 constructs a short-term direct probability prediction model suitable for photovoltaic power, including: preprocessing the initial data set, removing data set outliers based on graph-based detection, extracting strongly correlated meteorological variables using grey correlation analysis; improving the NGBoost metamodel with the help of a generalized natural gradient calculation method; and using the Blending architecture for fusion to further enhance the model learning effect;

考虑测量误差等实际工程情况，初始数据集存在较多异常值，会导致预测模型整体偏移。因此本项目首先利用统计学家约翰·图基提出的箱型图剔除初始数据中的异常值。Considering actual engineering situations such as measurement errors, there are many outliers in the initial data set, which will cause the overall deviation of the prediction model. Therefore, this project first uses the box plot proposed by statistician John Tukey to eliminate outliers in the initial data.

光伏功率与湿度、云量等气象变量相关。然而，受光伏场地理位置及所在地局部小气候等因素影响，多种气象变量与光伏功率间相关性在不同光伏场间存在差异。Photovoltaic power is related to meteorological variables such as humidity and cloud cover. However, due to factors such as the geographical location of the photovoltaic field and the local microclimate of the location, the correlation between various meteorological variables and photovoltaic power varies among different photovoltaic fields.

为了更好的效果，所述对初始数据集进行预处理，基于图基检测剔除数据集异常值，利用灰色关联分析提取强关联气象变量包括：For better results, the initial data set is preprocessed, outliers in the data set are removed based on graph-based detection, and strongly correlated meteorological variables are extracted using grey correlation analysis, including:

Δ^k(t)＝|S^k(t)-S⁰(t)| (1)Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (1)

3)求解关联度γ^k：3) Solve the correlation degree γ ^k :

式中：T_n为序列长度；Where: _Tn is the sequence length;

4)设定阈值

NGBoost的关键在于自然梯度的求解，然而相关概念取自极为复杂的信息几何学，为实际工程中的推广应用带来不便。The key to NGBoost lies in the solution of natural gradient. However, the relevant concepts are derived from extremely complex information geometry, which brings inconvenience to its promotion and application in practical engineering.

为了更好的效果，所述借助一般化自然梯度计算方法，改进NGBoost元模型包括：For better results, the improved NGBoost metamodel by using the generalized natural gradient calculation method includes:

S(θ,y_i)＝-log P_θ(y_i) (4)S(θ,y _i )＝-log P _θ (y _i ) (4)

式中：d’为θ沿

移动的无穷小步长向量；

表示自然梯度；Where: d' is the

infinitesimal step vector of movement;

represents the natural gradient;

将剩余部分表示为：The remainder is expressed as:

式中：Where:

的自然梯度

Natural gradient

将其统一表示为

It is uniformly expressed as

对元模型进行融合既能强化学习效果，又不至于造成整体模型的过度冗余，近年来在解决预测问题时得到广泛应用，尤以Stacking模型融合为甚。然而，Stacking模型融合过于复杂，训练过程中会出现训练数据引用全局统计量的数据穿越问题，不适用于解决概率预测问题。Fusion of meta-models can enhance learning effects without causing excessive redundancy in the overall model. In recent years, it has been widely used in solving prediction problems, especially Stacking model fusion. However, Stacking model fusion is too complex, and data crossing problems will occur during training when the training data references global statistics, which is not suitable for solving probability prediction problems.

为了更好的效果，所述利用Blending架构进行融合，进一步强化模型学习效果包括：In order to achieve better results, the Blending architecture is used to further enhance the model learning effect, including:

1)原始数据集分割1) Original dataset segmentation

2)模型融合2) Model Fusion

为了更好的效果，所述步骤S2构建电力数字孪生系统，利用泛在物联网，电力数据流，通过实时态势感知和实时虚拟推演，辅助电网运管调控的决策制定包括：基于调度侧已有的各层次数据信息，结合准实时的调度日志、检修单以及调度主站EMS系统中的线路情况、机组功率信息等，对预测模型进行实时训练和更新。For better results, step S2 constructs a power digital twin system, utilizes the ubiquitous Internet of Things and power data streams, and assists in decision-making for power grid operation, management and control through real-time situation awareness and real-time virtual deduction, including: based on the existing data information at all levels on the dispatching side, combined with quasi-real-time dispatching logs, maintenance orders, and line conditions and unit power information in the dispatching master station EMS system, real-time training and updating of the prediction model.

为了更好的效果，所述步骤S3基于新能源区间预测结果，建立一种鲁棒调度模型，考虑机组的自动发电控制(AGC)响应来应对新能源出力波动，保证电网功率平衡包括：For better results, step S3 establishes a robust dispatch model based on the prediction results of the new energy interval, considers the automatic generation control (AGC) response of the unit to cope with the fluctuation of new energy output, and ensures the power balance of the power grid, including:

1)预测区域覆盖率1) Prediction of regional coverage

2)预测区间平均宽度占比2) Average width of the prediction interval

3)综合得分3) Comprehensive score

为了更好的效果，所述构建电力数字孪生系统包括：For better results, the construction of the power digital twin system includes:

综述，本发明的基于数字孪生的新能源电力系统调度方法，包括步骤S1、构建适用于光伏功率的短期直接概率预测模型；步骤S2、构建电力数字孪生系统，利用泛在物联网，电力数据流，通过实时态势感知和实时虚拟推演，辅助电网运管调控的决策制定；步骤S3、基于新能源区间预测结果，建立一种鲁棒调度模型，考虑机组的自动发电控制(AGC)响应来应对新能源出力波动，保证电网功率平衡；本发明基于数字孪生的新能源电力系统调度方法，具有更高预测区域覆盖率和更小预测区间平均宽度占比，基于调度侧已有的各层次实时数据信息，使预测模型具备实时更新性，提高预测模型对当地电网情况的适用性和泛化性，基于新能源区间实时预测结果，实现了对现有调度系统的优化改进。In summary, the new energy power system scheduling method based on digital twins of the present invention includes step S1, constructing a short-term direct probability prediction model suitable for photovoltaic power; step S2, constructing a power digital twin system, using ubiquitous Internet of Things, power data flow, through real-time situation perception and real-time virtual deduction, to assist in the decision-making of power grid operation and control; step S3, based on the new energy interval prediction results, establishing a robust scheduling model, considering the automatic generation control (AGC) response of the unit to cope with the new energy output fluctuations, to ensure the power balance of the power grid; the new energy power system scheduling method based on digital twins of the present invention has a higher prediction area coverage rate and a smaller prediction interval average width ratio. Based on the existing real-time data information of various levels on the scheduling side, the prediction model has real-time updateability, improves the applicability and generalization of the prediction model to the local power grid conditions, and based on the real-time prediction results of the new energy interval, realizes the optimization and improvement of the existing scheduling system.

Claims

1. A new energy power system dispatching method based on digital twins, characterized in that it includes the following steps:

Step S1, constructing a short-term direct probability prediction model suitable for photovoltaic power;

Step S2: construct a power digital twin system, use ubiquitous Internet of Things, power data stream, and assist in decision-making for power grid operation and management through real-time situation awareness and real-time virtual simulation;

Step S3: Based on the prediction results of the new energy interval, a robust scheduling model is established, which takes into account the automatic generation control (AGC) response of the unit to cope with the fluctuation of new energy output and ensure the power balance of the power grid.

2. The method for dispatching a new energy power system based on digital twins according to claim 1 is characterized in that the step S1 of constructing a short-term direct probability prediction model suitable for photovoltaic power comprises: preprocessing the initial data set, removing outliers from the data set based on graph-based detection, and extracting strongly correlated meteorological variables using grey correlation analysis; improving the NGBoost metamodel with the help of a generalized natural gradient calculation method; and using a blending architecture for fusion to further enhance the model learning effect;

The model dataset D ^contains _nD samples and m features, that is, D = {( _xi , _yi )}( _xi∈Rm , _yi∈R ), _wherexi represents the feature vector of the ith sample, _yi represents the label value (true value) corresponding to the ith sample, i∈(1, _nD ).

3. The method for dispatching a new energy power system based on digital twins according to claim 2 is characterized in that the preprocessing of the initial data set, removing abnormal values of the data set based on graph-based detection, and extracting strongly correlated meteorological variables using grey correlation analysis include:

1) Normalize the time series of each variable, take the kth of the n meteorological variable sequences as the comparison sequence S ^k (t), and the photovoltaic power sequence as the reference sequence S ⁰ (t), and calculate the difference between the two as the absolute value sequence Δ ^k (t), where k∈(1,n);

Δ ^k (t)＝|S ^k (t)-S ⁰ (t)| (1)

2) Calculate the correlation coefficient η ^k (t):

Where: Min(·) and Max(·) represent the minimum and maximum values of the sequence; ρ is the resolution coefficient;

3) Solve the correlation degree γ ^k :

Where: _Tn is the sequence length;

4) Set the threshold

4. The method for dispatching a new energy power system based on digital twins according to claim 2, wherein the step of improving the NGBoost metamodel by means of a generalized natural gradient calculation method comprises:

The solution process of the natural gradient is improved, and the connection between the general gradient and the natural gradient is established through the Fisher information, as follows:

The scoring function S(θ, y _i ) is established based on the Shannon information of y _i .

S(θ,y _i )＝-log P _θ (y _i ) (4)

Where: P _θ (y _i ) is the probability value of y _i in the predicted probability distribution; θ is the parameter vector of the predicted probability distribution;

Perform Taylor expansion and discard the remainder of order 3 and above:

Where: d′ is the

infinitesimal step vector of movement;

represents the natural gradient;

Transform the Euclidean space into a statistical manifold and process equation (5) in Riemann space:

The calculation of the first-order term can be simplified as:

The remainder is expressed as:

Where:

This allows the calculation of the natural gradient through the general gradient:

Based on formula (10), the improved NGBoost metamodel is established: Taking θ° as the initial parameter vector, the calculation is performed to the mth iteration, and y _i and its corresponding parameter vector are calculated by ordinary gradient through formula (10)

Natural gradient

It is uniformly expressed as

5. The method for dispatching a new energy power system based on digital twins according to claim 2, characterized in that the fusion using the Blending architecture to further enhance the model learning effect includes:

1) Original dataset segmentation

The original training set is divided into a sub-training set DT and a test set DA in proportion, and the original prediction data set is defined as DP;

2) Model Fusion

Given a confidence level, construct V NGBoost meta-models MO ₁ , MO ₂ , …, MO _V , use these meta-models to learn DT, and after training, output the prediction results DA_P and DP_P of DA and DP on the meta-model; where DA_P and DP_P are the initial statistical parameter vectors of the corresponding prediction values of DA and DP;

The predicted mean determined by DA_P and the actual result DA_OUT corresponding to the original DA data are combined into a new data set, and a new meta-model MO _DA is established for training to obtain the predicted output MO _DA _P; among them, MO _DA _P is the corrected prediction statistical parameter vector. Compared with DA_P, MO _DA _P has higher accuracy and smaller sharpness, reflecting the advantages of model fusion.

MO _DA _P and DP_P are combined into a new data set, and a new meta-model MO _P is established for training, so as to output the final prediction statistical parameter vector. Through this vector, the upper and lower limits of the prediction value under a given confidence level can be calculated, and these points are connected to form the upper and lower limit curves of the prediction value.

6. The new energy power system dispatching method based on digital twins as described in claim 1 is characterized in that the step S2 constructs a power digital twin system, utilizes the ubiquitous Internet of Things, power data flow, and assists in the decision-making of power grid operation and management through real-time situation awareness and real-time virtual deduction, including: based on the existing various levels of data information on the dispatching side, combined with quasi-real-time dispatching logs, maintenance orders, and line conditions and unit power information in the dispatching master station EMS system, real-time training and updating of the prediction model.

7. The method for dispatching a new energy power system based on digital twins according to claim 1 is characterized in that, in step S3, a robust dispatching model is established based on the prediction results of the new energy interval, and the automatic generation control (AGC) response of the unit is considered to cope with the fluctuation of the new energy output, and ensuring the power balance of the power grid includes:

The prediction area coverage rate I _F and the prediction interval average width ratio I _P are proposed as basic indicators, and the comprehensive score I _C is established as the final indicator. The specific calculation methods of the above indicators are as follows:

1) Prediction of regional coverage

By introducing _IF to measure the accuracy of probability prediction results, the reliability of the model can be quantified. This indicator uses the number of actual values falling within the confidence interval at a given confidence level as a reference. The larger the value, the more accurate the model.

Where: _Nt is the number of predicted samples; _Ωi is the mark value of whether the i-th sample falls into the confidence interval, the format is a Boolean constant, the sample falls into the confidence interval is counted as 1, and the sample does not fall into the confidence interval is counted as 0;

2) Average width of the prediction interval

By introducing _IP to measure the sharpness of probability prediction results, we can avoid the situation where the confidence interval is too wide due to the simple pursuit of _IF , and the prediction results lose their reference value. The larger the _IP value, the wider the confidence interval, the sharper the prediction distribution, and the worse the prediction effect.

Where: I _P0 is the confidence interval width under the initial parameters; U _i , L _i are the upper and lower limits of the confidence interval corresponding to the i-th prediction sample;

3) Comprehensive score

_IC is introduced to comprehensively evaluate I _F and I _P. The higher the _IC value, the better the overall performance of the model. It reduces the sharpness while ensuring the accuracy.

8. The method for dispatching a new energy power system based on digital twins according to claim 6, wherein the construction of a power digital twin system comprises:

Collect offline information: parameters of primary equipment in the power grid, including plant, generator set, main transformer, bus, line, safety and control device information, etc.; collect quasi-real-time information: dispatch instructions, maintenance orders, operation tickets and other information; collect real-time information: real-time meteorological data, safety and control device fault diagnosis, and related line flow, unit power, and related switch status information involved in the predicted unit.