CN108599152A

CN108599152A - The key stato variable choosing method and device of power system transient stability assessment

Info

Publication number: CN108599152A
Application number: CN201810439638.3A
Authority: CN
Inventors: 陈颖; 凡航; 黄少伟; 沈沉; 梅生伟; 周二专; 冯东豪; 史东宇; 严剑锋; 张磊
Original assignee: Tsinghua University; China Electric Power Research Institute Co Ltd CEPRI; Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd; Beijing Kedong Electric Power Control System Co Ltd
Current assignee: Tsinghua University; China Electric Power Research Institute Co Ltd CEPRI; Electric Power Research Institute of State Grid Shandong Electric Power Co Ltd; Beijing Kedong Electric Power Control System Co Ltd
Priority date: 2018-05-09
Filing date: 2018-05-09
Publication date: 2018-09-28

Abstract

The present invention provides the key stato variable choosing method and device of power system transient stability assessment, the method includes：S1 is obtained multiple state variables in Power System Dynamic Simulation data, is pre-processed to the multiple state variable using fft algorithm；S2 obtains multiple key stato variables to carrying out Feature Selection by pretreated the multiple state variable；S3 carries out dimensionality reduction to each key stato variable.Selection and dimensionality reduction of the present invention by key stato variable can significantly shorten training time and the classification time of transient stability classification device, be more suitable for application on site in the case where not reducing transient stability arbiter nicety of grading.

Description

Key state variable selection method and device for power system transient stability assessment

技术领域technical field

本发明涉及电力系统暂态稳定性评估领域，更具体地，涉及电力系统暂态稳定性评估的关键状态变量选取方法及装置。The invention relates to the field of power system transient stability evaluation, and more specifically, to a method and device for selecting key state variables for power system transient stability evaluation.

背景技术Background technique

随着各种新能源的接入以及特高压输电线路的采用，电力系统变得更加的复杂。电力系统运行中经常遭受到各种各样的大扰动，特别是运行线路接地和短路故障，使系统可能面临暂态不稳定的问题。暂态稳定性的破坏是造成电网灾难性事故的主要因素，因此，快速、准确的暂态稳定性评估方法对保证电网安全稳定运行具有重要意义。With the access of various new energy sources and the adoption of UHV transmission lines, the power system has become more complex. The power system is often subjected to various large disturbances during operation, especially the grounding and short-circuit faults of the operating line, which may cause the system to face transient instability problems. The destruction of transient stability is the main factor causing catastrophic accidents in the power grid. Therefore, fast and accurate transient stability evaluation methods are of great significance to ensure the safe and stable operation of the power grid.

传统的利用数值计算仿真方法对电力系统的暂态稳定性进行评估存在模型复杂、速度较慢等特点。随着人工智能的迅速崛起，电网运行数据的积累和大数据方法的采用，给基于机器学习对电力系统的暂态稳定性进行预测带来了新的思路。机器学习将电力系统的暂态稳定性评估转换为模式识别问题，系统暂态稳定性和某些描述系统运行状态的特征量之间具有某种映射关系，本质上是一个“离线仿真训练，在线匹配应用”的过程，即通过离线仿真分析，获取能够充分反映这种映射关系的样本，通过样本集的学习选取未知映射的函数关系，这种函数关系一旦获得，就可对在线运行状态下的系统暂态稳定性进行分类评估。基于机器学习的电力系统暂态稳定性评估方法能够加快电力系统仿真的速度。The traditional numerical simulation method to evaluate the transient stability of the power system has the characteristics of complex model and slow speed. With the rapid rise of artificial intelligence, the accumulation of power grid operation data and the adoption of big data methods have brought new ideas to the prediction of power system transient stability based on machine learning. Machine learning transforms the transient stability evaluation of power systems into pattern recognition problems. There is a certain mapping relationship between the system transient stability and some feature quantities describing the operating state of the system, which is essentially an "offline simulation training, online The process of "matching application" is to obtain samples that can fully reflect the mapping relationship through offline simulation analysis, and select the functional relationship of the unknown mapping through the learning of the sample set. The transient stability of the system is classified and evaluated. The power system transient stability assessment method based on machine learning can speed up the power system simulation.

目前基于机器学习的电力系统暂态稳定性评估方法，直接利用电网所有运行状态变量的动态仿真数据进行特征选取，虽然能够充分利用电力系统的运行状态信息，但是可能会造成“维数灾”的问题，不仅分析难度较大，训练耗时较长，而且容易忽略掉关键运行状态变量的信息，降低暂态稳定性评估的准确性。At present, the transient stability evaluation method of power system based on machine learning directly uses the dynamic simulation data of all operating state variables of the power grid for feature selection. Although it can make full use of the operating state information of the power system, it may cause "dimension disaster". The problem is not only difficult to analyze, but also takes a long time to train, and it is easy to ignore the information of key operating state variables, reducing the accuracy of transient stability evaluation.

发明内容Contents of the invention

本发明提供一种克服上述问题或者至少部分地解决上述问题的电力系统暂态稳定性评估的关键状态变量选取方法及装置。The present invention provides a key state variable selection method and device for power system transient stability evaluation which overcomes the above problems or at least partly solves the above problems.

根据本发明的一个方面，提供电力系统暂态稳定性评估的关键状态变量选取方法，包括：According to one aspect of the present invention, a key state variable selection method for power system transient stability assessment is provided, including:

S1，获取电力系统动态仿真数据中的多个状态变量，利用FFT算法对所述多个状态变量进行预处理；S1. Obtain a plurality of state variables in the dynamic simulation data of the power system, and use an FFT algorithm to preprocess the plurality of state variables;

S2，对经过预处理后的所述多个状态变量进行特征选取，获得多个关键状态变量；S2, performing feature selection on the plurality of preprocessed state variables to obtain a plurality of key state variables;

S3，对每个所述关键状态变量进行降维。S3. Perform dimensionality reduction on each of the key state variables.

其中，所述电力系统动态仿真数据进一步包括：故障切除后的电力系统动态波形数据。Wherein, the power system dynamic simulation data further includes: power system dynamic waveform data after fault removal.

其中，所述步骤S1进一步包括：Wherein, the step S1 further includes:

利用FFT算法提取电力系统动态仿真数据中每个状态变量的时间序列的前n次谐波的实部和虚部，使每个状态变量的时间序列属性转换为2n个实数属性，其中，n为大于1的自然数。Use the FFT algorithm to extract the real and imaginary parts of the first n harmonics of the time series of each state variable in the power system dynamic simulation data, so that the time series attributes of each state variable can be converted into 2n real number attributes, where n is A natural number greater than 1.

其中，所述步骤S2进一步包括：Wherein, the step S2 further includes:

根据每个状态变量的2n个实数属性，利用Relief算法筛选出权重靠前的状态变量，获得多个关键状态变量。According to the 2n real number attributes of each state variable, the state variables with the highest weight are screened out by using the Relief algorithm, and multiple key state variables are obtained.

其中，所述根据每个状态变量的2n个实数属性，利用Relief算法筛选出权重靠前的状态变量，获得多个关键状态变量的步骤进一步包括：Wherein, according to the 2n real number attributes of each state variable, the step of obtaining a plurality of key state variables further includes:

S21，从多个动态仿真数据中随机选择一个动态仿真数据R，然后从和动态仿真数据R同类的样本集中寻找最近邻的动态仿真数据H，从和动态仿真数据R不同类的样本集中寻找最近邻的动态仿真数据M；S21, randomly select a dynamic simulation data R from a plurality of dynamic simulation data, and then search for the nearest neighbor dynamic simulation data H from a sample set of the same type as the dynamic simulation data R, and search for the nearest dynamic simulation data H from a sample set different from the dynamic simulation data R Neighboring dynamic simulation data M;

S22，根据以下规则更新所述动态仿真数据R的权重：如果动态仿真数据R和动态仿真数据H在某个状态变量上的距离小于动态仿真数据R和动态仿真数据M在该状态变量上的距离，则增加所述状态变量的权重；或者，动态仿真数据R和动态仿真数据H在某个状态变量上的距离大于动态仿真数据R和动态仿真数据M在该状态变量上的距离，则降低该状态变量的权重；S22, update the weight of the dynamic simulation data R according to the following rules: if the distance between the dynamic simulation data R and the dynamic simulation data H on a certain state variable is smaller than the distance between the dynamic simulation data R and the dynamic simulation data M on the state variable , then increase the weight of the state variable; or, if the distance between the dynamic simulation data R and the dynamic simulation data H on a certain state variable is greater than the distance between the dynamic simulation data R and the dynamic simulation data M on the state variable, then reduce the weights of state variables;

S23，重复所述步骤S1和S2p次，获得多个状态变量的权重值，根据权重值由大到小对状态变量进行排序，取权重值最大的前q个状态变量作为关键状态变量；S23, repeating the steps S1 and S2p times to obtain the weight values of multiple state variables, sort the state variables according to the weight values from large to small, and take the first q state variables with the largest weight values as key state variables;

其中，p和q的取值根据电力系统暂态稳定性评估需求确定。Among them, the values of p and q are determined according to the requirements of power system transient stability assessment.

其中，所述步骤S3进一步包括：Wherein, the step S3 further includes:

采用主成分分析法对每个所述关键状态变量进行降维。The principal component analysis method is used to reduce the dimensionality of each of the key state variables.

其中，所述步骤S3进一步包括：Wherein, the step S3 further includes:

采用流行学习方法对每个所述关键状态变量进行降维。A popular learning method is used to reduce the dimensionality of each of the key state variables.

根据本发明的另一个方面，提供电力系统暂态稳定性评估的关键状态变量选取装置，包括：According to another aspect of the present invention, a key state variable selection device for power system transient stability assessment is provided, including:

预处理模块，用于获取电力系统动态仿真数据中的多个状态变量，利用FFT算法对所述多个状态变量进行预处理；A preprocessing module, configured to obtain a plurality of state variables in the power system dynamic simulation data, and use an FFT algorithm to preprocess the plurality of state variables;

特征选取模块，用于对经过预处理后的所述多个状态变量进行特征选取，获得多个关键状态变量；A feature selection module, configured to perform feature selection on the preprocessed multiple state variables to obtain multiple key state variables;

降维模块，用于对每个所述关键状态变量进行降维。A dimensionality reduction module is configured to perform dimensionality reduction on each of the key state variables.

其中，所述预处理模块具体用于：Wherein, the preprocessing module is specifically used for:

其中，所述特征选取模块具体用于：Wherein, the feature selection module is specifically used for:

本发明提出的电力系统暂态稳定性评估的关键状态变量选取方法及装置，通过关键状态变量的选取和降维，在不降低暂态稳定判别器分类精度的情况下，可显著缩短暂态稳定分类器的训练时间和分类时间，更加适合于在线应用。The key state variable selection method and device for power system transient stability evaluation proposed by the present invention can significantly shorten the transient stability state without reducing the classification accuracy of the transient stability discriminator through the selection of key state variables and dimensionality reduction. The training time and classification time of the classifier are more suitable for online applications.

附图说明Description of drawings

图1为本发明一实施例提供的电力系统暂态稳定性评估的关键状态变量选取方法的流程示意图；Fig. 1 is a schematic flow chart of a key state variable selection method for power system transient stability assessment provided by an embodiment of the present invention;

图2为本发明一实施例提供的利用全部动态数据训练、利用改进的状态变量提取方法训练以及降维之后训练的耗时分析比较图；Fig. 2 is a time-consuming analysis and comparison diagram of training using all dynamic data, training using an improved state variable extraction method, and training after dimensionality reduction provided by an embodiment of the present invention;

图3为本发明另一实施例提供的电力系统暂态稳定性评估的关键状态变量选取装置的结构示意图。Fig. 3 is a schematic structural diagram of a key state variable selection device for power system transient stability assessment provided by another embodiment of the present invention.

具体实施方式Detailed ways

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他的实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is only some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

如图1所示，为本发明一实施例提供的电力系统暂态稳定性评估的关键状态变量选取方法的流程示意图，包括：As shown in Figure 1, it is a schematic flow diagram of a key state variable selection method for power system transient stability assessment provided by an embodiment of the present invention, including:

具体地，基于机器学习的电力系统暂态稳定评估可分为四个方面的内容。首先是暂态稳定评估的总体框架，其次是输入数据的选择和特征的选取，再次是分类算法的选用，最后是对于分类结果的分析以及做出的相应的改进。基于机器学习的电力系统暂态稳定评估，主要利用的框架还是“离线训练，在线仿真”的方法。通过挖掘离线数据的规律，与在线数据进行匹配，对电力系统的暂态稳定性进行快速的判断。本发明所改进的是分类器特征选取和输入部分。Specifically, the power system transient stability assessment based on machine learning can be divided into four aspects. The first is the overall framework of transient stability assessment, the second is the selection of input data and the selection of features, the third is the selection of classification algorithms, and the last is the analysis of classification results and the corresponding improvements. The framework of transient stability assessment of power system based on machine learning is still the method of "offline training, online simulation". By mining the rules of offline data and matching with online data, the transient stability of the power system can be quickly judged. What the invention improves is the feature selection and input part of the classifier.

S1，采用基于机器学习的电力系统暂态稳定评估方法对电力系统进行暂态稳定评估的时候，利用的是动态动态仿真数据。动态仿真数据包含了较多的电力系统状态变量，同时也对较长的时间长度进行仿真，因此，很容易造成“维数灾”的问题。比如，如果选取的状态变量有2000个，仿真的时间长度为10s，步长为0.01s，那么每次仿真的结果是一个2000*1000的矩阵，在进行暂态稳定评估分类器训练时，如果采用10000个样本，那么很容易导致样本太大，无法进行分析的结果，而且训练耗时较长，也不利于暂态稳定评估分类器的多次训练参数调节和在线应用。因此，需要对电力系统动态仿真数据进行处理，本发明则首先从多个状态变量中筛选出关键的状态变量，在状态变量数目上对动态仿真数据的大小进行压缩，再结合降维方法，对动态仿真数据进行降维，进一步降低其维度，从而加快分类器的训练速度。S1, when the power system transient stability assessment method based on machine learning is used to evaluate the transient stability of the power system, the dynamic dynamic simulation data is used. Dynamic simulation data contains more power system state variables, and at the same time simulates for a longer time length, so it is easy to cause the problem of "curse of dimensionality". For example, if there are 2000 selected state variables, the simulation time length is 10s, and the step size is 0.01s, then the result of each simulation is a 2000*1000 matrix. When training the transient stability evaluation classifier, if If 10,000 samples are used, it is easy to cause the samples to be too large to be analyzed, and the training takes a long time, which is not conducive to the multiple training parameter adjustment and online application of the transient stability evaluation classifier. Therefore, it is necessary to process the dynamic simulation data of the power system. The present invention first screens out key state variables from a plurality of state variables, compresses the size of the dynamic simulation data in terms of the number of state variables, and then combines the dimensionality reduction method to The dynamic simulation data is dimensionally reduced to further reduce its dimension, thereby speeding up the training speed of the classifier.

从多个状态变量中筛选出关键的状态变量需要使用特征选取算法，而在传统的特征选取算法中，样本的特征属性为实数。由于电力系统的动态仿真结果可用m个状态变量的时间序列进行表示，其中每个状态变量的时间序列即为样本的属性。为了使用传统的特征选取算法，我们首先利用FFT算法对所获取到的电力系统动态仿真数据中的多个状态变量进行预处理，将所述每个状态变量的属性从时间序列转换为实数，从而可以利用传统的特征选取算法对样本进行特征选取，筛选出关键状态变量。Selecting key state variables from multiple state variables requires the use of feature selection algorithms, and in traditional feature selection algorithms, the feature attributes of samples are real numbers. Since the dynamic simulation results of the power system can be expressed by the time series of m state variables, the time series of each state variable is the attribute of the sample. In order to use the traditional feature selection algorithm, we first use the FFT algorithm to preprocess the multiple state variables in the obtained power system dynamic simulation data, and convert the attributes of each state variable from time series to real numbers, so that The traditional feature selection algorithm can be used to select the features of the samples and screen out the key state variables.

S2，特征选取(Feature Selection)也称特征子集选择(Feature SubsetSelection,FSS)，或属性选择(Attribute Selection)，是指从全部特征中选取一个特征子集，使构造出的模型更好。在机器学习的实际应用中，特征数量往往较多，其中可能存在不相关的特征，特征之间也可能存在相互依赖，容易导致如下的后果：特征个数越多，分析特征、训练模型所需的时间就越长。特征个数越多，容易引起“维度灾难”，模型也会越复杂，其推广能力会下降。特征选择能剔除不相关(irrelevant)或亢余(redundant)的特征，从而达到减少特征个数，提高模型精确度，减少运行时间的目的。另一方面，选取出真正相关的特征，使研究人员易于理解数据产生的过程。在步骤S1的基础上，即可利用现有的特征选取算法，例如Relief算法，对经过预处理后的所述多个状态变量进行特征选取，从而定位出用于电力系统暂态稳定性评估的关键状态变量。S2, Feature Selection (Feature Selection), also known as Feature Subset Selection (FSS), or Attribute Selection (Attribute Selection), refers to selecting a feature subset from all features to make the constructed model better. In the practical application of machine learning, the number of features is often large, there may be irrelevant features, and there may be interdependence between features, which can easily lead to the following consequences: the more the number of features, the more features required for analyzing features and training models. the longer the time. The larger the number of features, the more likely it is to cause the "dimension disaster", the more complex the model will be, and its generalization ability will decrease. Feature selection can eliminate irrelevant or redundant features, so as to reduce the number of features, improve the accuracy of the model, and reduce the running time. On the other hand, picking out the truly relevant features makes it easy for researchers to understand the process of data generation. On the basis of step S1, the existing feature selection algorithm, such as the Relief algorithm, can be used to perform feature selection on the multiple state variables after preprocessing, so as to locate the power system transient stability evaluation. key state variables.

S3，执行了步骤S1和S2之后，选出了用于电力系统暂态稳定性评估的关键状态变量。但是对于一个实际电力系统而言，如果存在2000个状态变量，以0.01s的步长仿真10s。即便从2000个状态变量中筛选了50个关键状态变量，每次仿真的结果依然会是一个50*1000的矩阵，仿真结果依然很大，仍然可能会存在“维数灾”的问题。所以还需要进一步对其进行降维。常用的降维方法主要有两种，一种是线性的降维方法，如PCA等。另外一种是非线性的降维方法，如流形学习等。S3, after steps S1 and S2 are executed, key state variables for power system transient stability evaluation are selected. But for an actual power system, if there are 2000 state variables, simulate 10s with a step size of 0.01s. Even if 50 key state variables are selected from 2000 state variables, the result of each simulation will still be a 50*1000 matrix, the simulation result is still very large, and the problem of "curse of dimensionality" may still exist. Therefore, further dimensionality reduction is needed. There are two commonly used dimensionality reduction methods, one is a linear dimensionality reduction method, such as PCA. The other is a nonlinear dimensionality reduction method, such as manifold learning.

在通过特征选取算法筛选出关键状态变量和通过所选择的降维方法对关键状态变量进行降维之后，电力系统的动态仿真结果可以降维到低维平面上。利用降维之后的结果对暂态稳定的分类器进行训练，可以大大缩短模型训练所需要的时间，同时也不会降低模型训练的精度。After the key state variables are screened out by the feature selection algorithm and the key state variables are reduced by the selected dimension reduction method, the dynamic simulation results of the power system can be reduced to a low-dimensional plane. Using the results of dimensionality reduction to train a transiently stable classifier can greatly shorten the time required for model training without reducing the accuracy of model training.

本发明提出的电力系统暂态稳定性评估的关键状态变量选取方法，通过关键状态变量的选取和降维，在不降低暂态稳定判别器分类精度的情况下，可显著缩短暂态稳定分类器的训练时间和分类时间，更加适合于在线应用。The key state variable selection method for power system transient stability evaluation proposed by the present invention can significantly shorten the transient stability classifier without reducing the classification accuracy of the transient stability discriminator through the selection of key state variables and dimensionality reduction The training time and classification time are more suitable for online applications.

基于上述实施例，所述电力系统动态仿真数据进一步包括：故障切除后的电力系统动态波形数据。Based on the above embodiment, the dynamic simulation data of the power system further includes: dynamic waveform data of the power system after fault removal.

具体地，在输入数据的选择上，通常情况下会有两种类型：一种是基于故障前潮流断面的静态数据，另外一种是故障切除后的动态数据。利用故障切除之后的全部动态数据，因为计及了更多的信息，在稳定性判断中精度更高。因此，本发明选取故障切除后的电力系统动态波形数据。Specifically, in the selection of input data, there are usually two types: one is the static data based on the power flow section before the fault, and the other is the dynamic data after the fault is removed. Using all the dynamic data after the fault is removed, because more information is taken into account, the accuracy in stability judgment is higher. Therefore, the present invention selects the dynamic waveform data of the power system after the fault is removed.

基于上述实施例，所述步骤S1进一步包括：Based on the above-mentioned embodiment, the step S1 further includes:

具体地，FFT是离散傅立叶变换的快速算法，可以将一个信号从时域变换到频域。通过FFT算法可以将所获取到的电力系统动态仿真数据中的每个状态变量的时间序列转换为实数，即通过FFT算法提取每个状态变量的时间序列的前n次谐波的实部和虚部，这样，一个时间序列即可转化为2n个实数，每个状态变量的时间序列即转换为2n个实数，由于电力系统动态仿真数据的每个样本可用m个状态变量的时间序列构成，样本的属性则对应由每个状态变量的时间序列构成，即样本的属性转换为由每个状态变量的2n个实数构成。Specifically, FFT is a fast algorithm for discrete Fourier transform, which can transform a signal from the time domain to the frequency domain. The time series of each state variable in the obtained power system dynamic simulation data can be converted into real numbers through the FFT algorithm, that is, the real part and imaginary part of the first n harmonics of the time series of each state variable can be extracted through the FFT algorithm In this way, a time series can be converted into 2n real numbers, and the time series of each state variable can be converted into 2n real numbers. Since each sample of power system dynamic simulation data can be composed of time series of m state variables, the sample The attribute of corresponds to the time series of each state variable, that is, the attribute of the sample is converted to be composed of 2n real numbers of each state variable.

基于上述实施例，所述步骤S2进一步包括：Based on the above-mentioned embodiment, the step S2 further includes:

具体地，Relief(Relevant Features)是著名的过滤式特征选取方法，Relief为一系列算法，它包括最早提出的Relief以及后来拓展的Relief-F和RRelief-F，其中最早提出的Relief针对的是二分类问题，RRelief-F算法可以解决多分类问题，RRelief-F算法针对的是目标属性为连续值的回归问题。原始Relief算法最初由Kira提出，是一种基于特征权重的算法，能够根据样本的各个特征和类别之间的相关性，决定特征的不同的权重。如果特征的权重小于某个阈值，那么这个特征将被移除。Relief算法中特征的权重由特征对近距离样本的区分能力决定。算法从训练集D中随机选择一个样本R，然后从和R同类的样本中寻找最近邻样本H，称为Near Hit，从和R不同类的样本中寻找最近邻样本M，称为NearMiss，然后根据以下规则更新每个特征的权重：如果R和Near Hit在某个特征上的距离小于R和NearMiss上的距离，则说明该特征对区分同类和不同类的最近邻是有益的，则增加该特征的权重；反之，如果R和Near Hit在某个特征的距离大于R和Near Miss上的距离，说明该特征对区分同类和不同类的最近邻起负面作用，则降低该特征的权重。以上过程重复m次，最后得到各特征的平均权重。特征的权重越大，表示该特征的分类能力越强，反之，表示该特征分类能力越弱。Relief算法的运行时间随着样本的抽样次数m和原始特征个数的增加线性增加，因而运行效率非常高。Specifically, Relief (Relevant Features) is a well-known filtering feature selection method. Relief is a series of algorithms, including the earliest proposed Relief and the later expanded Relief-F and RRelief-F. The earliest proposed Relief is aimed at two For classification problems, the RRelief-F algorithm can solve multi-classification problems, and the RRelief-F algorithm is aimed at the regression problem where the target attribute is a continuous value. The original Relief algorithm was originally proposed by Kira. It is an algorithm based on feature weights, which can determine different weights of features according to the correlation between each feature of the sample and the category. If the weight of a feature is less than a certain threshold, then this feature will be removed. The weight of features in the Relief algorithm is determined by the feature's ability to distinguish close samples. The algorithm randomly selects a sample R from the training set D, and then finds the nearest neighbor sample H from the samples of the same type as R, which is called Near Hit, and finds the nearest neighbor sample M from the samples that are different from R, called NearMiss, and then Update the weight of each feature according to the following rules: If the distance between R and Near Hit on a certain feature is smaller than the distance between R and NearMiss, it means that this feature is beneficial to distinguish the nearest neighbors of the same class from different classes, then increase the The weight of the feature; conversely, if the distance between R and Near Hit on a certain feature is greater than the distance between R and Near Miss, it means that the feature has a negative effect on distinguishing the nearest neighbors of the same class from different classes, and the weight of the feature is reduced. The above process is repeated m times, and finally the average weight of each feature is obtained. The greater the weight of a feature, the stronger the classification ability of the feature, and vice versa, the weaker the classification ability of the feature. The running time of the Relief algorithm increases linearly with the number of sample sampling m and the number of original features, so the running efficiency is very high.

在执行了步骤S1之后，电力系统动态仿真数据中的每个样本的属性经FFT算法由m个状态变量的时间序列转为实数，因此可以利用传统的Relief算法进行特征选取，根据每个状态变量的2n个实数属性，筛选出权重排名靠前的状态变量，从而获得多个关键状态变量。After step S1 is executed, the attributes of each sample in the power system dynamic simulation data are converted from the time series of m state variables to real numbers by the FFT algorithm, so the traditional Relief algorithm can be used for feature selection, according to each state variable The 2n real number attributes of , filter out the state variables with the highest weight ranking, so as to obtain multiple key state variables.

基于上述实施例，所述根据每个状态变量的2n个实数属性，利用Relief算法筛选出权重靠前的状态变量，获得多个关键状态变量的步骤具体包括：Based on the above-mentioned embodiments, the step of obtaining a plurality of key state variables specifically includes:

上述即给出了利用Relief算法从电力系统的多个状态变量筛选出关键状态变量的具体过程。The above has given the specific process of using the Relief algorithm to screen out key state variables from multiple state variables of the power system.

基于上述实施例，所述步骤S3进一步包括：Based on the above-mentioned embodiment, the step S3 further includes:

采用主成分分析法对每个所述关键状态变量进行降维具体是指采用主成分分析法对每个关键状态变量的时间序列进行降维。主成分分析法的应用已非常成熟，在此不再赘述。Using principal component analysis to reduce the dimensionality of each of the key state variables specifically refers to using principal component analysis to reduce the dimensionality of the time series of each key state variable. The application of principal component analysis is very mature, so I won't go into details here.

具体地，流形学习的降维方法，假定样本位于一个低维的流形上，因此可以在低维流形上寻找一个映射，对样本进行降维。从而可以使得，样本在低维流形上的距离和高维空间中保持一致。而对于电力系统而言，不同的故障后动态波形，实际上可以看作是不同的运行工况、不同的故障类型和故障强度等组合下的流形，因此可以采用流形学习的方法对其进行降维。Specifically, the dimensionality reduction method of manifold learning assumes that the sample is located on a low-dimensional manifold, so a mapping can be found on the low-dimensional manifold to reduce the dimensionality of the sample. Thus, the distance of the samples on the low-dimensional manifold can be consistent with that in the high-dimensional space. For power systems, different post-fault dynamic waveforms can actually be regarded as manifolds under the combination of different operating conditions, different fault types and fault intensities, so the method of manifold learning can be used to control Perform dimensionality reduction.

在通过改进的特征选取算法和流形学习的降维方法之后，电力系统的动态仿真结果可以降维到低维平面上。利用降维之后的结果对暂态稳定的分类器进行训练，可以大大缩短模型训练所需要的时间，同时也不会降低模型训练的精度。After the improved feature selection algorithm and the dimension reduction method of manifold learning, the dynamic simulation results of the power system can be reduced to a low-dimensional plane. Using the results of dimensionality reduction to train a transiently stable classifier can greatly shorten the time required for model training without reducing the accuracy of model training.

下面结合仿真实例来进一步说明本发明所提供方法的有益效果。在电力系统的10机39节点的系统上进行了仿真测试。在10机39节点的系统上产生了4000个样本，其中3200个作为训练集，800个作为测试集。仿真样本中，每个样本的特征属性为电压、电流和功率等158个状态变量的时间序列。利用特征选取的算法，从158个状态变量中筛选了权值最大的20个状态变量，如表1所示。The beneficial effect of the method provided by the present invention will be further described below in conjunction with a simulation example. The simulation test is carried out on the system of 10 machines and 39 nodes in the power system. 4000 samples are generated on the system of 10 machines and 39 nodes, among which 3200 samples are used as training set and 800 samples are used as test set. In the simulation samples, the characteristic attribute of each sample is the time series of 158 state variables such as voltage, current and power. Using the feature selection algorithm, the 20 state variables with the largest weight are screened from 158 state variables, as shown in Table 1.

表1关键状态变量选取结果Table 1 Selection results of key state variables

利用选取出的关键的状态变量的时间序列进行降维。在此基础上，比较了利用全部动态数据训练、利用改进的状态变量提取方法训练以及降维之后训练的结果，如图2所示为三种训练方式的耗时分析比较图。Use the selected time series of key state variables for dimensionality reduction. On this basis, the results of training using all dynamic data, training using the improved state variable extraction method, and training after dimensionality reduction are compared. Figure 2 shows the time-consuming analysis and comparison diagram of the three training methods.

对于一个分类器而言，一般会存在分类的误差。评价指标是用于衡量分类器性能和指导分类模型参数调节的重要部分。而在电力系统暂态稳定评估的过程中，将不稳定的样本评估为稳定和将稳定的样本评估为不稳定给系统带来的损失是不一样的。因此，需要采用更多的指标来衡量分类器的综合性能。For a classifier, there is generally a classification error. The evaluation index is an important part for measuring the performance of the classifier and guiding the parameter adjustment of the classification model. However, in the process of power system transient stability assessment, the losses to the system are different if the unstable samples are evaluated as stable and the stable samples are evaluated as unstable. Therefore, more indicators need to be used to measure the comprehensive performance of the classifier.

定义出以下指标：The following indicators are defined:

真正例率： True case rate:

假正例率： False positive rate:

假负例率： False Negative Rate:

正确率： Correct rate:

此外，受试者特性曲线(ROC)采用TPR作为纵轴，采用FPR作为横轴。其被用于评估分类器的性能，曲线横跨的平面区域越大，分类器的性能越好。因此，可以通过计算曲线包络的面积(AUC)来衡量分类器的性能。In addition, the receiver characteristic curve (ROC) uses TPR as the vertical axis and FPR as the horizontal axis. It is used to evaluate the performance of the classifier, the larger the area of the plane spanned by the curve, the better the performance of the classifier. Therefore, the performance of a classifier can be measured by calculating the area of the envelope of the curve (AUC).

表2为基于上述各指标对三种训练方式进行准确性分析的结果。Table 2 shows the accuracy analysis results of the three training methods based on the above indicators.

表2三种训练方式的准确性分析Table 2 Accuracy analysis of three training methods

指标index FPRFPR FNRFNR AccAcc AUCAUC 原始数据Raw data 0.29％0.29% 00 99.71％99.71% 0.99810.9981 特征选取数据feature selection data 1.01％1.01% 0.87％0.87% 98.11％98.11% 0.99210.9921 降维数据Dimensionality reduction data 0.43％0.43% 1.01％1.01% 98.56％98.56% 0.99270.9927

通过以上数据可以说明，虽然利用全部动态数据，在精度上略好于关键状态变量选取的结果以及降维的结果，但是在训练时间和预测时间上，关键状态变量的选取以及降维后的方法的耗时远小于利用全部动态数据的方法，甚至存在数量级上的差异。From the above data, it can be shown that although using all the dynamic data, the accuracy is slightly better than the result of key state variable selection and dimensionality reduction, but in terms of training time and prediction time, the selection of key state variables and the method after dimensionality reduction The time-consuming is much less than the method of using all dynamic data, even there is an order of magnitude difference.

如图3所示，为本发明另一实施例提供的电力系统暂态稳定性评估的关键状态变量选取装置的结构示意图，包括：预处理模块31、特征选取模块32和降维模块33，其中，As shown in FIG. 3 , it is a schematic structural diagram of a key state variable selection device for power system transient stability assessment provided by another embodiment of the present invention, including: a preprocessing module 31, a feature selection module 32 and a dimensionality reduction module 33, wherein ,

预处理模块31，用于获取电力系统动态仿真数据中的多个状态变量，利用FFT算法对所述多个状态变量进行预处理；A preprocessing module 31, configured to obtain a plurality of state variables in the power system dynamic simulation data, and use an FFT algorithm to preprocess the plurality of state variables;

特征选取模块32，用于对经过预处理后的所述多个状态变量进行特征选取，获得多个关键状态变量；A feature selection module 32, configured to perform feature selection on the preprocessed plurality of state variables to obtain a plurality of key state variables;

降维模块33，用于对每个所述关键状态变量进行降维。A dimensionality reduction module 33, configured to perform dimensionality reduction on each of the key state variables.

具体地，采用基于机器学习的电力系统暂态稳定评估方法对电力系统进行暂态稳定评估的时候，利用的是动态动态仿真数据。动态仿真数据包含了较多的电力系统状态变量，同时也对较长的时间长度进行仿真，因此，很容易造成“维数灾”的问题。比如，如果选取的状态变量有2000个，仿真的时间长度为10s，步长为0.01s，那么每次仿真的结果是一个2000*1000的矩阵，在进行暂态稳定评估分类器训练时，如果采用10000个样本，那么很容易导致样本太大，无法进行分析的结果，而且训练耗时较长，也不利于暂态稳定评估分类器的多次训练参数调节和在线应用。因此，需要对电力系统动态仿真数据进行处理，本发明实施例所提供的电力系统暂态稳定性评估的关键状态变量选取装置则首先从多个状态变量中筛选出关键的状态变量，在状态变量数目上对动态仿真数据的大小进行压缩，再结合降维方法，对动态仿真数据进行降维，进一步降低其维度，从而加快分类器的训练速度。Specifically, when the power system transient stability assessment method based on machine learning is used to evaluate the transient stability of the power system, the dynamic dynamic simulation data is used. Dynamic simulation data contains more power system state variables, and at the same time simulates for a longer time length, so it is easy to cause the problem of "curse of dimensionality". For example, if there are 2000 selected state variables, the simulation time length is 10s, and the step size is 0.01s, then the result of each simulation is a 2000*1000 matrix. When training the transient stability evaluation classifier, if If 10,000 samples are used, it is easy to cause the samples to be too large to be analyzed, and the training takes a long time, which is not conducive to the multiple training parameter adjustment and online application of the transient stability evaluation classifier. Therefore, it is necessary to process the dynamic simulation data of the power system. The key state variable selection device for power system transient stability evaluation provided by the embodiment of the present invention first screens out the key state variables from a plurality of state variables. In terms of numbers, the size of the dynamic simulation data is compressed, and combined with the dimensionality reduction method, the dimensionality reduction of the dynamic simulation data is further reduced, thereby speeding up the training speed of the classifier.

从多个状态变量中筛选出关键的状态变量需要使用特征选取算法，而在传统的特征选取算法中，样本的特征属性为实数。由于电力系统的动态仿真结果可用m个状态变量的时间序列进行表示，其中每个状态变量的时间序列即为样本的属性。为了使用传统的特征选取算法，预处理模块31首先利用FFT算法对所获取到的电力系统动态仿真数据中的多个状态变量进行预处理，将所述每个状态变量的属性从时间序列转换为实数，从而可以利用传统的特征选取算法对样本进行特征选取，筛选出关键状态变量。Selecting key state variables from multiple state variables requires the use of feature selection algorithms, and in traditional feature selection algorithms, the feature attributes of samples are real numbers. Since the dynamic simulation results of the power system can be expressed by the time series of m state variables, the time series of each state variable is the attribute of the sample. In order to use the traditional feature selection algorithm, the preprocessing module 31 first uses the FFT algorithm to preprocess the multiple state variables in the obtained power system dynamic simulation data, and converts the attributes of each state variable from time series to Real numbers, so that the traditional feature selection algorithm can be used to select the features of the samples, and the key state variables can be screened out.

特征选取(Feature Selection)也称特征子集选择(Feature Subset Selection,FSS)，或属性选择(Attribute Selection)，是指从全部特征中选取一个特征子集，使构造出的模型更好。在机器学习的实际应用中，特征数量往往较多，其中可能存在不相关的特征，特征之间也可能存在相互依赖，容易导致如下的后果：特征个数越多，分析特征、训练模型所需的时间就越长。特征个数越多，容易引起“维度灾难”，模型也会越复杂，其推广能力会下降。特征选择能剔除不相关(irrelevant)或亢余(redundant)的特征，从而达到减少特征个数，提高模型精确度，减少运行时间的目的。另一方面，选取出真正相关的特征，使研究人员易于理解数据产生的过程。在预处理模块31进行预处理之后，特征选取模块32即可利用现有的特征选取算法，例如Relief算法，对经过预处理后的所述多个状态变量进行特征选取，从而定位出用于电力系统暂态稳定性评估的关键状态变量。Feature Selection (Feature Selection), also known as Feature Subset Selection (FSS), or Attribute Selection (Attribute Selection), refers to selecting a subset of features from all features to make the constructed model better. In the practical application of machine learning, the number of features is often large, there may be irrelevant features, and there may be interdependence between features, which can easily lead to the following consequences: the more the number of features, the more features required for analyzing features and training models. the longer the time. The larger the number of features, the more likely it is to cause the "dimension disaster", the more complex the model will be, and its generalization ability will decrease. Feature selection can eliminate irrelevant or redundant features, so as to reduce the number of features, improve the accuracy of the model, and reduce the running time. On the other hand, picking out the truly relevant features makes it easy for researchers to understand the process of data generation. After preprocessing by the preprocessing module 31, the feature selection module 32 can use the existing feature selection algorithm, such as the Relief algorithm, to perform feature selection on the multiple state variables after preprocessing, thereby locating the Key state variables for system transient stability assessment.

经过预处理模块31和特征选取模块32的处理后，选出了用于电力系统暂态稳定性评估的关键状态变量。但是对于一个实际电力系统而言，如果存在2000个状态变量，以0.01s的步长仿真10s。即便从2000个状态变量中筛选了50个关键状态变量，每次仿真的结果依然会是一个50*1000的矩阵，仿真结果依然很大，仍然可能会存在“维数灾”的问题。所以还需要降维模块33进一步对所筛选出的关键状态变量进行降维。常用的降维方法主要有两种，一种是线性的降维方法，如PCA等。另外一种是非线性的降维方法，如流形学习等。After being processed by the preprocessing module 31 and the feature selection module 32, the key state variables for power system transient stability evaluation are selected. But for an actual power system, if there are 2000 state variables, simulate 10s with a step size of 0.01s. Even if 50 key state variables are selected from 2000 state variables, the result of each simulation will still be a 50*1000 matrix, the simulation result is still very large, and the problem of "curse of dimensionality" may still exist. Therefore, the dimensionality reduction module 33 is required to further reduce the dimensionality of the key state variables that have been screened out. There are two commonly used dimensionality reduction methods, one is a linear dimensionality reduction method, such as PCA. The other is a nonlinear dimensionality reduction method, such as manifold learning.

基于上述实施例，所述预处理模块31具体用于：Based on the above-mentioned embodiments, the preprocessing module 31 is specifically used for:

基于上述实施例，所述特征选取模块32具体用于：Based on the foregoing embodiments, the feature selection module 32 is specifically used for:

在经过预处理模块31的处理之后，电力系统动态仿真数据中的每个样本的属性经FFT算法由m个状态变量的时间序列转为实数，因此特征选取模块22可以利用传统的Relief算法进行特征选取，根据每个状态变量的2n个实数属性，筛选出权重排名靠前的状态变量，从而获得多个关键状态变量。After being processed by the preprocessing module 31, the attributes of each sample in the power system dynamic simulation data are converted from the time series of m state variables to real numbers through the FFT algorithm, so the feature selection module 22 can use the traditional Relief algorithm to perform feature Select, according to the 2n real number attributes of each state variable, filter out the state variables with the highest weight ranking, so as to obtain multiple key state variables.

根据每个状态变量的2n个实数属性，利用Relief算法筛选出权重靠前的状态变量，获得多个关键状态变量的步骤具体包括：According to the 2n real number attributes of each state variable, use the Relief algorithm to screen out the state variables with the highest weight, and the steps to obtain multiple key state variables include:

本发明提出的电力系统暂态稳定性评估的关键状态变量选取装置，通过关键状态变量的选取和降维，在不降低暂态稳定判别器分类精度的情况下，可显著缩短暂态稳定分类器的训练时间和分类时间，更加适合于在线应用。The key state variable selection device for power system transient stability evaluation proposed by the present invention can significantly shorten the transient stability classifier without reducing the classification accuracy of the transient stability discriminator through the selection of key state variables and dimensionality reduction The training time and classification time are more suitable for online applications.

最后，本发明的方法仅为较佳的实施方案，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Finally, the method of the present invention is only a preferred embodiment, and is not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.

Claims

1. A method for selecting key state variables for power system transient stability assessment, characterized in that it includes:

S1. Obtain a plurality of state variables in the dynamic simulation data of the power system, and use an FFT algorithm to preprocess the plurality of state variables;

S2, performing feature selection on the plurality of preprocessed state variables to obtain a plurality of key state variables;

S3. Perform dimensionality reduction on each of the key state variables.

2. The method according to claim 1, wherein the power system dynamic simulation data further comprises: power system dynamic waveform data after fault removal.

3. The method according to claim 1, wherein said step S1 further comprises:

Use the FFT algorithm to extract the real and imaginary parts of the first n harmonics of the time series of each state variable in the power system dynamic simulation data, so that the time series attributes of each state variable can be converted into 2n real number attributes, where n is A natural number greater than 1.

4. The method according to claim 3, wherein said step S2 further comprises:

According to the 2n real number attributes of each state variable, the state variables with the highest weight are screened out by using the Relief algorithm, and multiple key state variables are obtained.

5. method according to claim 4, is characterized in that, described according to 2n real number attributes of each state variable, utilizes the Relief algorithm to filter out the state variable with the highest weight, and the step of obtaining a plurality of key state variables further comprises :

S21, randomly select a dynamic simulation data R from a plurality of dynamic simulation data, and then search for the nearest neighbor dynamic simulation data H from a sample set of the same type as the dynamic simulation data R, and search for the nearest dynamic simulation data H from a sample set different from the dynamic simulation data R Neighboring dynamic simulation data M;

S22, update the weight of the dynamic simulation data R according to the following rules: if the distance between the dynamic simulation data R and the dynamic simulation data H on a certain state variable is smaller than the distance between the dynamic simulation data R and the dynamic simulation data M on the state variable , then increase the weight of the state variable; or, if the distance between the dynamic simulation data R and the dynamic simulation data H on a certain state variable is greater than the distance between the dynamic simulation data R and the dynamic simulation data M on the state variable, then reduce the the weight of the state variable;

S23, repeating the steps S1 and S2p times to obtain the weight values of multiple state variables, sort the state variables according to the weight values from large to small, and take the first q state variables with the largest weight values as key state variables;

Among them, the values of p and q are determined according to the requirements of power system transient stability assessment.

6. The method according to claim 1, wherein said step S3 further comprises:

The principal component analysis method is used to reduce the dimensionality of each of the key state variables.

7. The method according to claim 1, wherein said step S3 further comprises:

A popular learning method is used to reduce the dimensionality of each of the key state variables.

8. The key state variable selection device for power system transient stability assessment, characterized in that it includes:

A preprocessing module, configured to obtain a plurality of state variables in the power system dynamic simulation data, and use an FFT algorithm to preprocess the plurality of state variables;

A feature selection module, configured to perform feature selection on the preprocessed multiple state variables to obtain multiple key state variables;

A dimensionality reduction module, configured to perform dimensionality reduction on the multiple key state variables.

9. The device according to claim 8, wherein the preprocessing module is specifically used for:

10. The device according to claim 9, wherein the feature selection module is specifically used for: