CN112149873B

CN112149873B - Low-voltage station line loss reasonable interval prediction method based on deep learning

Info

Publication number: CN112149873B
Application number: CN202010867853.0A
Authority: CN
Inventors: 井友鼎; 付勇; 滕铁军; 郝增才; 陈小燕; 张伟
Original assignee: Beijing Hezhong Weiqi Technology Co ltd
Current assignee: Beijing Hezhong Weiqi Technology Co ltd
Priority date: 2020-08-25
Filing date: 2020-08-25
Publication date: 2023-07-21
Anticipated expiration: 2040-08-25
Also published as: CN112149873A

Abstract

The invention belongs to the technical field of station area line loss prediction, and particularly relates to a low-voltage station area line loss reasonable interval prediction method based on deep learning, which comprises the following steps: collecting data; constructing indexes; constructing characteristics; selecting a characteristic factor; constructing a line loss prediction model; and (5) displaying the model prediction effect. According to the method, through an accurate line loss prediction model, the accurate loss reduction capacity is improved, and the line loss lean management is realized; through accurate prediction of line loss, the accurate beneficial management of line loss is promoted.

Description

Low-voltage station line loss reasonable interval prediction method based on deep learning

Technical Field

The invention belongs to the technical field of line loss prediction of a transformer area, and particularly relates to a reasonable interval prediction method of line loss of a low-voltage transformer area based on deep learning.

Background

The reasons for the line loss in the low-voltage distribution transformer area are mainly classified into fixed loss, management reasons and technical reasons. Wherein the fixed loss comprises resistance loss and excitation loss generated by the inner winding and the iron core of the transformer; resistance loss generated by a cable line for power transmission of a power grid; the electric energy loss generated by the capacitor and reactance equipment deployed in the power transmission website; the electric energy loss generated by the protection device in the electric power network; loss generated by the medium and loss generated by the power grid metering device. The management reasons mainly refer to meter reading problems, insufficient electricity stealing management work and the like. The technical reasons mainly refer to the problems of inconsistent nutrient count, inconsistent user-to-user relationship and the like.

With the deep development of the line loss lean management work of the national network company, the line loss qualification rate assessment mode of the traditional one-cut mode does not meet the line loss lean management requirement any more, and power supply enterprises need to find an effective calculation line loss calculation method to dynamically predict reasonable line loss values and reasonable interval upper boundaries of all the areas and timely early warn the areas beyond the reasonable interval upper boundaries.

The students before the calculation of the line loss of the transformer area have conducted a great deal of research and verification, and at present, a great deal of calculation methods for the line loss rate of the power grid are divided into a traditional method and a machine learning-based method. The traditional method mainly comprises a region loss rate method, a voltage loss rate method, an equivalent resistance method, a tide method and the like. The machine learning method mainly comprises a linear regression method with strong interpretability based on the traditional method, an integrated regression method based on a decision tree, a nonlinear regression method based on a neural network and difficult to interpret, a calculation method based on a support vector machine, a calculation method based on the neural network and an improved self-adaptive secondary variation differential evolution algorithm, a calculation method based on an improved K-means cluster and a BP neural network, and a short-term low-voltage distribution network theoretical line loss prediction algorithm based on a kmmeans-LightGBM.

The invention patent document of bulletin No. CN109272176B discloses a method for predicting and calculating the line loss rate of a transformer area by using a K-means clustering algorithm, which comprises the following steps: step 1, selecting an active power supply quantity X1, a reactive power supply quantity X2, a total length X3 of a power supply line, a power supply radius X4 and a total line resistance X5 as electrical characteristic parameters; step 2, carrying out standardized processing on the original data of the electrical characteristic parameters; step 3, establishing a performance index function PI (i) of the platform region through the electrical characteristic parameters, and selecting an initial clustering center point and a clustering number K; step 4, predicting the line loss rate of the transformer area by using an improved K-means clustering algorithm; according to the method, the index function established by the electrical characteristic parameters of the platform area is used as a principle of judging the initial clustering center through clustering analysis, so that the accuracy of a clustering result is improved, but different initial values of the clustering result can lead to different classification results, and when the classification is inaccurate, the defect of larger parameter reconstruction error is easily caused.

Disclosure of Invention

The invention aims to provide a low-voltage station area line loss reasonable interval prediction method based on deep learning aiming at the problems in the prior art, and the method improves the accurate loss reduction capability through an accurate line loss prediction model to realize the accurate management of line loss; through accurate prediction of line loss, the accurate beneficial management of line loss is promoted.

The technical scheme of the invention is as follows:

a low-voltage station area line loss reasonable interval prediction method based on deep learning comprises the following steps:

s1, collecting data, wherein the data comprises four systems of acquisition, marketing, PMS and GIS, and the related data comprises distribution side data and user side data;

s2, constructing indexes, and selecting line loss influence factors from the data collected in the S1;

s3, constructing characteristics, namely inputting the square of the load characteristic and the absolute value of the terminal pressure drop as characteristic factors of a model through correlation analysis of the selected influence factors and the line loss rate;

s4, feature factor selection, namely constructing a Lasso regression model by adopting a statistical index factor and two constructed feature factors, and finally selecting eight factors including a network power ratio, a power supply radius, a terminal power ratio, a power factor of a platform area, an absolute value of a head-terminal voltage drop, a load rate, a load characteristic square and a three-phase imbalance coefficient as final feature factors according to Lasso regression variable screening results;

s5, constructing a line loss prediction model, and constructing the line loss prediction model by adopting LSTM;

s6, displaying the model prediction effect.

Specifically, in the step S1, the data on the distribution transformer side includes distribution transformer capacity, voltage, current, power and electric quantity, and the data on the user side includes user capacity, daily electric quantity and user coordinates.

Specifically, in the step S2, five kinds of power supply indexes, grid indexes, electric quantity indexes, operation indexes and capacity indexes of the transformer areas are selected to be twenty-four transformer area line loss influence factors.

Specifically, in the step S4, the Lasso regression model performs a regularization method on parameter estimation and variable selection at the same time, where the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,is a penalty term.

Specifically, in the step S4, the process of screening the regression variable by using the Lasso regression model includes performing outlier rejection on the residual error estimation, and the method includes the following steps:

1) Calculating prediction residual

2) Calculating the mean and standard deviation of the residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

|z_score|＞3

And removing the line loss rate sample value with the z score larger than 3 in the sample data.

Specifically, the step S4 further includes normalizing the selected feature factors, and normalizing each feature factor to a value between [0.1,0.9] by using a minimum maximum normalization method, where N samples { x (N) } nn=1 are assumed, and for each dimension of feature x, the normalized feature is

Where min (x) and max (x) are the minimum and maximum values of the feature factor x over all samples, respectively.

Specifically, in the step S5, the line loss prediction model is constructed, eight feature factors are selected by using the LASSO algorithm and are used as inputs of the LSTM model, and the training network structure is optimized through loop iteration, including the following steps:

1) Normalizing the eight characteristic factors to be used as the input of an LSTM model, outputting the actual line loss rate as the model, and dividing the data into a training set and a testing set;

2) Setting basic parameters of an LSTM deep learning model, wherein the basic parameters comprise an activation function, the layer number of a deep neural network and a model gradient optimization algorithm which is ADAM;

3) Training model parameters according to training data, fine-tuning basic parameters of a network model, including increasing the number of layers, changing an activation function until an evaluation function reaches an ideal range, finally determining 128 layers of the model network parameters, selecting a sigmod function by the activation function, and enabling the drop rate of a Dropout layer to be 0.5;

4) And finally, predicting the corresponding line loss rate and the reasonable interval upper bound for the newly input characteristic factors by using the trained model.

Specifically, the evaluation function of the LSTM model adopts RMSLE, and the formula is as follows:

wherein y is _i As the actual line loss value of the wire,is a predicted value of line loss.

In order to strengthen daily management work of power supply enterprises, china comprehensively implements partition management on a low-voltage power distribution network, the partition management is used as an important component of power grid 'quarter management', and the line loss of the partition directly reflects the power grid management level of a certain area. However, the complexity of the line loss management of the transformer area is increased due to the large number of users in the low-voltage transformer area, various loads, uneven management level of a power grid base layer and the network frame construction mechanism, imperfect management of the transformer accounts and complex and various line distribution. Based on the current situation, the accurate and rapid calculation of the line loss rate of the cell becomes a problem to be solved.

The problems of the traditional line loss calculation method are as follows: the method has the advantages that firstly, the simplified algorithms such as a station area loss rate method and a voltage loss rate method are low in calculation accuracy, and the requirements of station area lean management cannot be met; secondly, the equivalent resistance method, the tide method and other accurate algorithms have high requirements on the topological network of the transformer area, the equipment parameters and the operation data, and because the power distribution network has huge scale, if the large-scale transformer area is opened for actual measurement, the workload is very large, so that the overall line loss condition of the transformer area is difficult to master; thirdly, with the development of new energy, the power supply access of a transformer area 380V photovoltaic and the like becomes a common phenomenon, and the existing line loss calculation method cannot meet the development requirement of the transformer area.

The calculation or prediction accuracy of line loss is greatly improved by adopting a machine learning algorithm, but the method has certain limitations, the model training efficiency is lower when the data size is larger by adopting a calculation method based on a support vector machine, and in addition, a proper kernel function is difficult to find when the feature dimension space is larger; by adopting a calculation method of k-means clustering and bp neural network and a calculation method based on kmmeans-LightGBM, different initial values can lead to different classification results because the initialization center of the clustering algorithm is randomly selected, and the defect of larger parameter reconstruction error is easily caused when the classification is inaccurate.

The beneficial effects of the invention are as follows: 1) Feature selection is performed by lasso regression, model input is optimized, and model operation efficiency and model stability are improved; 2) A deep learning model is built aiming at line loss prediction, the deep learning model has excellent nonlinear function approximation capability, and deep characteristic rules between characteristic factors and line loss rate can be mined, so that a prediction result is more reasonable; 3) The RMSLE is adopted to replace the traditional RMSE to evaluate the model, the RMSLE evaluation strategy is to punish that the underspection is larger than the overspection, namely, the underspection is larger than the overspection, so that the problem of trailing of line loss rate distribution is effectively solved, and the model prediction result is more reasonable and accurate; 4) The invention estimates the reasonable interval upper bound of the predicted line loss according to the residual error of the predicted line loss and the actual line loss.

The method provided by the invention has the advantages that the marketing line loss management coverage users are the widest, the related equipment is the most, the data gauge is the largest, the accurate loss reduction capacity is improved through the accurate line loss prediction model, and the line loss lean management is realized; through accurate prediction of line loss, the line loss lean management is improved, and the benefit is increased by about billions of yuan for companies each year.

Drawings

Fig. 1 is a schematic diagram of the line loss ratio versus load characteristic scatter.

FIG. 2 is a plot of line loss versus head-to-tail pressure drop spread;

FIG. 3 is a schematic view of LSTM structure;

fig. 4 is a comparison of the predicted value of the line loss rate of the area with the actual value.

Detailed Description

The technical scheme of the invention is described in detail below with reference to the accompanying drawings and the specific embodiments.

Example 1

The low-voltage station area line loss reasonable interval prediction method based on deep learning provided by the embodiment comprises the following steps:

s1, collecting data, wherein the data comprises four systems of acquisition, marketing, PMS and GIS, the related data comprises distribution transformer side data and user side data, the distribution transformer side data comprises distribution transformer capacity, voltage, current, power and electric quantity, and the user side data comprises user capacity, daily electric quantity and user coordinates;

s2, constructing indexes, namely selecting line loss influence factors from the data collected in the S1, and selecting twenty-four line loss influence factors of five major types of power indexes, grid indexes, electric quantity indexes, operation indexes and capacity indexes of the transformer areas;

s3, characteristic construction, namely, inputting the square of the load characteristic and the absolute value of the terminal pressure drop as characteristic factors of a model through correlation analysis of the selected influence factors and the line loss rate, wherein the correlation analysis of 24 influence factors and the line loss rate shows that a certain linear correlation exists between the influence factors and the line loss rate, for example, the power supply radius, the terminal electric quantity duty ratio, the load rate and the three-phase imbalance are all in positive correlation with the line loss rate as a whole, and also some influence factors are in nonlinear correlation, for example, the figure 1 is a line loss rate and load characteristic scattered point schematic diagram, the figure 2 is a line loss rate and first terminal pressure drop scattered point schematic diagram, the square of the line loss rate and the load characteristic is in linear relation, and the absolute value of the line loss rate and the terminal pressure drop is in positive linear relation, so that the square of the load characteristic and the absolute value of the terminal pressure drop are taken as characteristic inputs of the model;

s4, selecting characteristic factors, constructing a Lasso regression model by adopting the counted 22 index factors and the two constructed characteristic factors, and finally selecting eight factors including the network surfing electric quantity duty ratio, the power supply radius, the terminal electric quantity duty ratio, the power factor of a platform area, the absolute value of the first terminal voltage drop, the load rate, the load characteristic square and the three-phase imbalance coefficient as final characteristic factors according to the Lasso regression variable screening result, wherein if the model training influence factors are too much, the model training efficiency and the model stability are influenced. The invention adopts Lasso to screen characteristic factors, lasso is widely applied to one of methods of parameter estimation and variable selection, and the Lasso is proved to be consistent under the determined condition, and the Lasso is a regularization method for simultaneously carrying out parameter estimation and variable selection, wherein the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,is a penalty term;

s6, displaying the model prediction effect.

In the step S2, the line loss characteristic index factors of the twenty-four transformer areas are respectively power supply indexes: the power-on ratio; capacity class index: distribution transformer capacity, single-phase user total capacity, three-phase user total capacity, user capacity ratio, single-phase user total capacity percentage, and three-phase user total capacity percentage; grid type index: power supply radius, grid structure and power supply length of house number; the electric quantity class index: the method comprises the following steps of total daily power supply quantity, total daily power consumption of single-phase users, total daily power consumption of three-phase users, total power consumption percentage of single-phase users, total power consumption percentage of three-phase users and percentage of terminal user power consumption; operation type index: power factor, average bus voltage, head-to-tail voltage drop, average load factor, maximum load factor, load characteristics, three-phase imbalance, maximum three-phase imbalance.

The existence of the outlier can affect the stability of the model, so that the process of screening the regression variable by adopting the Lasso regression model in the step S4 includes performing outlier rejection on the residual estimation, and the outlier rejection process includes the following steps:

1) Calculating prediction residual

2) Calculating the mean and standard deviation of the residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

|z_score|＞3

Example 2

In general, each dimension of the original features of a sample often varies widely in the distribution range of the feature values due to the source and the unit of measure. When we calculate the euclidean distance between different samples, the feature with large value range will play the dominant role. Such as a larger difference in power supply radius and load factor dimension. This affects our convergence rate and stability and accuracy of the model for later construction of the deep learning model, and therefore requires normalization of the input feature factors.

The difference between this embodiment and embodiment 1 is that the normalization processing is required for the selected feature factors in the step S4, and the minimum-maximum normalization method is adopted in this embodiment to normalize each feature factor between [0.1,0.9], and it is assumed that N samples { x (N) } nn=1 are provided, and for each dimension of feature x, the normalized feature is

Example 3

In the step S5, eight characteristic factors selected by a LASSO algorithm are used as inputs of an LSTM model for constructing a line loss prediction model, and a network structure is optimized and trained through loop iteration, and the method comprises the following steps:

LSTM is a recurrent neural network, LSTM layer is a variant of SimpleRNN layer, and the algorithm was developed by Hochrite and Schmidhuber in 1997, effectively solving the problem of gradient disappearance of SimpleRNN simple recurrent neural network. The method is realized by adding a method for carrying information to span a plurality of time steps, so that the problem of gradient disappearance is solved.

The basic principle is that, assuming that there is a conveyor belt and a time sequence, the direction of the conveyor belt and the direction of the time sequence are parallel, the information in the time sequence can jump to the conveyor belt at any time node, the conveyor belt transmits the information to a later time node, and the information is returned to the original time node as needed. This is the basic principle of LSTM: it saves the information in such a form for later use, thereby preventing the earlier time information from fading out during processing. The structure is shown in fig. 3, where h (t) is a short-term state; c (t) is a long term state; g (t) is the main layer output layer, the basic function of which is to analyze the current input x (t) and the previous short-term state h (t-1); f (2) controlling which long term states should be discarded; i (t) controls which parts of g (t) are added to the long-term state; o (t) controls which long term states should be iteratively read and output at this time.

The calculation formula of LSTM is as follows:

i _(t) ＝σ(W _xi ^T *X _(t) +W _hi ^T *h _(t-1) +b _i )

f _(t) ＝σ(W _xf ^T *X _(t) +W _hf ^T *h _(t-1) +b _f )

o _(t) ＝σ(W _xo ^T *X _(t) +W _ho ^T *h _(t-1) +b _o )

g _(t) ＝tanh(W _xg ^T *X _(t) +W _hg ^T *h _(t-1) +b _g )

wherein W is _xi ,W _fi ,W _xo ,W _xg Is that each layer is connected to an input vector x _(t) ；W _hi ,W _hf ,W _ho ,W _hg Is that each layer is connected to the previous short-term state h _(t-1) Is a weight matrix of (2); b _i ,b _f ,b _o ,b _g Is the coefficient of deviation for each layer.

Example 4

The line loss prediction belongs to regression, and for regression algorithm, the performance of an evaluation model is usually evaluated by MSE, RMSE, MAPE, R2 and the like, but because the actual line loss rate is used as a target variable for model training, and the actual line loss rate is not symmetrically distributed and has a certain tailing effect by analyzing the actual line loss rate distribution, the RMSE is not optimal, and in the embodiment, the RMSLE is used as an evaluation function of the LSTM model, and the formula is as follows:

Because of a certain tailing effect of the actual line loss rate, if the RMSE is adopted for evaluation, the value is led by an abnormally large value, so that even if a plurality of small values are accurately predicted, when the individual abnormally large values are not predicted, the RMSE is large. And RLMSE takes the logarithm and then calculates the RMSE, so that the problem can be effectively solved.

Example 5

The 20 present areas of a certain city are randomly extracted, the relation between the actual line loss rate and the predicted line loss rate of the area near 20 days is analyzed, blue is the actual line loss rate, yellow is the predicted line loss rate, most of predicted values are concentrated in the middle position of the actual line loss rate as seen in fig. 4, and the predicted values are similar to the actual line loss rate distribution, so that the model prediction effect is better.

The upper boundary of the prediction interval is estimated according to the residual error between the predicted line loss and the actual line loss, namely, the RMSE (root mean square error) of a training model, and the interval expansion is carried out according to the 3 sigma principle, wherein one time of delta is 68.3%,1.5 delta is 93.32%, delta is 95.4%,3 delta is 99.7%, the invention adopts 1.5 delta for expansion, wherein delta=RMSE, and the calculation formula of the RMSE is as follows

The upper boundary of the prediction interval of the invention is therefore

Wherein the method comprises the steps ofFor reasonable interval upper bound->And (3) obtaining root mean square error for the line loss predicted value through training of the RMSE through a model.

The invention is used for carrying out a line loss prediction model of a certain area, carrying out modeling analysis according to more than 30 ten thousand sample data of 1 ten thousand areas in a certain city, and well verifying the feasibility and the effectiveness of the method, wherein the accuracy of the model reaches more than 90% when the model is verified on site.

The method provided by the invention is different from other machine learning methods, the line loss of the transformer area is predicted by adopting a deep learning algorithm, and the prediction accuracy is obviously improved; the design of the process of feature index construction and feature selection is novel and reasonable; 3) The Lasso regression model is adopted to screen regression variables, and abnormal values are removed from residual error estimation, so that the stability of the model is ensured; 4) The model evaluation adopts an RMSLE strategy to distinguish the traditional evaluation strategy, so that the tailing effect is effectively solved; 5) The upper range boundary is extended with 1.5 times RMSE.

Finally, it should be noted that the above-mentioned embodiments are only for illustrating the technical scheme of the present invention and are not limiting; while the invention has been described in detail with reference to the preferred embodiments, those skilled in the art will appreciate that: modifications may be made to the specific embodiments of the present invention or equivalents may be substituted for part of the technical features thereof; without departing from the spirit of the invention, it is intended to cover the scope of the invention as claimed.

Claims

1. A low-voltage station area line loss reasonable interval prediction method based on deep learning is characterized in that,

the method comprises the following steps:

s4, feature factor selection, namely constructing a Lasso regression model by adopting a statistical index factor and two constructed feature factors, and finally selecting eight factors including a network power duty ratio, a power supply radius, a terminal power duty ratio, a power factor of a platform area, an absolute value of a head-terminal pressure drop, a load rate, a load characteristic square and a three-phase imbalance coefficient as final feature factors according to a Lasso regression variable screening result, wherein the process of screening the regression variable by adopting the Lasso regression model comprises the steps of:

1) Calculating prediction residual

2) Calculating the mean and standard deviation of the residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

z_score＞3

Removing a line loss rate sample value with a z score greater than 3 in the sample data;

s5, constructing a line loss prediction model, constructing the line loss prediction model by adopting an LSTM, wherein eight characteristic factors are selected from the line loss prediction model in the step S5 by adopting an LASSO algorithm to serve as the input of the LSTM, and the training network structure is optimized through cyclic iteration, and the method comprises the following steps:

4) Finally, predicting corresponding line loss rate and reasonable interval upper bound for the newly input characteristic factors by using the trained model;

s6, displaying the model prediction effect.

2. The method for predicting reasonable interval of line loss in low-voltage transformer area based on deep learning as claimed in claim 1, wherein in step S1, the data of the distribution transformer side includes distribution transformer capacity, voltage, current, power and electric quantity, and the data of the user side includes user capacity, daily electric quantity and user coordinates.

3. The method for predicting the reasonable interval of the line loss of the low-voltage transformer area based on the deep learning according to claim 1, wherein in the step S2, twenty-four transformer area line loss influencing factors including a transformer area power class index, a net rack class index, an electric quantity class index, an operation class index and a capacity class index are selected.

4. The method for predicting reasonable interval of line loss of low-voltage transformer area based on deep learning as claimed in claim 1, wherein the Lasso regression model in step S4 is a regularization method for simultaneously performing parameter estimation and variable selection, and the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,is a penalty term.

5. The method for predicting reasonable interval of line loss in low-voltage transformer area based on deep learning as claimed in claim 1, wherein said step S4 further comprises normalizing the selected feature factors, normalizing each feature factor to a value between [0.1,0.9] by minimum maximum normalization method, assuming N samples { x (N) } nn=1, for each dimension of feature x, the normalized feature is

6. The method for predicting the reasonable interval of the line loss of the low-voltage transformer area based on deep learning as claimed in claim 1, wherein an evaluation function of the LSTM model adopts RMSLE, and the formula is as follows:

where yi is the actual line loss value,is a predicted value of line loss.