Background
With the high-speed development of national economy, the electric power industry is also in a rapid development stage, and accurate electricity selling quantity prediction has important significance for power supply enterprises in various aspects such as adjustment of future power supply quantity, optimization of power supply structures, improvement of operation safety of electric power systems and the like.
The nature of the electricity sales prediction problem can be attributed to the time series prediction problem. The time series refers to a series formed by arranging numerical values expressed by certain statistical indexes at different moments according to the sequence of occurrence of the numerical values, and the time series data is actual data reflected by the time series. Since the change of the electricity sales amount is not only related to the electricity sales amount on the time axis, but also related to other additional conditions, including a plurality of factors such as weekend conditions, holiday conditions, seasons, air temperature and climate, the electricity sales amount prediction is essentially a multi-condition time series prediction problem.
Common models for time series prediction include the classical autoregressive model (AR), the moving average Model (MA), and the autoregressive moving average model ARMA model combining the two (ii transitions, 2015), while since time series data often exhibit non-stationary characteristics, researchers propose an autoregressive integrated sliding model (ARIMA) that differentiates time series data in one or more steps so that the data becomes stationary (Journal of the American Statistical Association, 1970).
Besides the non-stationary characteristic, the time sequence is usually non-linear, which results in that the classical autoregressive model cannot be fitted well; a nonlinear model is adopted for a non-stationary nonlinear time sequence, so that a better effect is usually obtained, and researchers propose a regression model based on SVM and LS-SVM (Neural Networks and Brain, 2005); secondly, because of the ability of Neural Networks to fit any Borel measurable function with any accuracy, scholars propose Neural network-based prediction models (Neural Networks, 1989), and in addition, scholars propose hybrid models combining Neural Networks and ARIMA (neuro-prediction, 2003) which achieve better prediction results in time series by taking advantage of the nonlinear and linear modeling of Neural Networks and ARIMA, respectively.
However, the above prediction methods often have difficulty in efficiently extracting good features, which makes the accuracy of time series prediction results often not high. Deep learning is a very effective method for extracting features (Nature, 2015) proposed in recent years, in which a Recurrent Neural Network (RNN) model is mainly applied to process sequence-like problems, and it is based on a recurrent network structure to fully utilize sequence information of sequence data itself and discover intrinsic rules and features of sequences. Chen et al (IEEE International Conference on Big Data, 2015) utilize recurrent neural network to predict stocks; sutskeeper et al (International Conference on Machine Learning, 2011) utilize recurrent neural networks for text generation; gregor et al (International Conference on Machine Learning, 2015) model pictures in sequence using a recurrent neural network for automatic picture generation.
However, RNNs have a problem of disappearance of the gradient, resulting in poor utilization of long-term history information.
Disclosure of Invention
In order to solve the problems, the invention provides a method for predicting the electricity sales amount based on a long-short term memory network, which can automatically learn the data characteristics of the electricity sales amount and relevant influence factors, and model multiple conditions of electricity sales amount data based on the long-short term memory network to realize accurate prediction of the electricity sales amount.
In order to solve the problems, the invention adopts the following technical scheme:
the invention discloses a method for predicting electricity sales amount based on a long-term and short-term memory network, which comprises the following steps:
s1: determining influence factors influencing the electricity selling quantity data;
s2: calculating a Pearson correlation coefficient r of the electricity sales data and each influence factor data of each industry to be analyzed, and constructing a correlation coefficient matrix;
s3: clustering the Pearson correlation coefficients r of all the industries by using a k-means clustering algorithm to obtain a plurality of clusters after clustering, wherein each cluster comprises a plurality of industries and is used as a subset, and then accumulating the electricity sales data of all the industries in each cluster according to date to obtain the daily total electricity consumption data of each cluster;
s4: normalizing the daily total power consumption data of each cluster, and normalizing the influence factor data;
s5: establishing a power sale quantity prediction model based on a long-term and short-term memory network (LSTM), training the daily power consumption data and the influence factor data of each cluster after normalization processing as the input of the power sale quantity prediction model, taking M days as the time step of training data, constructing a 3-dimensional input tensor with the shape of (batch size, time step and features), wherein the batch size is the size of sample input batch, the time step is the time step, and the features are the total characteristic number, obtaining the power sale quantity prediction result of each cluster, and summing the power sale quantity prediction results of all clusters to obtain the total power sale quantity prediction result.
In the technical scheme, since the electricity sales amount relates to a plurality of industries, each industry comprises a plurality of industries, each industry has respective influence factors in the electricity sales amount, and the industries need to be classified, the correlation between the electricity sales amount data of each industry to be analyzed and each influence factor data is calculated, and the correlation adopts a Pearson correlation coefficient r as an evaluation basis, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data.
And then, carrying out normalization processing on the electricity sales data and the influence factor data thereof, and finally obtaining a total electricity sales prediction result based on a long-term and short-term memory network (LSTM) electricity sales prediction model.
Preferably, the calculation formula of the Pearson correlation coefficient r is as follows:
wherein r represents a Pearson correlation coefficient, the value range is [ -1, 1], r ═ 0 represents irrelevance, the closer the r value is to 1, the greater the positive correlation is, and the closer the r value is to-1, the greater the negative correlation is; x and y are two data characteristic variables; cov (x, y) denotes covariance, σ x, σ y denotes standard deviation.
Preferably, the k-means clustering algorithm in step S3 is as follows:
inputting: the Pearson correlation coefficient r set D ═ x of the electricity selling quantity data and each influence factor data1,x2,...,xm},xiExpressing a correlation coefficient vector of the ith industry, wherein each dimension expresses an influence factor related to the electricity sales data, m expresses the number of industries, and the number of clustering clusters is k;
and (3) outputting: and K clustering clusters.
Preferably, the calculation formula of the normalization processing in step S4 is as follows:
where x is the original value of each type of data, xminFor the minimum value of each type of data, xmaxFor the maximum value of each type of data, x*And calculating the obtained normalized value for the data.
Preferably, the method for establishing the model for predicting the electricity sales amount based on the long-short term memory network LSTM in step S5 includes the following steps:
setting the selling electricity time sequence data as X ═ Xt,x2,...,xt-1,xt),xtIf the power selling amount is t, the power selling amount is predicted by obtaining X from the known time series data Xt+1Maximum likelihood estimate of time p (x):
in the case of the multi-conditional time series, the formula (3) becomes the following form:
wherein x
tIndicating the amount of electricity sold at time t,
a value representing the ith influencing factor at time t;
the formula (3) and the formula (4) are targets of electricity sales amount prediction, a characteristic rule of sequence data is extracted by forgetting and memorizing control history information of an input gate, a forgetting gate and an output gate in the LSTM-RNN network, a prediction result is finally output, and the calculation process at the time t is as follows:
it=sigmoid(Wxixt+Whiht-1+Wcict+bi)
ft=sigmoid(Wxfxt+Whfht-1+Wcfct+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=sigmoid(Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
wherein i, f and o respectively represent an input gate, a forgetting gate and an output gate; c represents a memory cell; h represents hidden layer output; w represents a connection weight, and the subscript thereof represents a weight association; b is a bias term.
And controlling the proportion of the history information passing through a gate structure according to the values of i, f and o in the range of [0, 1], wherein the gate structure of the LSTM memory unit enables the recurrent neural network to learn the long-interval history information.
The invention has the beneficial effects that: through the designed data feature selection, data normalization processing process and long-short term memory network, the data of the electricity sales amount and the data features of relevant influence factors can be automatically learned, the electricity sales amount can be accurately predicted, and the accuracy of the electricity sales amount prediction is improved.
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b): the method for predicting the electricity sales amount based on the long-term and short-term memory network in the embodiment is shown in fig. 1, and comprises the following steps:
s1: determining influence factors influencing the electricity selling quantity data;
s2: calculating a Pearson correlation coefficient r of the electricity sales data and each influence factor data of each industry to be analyzed, and constructing a correlation coefficient matrix;
the Pearson correlation coefficient r is calculated as follows:
wherein r represents a Pearson correlation coefficient, the value range is [ -1, 1], r ═ 0 represents irrelevance, the closer the r value is to 1, the greater the positive correlation is, and the closer the r value is to-1, the greater the negative correlation is; x and y are two data characteristic variables; cov (x, y) denotes covariance, σ x, σ y denotes standard deviation;
through the calculation result, whether two data characteristic variables are related or not can be known, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data;
s3: clustering Pearson correlation coefficients r of all industries by using a k-means clustering algorithm to obtain a plurality of clusters after clustering, wherein each cluster comprises a plurality of industries, the electricity sales data of the industries in the same cluster have certain similarity to certain influence factor data, each cluster is used as a subset, and then the electricity sales data of all the industries in each cluster are accumulated according to dates to obtain the daily total electricity consumption data of each cluster;
the k-means clustering algorithm process is as follows:
inputting: the Pearson correlation coefficient r set D ═ x of the electricity selling quantity data and each influence factor data1,x2,...,xm},xiAnd expressing a correlation coefficient vector of the ith industry, wherein each dimension expresses an influence factor related to the electricity sales data, m expresses the industry number, and the cluster number is k:
and (3) outputting: k clustering clusters;
the method comprises the following steps:
(1) randomly selecting k rows from the correlation coefficient set DThe correlation coefficient vector is used as the initial mean value vector mu ═ mu1,μ2,...μkIn which μiIs the correlation coefficient vector of the ith industry;
(2)Do;
(4)forj=1,2,...,m do;
(5) Calculating xj and μiDetermining the cluster division of xj according to the nearest mean vector;
(6)end for;
(7)for i=1,2,3,...,k do;
(8) calculating a new mean vector mui;
(9) If mu'i!=μiThen update muiOtherwise, keeping the state unchanged;
(10) while (μ no longer updated);
s4: normalizing the daily total power consumption data of each cluster, and normalizing the influence factor data;
because the difference between the electricity selling quantity value and the influencing factor value is large, the data must be normalized, the method is to scale various values to the same scale, and the calculation formula of the normalization processing is as follows:
where x is the original value of each type of data, xminFor the minimum value of each type of data, xmaxFor the maximum value of each type of data, x*Calculating a normalized value for the class of data;
s5: establishing a power sale quantity prediction model based on a long-term and short-term memory network (LSTM), training the daily power consumption data and the influence factor data of each cluster after normalization processing as the input of the power sale quantity prediction model, taking M days as the time step of training data, constructing a 3-dimensional input tensor according to the data model, wherein the shape is (batch size, time step, features), the batch size is the size of sample input batch, the time step is the time step, the features are the total characteristic number, obtaining the power sale quantity prediction result of each cluster, and summing the power sale quantity prediction results of all clusters to obtain the total power sale quantity prediction result.
Because the electricity sales amount relates to a plurality of industries, each industry comprises a plurality of industries, each industry has respective influence factors in the electricity sales amount, and the industries need to be classified, the correlation between the electricity sales amount data of each industry to be analyzed and each influence factor data is calculated, and the correlation adopts a Pearson correlation coefficient r as an evaluation basis, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data.
And then, carrying out normalization processing on the electricity sales data and the influence factor data thereof, and finally obtaining a total electricity sales prediction result based on a long-term and short-term memory network (LSTM) electricity sales prediction model.
The method for establishing the model for predicting the electricity sales amount based on the long-short term memory network LSTM in the step S5 comprises the following steps:
setting the selling electricity time sequence data as X ═ Xt,x2,...,xt-1,xt),xtIf the power selling amount is t, the power selling amount is predicted by obtaining X from the known time series data Xt+1Maximum likelihood estimate of time p (x):
in the case of the multi-conditional time series, the formula (3) becomes the following form:
wherein x
tSale of electricity at time tThe amount of the compound (A) is,
a value representing the ith influencing factor at time t;
formula (3) and formula (4) are the targets of electricity sales prediction, the feature rule of sequence data is extracted by forgetting and memorizing the control history information of an input gate, a forgetting gate and an output gate in the LSTM-RNN network, and the prediction result is finally output, the LSTM memory unit structure is shown in figure 2, the LSTM network is used to avoid the problem of gradient disappearance after the depth of the model hierarchy is deepened, and the calculation process at time t is as follows:
it=sigmoid(Wxixt+Whiht-1+Wcict+bi)
ft=sigmoid(Wxfxt+Whfht-1+Wcfct+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=sigmoid(Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
wherein i, f and o respectively represent an input gate, a forgetting gate and an output gate; c represents a memory cell; h represents hidden layer output; w denotes the connection weight, with the subscripts denoting weight associations, e.g. WxiRepresenting the connection weight from the input layer to the input gate; b is a bias term.
And controlling the proportion of the history information passing through a gate structure according to the values of i, f and o in the range of [0, 1], wherein the gate structure of the LSTM memory unit enables the recurrent neural network to learn the long-interval history information.
Taking the total electricity sales of 10 industries in the second industry and the third industry of a certain market as an example, exemplary data information is shown in table 1 below,
TABLE 1 historical data of electricity sold by industry
| Name of trade
|
Industry code
|
Time span
|
Total amount of data
|
| Mining of precious metals
|
0920
|
20150101--20170525
|
870 strips
|
| Asbestos and other non-metallic mining
|
1090
|
20150101--20170525
|
870 strips
|
| Telecommunications
|
6010
|
20150101--20170525
|
870 strips
|
| Real estate development and management
|
7210
|
20150101--20170525
|
870 strips
|
| Road passenger transport
|
5210
|
20150101--20170525
|
870 strips
|
| Rolling process for non-ferrous metal
|
3550
|
20150101--20170525
|
870 strips
|
| Structural metal article manufacture
|
3410
|
20150101--20170525
|
870 strips
|
| Cutting tool manufacture
|
3421
|
20150101--20170525
|
870 strips
|
| Other metal tool fabrication
|
3429
|
20150101--20170525
|
870 strips
|
| Metal wire rope and manufacture of products thereof
|
3440
|
20150101--20170525
|
870 strips |
Taking the telecommunication industry in table 1 as an example, the selling electricity amount data is shown in table 2,
TABLE 2 electric sales data of telecommunication industry
The specific implementation mode of the invention is further explained by combining the electric quantity selling data of each industry and considering three influence factors of holiday conditions, seasonal conditions and daily average air temperature, and the steps are as follows:
s1: determining influence factors influencing the electricity selling quantity data as holidays, quarters and daily average air temperatures;
s2: calculating Pearson correlation coefficients r of the electricity sales data and the influence factor data of each of the 10 industries, and constructing a correlation coefficient matrix;
such as: calculating the correlation coefficient r between the telecommunication industry electricity sales data and the holidays, the correlation coefficient r between the telecommunication industry electricity sales data and the seasons, and the correlation coefficient r between the telecommunication industry electricity sales data and the daily average air temperature to obtain the following correlation coefficient matrix,
the first column represents the correlation coefficient between the telecommunication industry electricity sales data and the holidays, the second column represents the correlation coefficient between the telecommunication industry electricity sales data and the seasons, and the third column represents the correlation coefficient between the telecommunication industry electricity sales data and the daily average air temperature;
s3: clustering the Pearson correlation coefficient r of the above 10 industries by using a k-means clustering algorithm, wherein the number of clusters is set to be 2, and the clustering result is as follows:
cluster 1 contains the industry number: 0920, 1090, 7210, 5210, 3350, 3421, 3429, 3440;
cluster 2 contains the industry number: 1090, 3410;
taking each cluster as an industry subset, accumulating the electricity sales of all industries in each cluster according to date to obtain the daily total electricity consumption data of the cluster, and finally obtaining 2 industry feature sets in total;
s4: because the numerical value of the electricity sales amount is greatly different from the numerical value of the average temperature in holidays, seasons or days, the data must be preprocessed, the method is to scale various numerical values to the same scale, the adopted scaling mode is min-max normalization, and the operation can scale the data of all dimensions to be between [0, 1] in numerical value;
taking the daily average air temperature as an example:
the results of scaling each temperature data by the formula (2) are shown in table 3, where the daily average temperature is 33.5 degrees in the maximum temperature value and-2.5 degrees in the minimum temperature value for 870 days in total from 01/2015 to 25/05/2017/05/2015,
TABLE 3 normalized daily average air temperature results
S5: establishing a power sales forecasting model based on a long and short term memory network (LSTM), training the daily power data and the influence factor data of each cluster after normalization processing as the input of the power sales forecasting model, taking 90 days as the time step of training data, constructing 3-dimensional input tensors according to the data models, wherein the shapes of the 3-dimensional input tensors are (batch size, time step and features), the batch size is the size of the sample input batch, the time step is the time step, the features are the total feature number, the batch size is 1 in the example, the time step is 90, the features are 4, and the detailed parameters of the model are shown in the following table 4,
TABLE 4 prediction model training parameters for electricity sales
| Name of hyper-parameter
|
Detailed description of the invention
|
| Number of LSTM layers
|
2
|
| Number of LSTM hidden units
|
100
|
| Learning algorithm
|
RMSprop
|
| Learning rate
|
0.001
|
| Loss function
|
MSE
|
| Dropout rate
|
0.5
|
| Batch size (batch size)
|
128
|
| Step of time
|
90
|
| Number of iterations
|
500 |
The input of the electricity sales prediction model is the 3-dimensional tensor, the output of the model is future electricity sales data, and 2 prediction models are obtained through training because the number of the current cluster is 2;
and in the prediction stage, a data preprocessing process which is the same as that in the training stage is used for constructing a 3-dimensional input tensor, the shape of the input tensor is (1, 90, 4), data are input into the electricity sales amount prediction model, and the final result is that the prediction results of the 2 prediction models are summed up and accumulated to obtain the total electricity sales amount prediction result.