[go: up one dir, main page]

CN107967542B - A prediction method of electricity sales based on long short-term memory network - Google Patents

A prediction method of electricity sales based on long short-term memory network Download PDF

Info

Publication number
CN107967542B
CN107967542B CN201711400097.5A CN201711400097A CN107967542B CN 107967542 B CN107967542 B CN 107967542B CN 201711400097 A CN201711400097 A CN 201711400097A CN 107967542 B CN107967542 B CN 107967542B
Authority
CN
China
Prior art keywords
data
electricity sales
electricity
correlation coefficient
term memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711400097.5A
Other languages
Chinese (zh)
Other versions
CN107967542A (en
Inventor
黎自若
方志强
王晓辉
夏通
周艳梅
付健艺
施进平
周晨
吴志华
吴中旻
朱好
吴昊铮
严辉敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lishui Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
State Grid Corp of China SGCC
Original Assignee
Lishui Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
State Grid Corp of China SGCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lishui Power Supply Co of State Grid Zhejiang Electric Power Co Ltd, State Grid Corp of China SGCC filed Critical Lishui Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority to CN201711400097.5A priority Critical patent/CN107967542B/en
Publication of CN107967542A publication Critical patent/CN107967542A/en
Application granted granted Critical
Publication of CN107967542B publication Critical patent/CN107967542B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于长短期记忆网络的售电量预测方法。它包括以下步骤:S1:确定影响售电量数据的影响因素;S2:计算每个待分析行业的售电量数据与每个影响因素数据的Pearson相关系数r;S3:使用k‑means聚类算法,对以上各个行业的Pearson相关系数r进行聚类,聚类之后得到若干个簇;S4:对每个簇的日总用电量数据进行归一化处理,对影响因素数据进行归一化处理;S5:基于长短期记忆网络LSTM的售电量预测模型得到总的售电量预测结果。本发明能够自动学习售电量数据以及相关影响因素的数据特征,基于长短期记忆网络对多条件售电量数据进行建模,实现对售电量的准确预测。

Figure 201711400097

The invention discloses a method for predicting electricity sales based on a long and short-term memory network. It includes the following steps: S1: Determine the influencing factors that affect the electricity sales data; S2: Calculate the Pearson correlation coefficient r between the electricity sales data of each industry to be analyzed and each influencing factor data; S3: Use the k-means clustering algorithm, The Pearson correlation coefficient r of the above industries is clustered, and several clusters are obtained after clustering; S4: normalize the daily total electricity consumption data of each cluster, and normalize the influencing factor data; S5: The electricity sales forecast model based on the long short-term memory network LSTM obtains the total electricity sales forecast result. The invention can automatically learn the data characteristics of electricity sales data and related influencing factors, model the multi-condition electricity sales data based on the long and short-term memory network, and realize the accurate prediction of the electricity sales.

Figure 201711400097

Description

Long-short term memory network-based electricity sales amount prediction method
Technical Field
The invention relates to the technical field of power sale quantity prediction, in particular to a power sale quantity prediction method based on a long-term and short-term memory network.
Background
With the high-speed development of national economy, the electric power industry is also in a rapid development stage, and accurate electricity selling quantity prediction has important significance for power supply enterprises in various aspects such as adjustment of future power supply quantity, optimization of power supply structures, improvement of operation safety of electric power systems and the like.
The nature of the electricity sales prediction problem can be attributed to the time series prediction problem. The time series refers to a series formed by arranging numerical values expressed by certain statistical indexes at different moments according to the sequence of occurrence of the numerical values, and the time series data is actual data reflected by the time series. Since the change of the electricity sales amount is not only related to the electricity sales amount on the time axis, but also related to other additional conditions, including a plurality of factors such as weekend conditions, holiday conditions, seasons, air temperature and climate, the electricity sales amount prediction is essentially a multi-condition time series prediction problem.
Common models for time series prediction include the classical autoregressive model (AR), the moving average Model (MA), and the autoregressive moving average model ARMA model combining the two (ii transitions, 2015), while since time series data often exhibit non-stationary characteristics, researchers propose an autoregressive integrated sliding model (ARIMA) that differentiates time series data in one or more steps so that the data becomes stationary (Journal of the American Statistical Association, 1970).
Besides the non-stationary characteristic, the time sequence is usually non-linear, which results in that the classical autoregressive model cannot be fitted well; a nonlinear model is adopted for a non-stationary nonlinear time sequence, so that a better effect is usually obtained, and researchers propose a regression model based on SVM and LS-SVM (Neural Networks and Brain, 2005); secondly, because of the ability of Neural Networks to fit any Borel measurable function with any accuracy, scholars propose Neural network-based prediction models (Neural Networks, 1989), and in addition, scholars propose hybrid models combining Neural Networks and ARIMA (neuro-prediction, 2003) which achieve better prediction results in time series by taking advantage of the nonlinear and linear modeling of Neural Networks and ARIMA, respectively.
However, the above prediction methods often have difficulty in efficiently extracting good features, which makes the accuracy of time series prediction results often not high. Deep learning is a very effective method for extracting features (Nature, 2015) proposed in recent years, in which a Recurrent Neural Network (RNN) model is mainly applied to process sequence-like problems, and it is based on a recurrent network structure to fully utilize sequence information of sequence data itself and discover intrinsic rules and features of sequences. Chen et al (IEEE International Conference on Big Data, 2015) utilize recurrent neural network to predict stocks; sutskeeper et al (International Conference on Machine Learning, 2011) utilize recurrent neural networks for text generation; gregor et al (International Conference on Machine Learning, 2015) model pictures in sequence using a recurrent neural network for automatic picture generation.
However, RNNs have a problem of disappearance of the gradient, resulting in poor utilization of long-term history information.
Disclosure of Invention
In order to solve the problems, the invention provides a method for predicting the electricity sales amount based on a long-short term memory network, which can automatically learn the data characteristics of the electricity sales amount and relevant influence factors, and model multiple conditions of electricity sales amount data based on the long-short term memory network to realize accurate prediction of the electricity sales amount.
In order to solve the problems, the invention adopts the following technical scheme:
the invention discloses a method for predicting electricity sales amount based on a long-term and short-term memory network, which comprises the following steps:
s1: determining influence factors influencing the electricity selling quantity data;
s2: calculating a Pearson correlation coefficient r of the electricity sales data and each influence factor data of each industry to be analyzed, and constructing a correlation coefficient matrix;
s3: clustering the Pearson correlation coefficients r of all the industries by using a k-means clustering algorithm to obtain a plurality of clusters after clustering, wherein each cluster comprises a plurality of industries and is used as a subset, and then accumulating the electricity sales data of all the industries in each cluster according to date to obtain the daily total electricity consumption data of each cluster;
s4: normalizing the daily total power consumption data of each cluster, and normalizing the influence factor data;
s5: establishing a power sale quantity prediction model based on a long-term and short-term memory network (LSTM), training the daily power consumption data and the influence factor data of each cluster after normalization processing as the input of the power sale quantity prediction model, taking M days as the time step of training data, constructing a 3-dimensional input tensor with the shape of (batch size, time step and features), wherein the batch size is the size of sample input batch, the time step is the time step, and the features are the total characteristic number, obtaining the power sale quantity prediction result of each cluster, and summing the power sale quantity prediction results of all clusters to obtain the total power sale quantity prediction result.
In the technical scheme, since the electricity sales amount relates to a plurality of industries, each industry comprises a plurality of industries, each industry has respective influence factors in the electricity sales amount, and the industries need to be classified, the correlation between the electricity sales amount data of each industry to be analyzed and each influence factor data is calculated, and the correlation adopts a Pearson correlation coefficient r as an evaluation basis, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data.
And then, carrying out normalization processing on the electricity sales data and the influence factor data thereof, and finally obtaining a total electricity sales prediction result based on a long-term and short-term memory network (LSTM) electricity sales prediction model.
Preferably, the calculation formula of the Pearson correlation coefficient r is as follows:
Figure BDA0001518692660000041
wherein r represents a Pearson correlation coefficient, the value range is [ -1, 1], r ═ 0 represents irrelevance, the closer the r value is to 1, the greater the positive correlation is, and the closer the r value is to-1, the greater the negative correlation is; x and y are two data characteristic variables; cov (x, y) denotes covariance, σ x, σ y denotes standard deviation.
Preferably, the k-means clustering algorithm in step S3 is as follows:
inputting: the Pearson correlation coefficient r set D ═ x of the electricity selling quantity data and each influence factor data1,x2,...,xm},xiExpressing a correlation coefficient vector of the ith industry, wherein each dimension expresses an influence factor related to the electricity sales data, m expresses the number of industries, and the number of clustering clusters is k;
and (3) outputting: and K clustering clusters.
Preferably, the calculation formula of the normalization processing in step S4 is as follows:
Figure BDA0001518692660000042
where x is the original value of each type of data, xminFor the minimum value of each type of data, xmaxFor the maximum value of each type of data, x*And calculating the obtained normalized value for the data.
Preferably, the method for establishing the model for predicting the electricity sales amount based on the long-short term memory network LSTM in step S5 includes the following steps:
setting the selling electricity time sequence data as X ═ Xt,x2,...,xt-1,xt),xtIf the power selling amount is t, the power selling amount is predicted by obtaining X from the known time series data Xt+1Maximum likelihood estimate of time p (x):
Figure BDA0001518692660000051
in the case of the multi-conditional time series, the formula (3) becomes the following form:
Figure BDA0001518692660000052
wherein xtIndicating the amount of electricity sold at time t,
Figure BDA0001518692660000053
a value representing the ith influencing factor at time t;
the formula (3) and the formula (4) are targets of electricity sales amount prediction, a characteristic rule of sequence data is extracted by forgetting and memorizing control history information of an input gate, a forgetting gate and an output gate in the LSTM-RNN network, a prediction result is finally output, and the calculation process at the time t is as follows:
it=sigmoid(Wxixt+Whiht-1+Wcict+bi)
ft=sigmoid(Wxfxt+Whfht-1+Wcfct+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=sigmoid(Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
wherein i, f and o respectively represent an input gate, a forgetting gate and an output gate; c represents a memory cell; h represents hidden layer output; w represents a connection weight, and the subscript thereof represents a weight association; b is a bias term.
And controlling the proportion of the history information passing through a gate structure according to the values of i, f and o in the range of [0, 1], wherein the gate structure of the LSTM memory unit enables the recurrent neural network to learn the long-interval history information.
The invention has the beneficial effects that: through the designed data feature selection, data normalization processing process and long-short term memory network, the data of the electricity sales amount and the data features of relevant influence factors can be automatically learned, the electricity sales amount can be accurately predicted, and the accuracy of the electricity sales amount prediction is improved.
Drawings
FIG. 1 is a flow chart of the invention for predicting the electricity sales based on a long-term and short-term memory network;
FIG. 2 is a diagram of a long term memory cell and a short term memory cell.
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b): the method for predicting the electricity sales amount based on the long-term and short-term memory network in the embodiment is shown in fig. 1, and comprises the following steps:
s1: determining influence factors influencing the electricity selling quantity data;
s2: calculating a Pearson correlation coefficient r of the electricity sales data and each influence factor data of each industry to be analyzed, and constructing a correlation coefficient matrix;
the Pearson correlation coefficient r is calculated as follows:
Figure BDA0001518692660000061
wherein r represents a Pearson correlation coefficient, the value range is [ -1, 1], r ═ 0 represents irrelevance, the closer the r value is to 1, the greater the positive correlation is, and the closer the r value is to-1, the greater the negative correlation is; x and y are two data characteristic variables; cov (x, y) denotes covariance, σ x, σ y denotes standard deviation;
through the calculation result, whether two data characteristic variables are related or not can be known, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data;
s3: clustering Pearson correlation coefficients r of all industries by using a k-means clustering algorithm to obtain a plurality of clusters after clustering, wherein each cluster comprises a plurality of industries, the electricity sales data of the industries in the same cluster have certain similarity to certain influence factor data, each cluster is used as a subset, and then the electricity sales data of all the industries in each cluster are accumulated according to dates to obtain the daily total electricity consumption data of each cluster;
the k-means clustering algorithm process is as follows:
inputting: the Pearson correlation coefficient r set D ═ x of the electricity selling quantity data and each influence factor data1,x2,...,xm},xiAnd expressing a correlation coefficient vector of the ith industry, wherein each dimension expresses an influence factor related to the electricity sales data, m expresses the industry number, and the cluster number is k:
and (3) outputting: k clustering clusters;
the method comprises the following steps:
(1) randomly selecting k rows from the correlation coefficient set DThe correlation coefficient vector is used as the initial mean value vector mu ═ mu1,μ2,...μkIn which μiIs the correlation coefficient vector of the ith industry;
(2)Do;
(3) order to
Figure BDA0001518692660000071
(4)forj=1,2,...,m do;
(5) Calculating xj and μiDetermining the cluster division of xj according to the nearest mean vector;
(6)end for;
(7)for i=1,2,3,...,k do;
(8) calculating a new mean vector mui
(9) If mu'i!=μiThen update muiOtherwise, keeping the state unchanged;
(10) while (μ no longer updated);
s4: normalizing the daily total power consumption data of each cluster, and normalizing the influence factor data;
because the difference between the electricity selling quantity value and the influencing factor value is large, the data must be normalized, the method is to scale various values to the same scale, and the calculation formula of the normalization processing is as follows:
Figure BDA0001518692660000081
where x is the original value of each type of data, xminFor the minimum value of each type of data, xmaxFor the maximum value of each type of data, x*Calculating a normalized value for the class of data;
s5: establishing a power sale quantity prediction model based on a long-term and short-term memory network (LSTM), training the daily power consumption data and the influence factor data of each cluster after normalization processing as the input of the power sale quantity prediction model, taking M days as the time step of training data, constructing a 3-dimensional input tensor according to the data model, wherein the shape is (batch size, time step, features), the batch size is the size of sample input batch, the time step is the time step, the features are the total characteristic number, obtaining the power sale quantity prediction result of each cluster, and summing the power sale quantity prediction results of all clusters to obtain the total power sale quantity prediction result.
Because the electricity sales amount relates to a plurality of industries, each industry comprises a plurality of industries, each industry has respective influence factors in the electricity sales amount, and the industries need to be classified, the correlation between the electricity sales amount data of each industry to be analyzed and each influence factor data is calculated, and the correlation adopts a Pearson correlation coefficient r as an evaluation basis, such as: the correlation between the electricity sales data and the air temperature data, the correlation between the electricity sales data and the GDP data, the correlation between the electricity sales data and the investment data, and the correlation between the electricity sales data and the date data.
And then, carrying out normalization processing on the electricity sales data and the influence factor data thereof, and finally obtaining a total electricity sales prediction result based on a long-term and short-term memory network (LSTM) electricity sales prediction model.
The method for establishing the model for predicting the electricity sales amount based on the long-short term memory network LSTM in the step S5 comprises the following steps:
setting the selling electricity time sequence data as X ═ Xt,x2,...,xt-1,xt),xtIf the power selling amount is t, the power selling amount is predicted by obtaining X from the known time series data Xt+1Maximum likelihood estimate of time p (x):
Figure BDA0001518692660000091
in the case of the multi-conditional time series, the formula (3) becomes the following form:
Figure BDA0001518692660000092
wherein xtSale of electricity at time tThe amount of the compound (A) is,
Figure BDA0001518692660000093
a value representing the ith influencing factor at time t;
formula (3) and formula (4) are the targets of electricity sales prediction, the feature rule of sequence data is extracted by forgetting and memorizing the control history information of an input gate, a forgetting gate and an output gate in the LSTM-RNN network, and the prediction result is finally output, the LSTM memory unit structure is shown in figure 2, the LSTM network is used to avoid the problem of gradient disappearance after the depth of the model hierarchy is deepened, and the calculation process at time t is as follows:
it=sigmoid(Wxixt+Whiht-1+Wcict+bi)
ft=sigmoid(Wxfxt+Whfht-1+Wcfct+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=sigmoid(Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
wherein i, f and o respectively represent an input gate, a forgetting gate and an output gate; c represents a memory cell; h represents hidden layer output; w denotes the connection weight, with the subscripts denoting weight associations, e.g. WxiRepresenting the connection weight from the input layer to the input gate; b is a bias term.
And controlling the proportion of the history information passing through a gate structure according to the values of i, f and o in the range of [0, 1], wherein the gate structure of the LSTM memory unit enables the recurrent neural network to learn the long-interval history information.
Taking the total electricity sales of 10 industries in the second industry and the third industry of a certain market as an example, exemplary data information is shown in table 1 below,
TABLE 1 historical data of electricity sold by industry
Name of trade Industry code Time span Total amount of data
Mining of precious metals 0920 20150101--20170525 870 strips
Asbestos and other non-metallic mining 1090 20150101--20170525 870 strips
Telecommunications 6010 20150101--20170525 870 strips
Real estate development and management 7210 20150101--20170525 870 strips
Road passenger transport 5210 20150101--20170525 870 strips
Rolling process for non-ferrous metal 3550 20150101--20170525 870 strips
Structural metal article manufacture 3410 20150101--20170525 870 strips
Cutting tool manufacture 3421 20150101--20170525 870 strips
Other metal tool fabrication 3429 20150101--20170525 870 strips
Metal wire rope and manufacture of products thereof 3440 20150101--20170525 870 strips
Taking the telecommunication industry in table 1 as an example, the selling electricity amount data is shown in table 2,
TABLE 2 electric sales data of telecommunication industry
Figure BDA0001518692660000101
Figure BDA0001518692660000111
The specific implementation mode of the invention is further explained by combining the electric quantity selling data of each industry and considering three influence factors of holiday conditions, seasonal conditions and daily average air temperature, and the steps are as follows:
s1: determining influence factors influencing the electricity selling quantity data as holidays, quarters and daily average air temperatures;
s2: calculating Pearson correlation coefficients r of the electricity sales data and the influence factor data of each of the 10 industries, and constructing a correlation coefficient matrix;
such as: calculating the correlation coefficient r between the telecommunication industry electricity sales data and the holidays, the correlation coefficient r between the telecommunication industry electricity sales data and the seasons, and the correlation coefficient r between the telecommunication industry electricity sales data and the daily average air temperature to obtain the following correlation coefficient matrix,
Figure BDA0001518692660000112
the first column represents the correlation coefficient between the telecommunication industry electricity sales data and the holidays, the second column represents the correlation coefficient between the telecommunication industry electricity sales data and the seasons, and the third column represents the correlation coefficient between the telecommunication industry electricity sales data and the daily average air temperature;
s3: clustering the Pearson correlation coefficient r of the above 10 industries by using a k-means clustering algorithm, wherein the number of clusters is set to be 2, and the clustering result is as follows:
cluster 1 contains the industry number: 0920, 1090, 7210, 5210, 3350, 3421, 3429, 3440;
cluster 2 contains the industry number: 1090, 3410;
taking each cluster as an industry subset, accumulating the electricity sales of all industries in each cluster according to date to obtain the daily total electricity consumption data of the cluster, and finally obtaining 2 industry feature sets in total;
s4: because the numerical value of the electricity sales amount is greatly different from the numerical value of the average temperature in holidays, seasons or days, the data must be preprocessed, the method is to scale various numerical values to the same scale, the adopted scaling mode is min-max normalization, and the operation can scale the data of all dimensions to be between [0, 1] in numerical value;
taking the daily average air temperature as an example:
the results of scaling each temperature data by the formula (2) are shown in table 3, where the daily average temperature is 33.5 degrees in the maximum temperature value and-2.5 degrees in the minimum temperature value for 870 days in total from 01/2015 to 25/05/2017/05/2015,
TABLE 3 normalized daily average air temperature results
Figure BDA0001518692660000121
Figure BDA0001518692660000131
S5: establishing a power sales forecasting model based on a long and short term memory network (LSTM), training the daily power data and the influence factor data of each cluster after normalization processing as the input of the power sales forecasting model, taking 90 days as the time step of training data, constructing 3-dimensional input tensors according to the data models, wherein the shapes of the 3-dimensional input tensors are (batch size, time step and features), the batch size is the size of the sample input batch, the time step is the time step, the features are the total feature number, the batch size is 1 in the example, the time step is 90, the features are 4, and the detailed parameters of the model are shown in the following table 4,
TABLE 4 prediction model training parameters for electricity sales
Name of hyper-parameter Detailed description of the invention
Number of LSTM layers 2
Number of LSTM hidden units 100
Learning algorithm RMSprop
Learning rate 0.001
Loss function MSE
Dropout rate 0.5
Batch size (batch size) 128
Step of time 90
Number of iterations 500
The input of the electricity sales prediction model is the 3-dimensional tensor, the output of the model is future electricity sales data, and 2 prediction models are obtained through training because the number of the current cluster is 2;
and in the prediction stage, a data preprocessing process which is the same as that in the training stage is used for constructing a 3-dimensional input tensor, the shape of the input tensor is (1, 90, 4), data are input into the electricity sales amount prediction model, and the final result is that the prediction results of the 2 prediction models are summed up and accumulated to obtain the total electricity sales amount prediction result.

Claims (4)

1. A method for predicting electricity sales amount based on a long-term and short-term memory network is characterized by comprising the following steps:
s1: determining influence factors influencing the electricity selling quantity data;
s2: calculating a Pearson correlation coefficient r of the electricity sales data and each influence factor data of each industry to be analyzed, and constructing a correlation coefficient matrix;
s3: clustering the Pearson correlation coefficients r of all the industries by using a k-means clustering algorithm to obtain a plurality of clusters after clustering, wherein each cluster comprises a plurality of industries and is used as a subset, and then accumulating the electricity sales data of all the industries in each cluster according to date to obtain the daily total electricity consumption data of each cluster;
s4: normalizing the daily total power consumption data of each cluster, and normalizing the influence factor data;
s5: establishing an electricity sales quantity prediction model based on a long-term and short-term memory network (LSTM), training the daily electricity consumption data and the influence factor data of each cluster after normalization processing as the input of the electricity sales quantity prediction model, taking M days as the time step of training data, constructing a 3-dimensional input tensor with the shape of (batch size, time step and features), wherein the batch size is the size of a sample input batch, the time step is the time step, and the features are the total characteristic number, obtaining the electricity sales quantity prediction result of each cluster, and summing the electricity sales quantity prediction results of all the clusters to obtain the total electricity sales quantity prediction result; the method for establishing the model for predicting the electricity sales amount based on the long-short term memory network LSTM in the step S5 includes the following steps:
setting the selling electricity time sequence data as X ═ X1,x2,...,xt-1,xt),xtIf the power selling amount is t, the power selling amount is predicted by obtaining X from the known time series data Xt+1Maximum likelihood estimate of time p (x):
Figure FDA0003015821510000021
in the case of the multi-conditional time series, the formula (3) becomes the following form:
Figure FDA0003015821510000022
wherein xtIndicating the amount of electricity sold at time t,
Figure FDA0003015821510000023
a value representing the ith influencing factor at time t;
the formula (3) and the formula (4) are targets of electricity sales amount prediction, a characteristic rule of sequence data is extracted by forgetting and memorizing control history information of an input gate, a forgetting gate and an output gate in the LSTM-RNN network, a prediction result is finally output, and the calculation process at the time t is as follows:
it=sigmoid(Wxixt+Whiht-1+Wcict+bi)
ft=sigmoid(Wxfxt+Whfht-1+Wcfct+bf)
ct=ftct-1+ittanh(Wxcxt+Whcht-1+bc)
ot=sigmoid(Wxoxt+Whoht-1+Wcoct+bo)
ht=ottanh(ct)
wherein i, f and o respectively represent an input gate, a forgetting gate and an output gate; c represents a memory cell; h represents hidden layer output; w represents a connection weight, and the subscript thereof represents a weight association; b is a bias term.
2. The method as claimed in claim 1, wherein the Pearson correlation coefficient r is calculated as follows:
Figure FDA0003015821510000024
wherein r represents a Pearson correlation coefficient, the value range is [ -1, 1], r ═ 0 represents irrelevance, the closer the r value is to 1, the greater the positive correlation is, and the closer the r value is to-1, the greater the negative correlation is; x and y are two data characteristic variables; cov (x, y) denotes covariance, σ x, σ y denotes standard deviation.
3. The method for predicting electricity sales amount based on long-short term memory network as claimed in claim 1, wherein the k-means clustering algorithm in step S3 is as follows: inputting: the Pearson correlation coefficient r set D ═ x of the electricity selling quantity data and each influence factor data1,x2,...,xm},xiExpressing a correlation coefficient vector of the ith industry, wherein each dimension expresses an influence factor related to the electricity sales data, m expresses the number of industries, and the number of clustering clusters is k;
and (3) outputting: and K clustering clusters.
4. The method for predicting electricity sales amount based on long-term and short-term memory network according to claim 1, 2 or 3, wherein the calculation formula of the normalization process in the step S4 is as follows:
Figure FDA0003015821510000031
where x is the original value of each type of data, xminFor the minimum value of each type of data, xmaxFor the maximum value of each type of data, x*And calculating the obtained normalized value for the data.
CN201711400097.5A 2017-12-21 2017-12-21 A prediction method of electricity sales based on long short-term memory network Active CN107967542B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711400097.5A CN107967542B (en) 2017-12-21 2017-12-21 A prediction method of electricity sales based on long short-term memory network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711400097.5A CN107967542B (en) 2017-12-21 2017-12-21 A prediction method of electricity sales based on long short-term memory network

Publications (2)

Publication Number Publication Date
CN107967542A CN107967542A (en) 2018-04-27
CN107967542B true CN107967542B (en) 2021-07-27

Family

ID=61995105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711400097.5A Active CN107967542B (en) 2017-12-21 2017-12-21 A prediction method of electricity sales based on long short-term memory network

Country Status (1)

Country Link
CN (1) CN107967542B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829631B (en) * 2019-01-14 2020-10-09 北京中兴通网络科技股份有限公司 Enterprise risk early warning analysis method and system based on memory network
CN109919370B (en) * 2019-02-26 2021-06-11 国网冀北电力有限公司运营监测(控)中心 Power load prediction method and prediction device
CN110070145B (en) * 2019-04-30 2021-04-27 天津开发区精诺瀚海数据科技有限公司 LSTM hub single-product energy consumption prediction based on incremental clustering
CN110705743B (en) * 2019-08-23 2023-08-18 国网浙江省电力有限公司 A new energy consumption prediction method based on long-short-term memory neural network
CN111178611B (en) * 2019-12-23 2022-09-23 广西电网有限责任公司 Method for predicting daily electric quantity
CN111476433A (en) * 2020-04-26 2020-07-31 北京保生源科技有限公司 Data analysis-based flue gas emission prediction method and system
CN111401667B (en) * 2020-06-03 2021-01-08 广东电网有限责任公司东莞供电局 Method, device, computer equipment and storage medium for power scheduling in a factory
CN111652444B (en) * 2020-06-05 2023-07-21 南京机电职业技术学院 A method for predicting the number of daily tourists based on K-means and LSTM
CN111950794A (en) * 2020-08-18 2020-11-17 上海仪电(集团)有限公司中央研究院 Park energy consumption prediction method, system, equipment and storage medium
CN112434847A (en) * 2020-11-17 2021-03-02 上海东普信息科技有限公司 Express delivery quantity prediction method, device, equipment and storage medium based on LSTM model
CN113033898A (en) * 2021-03-26 2021-06-25 国核电力规划设计研究院有限公司 Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN113205368B (en) * 2021-05-25 2022-11-29 合肥供水集团有限公司 Industrial and commercial customer clustering method based on time sequence water consumption data
CN113948156B (en) * 2021-10-20 2024-05-07 大连理工大学 Multitasking neural network method for predicting degradation half-life of chemicals in four environmental media
CN114066642A (en) * 2021-11-30 2022-02-18 昆明电力交易中心有限责任公司 Risk prediction method, terminal and storage medium for electricity retailing
CN114881343B (en) * 2022-05-18 2023-11-14 清华大学 Short-term load prediction method and device for power system based on feature selection
CN115619447A (en) * 2022-11-02 2023-01-17 国网四川省电力公司经济技术研究院 Method, equipment and medium for forecasting monthly electricity sales combination
CN116663726A (en) * 2023-05-30 2023-08-29 合肥迈思泰合信息科技有限公司 A method for power forecasting of lower-level power grid based on data integration

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825289A (en) * 2015-12-17 2016-08-03 国网江苏省电力公司经济技术研究院 Prediction method for wind power time sequence
CN106960252A (en) * 2017-03-08 2017-07-18 深圳市景程信息科技有限公司 Methods of electric load forecasting based on long Memory Neural Networks in short-term
US20170230409A1 (en) * 2016-02-09 2017-08-10 International Business Machines Corporation Detecting and predicting cyber-attack phases in adjacent data processing environment regions
CN107239855A (en) * 2017-05-23 2017-10-10 华中科技大学 A kind of Prediction of Stock Index method and system based on LSTM models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825289A (en) * 2015-12-17 2016-08-03 国网江苏省电力公司经济技术研究院 Prediction method for wind power time sequence
US20170230409A1 (en) * 2016-02-09 2017-08-10 International Business Machines Corporation Detecting and predicting cyber-attack phases in adjacent data processing environment regions
CN106960252A (en) * 2017-03-08 2017-07-18 深圳市景程信息科技有限公司 Methods of electric load forecasting based on long Memory Neural Networks in short-term
CN107239855A (en) * 2017-05-23 2017-10-10 华中科技大学 A kind of Prediction of Stock Index method and system based on LSTM models

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Deep Learning for Solar Power Forecasting – An Approach Using Autoencoder and LSTM Neural Networks;Andre Gensler;《2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)》;20170209;全文 *
基于定制产品的客户聚类分析方法及其应用;王永兴;《浙江工业大学学报》;20090830;第37卷(第4期);全文 *
基于长短期记忆网络的风电场发电功率超短期预测;朱乔木;《电网技术》;20171205;第41卷(第12期);正文第3798页第1栏第2段-第3800页第2栏第2段 *
如何卖施配网令动化与提高供电可靠性的捩付;王晓辉;《能源.电力》;20131231;全文 *
电力电量预测的神经网络方法;张喆;《知识丛林》;20080430;全文 *

Also Published As

Publication number Publication date
CN107967542A (en) 2018-04-27

Similar Documents

Publication Publication Date Title
CN107967542B (en) A prediction method of electricity sales based on long short-term memory network
Xiao et al. Impacts of data preprocessing and selection on energy consumption prediction model of HVAC systems based on deep learning
Cui et al. Research on power load forecasting method based on LSTM model
CN108280551B (en) Photovoltaic power generation power prediction method utilizing long-term and short-term memory network
CN111260136A (en) Building short-term load prediction method based on ARIMA-LSTM combined model
CN109359786A (en) A short-term load forecasting method for power station area
CN112329990A (en) User power load prediction method based on LSTM-BP neural network
CN113868938B (en) Short-term load probability density prediction method, device and system based on quantile regression
CN110689162B (en) Bus load prediction method, device and system based on user side classification
CN116739118A (en) A power load forecasting method based on LSTM-XGBoost to implement error correction mechanism
CN111028100A (en) Refined short-term load prediction method, device and medium considering meteorological factors
CN117977568A (en) Power load forecasting method based on nested LSTM and quantile calculation
CN111027772A (en) Multi-factor short-term load forecasting method based on PCA-DBILSTM
CN107730031A (en) A kind of ultra-short term peak load forecasting method and its system
CN118313548B (en) Enterprise energy consumption situation prediction method and system based on time sequence portrait technology
CN108334988A (en) A kind of short-term Load Forecasting based on SVM
CN118970924A (en) A short-term load forecasting method and system based on LSTM-DNN hybrid neural network
CN116934383A (en) An in-depth store sales forecast method based on dynamic optimization
CN116169670A (en) A short-term non-resident load forecasting method and system based on improved neural network
CN115115119A (en) OA-GRU short-term power load prediction method based on grey correlation
CN114881347A (en) A Natural Gas Load Forecasting Interval Estimation Method Using Weighted Residual Clustering
CN116826745B (en) Layered and partitioned short-term load prediction method and system in power system background
Alsamraee et al. High-resolution energy consumption forecasting of a university campus power plant based on advanced machine learning techniques
CN116689503B (en) A method for predicting the full-length thickness of steel strip based on memory functional network
CN118281866A (en) A comprehensive energy short-term load forecasting method based on fuzzy C-means clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant