CN113220671B

CN113220671B - Power load missing data restoration method based on power utilization mode decomposition and reconstruction

Info

Publication number: CN113220671B
Application number: CN202110409685.5A
Authority: CN
Inventors: 林振智; 卢峰; 金伟超; 刘晟源; 杨莉; 崔雪原; 林之岸
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2022-06-17
Anticipated expiration: 2041-04-16
Also published as: CN113220671A

Abstract

The invention discloses a power load missing data repair method based on power consumption mode decomposition and reconstruction, and relates to the field of power big data analysis and processing. The method first obtains the electricity load data of power users, and divides the data set into a complete load data set and a load data set to be repaired; Centrally extract the base vector dictionary matrix representing the user's electronic mode; then based on the base vector dictionary matrix, decompose and encode the load curve to be repaired, and determine that it is composed of the electronic mode; finally, based on the base vector dictionary matrix, according to the encoding of the load curve to be repaired. The vector reconstructs the load curve, and fills in and repairs the missing part of the power load data. The method of the present invention can be applied to the repair of missing load data in multiple days or in continuous periods.

Description

A method for repairing missing data of power load based on the decomposition and reconstruction of power consumption patterns

技术领域technical field

本发明涉及电力大数据分析和处理领域，尤其涉及一种基于用电模式分解重构的电力负荷缺失数据修复方法。The invention relates to the field of power big data analysis and processing, in particular to a method for repairing missing data of power loads based on the decomposition and reconstruction of power consumption patterns.

背景技术Background technique

智能电表的广泛普及和用电信息采集系统的建设为用户侧负荷大数据的研究分析提供了数据基础。然而，由于电表故障或通信错误等问题，负荷数据并不完整。研究负荷数据缺失的修复方法，不仅能够提高数据质量，也是负荷数据分析的前提，对智能电网和智能用电具有重要意义。电力负荷则由于用户用电的随机性和设备的启停特性，其数据序列具有变化快、无固定规律等特点。同时，负荷数据缺失可以分为孤立缺失、连续缺失和全部缺失三种缺失类型，常规的插值算法不适合修复缺失负荷数据连续分布的情况。因此与地理空间数据修复和图像修复相比，负荷缺失数据修复的难度更大。The widespread popularity of smart meters and the construction of electricity information collection systems provide a data basis for the research and analysis of user-side load big data. However, load data is incomplete due to issues such as meter failure or communication errors. Studying the repair method for missing load data can not only improve data quality, but also the premise of load data analysis, which is of great significance to smart grid and smart electricity consumption. Due to the randomness of users' electricity consumption and the start-stop characteristics of equipment, the data sequence of power load has the characteristics of rapid change and no fixed rules. At the same time, the missing load data can be divided into three missing types: isolated missing, continuous missing and total missing. Conventional interpolation algorithms are not suitable for repairing the continuous distribution of missing load data. Therefore, compared with geospatial data inpainting and image inpainting, load-missing data inpainting is more difficult.

用户负荷数据具有两个主要特征：稀疏性和多样性。稀疏性指的是用户每天的负荷基本上可以由几个子模式线性组成，例如可以分解成用户的各设备用电曲线；多样性指的是一组用电子模式可以通过不同的编码重构成不同的日负荷曲线。基于电力用户负荷的稀疏性和多样性，采用稀疏编码技术将日负荷曲线分解成不同的负荷子模式，并将不同的负荷曲线描述为子模式的线性组合从而实现负荷重构，从而对负荷缺失数据进行修复。User load data has two main characteristics: sparsity and diversity. Sparsity means that the daily load of a user can basically be linearly composed of several sub-patterns, for example, it can be decomposed into the electricity consumption curve of each device of the user; diversity means that a set of electronic patterns can be reconstructed into different patterns through different codes. Daily load curve. Based on the sparseness and diversity of power user loads, the daily load curve is decomposed into different load sub-patterns by sparse coding technology, and the different load curves are described as a linear combination of sub-patterns to realize load reconstruction, so as to reduce the load loss. data is repaired.

发明内容SUMMARY OF THE INVENTION

本发明要解决的技术问题和提出的技术任务是对现有技术方案进行完善与改进，提供一种基于用电模式分解重构的电力负荷缺失数据修复方法，以实现对连续缺失的负荷数据进行有效修复。为此，本发明采取以下技术方案。The technical problem to be solved and the technical task proposed by the present invention are to improve and improve the existing technical solutions, and to provide a power load missing data repair method based on the decomposition and reconstruction of the power consumption mode, so as to realize the continuous missing load data. Effective repair. Therefore, the present invention adopts the following technical solutions.

一种基于用电模式分解重构的电力负荷缺失数据修复方法，其特征在于包括步骤：A method for repairing missing power load data based on the decomposition and reconstruction of power consumption patterns, which is characterized by comprising the steps of:

1)从用电信息采集系统中获取电力用户的用电负荷数据，并根据日负荷数据是否采集完整，将数据集分为完整负荷数据集和待修复负荷数据集；1) Obtain the electricity load data of power users from the electricity consumption information collection system, and divide the data set into a complete load data set and a to-be-repaired load data set according to whether the daily load data is collected completely;

2)采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵；2) Using the K singular value decomposition dictionary learning algorithm to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set;

3)基于基向量字典矩阵，对待修复负荷曲线进行分解及编码，确定其用电子模式构成；3) Decompose and encode the load curve to be repaired based on the base vector dictionary matrix, and determine that it is constituted by an electronic mode;

4)基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，并对电力负荷缺失部分数据进行填充修复，即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。4) Based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the power consumption data at the corresponding moment in the reconstructed load curve is regarded as the missing part of the load data. fix value.

本技术方案采用了K奇异值分解字典学习算法，首先获取电力用户的用电负荷数据，并根据日负荷数据是否采集完整，将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性，采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵；然后，基于基向量字典矩阵，对待修复负荷曲线进行分解及编码，确定其用电子模式构成；最后基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，并对电力负荷缺失部分数据进行填充修复，即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。This technical solution adopts the K-singular value decomposition dictionary learning algorithm, first obtains the electricity load data of power users, and divides the data set into a complete load data set and a to-be-repaired load data set according to whether the daily load data is collected completely. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired.

作为优选技术手段：在步骤1)中，从用电信息采集系统中获取电力用户的用电负荷数据，根据日负荷数据是否采集完整，将负荷数据分为完整负荷数据集和待修复负荷数据集，其中某一用户完整的日负荷采集样本集X_N×M可表示为：As a preferred technical means: in step 1), the power consumption load data of the power user is obtained from the power consumption information collection system, and the load data is divided into a complete load data set and a load data set to be repaired according to whether the daily load data is collected completely. , the complete daily load collection sample set X _N×M of a certain user can be expressed as:

式中：N为日负荷采集点数；M为负荷采集天数；

为第j天的日负荷曲线，是一个N维特征向量；

为全部负荷曲线的第i个采集时刻的功率向量。对于待恢复的负荷曲线x＝[x₁,x₂,…,x_N]^T，

为空缺值，i∈Ω_nan＝{c₁,c₂,...,c_L}，c_l为第l个缺失点的序号，Ω_nan为采集缺失点的序号集合，L为负荷曲线采集缺失的数量。In the formula: N is the daily load collection points; M is the load collection days;

is the daily load curve of the jth day, which is an N-dimensional eigenvector;

is the power vector at the ith acquisition time of all load curves. For the load curve to be restored x=[x ₁ ,x ₂ ,...,x _N ] ^T ,

is the vacancy value, i∈Ω _nan ={c ₁ ,c ₂ ,...,c _L }, c _l is the serial number of the lth missing point, Ω _nan is the set of serial numbers of the missing points, and L is the load curve collection number of missing.

作为优选技术手段：在步骤2)中，采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵，字典学习的目标是学习一个字典矩阵B，使得X_N×M被近似分解为：As a preferred technical means: in step 2), the K singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set, and the goal of dictionary learning is to learn a dictionary matrix B such that X _{N ×M} is approximately decomposed into:

X≈BZX≈BZ

式中：B∈R^N×K为字典矩阵，K为字典的大小，B的每一列

为单位化原子向量，同样为一个M维特征向量；Z＝[z₁,z₂,…,z_M]∈R^K×M为稀疏编码矩阵。在近似分解的同时要满足Z尽可能稀疏，则该近似分解问题的表达式为：In the formula: B∈R ^N×K is the dictionary matrix, K is the size of the dictionary, and each column of B

is a normalized atomic vector, which is also an M-dimensional feature vector; Z=[z ₁ , z ₂ ,...,z _M ]∈R ^K×M is a sparse coding matrix. In the approximate decomposition, Z should be as sparse as possible, the expression of the approximate decomposition problem is:

式中：||·||_F为Frobenius范数，其值为矩阵元素的平方和根，表示重构误差E_B的大小，重构误差E_B越小，则字典学习的效果越好；||·||₀为0范数，其值为矩阵中非零项的数量；T₀为稀疏度约束阈值，用来约束编码向量z_i中非零项的数量，该式可以用正交匹配追踪算法求解。In the formula: ||·|| _F is the Frobenius norm, and its value is the square sum root of the _matrix elements, indicating the size of the reconstruction error _EB , the smaller the reconstruction error EB, the better the effect of dictionary learning; | |·|| ₀ is the 0 norm, and its value is the number of non-zero items in the matrix; T ₀ is the sparsity constraint threshold, which is used to constrain the number of non-zero items in the coding vector _zi . This formula can be matched by orthogonal matching The tracking algorithm solves.

作为优选技术手段：在步骤3)中，采用K奇异值分解算法进行字典学习的基础上，基于基向量对待修复负荷曲线进行分解及编码，确定其用电子模式构成，利用待修复的负荷曲线采集成功的负荷数据部分及对应时刻的字典矩阵对其进行编码，编码的表达式为：As the preferred technical means: in step 3), on the basis of using K singular value decomposition algorithm for dictionary learning, the load curve to be repaired is decomposed and encoded based on the basis vector, and it is determined to be formed by electronic mode, and the load curve to be repaired is used to collect The successful load data part and the dictionary matrix of the corresponding time are encoded, and the encoding expression is:

x_/Ω＝x-{x_i|i∈Ω_nan}x _/Ω = x-{x _i |i∈Ω _nan }

式中：x_/Ω为负荷曲线x中采集成功的负荷数据，其长度为N-L；

为B中第i维(行)特征向量；B_/Ω为完整的字典矩阵B去除采集缺失时刻对应特征行向量后的字典矩阵，

z^g为重构向量，为x_/Ω基于B_/Ω分解所得的稀疏编码向量，其值为基于采集成功的负荷数据确定的用电子模式构成，代表了待修复负荷曲线可能的用电模式。In the formula: x _/Ω is the load data collected successfully in the load curve x, and its length is NL;

is the i-th dimension (row) eigenvector in B; B _/Ω is the complete dictionary matrix B is the dictionary matrix after removing the corresponding characteristic row vector at the missing moment,

z ^g is the reconstruction vector, which is the sparse coding vector obtained by x _/Ω based on B _/Ω decomposition.

作为优选技术手段：在步骤4)中，基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，其表达式为：As a preferred technical means: in step 4), based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and its expression is:

x^g＝Bz^g x ^g = Bz ^g

式中x^g为重构负荷曲线，由重构向量z^g和完整的字典矩阵B重构所得。在此基础上，对电力负荷缺失部分数据进行填充修复，即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值，其表达式为：where x ^g is the reconstruction load curve, which is reconstructed from the reconstruction vector z ^g and the complete dictionary matrix B. On this basis, fill in and repair the missing part of the power load data, that is, the power consumption data at the corresponding moment in the reconstructed load curve is used as the repair value of the missing part of the load data, and its expression is:

式中

为重构负荷x^g中对应采集缺失时刻的负荷数据。in the formula

Collect the load data corresponding to the missing moment in the reconstructed load x ^g .

有益效果：Beneficial effects:

本发明提出了一种基于用电模式分解重构的电力负荷缺失数据修复方法。首先获取电力用户的用电负荷数据，并根据日负荷数据是否采集完整，将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性，采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵；然后，基于基向量字典矩阵，对待修复负荷曲线进行分解及编码，确定其用电子模式构成；最后基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，并对电力负荷缺失部分数据进行填充修复，即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。本发明考虑到用户的用电习惯及用电设备相对固定，基于历史负荷数据将用户负荷划分为几种典型的用电模式，并基于其用电模式对缺失负荷数据进行修复。电力负荷数据管理人员可以根据其实际需要，将本发明应用于多日负荷数据缺失或连续时段的负荷数据缺失修复。The invention proposes a method for repairing missing data of power load based on the decomposition and reconstruction of the power consumption mode. First, the electricity load data of power users is obtained, and according to whether the daily load data is collected completely, the data set is divided into a complete load data set and a load data set to be repaired. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired. Considering the user's electricity consumption habits and the relatively fixed electricity consumption equipment, the invention divides the user load into several typical electricity consumption patterns based on historical load data, and repairs the missing load data based on the electricity consumption patterns. The electric power load data management personnel can apply the present invention to repairing the missing load data of multiple days or the missing load data of continuous periods according to their actual needs.

附图说明Description of drawings

图1是本发明的流程图。Figure 1 is a flow chart of the present invention.

图2是负荷曲线1缺失数据修复结果；Figure 2 is the repair result of missing data of load curve 1;

图3是负荷曲线2缺失数据修复结果；Figure 3 is the repair result of missing data in load curve 2;

图4是负荷曲线3缺失数据修复结果；Figure 4 is the result of repairing missing data in load curve 3;

图5是负荷曲线1的实际编码与重构编码结果；Fig. 5 is the actual coding and reconstruction coding result of load curve 1;

图6是负荷曲线2的实际编码与重构编码结果；Fig. 6 is the actual coding and reconstruction coding result of load curve 2;

图7是负荷曲线3的实际编码与重构编码结果；Fig. 7 is the actual coding and reconstruction coding result of load curve 3;

图8是负荷曲线1实际编码与重构编码对应基向量；Fig. 8 is the corresponding basis vector of actual coding and reconstructed coding of load curve 1;

图9是负荷曲线2实际编码与重构编码对应基向量；Fig. 9 is the corresponding basis vector of load curve 2 actual coding and reconstruction coding;

图10是负荷曲线3实际编码与重构编码对应基向量。Fig. 10 shows the base vectors corresponding to actual coding and reconstructed coding in load curve 3.

具体实施方式Detailed ways

以下结合说明书附图对本发明的技术方案做进一步的详细说明。The technical solutions of the present invention will be further described in detail below with reference to the accompanying drawings.

如图1所示，图1为本发明的方法流程：首先获取电力用户的用电负荷数据，并根据日负荷数据是否采集完整，将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性，采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵；然后，基于基向量字典矩阵，对待修复负荷曲线进行分解及编码，确定其用电子模式构成；最后基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，并对电力负荷缺失部分数据进行填充修复，即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。具体步骤为：As shown in FIG. 1, FIG. 1 is the method flow of the present invention: firstly, the power consumption load data of the power user is obtained, and according to whether the daily load data is collected completely, the data set is divided into a complete load data set and a to-be-repaired load data set. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired. The specific steps are:

步骤1.从用电信息采集系统中获取电力用户的用电负荷数据，根据日负荷数据是否采集完整，将负荷数据分为完整负荷数据集和待修复负荷数据集，其中某一用户完整的日负荷采集样本集X_N×M可表示为：Step 1. Obtain the electricity load data of power users from the electricity consumption information collection system, and divide the load data into complete load data sets and load data sets to be repaired according to whether the daily load data is collected completely. The load collection sample set X _N×M can be expressed as:

式中：N为日负荷采集点数；M为负荷采集天数；

为第j天的日负荷曲线，是一个N维特征向量；

is the daily load curve of the jth day, which is an N-dimensional eigenvector;

采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵。The K-singular value decomposition dictionary learning algorithm is used to extract the base vector dictionary matrix representing the user's electronic mode from the complete load data set.

步骤2.采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵，字典学习的目标是学习一个字典矩阵B，使得X_N×M被近似分解为：Step 2. Use the K singular value decomposition dictionary learning algorithm to extract the base vector dictionary matrix representing the user's electronic mode from the complete load data set. The goal of dictionary learning is to learn a dictionary matrix B, so that X _N×M is approximately decomposed into:

X≈BZX≈BZ

式中：B∈R^N×K为字典矩阵，K为字典的大小，B的每一列

步骤3.采用K奇异值分解算法进行字典学习的基础上，基于基向量对待修复负荷曲线进行分解及编码，确定其用电子模式构成，利用待修复的负荷曲线采集成功的负荷数据部分及对应时刻的字典矩阵对其进行编码，编码的表达式为：Step 3. On the basis of using K singular value decomposition algorithm for dictionary learning, decompose and encode the load curve to be repaired based on the basis vector, determine that it is constituted by electronic mode, and use the load curve to be repaired to collect the successful load data part and the corresponding time The dictionary matrix of , encodes it, and the encoded expression is:

x_/Ω＝x-{x_i|i∈Ω_nan}x _/Ω = x-{x _i |i∈Ω _nan }

步骤4.基于基向量字典矩阵，根据待修复负荷曲线的编码向量重构负荷曲线，其表达式为：Step 4. Based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and its expression is:

x^g＝Bz^g x ^g = Bz ^g

式中

为重构负荷x^g中对应采集缺失时刻的负荷数据。in the formula

以下以具体的实例对本发明作进一步的说明：The present invention is further described below with specific example:

一、数据来源1. Data sources

实例数据主要来源于某居民用户在2019年5月至10月的48点日负荷数据，随机选取三天的负荷曲线构造缺失样本，并均设定为连续10个采集时刻的负荷数据缺失。The example data is mainly derived from the daily load data of a resident user at 48:00 from May to October 2019. Three days of load curves are randomly selected to construct missing samples, and all of them are set as missing load data at 10 consecutive collection times.

二、负荷缺失数据修复结果Second, the load missing data repair results

采用本发明所提技术方案对三个负荷曲线进行修复，选取该用户其他采集完整的日负荷曲线100条(即M＝100)作为字典学习的训练集，并设定字典大小为20，即K＝20，设定稀疏度约束阈值T₀＝5。三个负荷曲线的修复结果分别如图2、图3和图4所示，其实际负荷曲线和重构负荷曲线的稀疏编码分别如图5、图6和图7所示，编码对应的基向量分别如图8、图9和图10所示。The three load curves are repaired by the technical solution of the present invention, and 100 other complete daily load curves (ie M=100) collected by the user are selected as the training set for dictionary learning, and the dictionary size is set to 20, that is, K =20, set the sparsity constraint threshold T ₀ =5. The repair results of the three load curves are shown in Figure 2, Figure 3, and Figure 4, respectively. The sparse coding of the actual load curve and the reconstructed load curve are shown in Figure 5, Figure 6, and Figure 7, respectively. The corresponding basis vectors are encoded. As shown in Figure 8, Figure 9 and Figure 10, respectively.

由图2、图3和图4可得，无论数据缺失时段负荷平缓或存在较大的上升和下降，本章所提算法均可以较好地修复缺失的负荷数据。由图5、图6和图7可得，对于部分数据缺失的负荷曲线，基于字典进行编码后，其编码的结果与实际完整的负荷曲线编码结果相近，并且其编码值较大的部分基本一致，说明缺失的负荷数据曲线基于其采集成功的部分数据依然可以基于字典矩阵得到与实际完整负荷曲线一致的编码。由于重构编码与原始负荷分解编码的一致性，基于重构编码和完整字典重构所得负荷曲线与实际完整负荷曲线基本一致，故可以对缺失负荷数据进行修复。由图8、图9和图10可得，最大编码对应的基向量(即图8的基向量13、图9的基向量18和图10的基向量16)和实际完整负荷曲线较为接近，而其余编码对应的基向量则进一步进行修补和近似，最终可以通过字典基向量。It can be seen from Figure 2, Figure 3 and Figure 4 that the algorithm proposed in this chapter can better repair the missing load data regardless of whether the load is flat during the period of data missing or there is a large rise and fall. As can be seen from Figure 5, Figure 6 and Figure 7, for the load curve with missing data, after encoding based on the dictionary, the encoding result is similar to the actual complete load curve encoding result, and the part with larger encoding value is basically the same. , indicating that the missing load data curve can still obtain a code consistent with the actual complete load curve based on the dictionary matrix based on the part of the data collected successfully. Due to the consistency between the reconstructed coding and the original load decomposition coding, the reconstructed load curve based on the reconstructed coding and the complete dictionary is basically consistent with the actual complete load curve, so the missing load data can be repaired. It can be seen from Fig. 8, Fig. 9 and Fig. 10 that the basis vectors corresponding to the maximum coding (ie basis vector 13 in Fig. 8, basis vector 18 in Fig. 9 and basis vector 16 in Fig. 10) are relatively close to the actual complete load curve, while The basis vectors corresponding to the remaining codes are further patched and approximated, and finally the dictionary basis vectors can be used.

以上图1所示的一种基于用电模式分解重构的电力负荷缺失数据修复方法是本发明的具体实施例，已经体现出本发明实质性特点和进步，可根据实际的使用需要，在本发明的启示下，对其进行形状、结构等方面的等同修改，均在本方案的保护范围之列。The method for repairing missing data of power load based on the decomposition and reconstruction of the power consumption mode shown in FIG. 1 above is a specific embodiment of the present invention, which has embodied the substantial features and progress of the present invention. Under the inspiration of the invention, equivalent modifications in terms of shape and structure are included in the protection scope of this scheme.

Claims

1. A method for repairing missing data of power load based on the decomposition and reconstruction of power consumption mode, is characterized in that, comprises the steps:

1) Obtain the electricity load data of power users from the electricity consumption information collection system, and divide the data set into a complete load data set and a to-be-repaired load data set according to whether the daily load data is collected completely;

2) Using the K singular value decomposition dictionary learning algorithm to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set;

3) Decompose and encode the load curve to be repaired based on the base vector dictionary matrix, and determine that it is constituted by an electronic mode;

4) Based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the power consumption data at the corresponding moment in the reconstructed load curve is regarded as the missing part of the load data. repair value;

Specifically, in step 2), the K singular value decomposition dictionary learning algorithm is used to extract the base vector dictionary matrix representing the user's electronic mode from the complete load data set. The goal of dictionary learning is to learn a dictionary matrix B such that X _N×M is approximately decomposed into:

X≈BZ

In the formula: B∈R ^N×K is the dictionary matrix, K is the size of the dictionary, and each column of B

is a unitized atomic vector, which is also an M-dimensional feature vector; Z=[z ₁ , z ₂ ,...,z _M ]∈R ^K×M is a sparse coding matrix; while approximate decomposition, it is necessary to satisfy Z as sparse as possible, Then the expression of the approximate decomposition problem is:

In the formula: ||·|| _F is the Frobenius norm, and its value is the square sum root of the _matrix elements, indicating the size of the reconstruction error _EB , the smaller the reconstruction error EB, the better the effect of dictionary learning; | |·|| ₀ is the 0 norm, and its value is the number of non-zero items in the matrix; T ₀ is the sparsity constraint threshold, which is used to constrain the number of non-zero items in the coding vector _zi . This formula can be matched by orthogonal matching The tracking algorithm solves.

2. A method for repairing missing data of power load based on power consumption pattern decomposition and reconstruction according to claim 1, characterized in that: in step 1), the power consumption load of the power user is obtained from the power consumption information collection system Data, according to whether the daily load data is collected completely, the load data is divided into a complete load data set and a load data set to be repaired, in which the user's complete daily load collection sample set X _N×M is:

In the formula: N is the daily load collection points; M is the load collection days;

is the daily load curve of the jth day, which is an N-dimensional eigenvector;

is the power vector at the ith acquisition time of all load curves; for the load curve to be restored x=[x ₁ , x ₂ ,...,x _N ] ^T ,

is the missing value, i∈Ω _nan ={c ₁ ,c ₂ ,…,c _L }, c _l is the serial number of the lth missing point, Ω _nan is the serial number set of the missing points collected, L is the missing point collected from the load curve quantity.

3. a kind of power load missing data repair method based on power consumption pattern decomposition and reconstruction according to claim 1, is characterized in that: in step 3), on the basis of adopting K singular value decomposition algorithm to carry out dictionary learning, based on The base vector decomposes and encodes the load curve to be repaired, determines that it is constituted by electronic mode, and uses the load data part of the load curve to be repaired successfully collected and the dictionary matrix at the corresponding time to encode it. The encoding expression is:

x _/Ω = x-{x _i |i∈Ω _nan }

In the formula: x _/Ω is the load data collected successfully in the load curve x, and its length is NL;

is the i-th dimension row eigenvector in B; B _/Ω is the complete dictionary matrix B is the dictionary matrix after removing the characteristic row vector corresponding to the missing moment,

4. A method for repairing missing data of power load based on power consumption pattern decomposition and reconstruction according to claim 1, characterized in that: in step 4), based on the base vector dictionary matrix, according to the coding vector of the load curve to be repaired The reconstructed load curve is expressed as:

x ^g = Bz ^g

In the formula, x ^g is the reconstructed load curve, which is reconstructed from the reconstructed vector z ^g and the complete dictionary matrix B; The electricity consumption data is used as the repair value of the missing partial load data, and its expression is:

in the formula