CN113220671B - Power load missing data restoration method based on power utilization mode decomposition and reconstruction - Google Patents
Power load missing data restoration method based on power utilization mode decomposition and reconstruction Download PDFInfo
- Publication number
- CN113220671B CN113220671B CN202110409685.5A CN202110409685A CN113220671B CN 113220671 B CN113220671 B CN 113220671B CN 202110409685 A CN202110409685 A CN 202110409685A CN 113220671 B CN113220671 B CN 113220671B
- Authority
- CN
- China
- Prior art keywords
- load
- data
- missing
- vector
- power
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000354 decomposition reaction Methods 0.000 title claims abstract description 37
- 238000000034 method Methods 0.000 title claims abstract description 16
- 239000013598 vector Substances 0.000 claims abstract description 71
- 239000011159 matrix material Substances 0.000 claims abstract description 53
- 230000008439 repair process Effects 0.000 claims abstract description 22
- 230000005611 electricity Effects 0.000 claims abstract description 21
- 230000000694 effects Effects 0.000 claims description 3
- 238000007405 data analysis Methods 0.000 abstract description 3
- 238000012545 processing Methods 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2136—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Economics (AREA)
- Databases & Information Systems (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Quality & Reliability (AREA)
- Public Health (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
本发明公开了一种基于用电模式分解重构的电力负荷缺失数据修复方法,涉及电力大数据分析和处理领域。该方法首先获取电力用户的用电负荷数据,将数据集分为完整负荷数据集和待修复负荷数据集;基于用户电力负荷稀疏性和多样性,采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵;再基于基向量字典矩阵,对待修复负荷曲线进行分解及编码,确定其用电子模式构成;最后基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,并对电力负荷缺失部分数据进行填充修复。本发明方法可以应用于多日负荷数据缺失或连续时段的负荷数据缺失修复。
The invention discloses a power load missing data repair method based on power consumption mode decomposition and reconstruction, and relates to the field of power big data analysis and processing. The method first obtains the electricity load data of power users, and divides the data set into a complete load data set and a load data set to be repaired; Centrally extract the base vector dictionary matrix representing the user's electronic mode; then based on the base vector dictionary matrix, decompose and encode the load curve to be repaired, and determine that it is composed of the electronic mode; finally, based on the base vector dictionary matrix, according to the encoding of the load curve to be repaired. The vector reconstructs the load curve, and fills in and repairs the missing part of the power load data. The method of the present invention can be applied to the repair of missing load data in multiple days or in continuous periods.
Description
技术领域technical field
本发明涉及电力大数据分析和处理领域,尤其涉及一种基于用电模式分解重构的电力负荷缺失数据修复方法。The invention relates to the field of power big data analysis and processing, in particular to a method for repairing missing data of power loads based on the decomposition and reconstruction of power consumption patterns.
背景技术Background technique
智能电表的广泛普及和用电信息采集系统的建设为用户侧负荷大数据的研究分析提供了数据基础。然而,由于电表故障或通信错误等问题,负荷数据并不完整。研究负荷数据缺失的修复方法,不仅能够提高数据质量,也是负荷数据分析的前提,对智能电网和智能用电具有重要意义。电力负荷则由于用户用电的随机性和设备的启停特性,其数据序列具有变化快、无固定规律等特点。同时,负荷数据缺失可以分为孤立缺失、连续缺失和全部缺失三种缺失类型,常规的插值算法不适合修复缺失负荷数据连续分布的情况。因此与地理空间数据修复和图像修复相比,负荷缺失数据修复的难度更大。The widespread popularity of smart meters and the construction of electricity information collection systems provide a data basis for the research and analysis of user-side load big data. However, load data is incomplete due to issues such as meter failure or communication errors. Studying the repair method for missing load data can not only improve data quality, but also the premise of load data analysis, which is of great significance to smart grid and smart electricity consumption. Due to the randomness of users' electricity consumption and the start-stop characteristics of equipment, the data sequence of power load has the characteristics of rapid change and no fixed rules. At the same time, the missing load data can be divided into three missing types: isolated missing, continuous missing and total missing. Conventional interpolation algorithms are not suitable for repairing the continuous distribution of missing load data. Therefore, compared with geospatial data inpainting and image inpainting, load-missing data inpainting is more difficult.
用户负荷数据具有两个主要特征:稀疏性和多样性。稀疏性指的是用户每天的负荷基本上可以由几个子模式线性组成,例如可以分解成用户的各设备用电曲线;多样性指的是一组用电子模式可以通过不同的编码重构成不同的日负荷曲线。基于电力用户负荷的稀疏性和多样性,采用稀疏编码技术将日负荷曲线分解成不同的负荷子模式,并将不同的负荷曲线描述为子模式的线性组合从而实现负荷重构,从而对负荷缺失数据进行修复。User load data has two main characteristics: sparsity and diversity. Sparsity means that the daily load of a user can basically be linearly composed of several sub-patterns, for example, it can be decomposed into the electricity consumption curve of each device of the user; diversity means that a set of electronic patterns can be reconstructed into different patterns through different codes. Daily load curve. Based on the sparseness and diversity of power user loads, the daily load curve is decomposed into different load sub-patterns by sparse coding technology, and the different load curves are described as a linear combination of sub-patterns to realize load reconstruction, so as to reduce the load loss. data is repaired.
发明内容SUMMARY OF THE INVENTION
本发明要解决的技术问题和提出的技术任务是对现有技术方案进行完善与改进,提供一种基于用电模式分解重构的电力负荷缺失数据修复方法,以实现对连续缺失的负荷数据进行有效修复。为此,本发明采取以下技术方案。The technical problem to be solved and the technical task proposed by the present invention are to improve and improve the existing technical solutions, and to provide a power load missing data repair method based on the decomposition and reconstruction of the power consumption mode, so as to realize the continuous missing load data. Effective repair. Therefore, the present invention adopts the following technical solutions.
一种基于用电模式分解重构的电力负荷缺失数据修复方法,其特征在于包括步骤:A method for repairing missing power load data based on the decomposition and reconstruction of power consumption patterns, which is characterized by comprising the steps of:
1)从用电信息采集系统中获取电力用户的用电负荷数据,并根据日负荷数据是否采集完整,将数据集分为完整负荷数据集和待修复负荷数据集;1) Obtain the electricity load data of power users from the electricity consumption information collection system, and divide the data set into a complete load data set and a to-be-repaired load data set according to whether the daily load data is collected completely;
2)采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵;2) Using the K singular value decomposition dictionary learning algorithm to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set;
3)基于基向量字典矩阵,对待修复负荷曲线进行分解及编码,确定其用电子模式构成;3) Decompose and encode the load curve to be repaired based on the base vector dictionary matrix, and determine that it is constituted by an electronic mode;
4)基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,并对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。4) Based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the power consumption data at the corresponding moment in the reconstructed load curve is regarded as the missing part of the load data. fix value.
本技术方案采用了K奇异值分解字典学习算法,首先获取电力用户的用电负荷数据,并根据日负荷数据是否采集完整,将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性,采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵;然后,基于基向量字典矩阵,对待修复负荷曲线进行分解及编码,确定其用电子模式构成;最后基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,并对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。This technical solution adopts the K-singular value decomposition dictionary learning algorithm, first obtains the electricity load data of power users, and divides the data set into a complete load data set and a to-be-repaired load data set according to whether the daily load data is collected completely. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired.
作为优选技术手段:在步骤1)中,从用电信息采集系统中获取电力用户的用电负荷数据,根据日负荷数据是否采集完整,将负荷数据分为完整负荷数据集和待修复负荷数据集,其中某一用户完整的日负荷采集样本集XN×M可表示为:As a preferred technical means: in step 1), the power consumption load data of the power user is obtained from the power consumption information collection system, and the load data is divided into a complete load data set and a load data set to be repaired according to whether the daily load data is collected completely. , the complete daily load collection sample set X N×M of a certain user can be expressed as:
式中:N为日负荷采集点数;M为负荷采集天数;为第j天的日负荷曲线,是一个N维特征向量;为全部负荷曲线的第i个采集时刻的功率向量。对于待恢复的负荷曲线x=[x1,x2,…,xN]T, 为空缺值,i∈Ωnan={c1,c2,...,cL},cl为第l个缺失点的序号,Ωnan为采集缺失点的序号集合,L为负荷曲线采集缺失的数量。In the formula: N is the daily load collection points; M is the load collection days; is the daily load curve of the jth day, which is an N-dimensional eigenvector; is the power vector at the ith acquisition time of all load curves. For the load curve to be restored x=[x 1 ,x 2 ,...,x N ] T , is the vacancy value, i∈Ω nan ={c 1 ,c 2 ,...,c L }, c l is the serial number of the lth missing point, Ω nan is the set of serial numbers of the missing points, and L is the load curve collection number of missing.
作为优选技术手段:在步骤2)中,采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵,字典学习的目标是学习一个字典矩阵B,使得XN×M被近似分解为:As a preferred technical means: in step 2), the K singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set, and the goal of dictionary learning is to learn a dictionary matrix B such that X N ×M is approximately decomposed into:
X≈BZX≈BZ
式中:B∈RN×K为字典矩阵,K为字典的大小,B的每一列为单位化原子向量,同样为一个M维特征向量;Z=[z1,z2,…,zM]∈RK×M为稀疏编码矩阵。在近似分解的同时要满足Z尽可能稀疏,则该近似分解问题的表达式为:In the formula: B∈R N×K is the dictionary matrix, K is the size of the dictionary, and each column of B is a normalized atomic vector, which is also an M-dimensional feature vector; Z=[z 1 , z 2 ,...,z M ]∈R K×M is a sparse coding matrix. In the approximate decomposition, Z should be as sparse as possible, the expression of the approximate decomposition problem is:
式中:||·||F为Frobenius范数,其值为矩阵元素的平方和根,表示重构误差EB的大小,重构误差EB越小,则字典学习的效果越好;||·||0为0范数,其值为矩阵中非零项的数量;T0为稀疏度约束阈值,用来约束编码向量zi中非零项的数量,该式可以用正交匹配追踪算法求解。In the formula: ||·|| F is the Frobenius norm, and its value is the square sum root of the matrix elements, indicating the size of the reconstruction error EB , the smaller the reconstruction error EB, the better the effect of dictionary learning; | |·|| 0 is the 0 norm, and its value is the number of non-zero items in the matrix; T 0 is the sparsity constraint threshold, which is used to constrain the number of non-zero items in the coding vector zi . This formula can be matched by orthogonal matching The tracking algorithm solves.
作为优选技术手段:在步骤3)中,采用K奇异值分解算法进行字典学习的基础上,基于基向量对待修复负荷曲线进行分解及编码,确定其用电子模式构成,利用待修复的负荷曲线采集成功的负荷数据部分及对应时刻的字典矩阵对其进行编码,编码的表达式为:As the preferred technical means: in step 3), on the basis of using K singular value decomposition algorithm for dictionary learning, the load curve to be repaired is decomposed and encoded based on the basis vector, and it is determined to be formed by electronic mode, and the load curve to be repaired is used to collect The successful load data part and the dictionary matrix of the corresponding time are encoded, and the encoding expression is:
x/Ω=x-{xi|i∈Ωnan}x /Ω = x-{x i |i∈Ω nan }
式中:x/Ω为负荷曲线x中采集成功的负荷数据,其长度为N-L;为B中第i维(行)特征向量;B/Ω为完整的字典矩阵B去除采集缺失时刻对应特征行向量后的字典矩阵,zg为重构向量,为x/Ω基于B/Ω分解所得的稀疏编码向量,其值为基于采集成功的负荷数据确定的用电子模式构成,代表了待修复负荷曲线可能的用电模式。In the formula: x /Ω is the load data collected successfully in the load curve x, and its length is NL; is the i-th dimension (row) eigenvector in B; B /Ω is the complete dictionary matrix B is the dictionary matrix after removing the corresponding characteristic row vector at the missing moment, z g is the reconstruction vector, which is the sparse coding vector obtained by x /Ω based on B /Ω decomposition.
作为优选技术手段:在步骤4)中,基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,其表达式为:As a preferred technical means: in step 4), based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and its expression is:
xg=Bzg x g = Bz g
式中xg为重构负荷曲线,由重构向量zg和完整的字典矩阵B重构所得。在此基础上,对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值,其表达式为:where x g is the reconstruction load curve, which is reconstructed from the reconstruction vector z g and the complete dictionary matrix B. On this basis, fill in and repair the missing part of the power load data, that is, the power consumption data at the corresponding moment in the reconstructed load curve is used as the repair value of the missing part of the load data, and its expression is:
式中为重构负荷xg中对应采集缺失时刻的负荷数据。in the formula Collect the load data corresponding to the missing moment in the reconstructed load x g .
有益效果:Beneficial effects:
本发明提出了一种基于用电模式分解重构的电力负荷缺失数据修复方法。首先获取电力用户的用电负荷数据,并根据日负荷数据是否采集完整,将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性,采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵;然后,基于基向量字典矩阵,对待修复负荷曲线进行分解及编码,确定其用电子模式构成;最后基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,并对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。本发明考虑到用户的用电习惯及用电设备相对固定,基于历史负荷数据将用户负荷划分为几种典型的用电模式,并基于其用电模式对缺失负荷数据进行修复。电力负荷数据管理人员可以根据其实际需要,将本发明应用于多日负荷数据缺失或连续时段的负荷数据缺失修复。The invention proposes a method for repairing missing data of power load based on the decomposition and reconstruction of the power consumption mode. First, the electricity load data of power users is obtained, and according to whether the daily load data is collected completely, the data set is divided into a complete load data set and a load data set to be repaired. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired. Considering the user's electricity consumption habits and the relatively fixed electricity consumption equipment, the invention divides the user load into several typical electricity consumption patterns based on historical load data, and repairs the missing load data based on the electricity consumption patterns. The electric power load data management personnel can apply the present invention to repairing the missing load data of multiple days or the missing load data of continuous periods according to their actual needs.
附图说明Description of drawings
图1是本发明的流程图。Figure 1 is a flow chart of the present invention.
图2是负荷曲线1缺失数据修复结果;Figure 2 is the repair result of missing data of
图3是负荷曲线2缺失数据修复结果;Figure 3 is the repair result of missing data in
图4是负荷曲线3缺失数据修复结果;Figure 4 is the result of repairing missing data in load curve 3;
图5是负荷曲线1的实际编码与重构编码结果;Fig. 5 is the actual coding and reconstruction coding result of
图6是负荷曲线2的实际编码与重构编码结果;Fig. 6 is the actual coding and reconstruction coding result of
图7是负荷曲线3的实际编码与重构编码结果;Fig. 7 is the actual coding and reconstruction coding result of load curve 3;
图8是负荷曲线1实际编码与重构编码对应基向量;Fig. 8 is the corresponding basis vector of actual coding and reconstructed coding of
图9是负荷曲线2实际编码与重构编码对应基向量;Fig. 9 is the corresponding basis vector of
图10是负荷曲线3实际编码与重构编码对应基向量。Fig. 10 shows the base vectors corresponding to actual coding and reconstructed coding in load curve 3.
具体实施方式Detailed ways
以下结合说明书附图对本发明的技术方案做进一步的详细说明。The technical solutions of the present invention will be further described in detail below with reference to the accompanying drawings.
如图1所示,图1为本发明的方法流程:首先获取电力用户的用电负荷数据,并根据日负荷数据是否采集完整,将数据集分为完整负荷数据集和待修复负荷数据集。基于电力用户负荷具有的稀疏性和多样性,采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵;然后,基于基向量字典矩阵,对待修复负荷曲线进行分解及编码,确定其用电子模式构成;最后基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,并对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值。从而实现对连续缺失的负荷数据进行有效修复。具体步骤为:As shown in FIG. 1, FIG. 1 is the method flow of the present invention: firstly, the power consumption load data of the power user is obtained, and according to whether the daily load data is collected completely, the data set is divided into a complete load data set and a to-be-repaired load data set. Based on the sparseness and diversity of power user loads, the K-singular value decomposition dictionary learning algorithm is used to extract the basis vector dictionary matrix representing the user's electronic mode from the complete load data set; then, based on the basis vector dictionary matrix, the load curve to be repaired is carried out. Decomposition and coding, determine that it is constituted by electronic mode; finally, based on the base vector dictionary matrix, the load curve is reconstructed according to the coding vector of the load curve to be repaired, and the missing part of the power load data is filled and repaired, that is, the corresponding moment in the reconstructed load curve will be reconstructed The electricity consumption data is used as the repair value for the missing partial load data. In this way, the continuous missing load data can be effectively repaired. The specific steps are:
步骤1.从用电信息采集系统中获取电力用户的用电负荷数据,根据日负荷数据是否采集完整,将负荷数据分为完整负荷数据集和待修复负荷数据集,其中某一用户完整的日负荷采集样本集XN×M可表示为:
式中:N为日负荷采集点数;M为负荷采集天数;为第j天的日负荷曲线,是一个N维特征向量;为全部负荷曲线的第i个采集时刻的功率向量。对于待恢复的负荷曲线x=[x1,x2,…,xN]T, 为空缺值,i∈Ωnan={c1,c2,...,cL},cl为第l个缺失点的序号,Ωnan为采集缺失点的序号集合,L为负荷曲线采集缺失的数量。In the formula: N is the daily load collection points; M is the load collection days; is the daily load curve of the jth day, which is an N-dimensional eigenvector; is the power vector at the ith acquisition time of all load curves. For the load curve to be restored x=[x 1 ,x 2 ,...,x N ] T , is the vacancy value, i∈Ω nan ={c 1 ,c 2 ,...,c L }, c l is the serial number of the lth missing point, Ω nan is the set of serial numbers of the missing points, and L is the load curve collection number of missing.
采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵。The K-singular value decomposition dictionary learning algorithm is used to extract the base vector dictionary matrix representing the user's electronic mode from the complete load data set.
步骤2.采用K奇异值分解字典学习算法从完整负荷数据集中提取表征用户用电子模式的基向量字典矩阵,字典学习的目标是学习一个字典矩阵B,使得XN×M被近似分解为:
X≈BZX≈BZ
式中:B∈RN×K为字典矩阵,K为字典的大小,B的每一列为单位化原子向量,同样为一个M维特征向量;Z=[z1,z2,…,zM]∈RK×M为稀疏编码矩阵。在近似分解的同时要满足Z尽可能稀疏,则该近似分解问题的表达式为:In the formula: B∈R N×K is the dictionary matrix, K is the size of the dictionary, and each column of B is a normalized atomic vector, which is also an M-dimensional feature vector; Z=[z 1 , z 2 ,...,z M ]∈R K×M is a sparse coding matrix. In the approximate decomposition, Z should be as sparse as possible, the expression of the approximate decomposition problem is:
式中:||·||F为Frobenius范数,其值为矩阵元素的平方和根,表示重构误差EB的大小,重构误差EB越小,则字典学习的效果越好;||·||0为0范数,其值为矩阵中非零项的数量;T0为稀疏度约束阈值,用来约束编码向量zi中非零项的数量,该式可以用正交匹配追踪算法求解。In the formula: ||·|| F is the Frobenius norm, and its value is the square sum root of the matrix elements, indicating the size of the reconstruction error EB , the smaller the reconstruction error EB, the better the effect of dictionary learning; | |·|| 0 is the 0 norm, and its value is the number of non-zero items in the matrix; T 0 is the sparsity constraint threshold, which is used to constrain the number of non-zero items in the coding vector zi . This formula can be matched by orthogonal matching The tracking algorithm solves.
步骤3.采用K奇异值分解算法进行字典学习的基础上,基于基向量对待修复负荷曲线进行分解及编码,确定其用电子模式构成,利用待修复的负荷曲线采集成功的负荷数据部分及对应时刻的字典矩阵对其进行编码,编码的表达式为:Step 3. On the basis of using K singular value decomposition algorithm for dictionary learning, decompose and encode the load curve to be repaired based on the basis vector, determine that it is constituted by electronic mode, and use the load curve to be repaired to collect the successful load data part and the corresponding time The dictionary matrix of , encodes it, and the encoded expression is:
x/Ω=x-{xi|i∈Ωnan}x /Ω = x-{x i |i∈Ω nan }
式中:x/Ω为负荷曲线x中采集成功的负荷数据,其长度为N-L;为B中第i维(行)特征向量;B/Ω为完整的字典矩阵B去除采集缺失时刻对应特征行向量后的字典矩阵,zg为重构向量,为x/Ω基于B/Ω分解所得的稀疏编码向量,其值为基于采集成功的负荷数据确定的用电子模式构成,代表了待修复负荷曲线可能的用电模式。In the formula: x /Ω is the load data collected successfully in the load curve x, and its length is NL; is the i-th dimension (row) eigenvector in B; B /Ω is the complete dictionary matrix B is the dictionary matrix after removing the corresponding characteristic row vector at the missing moment, z g is the reconstruction vector, which is the sparse coding vector obtained by x /Ω based on B /Ω decomposition.
步骤4.基于基向量字典矩阵,根据待修复负荷曲线的编码向量重构负荷曲线,其表达式为:
xg=Bzg x g = Bz g
式中xg为重构负荷曲线,由重构向量zg和完整的字典矩阵B重构所得。在此基础上,对电力负荷缺失部分数据进行填充修复,即将重构负荷曲线中对应时刻的用电数据作为缺失部分负荷数据的修复值,其表达式为:where x g is the reconstruction load curve, which is reconstructed from the reconstruction vector z g and the complete dictionary matrix B. On this basis, fill in and repair the missing part of the power load data, that is, the power consumption data at the corresponding moment in the reconstructed load curve is used as the repair value of the missing part of the load data, and its expression is:
式中为重构负荷xg中对应采集缺失时刻的负荷数据。in the formula Collect the load data corresponding to the missing moment in the reconstructed load x g .
以下以具体的实例对本发明作进一步的说明:The present invention is further described below with specific example:
一、数据来源1. Data sources
实例数据主要来源于某居民用户在2019年5月至10月的48点日负荷数据,随机选取三天的负荷曲线构造缺失样本,并均设定为连续10个采集时刻的负荷数据缺失。The example data is mainly derived from the daily load data of a resident user at 48:00 from May to October 2019. Three days of load curves are randomly selected to construct missing samples, and all of them are set as missing load data at 10 consecutive collection times.
二、负荷缺失数据修复结果Second, the load missing data repair results
采用本发明所提技术方案对三个负荷曲线进行修复,选取该用户其他采集完整的日负荷曲线100条(即M=100)作为字典学习的训练集,并设定字典大小为20,即K=20,设定稀疏度约束阈值T0=5。三个负荷曲线的修复结果分别如图2、图3和图4所示,其实际负荷曲线和重构负荷曲线的稀疏编码分别如图5、图6和图7所示,编码对应的基向量分别如图8、图9和图10所示。The three load curves are repaired by the technical solution of the present invention, and 100 other complete daily load curves (ie M=100) collected by the user are selected as the training set for dictionary learning, and the dictionary size is set to 20, that is, K =20, set the sparsity constraint threshold T 0 =5. The repair results of the three load curves are shown in Figure 2, Figure 3, and Figure 4, respectively. The sparse coding of the actual load curve and the reconstructed load curve are shown in Figure 5, Figure 6, and Figure 7, respectively. The corresponding basis vectors are encoded. As shown in Figure 8, Figure 9 and Figure 10, respectively.
由图2、图3和图4可得,无论数据缺失时段负荷平缓或存在较大的上升和下降,本章所提算法均可以较好地修复缺失的负荷数据。由图5、图6和图7可得,对于部分数据缺失的负荷曲线,基于字典进行编码后,其编码的结果与实际完整的负荷曲线编码结果相近,并且其编码值较大的部分基本一致,说明缺失的负荷数据曲线基于其采集成功的部分数据依然可以基于字典矩阵得到与实际完整负荷曲线一致的编码。由于重构编码与原始负荷分解编码的一致性,基于重构编码和完整字典重构所得负荷曲线与实际完整负荷曲线基本一致,故可以对缺失负荷数据进行修复。由图8、图9和图10可得,最大编码对应的基向量(即图8的基向量13、图9的基向量18和图10的基向量16)和实际完整负荷曲线较为接近,而其余编码对应的基向量则进一步进行修补和近似,最终可以通过字典基向量。It can be seen from Figure 2, Figure 3 and Figure 4 that the algorithm proposed in this chapter can better repair the missing load data regardless of whether the load is flat during the period of data missing or there is a large rise and fall. As can be seen from Figure 5, Figure 6 and Figure 7, for the load curve with missing data, after encoding based on the dictionary, the encoding result is similar to the actual complete load curve encoding result, and the part with larger encoding value is basically the same. , indicating that the missing load data curve can still obtain a code consistent with the actual complete load curve based on the dictionary matrix based on the part of the data collected successfully. Due to the consistency between the reconstructed coding and the original load decomposition coding, the reconstructed load curve based on the reconstructed coding and the complete dictionary is basically consistent with the actual complete load curve, so the missing load data can be repaired. It can be seen from Fig. 8, Fig. 9 and Fig. 10 that the basis vectors corresponding to the maximum coding (
以上图1所示的一种基于用电模式分解重构的电力负荷缺失数据修复方法是本发明的具体实施例,已经体现出本发明实质性特点和进步,可根据实际的使用需要,在本发明的启示下,对其进行形状、结构等方面的等同修改,均在本方案的保护范围之列。The method for repairing missing data of power load based on the decomposition and reconstruction of the power consumption mode shown in FIG. 1 above is a specific embodiment of the present invention, which has embodied the substantial features and progress of the present invention. Under the inspiration of the invention, equivalent modifications in terms of shape and structure are included in the protection scope of this scheme.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110409685.5A CN113220671B (en) | 2021-04-16 | 2021-04-16 | Power load missing data restoration method based on power utilization mode decomposition and reconstruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110409685.5A CN113220671B (en) | 2021-04-16 | 2021-04-16 | Power load missing data restoration method based on power utilization mode decomposition and reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113220671A CN113220671A (en) | 2021-08-06 |
CN113220671B true CN113220671B (en) | 2022-06-17 |
Family
ID=77087575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110409685.5A Active CN113220671B (en) | 2021-04-16 | 2021-04-16 | Power load missing data restoration method based on power utilization mode decomposition and reconstruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113220671B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114168574B (en) * | 2021-10-27 | 2024-09-24 | 清华大学 | A data missing processing method and device for industrial loads |
CN120011148B (en) * | 2025-04-18 | 2025-07-08 | 浙江大学 | Missing load recovery method based on similar block embedding and first-order polynomial interpolation |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108537380A (en) * | 2018-04-04 | 2018-09-14 | 福州大学 | A kind of Methods of electric load forecasting based on rarefaction representation |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3441896B1 (en) * | 2012-09-14 | 2021-04-21 | InteraXon Inc. | Systems and methods for collecting, analyzing, and sharing bio-signal and non-bio-signal data |
US10545919B2 (en) * | 2013-09-27 | 2020-01-28 | Google Llc | Decomposition techniques for multi-dimensional data |
CN104361054A (en) * | 2014-10-30 | 2015-02-18 | 广东电网有限责任公司电力科学研究院 | Method and system for restructuring, positioning and visualizing line loss of electric power system |
US10705168B2 (en) * | 2017-01-17 | 2020-07-07 | Case Western Reserve University | System and method for low rank approximation of high resolution MRF through dictionary fitting |
CN111159638B (en) * | 2019-12-26 | 2023-12-08 | 华南理工大学 | Distribution network load missing data recovery method based on approximate low-rank matrix completion |
CN111508043B (en) * | 2020-03-24 | 2022-11-25 | 东华大学 | A Texture Reconstruction Method of Woven Fabric Based on Discriminant Shared Dictionary |
-
2021
- 2021-04-16 CN CN202110409685.5A patent/CN113220671B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108537380A (en) * | 2018-04-04 | 2018-09-14 | 福州大学 | A kind of Methods of electric load forecasting based on rarefaction representation |
Also Published As
Publication number | Publication date |
---|---|
CN113220671A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102938649B (en) | Power quality data self-adapting reconstruction decompression method based on compressive sensing theory | |
CN107832837B (en) | Convolutional neural network compression method and decompression method based on compressed sensing principle | |
CN111159638B (en) | Distribution network load missing data recovery method based on approximate low-rank matrix completion | |
CN113220671B (en) | Power load missing data restoration method based on power utilization mode decomposition and reconstruction | |
CN104506752B (en) | A kind of similar image compression method based on residual error compressed sensing | |
CN113141008A (en) | Data-driven power distribution network distributed new energy consumption capacity assessment method | |
CN109992930A (en) | A method and device for estimating weather-sensitive load power | |
Huang et al. | ECG compression using the context modeling arithmetic coding with dynamic learning vector–scalar quantization | |
CN108197425B (en) | A Smart Grid Data Decomposition Method Based on Non-negative Matrix Factorization | |
CN111612319A (en) | Loading Curve Deep Embedding Clustering Method Based on 1D Convolutional Autoencoder | |
CN113469189A (en) | Method, system and device for filling missing values of power utilization acquisition data | |
CN109635946A (en) | A kind of combined depth neural network and the clustering method constrained in pairs | |
CN110278444A (en) | A Geometry-Guided Sparse Representation 3D Point Cloud Compression Method | |
CN111193254A (en) | A kind of residential daily electricity load forecasting method and equipment | |
CN116977763A (en) | Model training method, device, computer readable storage medium and computer equipment | |
CN113158134B (en) | Method, device and storage medium for constructing non-invasive load identification model | |
Liu et al. | Prediction of Temperature Time Series Based on Wavelet Transform and Support Vector Machine. | |
CN116859140A (en) | Cloud edge cooperation-based non-invasive load monitoring data online compressed sensing method | |
CN114881120B (en) | Method and system for identifying household transformer relation of platform based on depth self-encoder and clustering | |
CN118036605A (en) | A text representation system and method based on Word2vec-QCNN model and its application in the construction of a vocabulary in the power field | |
CN114638905B (en) | Image generation method, device, equipment and storage medium | |
CN113901679B (en) | Reliability analysis method and device for power system and computer equipment | |
CN116091120B (en) | Full stack type electricity price consulting and managing system based on knowledge graph technology | |
CN112051040A (en) | An intelligent identification method for mechanical faults of on-load tap-changers | |
CN117590267A (en) | Early warning method and system for performance degradation of energy storage battery |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |