CN114926303A - Electric larceny detection method based on transfer learning - Google Patents
Electric larceny detection method based on transfer learning Download PDFInfo
- Publication number
- CN114926303A CN114926303A CN202210451618.4A CN202210451618A CN114926303A CN 114926303 A CN114926303 A CN 114926303A CN 202210451618 A CN202210451618 A CN 202210451618A CN 114926303 A CN114926303 A CN 114926303A
- Authority
- CN
- China
- Prior art keywords
- data
- neural network
- network model
- target
- electricity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 22
- 238000013526 transfer learning Methods 0.000 title claims abstract description 8
- 230000005611 electricity Effects 0.000 claims abstract description 111
- 238000003062 neural network model Methods 0.000 claims abstract description 72
- 238000012549 training Methods 0.000 claims abstract description 48
- 238000012360 testing method Methods 0.000 claims abstract description 37
- 238000000034 method Methods 0.000 claims abstract description 15
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 66
- 239000011159 matrix material Substances 0.000 claims description 44
- 238000005070 sampling Methods 0.000 claims description 16
- 230000037211 monthly cycles Effects 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000004140 cleaning Methods 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 6
- 238000011084 recovery Methods 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000000737 periodic effect Effects 0.000 claims 4
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 claims 2
- 238000005192 partition Methods 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 2
- 230000002354 daily effect Effects 0.000 description 18
- 230000003442 weekly effect Effects 0.000 description 18
- 238000013508 migration Methods 0.000 description 4
- 230000005012 migration Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Software Systems (AREA)
- Marketing (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Development Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Water Supply & Treatment (AREA)
- Public Health (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
技术领域technical field
本发明涉及窃电检测技术领域,具体涉及一种基于迁移学习的窃电检测方法。The invention relates to the technical field of electricity stealing detection, in particular to a method for detecting electricity stealing based on migration learning.
背景技术Background technique
窃电给电力公司、社会、国家带来非常严重的后果和伤害。窃电不仅在各个国家每年造成巨大的经济损失,而且还会影响电网的稳定性。随着智能电网的发展,电力系统逐渐实现数字化,海量的用户用电数据将被获得,为基于深度学习进行窃电检测提供了基础。然而,单个智能电表不具备窃电检测功能,只能提供大量无窃电状态标签的用电数据,如果需要根据大量智能电表提供的海量用电数据鉴别用户窃电状态并为每个用户设置窃电状态标签,需要巨大的人力成本。因此,带有窃电状态标签的用电数据数量很少,导致依赖数据标签的深度学习窃电检测方法在实际情况下难以得到有效训练,不能发挥其检测性能,无法正确检测出窃电用户。Electricity theft brings very serious consequences and harm to power companies, society and the country. Electricity theft not only causes huge economic losses in various countries every year, but also affects the stability of the power grid. With the development of the smart grid, the power system is gradually digitized, and a large amount of user power consumption data will be obtained, which provides a basis for the detection of electricity theft based on deep learning. However, a single smart meter does not have the function of electricity theft detection, and can only provide a large amount of electricity consumption data without electricity tampering status labels. Electrical status tags require huge labor costs. Therefore, the amount of electricity consumption data with electricity stealing status labels is very small, which makes it difficult for the deep learning electricity stealing detection method relying on data labels to be effectively trained in actual situations, unable to exert its detection performance, and unable to correctly detect electricity stealing users.
发明内容SUMMARY OF THE INVENTION
本发明为克服上述现有技术要求用电数据包含大量窃电状态标签导致的训练数据量不足的问题,提出一种基于迁移学习的窃电检测方法。In order to overcome the problem of insufficient amount of training data caused by the above-mentioned prior art requiring that the electricity consumption data contain a large number of electricity stealing state labels, the present invention proposes a method for detecting electricity stealing based on migration learning.
本发明的首要目的是为解决上述技术问题,本发明的技术方案如下:The primary purpose of the present invention is to solve the above-mentioned technical problems, and the technical scheme of the present invention is as follows:
一种基于迁移学习的窃电检测方法,其特征在于,包括以下步骤:A method for detecting electricity theft based on migration learning, comprising the following steps:
S1:获取检测目标区域内所有用户的子表读数,构建目标域测试集Dtarget;将目标区域划分为多个分区域,检测分区域总表读数,并根据分区域总表读数、分区域内所有用户子表读数、电能技术损耗和误差阈值判断分区域是否存在窃电,并设置窃电状态标签,构建源域数据集Dsource;获取少量历史用户的用电数据及其窃电状态标签,构建目标域训练集Dtrain;S1: Obtain the sub-meter readings of all users in the detection target area, and construct the target domain test set D target ; divide the target area into multiple sub-regions, detect the sub-region master meter readings, and determine All user sub-meter readings, power technical losses and error thresholds determine whether there is electricity stealing in the sub-region, and set the electricity stealing status label to construct the source domain data set D source ; obtain a small number of historical users' electricity consumption data and electricity stealing status labels, Construct the target domain training set D train ;
S2:将步骤S1所述的目标域测试集、源域数据集、目标域训练集分别进行数据预处理,将其中所有用电数据序列分别以日、周、月为时间周期转化为用电数据矩阵,然后进行缺失值恢复、数据清洗、数据归一化;S2: Perform data preprocessing on the target domain test set, the source domain data set, and the target domain training set described in step S1, respectively, and convert all the electricity consumption data sequences into electricity consumption data with daily, weekly, and monthly time periods respectively. matrix, and then perform missing value recovery, data cleaning, and data normalization;
S3:将步骤S2所述的源域数据集划分为源域训练集和源域测试集;S3: Divide the source domain data set described in step S2 into a source domain training set and a source domain test set;
S4:搭建源域和目标域CNN神经网络模型,所述CNN神经网络模型包括三个输入层、多个卷积层和多个全连接层;S4: Build a source domain and a target domain CNN neural network model, where the CNN neural network model includes three input layers, multiple convolution layers, and multiple fully connected layers;
S5:使用源域训练集对源域CNN神经网络模型进行预训练,使用源域测试集对源域CNN神经网络模型进行评估;S5: Use the source domain training set to pre-train the source domain CNN neural network model, and use the source domain test set to evaluate the source domain CNN neural network model;
S6:保存步骤S5评估合格的源域CNN神经网络模型参数,并将其迁移到待训练的目标域CNN神经网络模型中,对目标域CNN神经网络模型参数进行初始化;S6: Save the source domain CNN neural network model parameters that are qualified in the evaluation of step S5, and migrate them to the target domain CNN neural network model to be trained, and initialize the target domain CNN neural network model parameters;
S7:采用目标域训练集中所有用户的用电数据及其对应的窃电状态标签对步骤S6所述已初始化的目标域CNN神经网络模型进行训练;S7: Use the electricity consumption data of all users in the target domain training set and their corresponding electricity stealing state labels to train the initialized target domain CNN neural network model described in step S6;
S8:将目标域测试集中所有用户的用电数据输入到步骤S7训练的目标域CNN神经网络模型,分类用户类型,寻找目标区域内窃电用户。S8: Input the electricity consumption data of all users in the target domain test set into the target domain CNN neural network model trained in step S7, classify user types, and find electricity stealing users in the target area.
本方案中,步骤S1所述的目标域测试集Dtarget、源域数据集Dsource和目标域训练集Dtrain,其具体组成步骤如下:In this solution, the target domain test set D target , the source domain data set D source and the target domain training set D train described in step S1 are composed of the following steps:
S101:记录目标区域内所有用户子表的读数作为对应用户的用电数据,并构建目标域测试集Dtarget如下:S101: Record the readings of all user sub-meters in the target area as the electricity consumption data of the corresponding users, and construct the target domain test set D target as follows:
其中,dm表示用户m的用电数据序列;dm,n表示用户m第n个采样记录的数据;Among them, d m represents the power consumption data sequence of user m; d m,n represents the data recorded by the nth sample of user m;
S102:将目标区域划分为多个分区域,检测分区域总表读数,作为分区域用电数据如下:S102: Divide the target area into a plurality of sub-areas, detect the sub-area total meter reading, and use the sub-area electricity consumption data as follows:
其中,dsub,g表示分区域g的用电数据序列;dsub,g,n表示分区域g第n个采样记录的数据;Among them, d sub, g represents the power consumption data sequence of sub-region g; d sub, g, n represents the data recorded by the nth sampling of sub-region g;
S103:按所在区域将各个分区域内的用户子表读数相加,得到各个分区域用户总用电数据如下:S103: Add up the readings of the sub-meters of users in each sub-area according to the area, and obtain the total electricity consumption data of users in each sub-area as follows:
其中,dreg,g表示分区域g的用户总用电数据序列;dreg,g,n表示分区域g第n个采样记录的数据;Among them, d reg,g represents the total electricity consumption data sequence of users in sub-region g; d reg, g, n represents the data recorded by the nth sample in sub-region g;
S104:计算各个分区域总表到用户子表之间输电线路的电能技术损耗dTL,g;根据下式计算各分区域的窃电状态标签:S104: Calculate the electric energy technical loss d TL,g of the transmission line between the master meter of each sub-region and the user sub-meter; calculate the electricity stealing status label of each sub-region according to the following formula:
其中,yreg,g表示分区域g的窃电状态标签,yreg,g=1表示为窃电状态,yreg,g=0表示为正常状态;α为误差阈值;Wherein, y reg,g represents the electricity stealing state label of sub-region g, y reg, g =1 represents the electricity stealing state, y reg, g =0 represents the normal state; α is the error threshold;
S105:将各个分区域的用户总用电数据序列及其窃电状态标签组合构建源域数据集Dsource:S105: Construct a source domain data set D source by combining the user's total electricity consumption data sequence of each sub-area and its electricity stealing state label:
S106:根据少量历史用户的用电数据及其窃电状态标签,构建目标域训练集Dtrain如下:S106: According to the electricity consumption data of a small number of historical users and their electricity stealing state labels, construct the target domain training set D train as follows:
其中,dhis,k表示历史用户k的用电数据序列;dhis,k,n表示历史用户k第n个采样记录的数据;yhis,k表示历史用户k的窃电状态标签。Among them, d his,k represents the electricity consumption data sequence of the historical user k; d his,k,n represents the data of the nth sampling record of the historical user k; y his,k represents the electricity stealing status label of the historical user k.
本方案中,步骤S2所述的数据预处理,其具体组成步骤如下:In this scheme, the data preprocessing described in step S2, its specific composition steps are as follows:
S201:针对步骤S1所述的目标域测试集、源域数据集和目标域训练集,将其中所有用电数据序列分别以日、周、月为时间周期,转化为用电数据矩阵,所述用电数据矩阵的每一行代表单个时间周期内的用电数据,构建日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth如下:S201: For the target domain test set, the source domain data set, and the target domain training set described in step S1, convert all the power consumption data sequences into a power consumption data matrix with day, week, and month as time periods, respectively. Each row of the electricity consumption data matrix represents electricity consumption data in a single time period. The daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month are constructed as follows:
其中,o表示电表采样的天数;p表示电表采样的周数;q表示电表采样的月数;Among them, o represents the number of days for meter sampling; p represents the number of weeks for meter sampling; q represents the number of months for meter sampling;
S202:将日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth进行缺失值恢复、数据清洗、数据归一化处理。S202: Perform missing value recovery, data cleaning, and data normalization processing on the daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month .
本方案中,步骤S4所述的搭建源域和目标域CNN神经网络模型,其具体组成步骤如下:In this solution, the construction of the source domain and target domain CNN neural network model described in step S4, its specific composition steps are as follows:
S401:所述的CNN神经网络模型有三个输入层,包括日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth,其大小分别为o×(24×60/n)、p×(24×60×7/n)和q×(24×60×30/n);S401: The CNN neural network model has three input layers, including daily cycle data input X day , weekly cycle data input X week and monthly cycle data input X month , whose sizes are o×(24×60/n), p×(24×60×7/n) and q×(24×60×30/n);
S402:分别使用多个卷积层对三组输入数据进行卷积操作:S402: Use multiple convolution layers to perform convolution operations on three sets of input data:
S1=f(WXday+b)S 1 =f(WX day +b)
S2=f(WXweek+b)S 2 =f(WX week +b)
S3=f(WXmonth+b)S 3 =f(WX month +b)
其中,S1、S2和S3分别三组输入经过多个卷积层后的特征输出;W和b分别表示卷积层的权重和偏置;f(.)表示卷积层的激活函数;Among them, S 1 , S 2 and S 3 respectively input three sets of feature outputs after multiple convolutional layers; W and b represent the weight and bias of the convolutional layer, respectively; f(.) represents the activation function of the convolutional layer ;
S403:使用融合层对S1、S2和S3进行融合,得到S4;S403: Use a fusion layer to fuse S 1 , S 2 and S 3 to obtain S 4 ;
S404:使用多个卷积层和全连接层对S4进行特征提取,并输出检测目标的类型:S404: Use multiple convolutional layers and fully connected layers to perform feature extraction on S4, and output the type of detection target:
S5=f(WS4+b)S 5 =f(WS 4 +b)
y=g(VS5+c)y=g(VS 5 +c)
其中,S5为融合层后多个卷积层的特征输出;V和c分别表示全连接层的权重和偏置;g(.)表示全连接层的激活函数;y表示检测目标的类别,y=1表示窃电,y=0表示正常。Among them, S5 is the feature output of multiple convolutional layers after the fusion layer; V and c represent the weight and bias of the fully connected layer, respectively; g(.) represents the activation function of the fully connected layer; y represents the detection target category, y=1 means stealing electricity, y=0 means normal.
本方案中,步骤S5所述的使用源域训练集对源域CNN神经网络模型进行预训练,使用源域测试集对源域CNN神经网络模型进行评估,其具体组成步骤如下:In this solution, in step S5, the source domain training set is used to pre-train the source domain CNN neural network model, and the source domain test set is used to evaluate the source domain CNN neural network model. The specific composition steps are as follows:
S501:将源域训练集和源域测试集中各个分区域的日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth分别作为步骤S4所述源域CNN神经网络模型的日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth;S501: Use the daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month of each subregion in the source domain training set and the source domain test set as the daily data matrix of the source domain CNN neural network model described in step S4, respectively. Cycle data input X day , weekly cycle data input X week and monthly cycle data input X month ;
S502:使用源域训练集和源域测试集的用电数据及其窃电状态标签对步骤S4所述的CNN神经网络模型进行预训练和评估。S502: Pre-train and evaluate the CNN neural network model described in step S4 by using the electricity consumption data of the source domain training set and the source domain test set and the electricity stealing state label.
本方案中,步骤S6所述的对目标域CNN神经网络模型参数进行初始化,具体为:将步骤S5所述源域CNN神经网络模型的权重和偏置迁移到目标域CNN神经网络模型,作为其权重和偏置的初始化值。In this solution, initializing the parameters of the target domain CNN neural network model described in step S6 is specifically: migrating the weights and biases of the source domain CNN neural network model described in step S5 to the target domain CNN neural network model as its Initialization values for weights and biases.
本方案中,步骤S7对已初始化的目标域CNN神经网络模型进行训练,具体为:将目标域训练集中少量历史用户的日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth分别作为步骤S6所述目标域CNN神经网络模型的日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth。In this solution, step S7 trains the initialized target domain CNN neural network model, specifically: the daily cycle data matrix D day , the weekly cycle data matrix D week , and the monthly cycle data matrix D of a small number of historical users in the target domain training set month is respectively used as the daily period data input X day , the weekly period data input X week and the monthly period data input X month of the target domain CNN neural network model in step S6 .
与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the beneficial effects of the technical solution of the present invention are:
采用迁移学习技术和CNN神经网络模型对目标区域内用户进行检测,克服了现有深度学习窃电检测方法依赖大量含有窃电状态标签数据的问题,只需少量含有窃电状态标签的用电数据,即可获得良好的窃电检测性能;另外,仅使用少量的源域数据和目标域训练数据对CNN神经网络模型进行训练,大幅度减少窃电检测深度神经网络的训练时间,并能准确识别窃电用户。以日、周、月周期数据作为CNN神经网络模型的输入,分别提取日、周、月周期特征,使所提取的特征能够更准确地描述用电行为,从而提高CNN神经网络模型窃电检测性能。The transfer learning technology and CNN neural network model are used to detect users in the target area, which overcomes the problem that the existing deep learning electricity stealing detection method relies on a large amount of data containing electricity stealing status labels, and only needs a small amount of electricity consumption data containing electricity stealing status labels. In addition, only a small amount of source domain data and target domain training data are used to train the CNN neural network model, which greatly reduces the training time of the deep neural network for electricity stealing detection, and can accurately identify Electricity theft users. The daily, weekly and monthly cycle data are used as the input of the CNN neural network model, and the daily, weekly and monthly cycle features are extracted respectively, so that the extracted features can more accurately describe the electricity consumption behavior, thereby improving the detection performance of the CNN neural network model. .
附图说明Description of drawings
图1为本发明提出的一种基于迁移学习的窃电检测方法流程图;1 is a flowchart of a method for detecting electricity theft based on migration learning proposed by the present invention;
图2为本发明实施例的CNN神经网络模型。FIG. 2 is a CNN neural network model according to an embodiment of the present invention.
具体实施方式Detailed ways
为了能够更清楚地理解本发明的上述目的、特征和优点,下面结合附图和具体实施方式对本发明进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to understand the above objects, features and advantages of the present invention more clearly, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present application and the features in the embodiments may be combined with each other in the case of no conflict.
在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是,本发明还可以采用其他不同于在此描述的其他方式来实施,因此,本发明的保护范围并不受下面公开的具体实施例的限制。Many specific details are set forth in the following description to facilitate a full understanding of the present invention. However, the present invention can also be implemented in other ways different from those described herein. Therefore, the protection scope of the present invention is not limited by the specific details disclosed below. Example limitations.
在一个具体的实施例中,如图1所示,一种基于迁移学习的窃电检测方法,包括以下步骤:In a specific embodiment, as shown in FIG. 1 , a method for detecting electricity theft based on transfer learning includes the following steps:
S1:获取检测目标区域内所有用户的子表读数,构建目标域测试集Dtarget;将目标区域划分为多个分区域,检测分区域总表读数,并根据分区域总表读数、分区域内所有用户子表读数、电能技术损耗和误差阈值判断分区域是否存在窃电,并设置窃电状态标签,构建源域数据集Dsource;获取少量历史用户的用电数据及其窃电状态标签,构建目标域训练集Dtrain;S1: Obtain the sub-meter readings of all users in the detection target area, and construct the target domain test set D target ; divide the target area into multiple sub-regions, detect the sub-region master meter readings, and determine All user sub-meter readings, power technical losses and error thresholds determine whether there is electricity stealing in the sub-region, and set the electricity stealing status label to construct the source domain data set D source ; obtain a small number of historical users' electricity consumption data and electricity stealing status labels, Construct the target domain training set D train ;
S2:将步骤S1所述的目标域测试集、源域数据集、目标域训练集分别进行数据预处理,将其中所有用电数据序列分别以日、周、月为时间周期转化为用电数据矩阵,然后进行缺失值恢复、数据清洗、数据归一化;S2: Perform data preprocessing on the target domain test set, the source domain data set, and the target domain training set described in step S1, respectively, and convert all the electricity consumption data sequences into electricity consumption data with daily, weekly, and monthly time periods respectively. matrix, and then perform missing value recovery, data cleaning, and data normalization;
S3:将步骤S2所述的源域数据集划分为源域训练集和源域测试集;S3: Divide the source domain data set described in step S2 into a source domain training set and a source domain test set;
S4:搭建源域和目标域CNN神经网络模型,所述CNN神经网络模型包括三个输入层、多个卷积层和多个全连接层;S4: Build source domain and target domain CNN neural network models, the CNN neural network model includes three input layers, multiple convolution layers and multiple fully connected layers;
S5:使用源域训练集对源域CNN神经网络模型进行预训练,使用源域测试集对源域CNN神经网络模型进行评估;S5: Use the source domain training set to pre-train the source domain CNN neural network model, and use the source domain test set to evaluate the source domain CNN neural network model;
S6:保存步骤S5评估合格的源域CNN神经网络模型参数,并将其迁移到待训练的目标域CNN神经网络模型中,对目标域CNN神经网络模型参数进行初始化;S6: Save the source domain CNN neural network model parameters that are qualified in the evaluation of step S5, and migrate them to the target domain CNN neural network model to be trained, and initialize the target domain CNN neural network model parameters;
S7:采用目标域训练集中所有用户的用电数据及其对应的窃电状态标签对步骤S6所述已初始化的目标域CNN神经网络模型进行训练;S7: Use the electricity consumption data of all users in the target domain training set and their corresponding electricity stealing state labels to train the initialized target domain CNN neural network model described in step S6;
S8:将目标域测试集中所有用户的用电数据输入到步骤S7训练的目标域CNN神经网络模型,分类用户类型,寻找目标区域内窃电用户。S8: Input the electricity consumption data of all users in the target domain test set into the target domain CNN neural network model trained in step S7, classify user types, and find electricity stealing users in the target area.
本方案中,步骤S1所述的目标域测试集Dtarget、源域数据集Dsource和目标域训练集Dtrain,其具体组成步骤如下:In this solution, the target domain test set D target , the source domain data set D source and the target domain training set D train described in step S1 are composed of the following steps:
S101:记录目标区域内所有用户子表的读数作为对应用户的用电数据,并构建目标域测试集Dtarget如下:S101: Record the readings of all user sub-meters in the target area as the electricity consumption data of the corresponding users, and construct the target domain test set D target as follows:
其中,dm表示用户m的用电数据序列;dm,n表示用户m第n个采样记录的数据;Among them, d m represents the power consumption data sequence of user m; d m,n represents the data recorded by the nth sample of user m;
S102:将目标区域划分为多个分区域,检测分区域总表读数,作为分区域用电数据如下:S102: Divide the target area into a plurality of sub-areas, detect the sub-area total meter reading, and use the sub-area electricity consumption data as follows:
其中,dsub,g表示分区域g的用电数据序列;dsub,g,n表示分区域g第n个采样记录的数据;Among them, d sub, g represents the power consumption data sequence of sub-region g; d sub, g, n represents the data recorded by the nth sampling of sub-region g;
S103:按所在区域将各个分区域内的用户子表读数相加,得到各个分区域用户总用电数据如下:S103: Add up the readings of user sub-meters in each sub-area according to the area, and obtain the total electricity consumption data of users in each sub-area as follows:
其中,dreg,g表示分区域g的用户总用电数据序列;dreg,g,n表示分区域g第n个采样记录的数据;Among them, d reg,g represents the total electricity consumption data sequence of users in sub-region g; d reg, g, n represents the data recorded by the nth sample in sub-region g;
S104:计算各个分区域总表到用户子表之间输电线路的电能技术损耗dTL,g;根据下式计算各分区域的窃电状态标签:S104: Calculate the electric energy technical loss d TL,g of the transmission line between the master meter of each sub-region and the user sub-meter; calculate the electricity stealing status label of each sub-region according to the following formula:
其中,yreg,g表示分区域g的窃电状态标签,yreg,g=1表示为窃电状态,yreg,g=0表示为正常状态;α为误差阈值;Wherein, y reg,g represents the electricity stealing state label of sub-region g, y reg, g =1 represents the electricity stealing state, y reg, g =0 represents the normal state; α is the error threshold;
S105:将各个分区域的用户总用电数据序列及其窃电状态标签组合构建源域数据集Dsource:S105: Construct a source domain data set D source by combining the user's total electricity consumption data sequence of each sub-area and its electricity stealing state label:
S106:根据少量历史用户的用电数据及其窃电状态标签,构建目标域训练集Dtrain如下:S106: According to the electricity consumption data of a small number of historical users and their electricity stealing state labels, construct the target domain training set D train as follows:
其中,dhis,k表示历史用户k的用电数据序列;dhis,k,n表示历史用户k第n个采样记录的数据;yhis,k表示历史用户k的窃电状态标签。Among them, d his,k represents the electricity consumption data sequence of the historical user k; d his,k,n represents the data of the nth sampling record of the historical user k; y his,k represents the electricity stealing status label of the historical user k.
本方案中,步骤S2所述的数据预处理,其具体组成步骤如下:In this scheme, the data preprocessing described in step S2, its specific composition steps are as follows:
S201:针对步骤S1所述的目标域测试集、源域数据集和目标域训练集,将其中所有用电数据序列分别以日、周、月为时间周期,转化为用电数据矩阵,所述用电数据矩阵的每一行代表单个时间周期内的用电数据,构建日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth如下:S201: For the target domain test set, the source domain data set, and the target domain training set described in step S1, convert all the power consumption data sequences into a power consumption data matrix with day, week, and month as time periods, respectively. Each row of the electricity consumption data matrix represents electricity consumption data in a single time period. The daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month are constructed as follows:
其中,o表示电表采样的天数;p表示电表采样的周数;q表示电表采样的月数;Among them, o represents the number of days for meter sampling; p represents the number of weeks for meter sampling; q represents the number of months for meter sampling;
S202:将日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth进行缺失值恢复、数据清洗、数据归一化处理。S202: Perform missing value recovery, data cleaning, and data normalization processing on the daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month .
本方案中,步骤S4所述的搭建源域和目标域CNN神经网络模型,如图2所示,其具体组成步骤如下:In this solution, the construction of the source domain and target domain CNN neural network models described in step S4 is shown in Figure 2, and its specific composition steps are as follows:
S401:所述的CNN神经网络模型有三个输入层,包括日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth,其大小分别为o×(24×60/n)、p×(24×60×7/n)和q×(24×60×30/n);S401: The CNN neural network model has three input layers, including daily cycle data input X day , weekly cycle data input X week and monthly cycle data input X month , whose sizes are o×(24×60/n), p×(24×60×7/n) and q×(24×60×30/n);
S402:分别使用多个卷积层对三组输入数据进行卷积操作:S402: Use multiple convolution layers to perform convolution operations on three sets of input data:
S1=f(WXday+b)S 1 =f(WX day +b)
S2=f(WXweek+b)S 2 =f(WX week +b)
S3=f(WXmonth+b)S 3 =f(WX month +b)
其中,S1、S2和S3分别三组输入经过多个卷积层后的特征输出;W和b分别表示卷积层的权重和偏置;f(.)表示卷积层的激活函数;Among them, S 1 , S 2 and S 3 respectively input three sets of feature outputs after multiple convolutional layers; W and b represent the weight and bias of the convolutional layer, respectively; f(.) represents the activation function of the convolutional layer ;
S403:使用融合层对S1、S2和S3进行融合,得到S4;S403: Use a fusion layer to fuse S 1 , S 2 and S 3 to obtain S 4 ;
S404:使用多个卷积层和全连接层对S4进行特征提取,并输出检测目标的类型:S404: Use multiple convolutional layers and fully connected layers to perform feature extraction on S4, and output the type of detection target:
S5=f(WS4+b)S 5 =f(WS 4 +b)
y=g(VS5+c)y=g(VS 5 +c)
其中,S5为融合层后多个卷积层的特征输出;V和c分别表示全连接层的权重和偏置;g(.)表示全连接层的激活函数;y表示检测目标的类别,y=1表示窃电,y=0表示正常。Among them, S5 is the feature output of multiple convolutional layers after the fusion layer; V and c represent the weight and bias of the fully connected layer, respectively; g(.) represents the activation function of the fully connected layer; y represents the detection target category, y=1 means stealing electricity, y=0 means normal.
本方案中,步骤S5所述的使用源域训练集对源域CNN神经网络模型进行预训练,使用源域测试集对源域CNN神经网络模型进行评估,其具体组成步骤如下:In this solution, in step S5, the source domain training set is used to pre-train the source domain CNN neural network model, and the source domain test set is used to evaluate the source domain CNN neural network model. The specific composition steps are as follows:
S501:将源域训练集和源域测试集中各个分区域的日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth分别作为步骤S4所述源域CNN神经网络模型的日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth;S501: Use the daily period data matrix D day , the weekly period data matrix D week , and the monthly period data matrix D month of each subregion in the source domain training set and the source domain test set as the daily data matrix of the source domain CNN neural network model described in step S4, respectively. Cycle data input X day , weekly cycle data input X week and monthly cycle data input X month ;
S502:使用源域训练集和源域测试集的用电数据及其窃电状态标签对步骤S4所述的CNN神经网络模型进行预训练和评估。S502: Pre-train and evaluate the CNN neural network model described in step S4 by using the electricity consumption data of the source domain training set and the source domain test set and the electricity stealing state label.
本方案中,步骤S6所述的对目标域CNN神经网络模型参数进行初始化,具体为:将步骤S5所述源域CNN神经网络模型的权重和偏置迁移到目标域CNN神经网络模型,作为其权重和偏置的初始化值。In this solution, initializing the parameters of the target domain CNN neural network model described in step S6 is specifically: migrating the weights and biases of the source domain CNN neural network model described in step S5 to the target domain CNN neural network model as its Initialization values for weights and biases.
本方案中,步骤S7对已初始化的目标域CNN神经网络模型进行训练,具体为:将目标域训练集中少量历史用户的日周期数据矩阵Dday、周周期数据矩阵Dweek、月周期数据矩阵Dmonth分别作为步骤S6所述目标域CNN神经网络模型的日周期数据输入Xday、周周期数据输入Xweek和月周期数据输入Xmonth。In this solution, step S7 trains the initialized target domain CNN neural network model, specifically: the daily cycle data matrix D day , the weekly cycle data matrix D week , and the monthly cycle data matrix D of a small number of historical users in the target domain training set month is respectively used as the daily period data input X day , the weekly period data input X week and the monthly period data input X month of the target domain CNN neural network model in step S6 .
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210451618.4A CN114926303A (en) | 2022-04-26 | 2022-04-26 | Electric larceny detection method based on transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210451618.4A CN114926303A (en) | 2022-04-26 | 2022-04-26 | Electric larceny detection method based on transfer learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114926303A true CN114926303A (en) | 2022-08-19 |
Family
ID=82807494
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210451618.4A Pending CN114926303A (en) | 2022-04-26 | 2022-04-26 | Electric larceny detection method based on transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114926303A (en) |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004340767A (en) * | 2003-05-16 | 2004-12-02 | Hitachi Ltd | Power trading method and power trading system to prevent power theft |
CN101571990A (en) * | 2009-06-15 | 2009-11-04 | 安徽省电力公司合肥供电公司 | Mobile intelligent meter reading system |
US20130191051A1 (en) * | 2010-10-06 | 2013-07-25 | Klaus Stocker | Detection of loss or malfunctions in electrical distribution networks |
CN107492043A (en) * | 2017-09-04 | 2017-12-19 | 国网冀北电力有限公司电力科学研究院 | stealing analysis method and device |
US20180357542A1 (en) * | 2018-06-08 | 2018-12-13 | University Of Electronic Science And Technology Of China | 1D-CNN-Based Distributed Optical Fiber Sensing Signal Feature Learning and Classification Method |
JP2019054715A (en) * | 2017-09-15 | 2019-04-04 | 東京電力ホールディングス株式会社 | Power theft monitoring system, power theft monitoring device, power theft monitoring method and program |
CN110824270A (en) * | 2019-10-09 | 2020-02-21 | 中国电力科学研究院有限公司 | Electricity stealing user identification method and device combining transformer area line loss and abnormal events |
CN111046581A (en) * | 2019-12-27 | 2020-04-21 | 国网江苏省电力有限公司电力科学研究院 | A kind of transmission line fault type identification method and system |
CN113673564A (en) * | 2021-07-16 | 2021-11-19 | 深圳供电局有限公司 | Electricity stealing sample generation method, device, computer equipment and storage medium |
CN113901977A (en) * | 2020-06-22 | 2022-01-07 | 中国电力科学研究院有限公司 | A deep learning-based method and system for identifying electricity theft by power users |
CN114019205A (en) * | 2021-07-16 | 2022-02-08 | 国家电网有限公司技术学院分公司 | Electricity stealing identification method and system |
CN114295967A (en) * | 2021-07-26 | 2022-04-08 | 桂林电子科技大学 | A fault diagnosis method for analog circuits based on transfer neural network |
CN114355240A (en) * | 2021-12-01 | 2022-04-15 | 国网安徽省电力有限公司电力科学研究院 | Power distribution network ground fault diagnosis method and device |
CN114819454A (en) * | 2021-11-15 | 2022-07-29 | 南方电网数字电网研究院有限公司 | Electricity theft detection method, device, equipment, storage medium and program product |
CN114841253A (en) * | 2022-04-19 | 2022-08-02 | 深圳市国电科技通信有限公司 | Electricity theft detection method, device, storage medium, and electronic device |
-
2022
- 2022-04-26 CN CN202210451618.4A patent/CN114926303A/en active Pending
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004340767A (en) * | 2003-05-16 | 2004-12-02 | Hitachi Ltd | Power trading method and power trading system to prevent power theft |
CN101571990A (en) * | 2009-06-15 | 2009-11-04 | 安徽省电力公司合肥供电公司 | Mobile intelligent meter reading system |
US20130191051A1 (en) * | 2010-10-06 | 2013-07-25 | Klaus Stocker | Detection of loss or malfunctions in electrical distribution networks |
CN107492043A (en) * | 2017-09-04 | 2017-12-19 | 国网冀北电力有限公司电力科学研究院 | stealing analysis method and device |
JP2019054715A (en) * | 2017-09-15 | 2019-04-04 | 東京電力ホールディングス株式会社 | Power theft monitoring system, power theft monitoring device, power theft monitoring method and program |
US20180357542A1 (en) * | 2018-06-08 | 2018-12-13 | University Of Electronic Science And Technology Of China | 1D-CNN-Based Distributed Optical Fiber Sensing Signal Feature Learning and Classification Method |
CN110824270A (en) * | 2019-10-09 | 2020-02-21 | 中国电力科学研究院有限公司 | Electricity stealing user identification method and device combining transformer area line loss and abnormal events |
CN111046581A (en) * | 2019-12-27 | 2020-04-21 | 国网江苏省电力有限公司电力科学研究院 | A kind of transmission line fault type identification method and system |
CN113901977A (en) * | 2020-06-22 | 2022-01-07 | 中国电力科学研究院有限公司 | A deep learning-based method and system for identifying electricity theft by power users |
CN113673564A (en) * | 2021-07-16 | 2021-11-19 | 深圳供电局有限公司 | Electricity stealing sample generation method, device, computer equipment and storage medium |
CN114019205A (en) * | 2021-07-16 | 2022-02-08 | 国家电网有限公司技术学院分公司 | Electricity stealing identification method and system |
CN114295967A (en) * | 2021-07-26 | 2022-04-08 | 桂林电子科技大学 | A fault diagnosis method for analog circuits based on transfer neural network |
CN114819454A (en) * | 2021-11-15 | 2022-07-29 | 南方电网数字电网研究院有限公司 | Electricity theft detection method, device, equipment, storage medium and program product |
CN114355240A (en) * | 2021-12-01 | 2022-04-15 | 国网安徽省电力有限公司电力科学研究院 | Power distribution network ground fault diagnosis method and device |
CN114841253A (en) * | 2022-04-19 | 2022-08-02 | 深圳市国电科技通信有限公司 | Electricity theft detection method, device, storage medium, and electronic device |
Non-Patent Citations (4)
Title |
---|
张根保等: "一种用于轴承故障诊断的迁移学习模型", 《吉林大学学报(工学版)》 * |
张若愚 等: "基于迁移学习的电力系统暂态稳定自适应预测", 《电网技术》 * |
王平飞: "基于时间序列的卷积LSTM电力负荷预测研究", 《中国优秀硕士学位论文全文数据库 基础科学辑》 * |
邱宁佳等: "结合迁移学习模型的卷积神经网络算法研究", 《计算机工程与应用》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Guo et al. | Short-term water demand forecast based on deep learning method | |
García-Pérez et al. | Fully-convolutional denoising auto-encoders for NILM in large non-residential buildings | |
Bui et al. | A deep learning approach for forecasting air pollution in South Korea using LSTM | |
CN113988373B (en) | Multi-task massive user load prediction method based on multi-channel convolutional neural network | |
CN110472665A (en) | Model training method, file classification method and relevant apparatus | |
CN110879377B (en) | Metering device fault tracing method based on deep belief network | |
CN112149890A (en) | Comprehensive energy load prediction method and system based on user energy label | |
CN113469266A (en) | Electricity stealing behavior detection method based on improved deep convolutional neural network | |
CN114417991A (en) | Data recovery method for missing structural health monitoring based on spatiotemporal attention network | |
CN113205368B (en) | Industrial and commercial customer clustering method based on time sequence water consumption data | |
Mohammad et al. | Short term load forecasting using deep neural networks | |
CN114792169A (en) | Residential water consumption prediction method based on MIC-XGBoost algorithm | |
Attallah et al. | An open-source, semisupervised water end-use disaggregation and classification tool | |
CN116993184A (en) | Water resource shortage assessment method, system and computer readable storage medium | |
Hu et al. | Bert-pin: A bert-based framework for recovering missing data segments in time-series load profiles | |
CN116760363A (en) | Photovoltaic fault identification method based on improved EfficientNet | |
TWI684927B (en) | Prediction system and method for solar photovoltaic power generation | |
CN114926303A (en) | Electric larceny detection method based on transfer learning | |
CN108829908B (en) | A circuit structure reliability prediction method based on deep autoencoder network | |
CN115329839A (en) | A method for electricity stealing user identification and electricity stealing prediction based on convolutional autoencoder and improved regression algorithm | |
CN118332449A (en) | Training method of marine photovoltaic array fault recognition model and fault recognition method | |
Zydlewski et al. | Hard choices in assessing survival past dams—a comparison of single-and paired-release strategies | |
CN118014006A (en) | Method and system for predicting multitasking business process of sewage treatment plant | |
CN117851757A (en) | River flow interpolation method and device based on machine learning, and electronic equipment | |
CN111273212A (en) | Data-driven electric quantity sensor error online evaluation closed-loop improvement method, system and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20220819 |
|
WD01 | Invention patent application deemed withdrawn after publication |