[go: up one dir, main page]

0% found this document useful (0 votes)
81 views4 pages

Simple Load Disaggregation Library Based On NILMTK

Uploaded by

uxpt.ouymv18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views4 pages

Simple Load Disaggregation Library Based On NILMTK

Uploaded by

uxpt.ouymv18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on

Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)

Simple Load Disaggregation Library based on


NILMTK
Kitisak Osathanunkul Khukrit Osathanunkul*
Department of Digital Technology Innovation Department of Information Technology
Faculty of Science, Maejo University International College, Payap University
Chiang Mai, Thailand Chiang Mai, Thailand
kit_o@mju.ac.th osathank@gmail.com

Abstract— Load disaggregation is a method to predict load can be predicted right away. It skips all unnecessary
appliance power usage from a household power meter reading. procedures required by NILMTK.
A powerful open-source tool used in load disaggregation task is
called Non-Intrusive Load Monitoring Toolkit (NILMTK). The Section II discusses about related works. SLD is
toolkit provides many features such as built-in basic statistics, introduced in Section III. Section IV compares SLD with the
evaluation metrics and disaggregation algorithm comparison. original NILMTK in several aspects, and section V concludes
However, it requires a steep learning curve. This can be difficult the paper.
and troublesome for an unexperienced researcher to get started.
Thus, this paper proposes a Simple Load Disaggregation (SLD) II. RELATED WORKS
library to allow users to get started with the load disaggregation
A. NILMTK
task quickly. The proposed library allows users to train and
disaggregate an appliance load within a few lines of codes. The Non-Intrusive Load Monitoring Toolkit or NILMTK [3] is
SLD modifies disaggregation algorithms from NILMTK. It a tool used in analysing load usage in a building. It is open
removes all unnecessary elements and focusing only on making source toolkit written in python. It is available on Github with
the load disaggregation task clean and simple. a big community. Prior NILMTK published, it is almost
impossible to find a way to compare literature findings and
Keywords—non-intrusive load monitoring, nilm, load experiments in NILM. There is no standard in terms of
disaggregation, energy disaggregation. experiment setup, data acquiring, data format, and etc.
Therefore, NILMTK is introduced to tackle these issue. It is
I. INTRODUCTION designed to be a standard tool for NILM tasks, and to be used
Load disaggregation is a technique of predicting an among researchers. With NILMTK, researchers can have
individual appliance electricity used from a household power some guidelines on how obtained data and predicted data to
meter reading. This technique is called Non-Intrusive be collected, stored, compared, evaluated and even
Appliance Load Monitoring (NALM or NILM). The idea was represented in the similar manner or format. Thus, the results
firstly introduced in the mid-1980s by George Hart [1]. The from disaggregation can be compared and discussed on the
technique uses a pattern recognition and event-based methods. performance of an algorithm used.
Another technique [2] makes use of the combinatorial NILMTK includes several features such as dataset
optimization technique. It learns the sum of appliance power
conversion tools (to import dataset into NITLMTK
consumption, then comparing this sum with the household
environment), several disaggregation algorithm and
meter reading. These two techniques were widely studied and
evaluation tools.
it has been developed as disaggregation algorithm in an open-
source community called Non-Intrusive Load Monitoring B. Public Dataset
Tool-Kit (NILMTK) [3]. In NILM, many public datasets are available for researcher
NILMTK is a python open source toolkit used in Non- to work on. Datasets in NILM consists of both main power
Intrusive Load Monitoring (NILM). It is designed to allow meter reading and an appliance power reading. The sampling
researchers to work in this field with the same standard. So rate of the dataset are typically ranges from 1 second to several
that benchmarks between different disaggregation algorithms second, but some datasets may also provide a higher sampling
can be compared using the toolkit. The features of NILMTK rate like 16kHz [4]. Popular NILM datasets are REDD [5],
is not limited to load disaggregation, but it can also provide BLUED [6] and UKDALE [4].
some data statistics, algorithm benchmark tools, evaluation
1) REDD: A public data set for energy disaggregation
metrics, and etc.
[5] is published in 2011. It is the first dataset available for
Due to many features of NILMTK, researchers need to NILM research. During that time, this dataset becomes a
study on the environment of NILMTK in order to get started. standard dataset for researchers to benchmark their NILM
For using just load disaggregation function in NILMTK, users algorithm. REDD dataset obtained data from 6 different
or researchers are required many steps to get started with the
houses in Massachusetts, US. REDD provides with both high
task. The whole package of NILMTK is needed to be installed.
Then the users need to learn how to convert a dataset into and low frequency sampling rate. However, a lower
NILMTK format and to learn how to deal with NILMTK basic frequency one is more useful as the data from the higher
commands and functions. Simply knowing a python language frequency one is often redundancy.
is not enough. A steep learning curve is definitely needed for 2) BLUED: Building-Level fUlly-labelled dataset for
unexperienced users.
Electricity Disaggregation [6] is published by Kyle et al.
In this paper, Simple Load Disaggregation is introduced from Electrical and Computer Engineering, Carnegie Mellon
to fill the gap. It is designed for users with little to no University, USA. The dataset provides both voltage and
experience in this area. A load disaggregation function can current data sampled at 12kHz with the duration of one full
be done by simply importing a library, training a model,
then an appliance

978-1-7281-6398-7/20/$31.00 ©2020 IEEE 141


Authorized licensed use limited to: Escuela Superior de Ingeneria Mecanica. Downloaded on May 30,2023 at 20:17:57 UTC from IEEE Xplore. Restrictions apply.
2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on
Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)

week. The strong point of this dataset is that the appliance on working with python dataframe and remove all
power data is labelled with a timestamp. That means ground unnecessary features apart from the load disaggregation
truth can be used to confirm disaggregation results when function.
evaluating algorithm. The following subsection explains the requirement of
3) UK-DALE: UK recording Domestic Appliance-Level SLD, how to prepare data, how to create and train a model
Electricity [4] is the first public UK dataset. The dataset and then finally how to disaggregate a whole house power
comes with a sampling rate of 16kHz for a whole house meter reading.
power meter reading and every 6 seconds for individual
appliances. UK-DALE includes data from 5 houses with a A. Software and Library Requirement
length of 2 to 4 years. With a long data collection period Disaggregation algorithms in SLD are a modification of
allows researcher to investigate the nature of the data at the the disaggregation algorithm of the NILMTK. So it inherits
different time of the year. For example, during winter a heater most of the major requirements from the original one.
However, SLD requires just enough software and libraries to
is tend to be switched on more often or to consume more
function a load disaggregation. The software and libraries
electricity than in the summer. This dataset also introduced a required are python, numpy, pandas and hmmlearn.
wireless device used for collecting its data.
B. Data Preparing and Preprocessing
C. Factorial Hidden Markov Model (FHMM)
Before getting started with the load disaggregation
Using Factorial Hidden Markov Model (FHMM) process, it is obvious that data is needed to be preprocessed in
algorithm for load disaggregation has been introduced by a certain format. In order to keep the task simple, SLD library
Zoha et al. [7]. The idea of FHMM algorithm is to learn load required much less complicated data preprocessing to get
patterns for each appliance. Zoha’s experiments shows that started. Just like any machine learning preparation process,
with 5 multi-state appliance, FHMM can achieve the f- SLD library required two sets of data; training data and testing
measure of 0.614. In addition, Kolter and Johnson [8] also data.
uses FHMM in the load disaggregation in their experiments.
Their results from REDD dataset shows the average accuracy 1) Training data
of 47.7%. In addition to the typical FHMM algorithm, many Training data is the data that is used to train a model. A
researchers have proposed several variants to improve the model can be more or less accurate depending on these
performance of this algorithm. For example, Kim et al. are training data feeding to the model. Here, training data must be
applying probabilistic models with FHMM in [9] and Parson included both main meter data (aggregated electricity meter
et al. adopt HMM and modify Viterbi algorithm in [10]. reading) and an individual appliance meter data.
To prepare a dataframe for training a model, both main
D. Combinatorial Optimisation (CO)
meter data and appliance meter data must be contained in the
CO is another disaggregation algorithm proposed by Hart same dataframe. It must include at least three columns;
et al [1]. CO reduces the difference between the sum of timestamp, power and appliance meter reading. The example
predicted appliance load and the household power meter of the dataframe is shown below.
reading data by finding the optimal combination of those
different states. It finds a sum of an appliance power usage, timestamp power app1 app2 app3
and then uses this sum to compare with the one from a 0 1525689485 23.30 0 0 0
household power meter reading. 1 1525689490 318.4 295.1 0 0
2 1525689495 318.7 295.4 0 0
...
III. SIMPLE LOAD DISAGGREGATION
1.1) Timestamp
Originally, disaggregation algorithm modules in NILMTK
are tied with its NILMTK environment. The module deals The first required column is the timestamp. It is used to
with power data stored in the Hierarchical Data Format identify the order of the event happened. Unlike the NIMLTK,
(HDF5). This is well organized but it can be complicated to timezone and daylight saving will not be considered. The first
understand. In order to allow unexperienced users to work timestamp will be used as the reference point. It is
without a steep learning curve, a modification on the module recommended to keep the timestamp in a Unix Epoc format in
is needed. Here, Simple Load Disaggregation (SLD) is which 10 decimal digit number. This column should be named
introduced to fill this gap. as “timestamp”.

SLD presented in this paper is based on the original 1.2) Power


NILMTK version 0.2. The purpose of this library is to make a The second column is the training main meter data. It is
load disaggregation task simple and easy to use for an online typically a reading from a main of a household, or it can be a
or a near real-time application. It is also written in python and reading from a set of appliance depending on how the meter
focus on using python dataframe rather than HDF5 like the data can be acquired. The unit of the reading can be in Watt or
original NILMTK. SLD is also published as a Github kWatt. This column should be named as ‘power’.
repository in [11].
1.3) Appliance
There are two disaggregation algorithms provided in
SLD; Factorial Hidden Markov Model (FHMM) and This rest of the column is the reading for individual
Combinatorial Optimisation (CO). Both algorithms are appliance. There must be at least one appliance meter data
included in the dataframe as it will be used to train a model.
modified from NILMTK in similar manner where focusing

142
Authorized licensed use limited to: Escuela Superior de Ingeneria Mecanica. Downloaded on May 30,2023 at 20:17:57 UTC from IEEE Xplore. Restrictions apply.
2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on
Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)

In python, data from a CSV file can be imported to a The model can be stored for later use. This is because it is
dataframe. It can be done by simply calling a method useful for further training, which helps to make the model
read_csv() from pandas. more accurate. The saved model is small and fast to be loaded,
it can be used for a near real time application. To save or store
2) Testing Data a model, it is recommended to use a python “pickle” library.
Similar to the training data, testing data should contain With pickle, the model will be saved in a file with the “pkl”
timestamp and a house meter power. However, it does not extension.
require individual appliance power data in the testing data. So
the format of the testing dataframe will be similar to the Here, SLD also provides a function to save and load a
following dataframe. model. It is done by calling save() and load() function. For
example, to save and to load a model to a file can be done by
timestamp power the following command.
0 1525689485 23.3 fhmm.save(“fhmm_trained_model”)
1 1525689490 318.4
2 1525689495 318.7 fhmm.load(“fhmm_trained_model”)
...
To save, a model will be stored to a file called
The power data in the testing dataframe here will be “fhmm_trained_model.pkl”. On the other hand, once the
disaggregated into individual appliance power data later on. model is reloaded, the same model will be ready to use as it is
The results of the disaggregation will be depending on the the same model.
model. If the model has been trained with 3 appliance data, the
D. Load Disaggregation
result data will also contain 3 disaggregated appliance data as
well. The details of the model will be discussed in the later Load disaggregation is the main function of this library. It
section. is used to extract appliance meter data from main meter data.
A result of disaggregated appliance meter data will be found
C. Creating and Training a Model depending on a trained model and a selected algorithm. In
In order to disaggregate main meter data, model is needed order to disaggregate a power meter reading, it can be done by
to be trained first. In SLD, models can be trained simply by calling disaggregate() function. This function only requires
calling train() function. The function requires two arguments; one argument which is a testing dataframe. The example for
dataframe and a list of appliance. The dataframe is the one that calling this function is as follow.
is prepared from the previous data preprocessing step. On the
other hand, a list of appliance must be stored as a python list. prediction = fhmm.disaggregate(df)
In the list, it should contain at least one name of appliance to
be trained. A name of the appliance is stored as string type. This section has introduced SLD. This includes functions
and data format to get load disaggregation task done. The next
A list of appliance will be used to define how many section will be discussed on what different between the SLD
appliances will be trained in the model. So that when and the original NILMTK.
disaggregating the main power, the number of appliance will
be the same as the number of appliance in the list. For IV. COMPARISONS BETWEEN SLD AND NILMTK
example, if three appliances like ‘kettle’, ‘microwave’ and
This section compares SLD with the original NILMTK in
‘heater’ are focused, the python list should be as follow.
different aspects. These are the software and library
list_of_appliance = ['kettle','microwave','fan']
dependency, data preparing and preprocessing, disaggregation
performance and feature and supports.
In addition, the name of the appliances in the list should A. Software and Library Dependency
exactly be the same name as the one in the dataframe. During It is known that NILMTK requires lot of software and
the training process, each appliance will be used to train the libraries for its feature-rich tool kit. There can be a
model one by one until all appliances in the list will be troublesome when installing the NILMTK at the first time. A
included. dependency confliction issue is the main cause of error during
Given “df” is a dataframe containing data for a house, the installation process. This is because when the time goes by
“list_of_appliance” is a list containing an appliance list and some libraries might have been upgraded to a newer version.
fhmm is an object class for FHMM algorithm. The example Some library often requires a specific version of dependency
code of calling a training function can be as the following library. This can cause the installation fail and some attention
code. will be needed to resolve this issue. Installing NILMTK can
be difficult from this kind of issue.
fhmm.train(df, list_of_appliance)
On the other hand, SLD only focuses on load
disaggregation but it does not consider much about the details
After calling the train() function, the model is created. This or environment of the data. So SLD required only few
model will be used to disaggregate an appliance power from software and libraries just enough for disaggregation function
the main power later on. to work. Its requirements are only python, numpy, pandas and
After a model is created (or the first training, the model hmmlearn. With this little requirements, SLD can be easily to
can be further trained with another dataframe containing install or import, and ready to get started easily.
another house to learn more information. Doing so will allow
the model to learn with different data from different houses.

143
Authorized licensed use limited to: Escuela Superior de Ingeneria Mecanica. Downloaded on May 30,2023 at 20:17:57 UTC from IEEE Xplore. Restrictions apply.
2020 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on
Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON)

Table I shows the list of the requirements of NILMTK and D. Feature and Supports
SLD. It is obvious that SLD requires less software and library NILMTK community has been published as Github
dependency to get started with load disaggregation task. repository since 2014. At the time of writing this paper it has
been forked for almost 300 times with over 20 contributors. It
TABLE I. SOFTWARE REQUIREMENTS OF NILMTK AND SIMPLE
LOAD AGGREGATION (SLA) LIBRARY contains lots of features such as dataset conversion, power
meter selection (in case one house has several meters), basic
Software and library NILMTK SLD statistics, evaluation metrics, disaggregation algorithm
Python >= 3.6 >= 3.6 comparison and etc. As this is an open source community,
Numpy >= 1.13.3 >= 1.13.3 NILMTK has a volunteer to support when users or researcher
Pandas >= 0.25.0 >= 0.25.0 get stuck at some point.
Cython >= 0.27.3 Not required
On the other hand, SLD is pretty simple. It has only one
Bottleneck >= 1.2.1 Not required feature. That is load disaggregation. SLD may have much less
Numexpr >= 2.6.4 Not required features when comparing to NILMTK, but it is also easy to
Matplotlib >= 3.1.0 Not required understand and get going.
Networkx == 2.1 Not required
Spicy >= 1.0.0 Not required V. CONCLUSIONS
Scikit-learn >= 0.21.2 Not required Simple Load Disaggregation library (SLD) is introduced.
Hmmlearn Any Any It predicts appliance electricity used from a house electricity
Pytables Any Not required meter reading. The main idea of SLD is to make load
Jupyter Any Not required disaggregation job easy to use. Disaggregation algorithms
iPython Any Not required from NILMTK is modified. SLD removes all unnecessary
iPykernel Any Not required elements, and leave the core of disaggregation algorithm
Nose Any Not required untouched. SLD is simple and easy to use, while it stills
Coverage Any Not required perform the same results as the original NILMTK.
Psycopg2 Any Not required
Coveralls Any Not required REFERENCES
[1] G. W. Hart. Prototype nonintrusive appliance load monitor. Technical
report, MIT Energy Laboratory and Electric Power Research Institute,
B. Data Preparation and Preprocessing Sept. 1985.
In NILMTK, this process can really be complicated and [2] G. W. Hart. Nonintrusive appliance load monitoring. Proceedings of
the IEEE, 80(12):1870–1891, Dec. 1992. doi:10.1109/5.192069.
confusing for unexperienced researcher. It requires a lot of
steps and some learning curves to get started. Data needs to be [3] Nipun Batra, Jack Kelly, Oliver Parson, Haimonti Dutta, William
Knottenbelt, Alex Rogers, Amarjeet Singh, Mani Srivastava.
arranged into its specific format called as NILMTK-DF NILMTK: An Open Source Toolkit for Non-intrusive Load
described in [9]. In the NILMTK-DF format, it is well Monitoring. In: 5th International Conference on Future Energy
organized using the Hierarchical Data format (HDF5). It Systems (ACM e-Energy), Cambridge, UK. 2014.
provides lots of information such as metadata of a dataset, [4] Kelly J., Knottenbelt W. The UK-DALE dataset, domestic appliance-
electricity meter reading, water meter reading, gas meter level electricity demand and whole-house demand from five UK
homes. Scientific Data. 2015;2 doi: 10.1038/sdata.2015.7.150007.
reading or even on-off switch data. The purpose of having this
kind of format is to preserve as much information as possible. [5] Kolter Z. J., Johnson M. J. Redd: A public data set for energy
disaggregation research. Proceedings of the In Workshop on Data
So that the other researcher can learn and understand on when, Mining Applications in Sustainability (SIGKDD); 2007; San Diego,
where, how data can be obtained. CA, USA. pp. 59–62.
NILMTK does provide some tools to convert some public [6] K. Anderson, A. Ocneanu, D. Benitez, D. Carlson, A. Rowe, and M.
Berges, "BLUED: A Fully Labeled Public Dataset for Event-Based
dataset into NILMTK-DF format. For example, the RRED, Non-Intrusive Load Monitoring Research," in Proceedings of the 2nd
BLUE and UKDALE datasets are the most famous dataset. KDD Workshop on Data Mining Applications in Sustainability
Users can convert these datasets into NILMTK-DF format (SustKDD), Beijing, China, 2012.M. Young, The Technical Writer’s
with a single command. However, converting a custom dataset Handbook. Mill Valley, CA: University Science, 1989.
in to NILMTK-DF format is not an easy task. A deep [7] Zoha A., Gluhak A., Nati M., Imran M. A. Low-power appliance
understanding of the NILMTK structure is mandatory. This monitoring using Factorial Hidden Markov Models. Proceedings of the
2013 IEEE 8th International Conference on Intelligent Sensors, Sensor
procedure can be time consuming to convert a custom dataset Networks and Information Processing: Sensing the Future (ISSNIP
into the right format. In contrast, SLD requires data '13); April 2013; pp. 527–532.
preparation in a simple python dataframe. The format is clean [8] Kolter Z. J., Johnson M. J. Redd: A public data set for energy
and simple to understand. Users can simply prepared data in disaggregation research. Proceedings of the In Workshop on Data
Microsoft Excel and import to dataframe easily. At this point, Mining Applications in Sustainability (SIGKDD); 2007; San Diego,
SLD allows users to prepare data in order to get started CA, USA. pp. 59–62.
quickly. [9] Kim H., Marwah M., Arlitt M., Lyon G., Han J. Unsupervised
disaggregation of low frequency power measurements. SDM; 2011; pp.
C. Disaggregation Performance 747–758.
[10] Parson O., Ghosh S., Weal M., Rogers A. Non-intrusive load
When considering a disaggregation performance between monitoring using prior models of general appliance types. Proceedings
NILMTK and SLD. Both of them does predict the exactly of the 26th AAAI Conference on Artificial Intelligence and the 24th
same results. This is because SLD modifies the disaggregation Innovative Applications of Artificial Intelligence Conference; July
algorithm module from NILMTK. The modifications were to 2012; pp. 356–362.
remove all unnecessary features and leave only the [11] Kitisak O, Simple Load Disaggregation Python Library, (2019),
disaggregation function untouched. As a result, the SLD GitHub repository, https://github.com/amzkit/load-disaggregation
library can predict an appliance load with the same results.

144
Authorized licensed use limited to: Escuela Superior de Ingeneria Mecanica. Downloaded on May 30,2023 at 20:17:57 UTC from IEEE Xplore. Restrictions apply.

You might also like