[go: up one dir, main page]

0% found this document useful (0 votes)
35 views17 pages

PDM For Conveyor Belts

Uploaded by

Owais Jafri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views17 pages

PDM For Conveyor Belts

Uploaded by

Owais Jafri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Received June 2, 2020, accepted June 26, 2020, date of publication July 2, 2020, date of current version July

15, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3006788

An Effective Predictive Maintenance Framework


for Conveyor Motors Using Dual Time-Series
Imaging and Convolutional Neural Network
in an Industry 4.0 Environment
KAHIOMBA SONIA KIANGALA AND ZENGHUI WANG , (Member, IEEE)
Department of Electrical and Mining Engineering, University of South Africa, Johannesburg 1710, South Africa
Corresponding author: Zenghui Wang (wangzengh@gmail.com)
This work was supported in part by the South African National Research Foundation under Grant 112108 and Grant 112142, and in part by
the Tertiary Education Support Program (TESP) of South African ESKOM.

ABSTRACT The ascent of Industry 4.0 and smart manufacturing has emphasized the use of intelligent
manufacturing techniques, tools, and methods such as predictive maintenance. The predictive maintenance
function facilitates the early detection of faults and errors in machinery before they reach critical stages.
This study suggests the design of an experimental predictive maintenance framework, for conveyor motors,
that efficiently detects a conveyor system’s impairments and considerably reduces the risk of incorrect faults
diagnosis in the plant; We achieve this remarkable task by developing a machine learning model that classifies
whether the abnormalities observed are production-threatening or not. We build a classification model using
a combination of time-series imaging and convolutional neural network (CNN) for better accuracy. In this
research, time-series represent different observations recorded from the machine over time. Our framework is
designed to accommodate both univariate and multivariate time-series as inputs of the model, offering more
flexibility to prepare for an Industry 4.0 environment. Because multivariate time-series are challenging to
manipulate and visualize, we apply a feature extraction approach called principal component analysis (PCA)
to reduce their dimensions to a maximum of two channels. The time-series are encoded into images via the
Gramian Angular Field (GAF) method and used as inputs to a CNN model. We added a parameterized
rectifier linear unit (PReLU) activation function option to the CNN model to improve the performance
of more extensive networks. All the features listed added together contribute to the creation of a robust
future proof predictive maintenance framework. The experimental results achieved in this study show the
advantages of our predictive maintenance framework over traditional classification approaches.

INDEX TERMS Convolutional neural network (CNN), Gramian angular field (GAF), industry 4.0 (I40),
predictive maintenance, principal component analysis (PCA), smart manufacturing, time-series imaging.

I. INTRODUCTION data types collected in this new growing era of Industry 4.0 is
The recent explosion of smart manufacturing applications, time-series data. Time-series data are known as observations
the Internet of things (IoT), and big data has considerably sequentially recorded over time [3], [4].
increased the amount of data collected and analyzed in dif- Time-series data are intensively analyzed, as a preventive
ferent areas such as health care, transportation, power energy, tool, in the manufacturing industry where unforeseen failures
food and beverage, multimedia, environment, finance, and of machinery can conduct to very long production downtime
logistics. Several types of predictions, production forecast- and losses. Studying and analyzing data to detect faults and
ing, fault detection and, predictive maintenance result from threats in devices before they occur and taking appropriate
analyzing various datasets [1], [2]. One of the most common measures to reduce the risk of failures is called ‘‘predictive
maintenance’’ [5]. As per [6], predictive maintenance is an
The associate editor coordinating the review of this manuscript and ensemble of activities that detect any abnormal physical con-
approving it for publication was Qinfen Lu . dition changes in equipment (signs of failure) to carry out the

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 121033
K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

required maintenance tasks to boost the service life of equip- images combined into a single bigger image used as an input
ment without increasing the risk of failure. For the past years, to a CNN model.
predictive maintenance has been subject to much research to Our research takes a step further on previous work done
bring improvement. One of the current innovative trends for in the manufacturing sector on this innovative concept by
this concept is the use of machine learning (ML) techniques developing an experimental framework that:
in combination with advanced technological concepts to offer 1) Generates accurate predictive maintenance flags for
better predictive maintenance results. conveyor motors by classifying whether observed system
Machine learning (ML) is a field of Artificial Intel- parameters inputs are threats or not.
ligence (AI) to extricate useful insights from various 2) Combines the use of UTS and MTS in one single plat-
data (time-series data) [7] through some of the following form to increase the flexibility of the system. No need to have
paradigms: supervised learning, semi-supervised learning, separate models.
unsupervised learning, and reinforcement learning [8]. It is 3) Facilitates inputs and manipulations of MTS data in
also commonly known as a study that offers machines dif- CNN by reducing their size to two channels through a fea-
ferent means and ways to make correct decisions on their ture extraction method called principal component analysis
own and execute tasks without explicit assistance from human (PCA).
beings. Deep Learning is a branch of ML that has the 4) Offers an option for a future proof CNN model by using
capability of extracting data representation. Some popular parameterized rectifier linear unit configuration (PReLU) to
deep learning methods are Artificial Neural Network (ANN), improve the performance of larger networks.
Convolution Neural Networks (CNN), Deep Belief Network, Our paper is structured as follows: Section 2 presents
Recurrent Neural Networks, and Stacked Auto-Encoders [9]. a literature review of some concepts such as time-series,
In this research, we focus on CNN, which is a deep learning deep learning, predictive maintenance with machine learning,
technique that tries to imitate the operations of a human brain, Imaging time-series for classification. Section 3 describes the
especially its ability to recognize and classify objects based methodologies, technological approaches, and architecture of
on their appearances. This feature has made CNN the con- our predictive maintenance framework. Section 4 presents
ventional method used for image classification and identifi- the experimental results obtained, and the conclusion and
cation [10]. In 2015, [11] initiated an inventive approach that suggestions for future research are provided in Section 5.
improved classification and imputation by encoding univari-
ate time-series (UTS) data to images and using them as inputs II. LITERATURE REVIEW
to CNN models. The concept of computer vision introduced A. TIME-SERIES DATA
the transformation of time-series into images. By learning As mentioned previously, time-series can be defined as
spatially invariant features from raw time series (inputs to a sequence of observations recorded over successive time
the model), the CNN method can reduce the risks of losing points [15]. Time series data can be grouped into two main
temporal information and those that the features learned are categories: Univariate Time Series (UTS) and Multivariate
no longer time-invariant, which are with the traditional multi- Time Series (MTS). UTS are time series composed of a
layer-perceptron approach [12]. The outcome of this study single variable observed over a regular period of time. MTS
generated better results than traditional machine learning are those made of two or more variables recorded over a
techniques for classification, such as decision tree (DT), ran- successive period of time [16].
dom forest (RF), or Support Vector Machine (SVM). Since Equation (1) is a mathematical representation of UTS
then, fewer more studies were conducted in the same vision defined as follows:
utilizing the basis of time-series imaging encoding and deep
B = [b1 , b2 , b3 , · · · , bn , · · · bt ] (1)
learning approaches to ameliorate classification modeling in
various sectors. Reference [13] developed a similar frame- where bn ∈ R, t ∈ N and represents the size of the time series
work that uses Relative Position Matrix with CNN. The data.
method was named RPMCNN and was used to perform the On the other hand, (2) is an expression for MTS.
classification by transforming 2D images from time-series
D = [B1 , B2 , B3 , · · · , Bi , · · · Bm ] (2)
data received as inputs. Their results displayed improved
performances. In the manufacturing sector, an approach was where m ∈ N and represents the size of the MTS, m is also
introduced by [14] using multivariate time-series (MTS) data equal to the number of univariate time series in D, i is the
as input to a classification of Tool wear for a CNN model. unique position identification for each UTS in D. As per (2),
Because of the large volume of MTS data and in order to D contains several UTS similar to those defined in (1). For a
ease data processing, this approach divided MTS inputs into MTS, D, regrouping a number of UTS, B, a single UTS object
three channels before being converted to images and fed into can be defined by (3) as:
the CNN model. Reference [3] conducted another research
Bi = bi(1) , bi(2) , bi(3) , · · · , bi(n) , · · · bi(t)
 
(3)
in that direction by converting MTS data to colored images
and feeding them as inputs of a CNN model for sensor where t ∈ N and is the size of the UTS [17], [18], i is the
classification. Their research encodes MTS data into multiple unique position identification for each UTS in the MTS.

121034 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

B. CONVERTING TIME-SERIES DATA TO IMAGES


Transforming time series to images is one type of data
transformation. An exciting data transformation approach
is to reduce the size or dimension (dimension reduction)
of massive volumes datasets from high dimensional data
of more than three features to only 2 (2-D) or sometimes
3-dimensional (3-D) providing a better understanding of the
data, especially when it comes to visualization [24], [25].
Another data transformation method that seems to be oppo-
site to the previous one is ‘‘dimension augmentation’’ as it
involves increasing the size of a particular dataset; for exam-
ple, going from a 1-dimensional (1-D) data into2-D, or even
3-D. Dimension augmentation is a crucial step, especially
when considering using a CNN model, which we intend to
in this study. Reference [11] introduced an approach named
FIGURE 1. Example of a UTS graphical representation. Gramian Angular Field (GAF) for encoding time series into
images to improve classification and imputation. GAF uses
a polar coordinates-based matrix to encodes time series into
image since they have the advantage of preserving temporal
correlation, unlike the Cartesian coordinate [11]. GAF can
generate two types of images: Gramian angular summation
field (GASF) and Gramian angular differential field (GADF).
The steps to obtain GAF images are as follows:

1) NORMALIZING TIME SERIES DATA INPUT


The original time series (1) (as described in the previous
section) is scaled or normalized to values in the intervals of
[−1, 1]. The normalization method is defined in (5).
i (bi − max(B)) + (bi − min(B))
bg−1 = (5)
max (B) − min(B)
i
where bg −1 is the scaled or the normalized value of each
original time series observation bi .
FIGURE 2. Example of a MTS graphical representation.

2) CONVERTING SCALED TIME-SERIES DATA TO POLAR


COORDINATES
From these mathematical representations, we can conclude
that UTS represents a vector while MTS represents a matrix The second step of imaging time series with the GAF method
(a combination of multiple vectors). Based on the above consists of representing the normalized time series B̃ in polar
parameters, let assume a MTS D where t = 4 and m = 4. coordinates. The polar coordinates are computed by finding
The MTS data can be defined by (4) in a matrix format as: the angular cosine of each normalized value and the time
  stamp which represented as a radius. The polar coordinates
b11 b21 b31 b41 are defined by (6) and (7):
 b12 b22 b32 b42 
D=  b13 b23 b33 b43 
 (4) θ = arccos (b̃i ) (6)
b14 b24 b34 b44 where −1 ≤ b̃i ≤ 1, b̃i ∈ B̃
The graphical representation of UTS and MTS are dis- ti
r = , ti ∈ N (7)
played in Fig. 1 and Fig. 2 respectively: N
In Fig.1, a single time series variable (UTS) value, ranging In (6), θ represents the time series value of each observation
from 0 to 6, named ‘Var1’, is recorded between the 05 May in the polar coordinates format. In (7), ti represents the time
2016 to the 26 September 2016. Var1 represents any other stamp of the time series data and N is a factor (a constant)
parameter measured over an interval. The main difference that stabilizes the polar coordinate system’s space.
between Fig.1 and Fig.2 is simply the number of variables
monitored over time. Fig.2 recorded three different variables: 3) FINDING THE GRAMIAN ANGULAR
Var1, Var2, and Var3 (MTS). Var1 to Var3 represents any SUMMATION/DIFFERENCE FIELD (GASF & GADF)
parameter values observed over the same period. More com- After obtaining the polar coordinates of the time series,
plex MTS has more than three variables. we make use of trigonometric sum and difference to find

VOLUME 8, 2020 121035


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 3. From time series to Gramian angular fields process.

spatial correlation between each polar point and determine meaningful spatial correlation and create features informa-
their GASF and GADF. GASF mathematical representation tion from input data used to detect patterns. The animals’
is presented in (8) and (9). GADF is defined in (10) and (11). visual cortex was the inspiration behind the CNN concept
and introduced by Hubel and Wiesel, two neurophysiolo-
GASF = cos θi + θj
 
(8) gists who did many types of research on visual cortical
q q
0 neurons of monkeys and cats [26]. The first modern CNN
GASF = b̃0 · b̃ − 1 − b̃2 · 1 − b̃2 (9)
framework was called LeNet and was published by [27].
GADF = sin θi − θj
 
(10) After this first model, several other successful architectures
q q
such as ResNet [28], AlexNet [10], VGGNet [29], Inception
GADF = 1 − b̃2 · b̃ − b̃0 1 − b̃2
0
(11)
v3 etc. [30]
A popular mathematical representation of GAF is done in An underlying CNN architecture has the following layers:
a matrix format and defined by (12) and (13):
cos(θ1 + θ1 ) · · · cos(θ1 + θn )
 
1) A CONVOLUTIONAL LAYER
.. .. .. This layer extracts the input image features by using some
GASF =  . . .  (12)
 
filters (feature detectors) and generating a smaller size image
cos(θm + θ1 ) ··· cos(θm + θn )
containing the original input image features. The result of
sin(θ1 − θ1 ) sin(θ1 − θn )
 
··· the convolutional layer is called a feature map. Before going
.. .. ..
GADF =  . . . (13) to the next layer, in most CNN architectures, an activation
 

sin(θm − θ1 ) ··· sin(θm − θn ) function is applied to the feature maps to increase the non-
linearity of the image (useful to avoid linearity in images
A graphical representation of steps to convert time series to since most images have non-linear features predominantly).
GAF is displayed in Fig.3. One of the most popular activation functions used in deep
In this research, we focus on the GAF method for image learning for the past few years in the Rectifier Linear Unit
encoding since it preserves the temporal correlation of time (ReLU) [31].
series data inputs which is needed for our predictive mainte-
nance framework.
2) A POOLING LAYER
C. CONVOLUTIONAL NEURAL NETWORK (CNN) The pooling layer’s objectives are to generate a spatial invari-
Convolutional neural network (CNN) is a deep learning algo- ant feature for the image (the ability to recognize the image
rithm successfully used for image classification problems. in positions different than the input image) and reduce the
The outstanding performance of CNN in image classifica- size of the feature maps. One standard pooling method used is
tion (computer vision) is due to its ability to extracting ‘‘Max Pooling’’ [32]. Many other convolutional and pooling

121036 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 4. Basic structure of CNN architecture.

layers can be added before the flattening layer to improve the on using SVM and Regularized Least Square (RLS) to pre-
accuracy of a CNN model. dict the appropriate maintenance time of the gas turbines
based on its speed, compressor decay, and gas turbines
3) A FLATTENING LAYER decay. The results showed that the SVM outperformed the
The flattening layer converts the pooled feature map matrix RLS model. Another method proposed by Leahy et al. [20]
(2D) into a vector(1D) input for the neural network (next uses SVM to perform Predictive maintenance on Wind Tur-
layer) [3]. bines. Their approach used ML to create a classification
model for six faults in wind turbines: fault/no-fault, feeding
4) A FULL CONNECTION LAYER faults, air cooling faults, excitation faults, generator heat-
The full connection layer is a neural network composed of ing faults, and mains failure. By using SVM hyperparame-
several neurons’ layers interconnected through the synopsis ter optimization by randomized search, they reached better
and converging to the final outputs. The full connection results on detecting generator heating faults, classifying
layer is where all the classification intelligence of the CNN correctly 100% of the cases. However, on the fault/no-
happens. The first neurons layer receives its input from the fault dataset, they got only 90%recall and 8% precision.
previous flattening layer and goes through several hidden Susto et al. [21] introduced another ensemble approach to
layers before producing the results. detect the best moment for tungsten filaments replaced during
Note: One way to improve a CNN model accuracy is to ion implantation. It is a step in the process of manufacturing
pass through the model forward and backward several times semiconductors. The authors tested SVMs ensembles Pre-
by adjusting the weight of the inputs (iterating the dataset) dictive Maintenance with K-Nearest Neighbor (KNN) and
based on obtained output results until we achieve the desired Predictive Maintenance with SVM; the predictive mainte-
accuracy. The number of time the dataset is iterated is called nance with SVM gave slightly better results than the KNN
Epochs [33]. approach. In [22], the authors used the random forest (RF)
A graphical representation of the basic structure of a CNN ML technique to generate a predictive maintenance approach
architecture is presented in Fig.4. for a cutting machine. The RF model used different rotor
status of the cutting machine to perform classification in the
D. PREDICTIVE MAINTENANCE APPROACHES USING predictive maintenance scheme. Kulkarni et al. [23] worked
MACHINE LEARNING TECHNIQUES on a refrigeration and cold storage system by developing
Predictive maintenance is one advantageous approach an ML base approach that performs predictive maintenance
to ensure smooth and reliable operations of production by detecting early faults on the machinery involved in the
processes. For the past years, researchers conducted many refrigeration. They apply a feature extraction step in the
studies to improve predictive maintenance techniques. One pre-processing phase of the model, which consisted of learn-
innovative trend introduced in the research field is the use ing the pattern of the dataset and seasonality decomposition
of machine learning techniques to ameliorate this concept’s by dynamic time wrapping and clustering. They also built
outcomes. A study was proposed by [19] to apply ML for an RF classifier to recognize if the pattern was abnormal
the predictive maintenance of gas turbines. They focused or not.

VOLUME 8, 2020 121037


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

Following the same path to improve the quality of pre- difference in the manufacturing environment where avail-
dictive maintenance approaches implemented for systems, ability of systems and machines is essential to production.
our study applies CNN modeling that can extract feature A 1% more accuracy could be the information needed to
representation to offer better results than traditional ML avoid chaos in the plant.
techniques. Unlike traditional ML techniques such as RF
or SVM. CNN has the advantage of accommodating a very A. PRINCIPAL COMPONENT ANALYSIS (PCA) TECHNIQUE
high number of features by quickly determining which ones PCA is an unsupervised learning method, it means that we
have higher weights (more influential to the system) than the don’t make use of the dependent variable to perform its
others, therefore eliminating unnecessary ones. In this era of operations. A. Yunusa-Kaltungo et al. [54] describe a similar
IoT and Big data where massive amounts of data are available approach to reduce data dimension for fault diagnostic on
every day, it is convenient to integrate such a feature into rotating machines. To achieve dimensionality reduction using
modeling techniques. PCA, we go through the following steps:
Having a sound theoretical background on all useful con-
cepts used in this study, let us have a detailed look at 1) PRE-REDUCTION OF DATASET DIMENSION
the methodology applied to construct the predictive mainte- In this study, we consider datasets of time series variables,
nance framework. This methodology focuses on reducing the which are observations of different conveyor motor parame-
dataset dimension (PCA) and using CNN to achieve accurate ters indicating threats to the system. These could be obser-
classification. vations of any other system. Our experimental data is com-
posed of 12 parameters, 11 observations: Vibration speed,
III. PREDICTIVE MAINTENANCE FRAMEWORK Motor torque, Acceleration, Motor Speed, Air pressure, Prod-
METHODOLOGY uct Weight, Deceleration, Current, Belt tension, Motor ten-
This experimental predictive maintenance framework aims to sion, Temperature and one outcome which is the type of
classify conveyor motor states as dangerous or not dangerous Fault detected in the system. Each parameter has about
by encoding time series as images and feeding them into a 15,000 observations or values recorded during a specific
CNN model that performs the classification task. The frame- interval. We express the overall dataset as the expression
work consists of the following stages: p + 1, with p being the number of observations or inde-
1) FEEDING STAGE pendent time series variables and one the number of the
In order to accommodate dual time-series types, we design dependent variable or the label (In our case, the type of
this stage is responsible for the separation of MTS and UTS. faults generated). We discard the number of label one and
This stage has two inputs. The feeding stage has a sub-stage remain with p as the new dimension of our dataset, in this
for MTS data inputs. The sub-stage is called ‘‘Dimensional- case, p=11.
ity Reduction Stage’’ and aims to reduce the size of MTS
inputs to two channels using an approach called principal 2) CALCULATE THE AVERAGE OF EVERY DIMENSION OF THE
component analysis (PCA). By reducing the size of MTS NEW DATASET
data, the system’s complexity decreases, and the performance Since the new size of our dataset is p = 11, the dataset is
improves (data processing volume reduces considerably). composed of eleven time series variables or eleven vectors of
2) IMAGING STAGE observations. In this research, they can be detailed as follows:
At this stage, time series received from the feeding step:
P1(vibration speed) = p11 , p12 , · · ·, p1n
 
either a UTS or a Reduced MTS are converted into images
P2(motor torque) = p21 , p22 , · · ·, p2n
 
using the GAF method.
3) CNN CLASSIFICATION MODELING STAGE P3(acceleration) = p31 , p32 , · · ·, p3n
 
This stage receives encoded images from the previous step
P4(motor speed) = p41 , p42 , · · ·, p4n
 
and performs a classification task using the CNN method.
P5(air pressure) = p51 , p52 , · · ·, p5n
 
In this research, we add an option in the CNN model
that uses the Parameterized Rectifier Linear Unit (PReLU)
P6(product weight) = p61 , p62 , · · ·, p6n
 
activation function to improve the non-linearity feature of
P7(deceleration) = p71 , p72 , · · ·, p7n
 
input images and to achieve better accuracy at the out-
put when using extensive input networks. Since we built P8(current) = p81 , p82 , · · ·, p8n
 
our predictive model for small manufacturing industries,
P9(belt tension) = p91 , p92 , · · ·, p9n
 
the performance results obtained using both CNN with clas-
sic rectifier linear unit (ReLU) and PReLU are very much P10(motor tension) = p101 , p102 , · · ·, p10n
 
similar.
P11(temperature) = p111 , p112 , · · ·, p11n
 
Note: Although the performance improvement between
CNN models with ReLU and those using PReLU has been where n is the length of each time series variable.
proven by some authors to be very small (about 1% to In this experiment let us assume n = 15,000, we can gen-
2% accuracy improvement), this could make a massive erate a matrix (14) of size p × n,(11 × 15,000), representing

121038 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

TABLE 1. Variance-covariance vector relationship. The mathematical expression to find Eigenvalues for the
CVM matrix can be presented as follows:

det (CVM − λI) = 0 (20)

where λ is the Eigenvalue associated with CVM and I is the


identity matrix. An identity matrix corresponding to (19) can
be expressed as follows:
 
1 ··· 0 ··· 0
 .. .. .. .. .. 
the new dataset as: . . . . .
. . .
 
0 1 0
 
p11 ··· p61 ··· p111 I = · · ·  (21)
 .. .. .. .. ..  .. . . . .
 
 . . . . .. .. .. .. 
. .


... 0 ... 0 ... 1
 
D=  p17500 ··· p67500 p117500  (14)
 .. .. .. .. ..
 . . . . .

 The identity matrix is also a square matrix with the same size
p115000 ... p615000 ... p1115000 as the CVM (11 × 11). Substituting (19) and (21) to (20) and
The average of each dimension of the dataset D can be computing the operation results in a eleventh degree equation
computed by equation (15): with λ the unknown. The equation can be represented as:
Pn
(P1n ) aλ11 + bλ10 + hλ4 + . . . + iλ3 + jλ2 + kλ + l = 0 (22)
AvgP1 = i=1 = P1 (15)
n
From (14) and (15), the average matrix originating from D After solving (22), eleven values are found: λ1 , λ2 , λ3 , λ4 ,
can be summarized as follows: λ5 , λ6 , λ7 , λ8 , λ9 , λ10 and λ11 ; these are the Eigenvalues of
Pn Pn Pn our dataset matrix. The next step is to find the corresponding
(p1n ) (p6n ) (p11n )
D̄ = [ i=1 · · · i=1 . . . i=1 ] (16) Eigenvectors for each Eigenvalues. Let’s assume for each
n n n Eigenvalues the following Eigenvectors:
D̄ = [P1 · · · P6 . . . P11] (17)
λ1 → E1 = [e11 , e12 , · · ·, e111 ]
3) GENERATE THE VARIANCE-COVARIANCE MATRIX OF THE λ2 → E2 = [e21 , e22 , · · ·, e211 ]
DATASET D
λ3 → E3 = [e31 , e32 , · · · , e311 ]
The variance-covariance matrix or covariance matrix is com-
puted by establishing a variance relationship between each λ4 → E4 = [e41 , e42 , · · ·, e411 ]
element of the dataset with the following formula: λ5 → E5 = [e51 , e52 , · · ·, e511 ]
1 Xn λ6 → E6 = [e61 , e62 , · · · , e611 ]
VC (P1, P2) =
 
P1n − P1 P2n − P2 (18)
n i=1 λ7 → E7 = [e71 , e72 , . . . , e711 ]
The result of the variance-covariance matrix is a square λ8 → E8 = [e81 , e82 , . . . , e811 ]
matrix of size p × p; In this research the size of the variance-
λ9 → E8 = [e91 , e92 , . . . , e911 ]
covariance matrix is 11 × 11. Table 1 is a sample of the
variance-covariance space’s sake, we illustrate a sample of λ10 → E10 = [e101 , e102 , . . . , e1011 ]
the matrix with some of the vectors. λ11 → E11 = [e111 , e112 , . . . , e1111 ]

4) FIND THE EIGENVALUES AND THEIR EIGENVECTORS 5) REDUCE DATASET DIMENSION BY KEEPING
An Eigenvector is in simple terms, a vector which will not EIGENVECTORS WITH HIGHEST EIGENVALUES
change directions after we apply any linear transformation to The Eigenvector with the smallest Eigenvalue carry the least
it [34]. Let us assume our square variance-covariance matrix information of our data. To effectively reduce the dimension
to be defined by (19). of the dataset we focus on the eigenvectors corresponding to
higher Eigenvalues. Since we would like to reduce the size
CVM
  p=11 to a dimension of 2 (2 channels input), we only select
VC(P1, P1) · · · VC(P1, P6) · · · VC(P1, P11)
.. .. .. .. .. the first two higher Eigenvalues and their Eigenvectors. If we
. . . . . consider λ1 and λ2 to be the two Eigenvalues with higher
 
 
. . . VC(P6, P11)  values, with λ1 > λ2 , their corresponding Eigenvectors can
 
= VC(P6, P1) · · · VC(P6, P6)
.. .. .. .. ..
 
be combined into a new matrix (23).
. . . . .
 
 
VC(P11, P1) . . . VC(P11, P6) . . . VC(P11, P11) e11 e12 . . . e111
 
G= (23)
(19) e21 e22 . . . e211

VOLUME 8, 2020 121039


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

The reduced dimension of dataset D is computed by the


following expression:
Z = DGT (24)
where GT is the transpose of matrix G. GT is defined in (25):
 
e11 e21
 e12 e22 
G= . ..  (25)
 
 .. . 
e111 e211

The multiplication of D (14), a 15,000 × 11 matrix, by GT ,


a 11 × 2 matrix, by (14) results in new matrix of size 15,000 FIGURE 5. ReLU graphical representation.
× 2, with 2 the number of columns of the reduced dataset.
The same principal applies for reducing the dimension of any
other dataset depending on its size.

B. PARAMETERIZED RECTIFIER LINEAR UNIT (PRELU)


ACTIVATION FUNCTION OPTION TO IMPROVE ACCURACY
OF LARGER NETWORKS IN CNN MODEL
Improving model performance in machine learning usually
implies building more powerful models and designing effec-
tive strategies against overfitting. For the past years, several
strategies are applied to neural networks to create models
more capable of fitting training data: the use of smaller
strides [29], [35]–[37], bigger depth [29], [38], new nonlinear
activations [39]–[44], enlarged width [35], [36], sophisticated FIGURE 6. PReLU graphical representation.
layer designs [38], [45] etc. Researchers introduced some
other advanced approaches, such as aggressive data augmen-
When ci is equal to zero, (26) defines a tradition ReLU; when
tation [10], [29], [38], [46], large-scale data [10], [29], and by
ci is different than zero or a linear parameter (26) represents
effective regularization techniques [44], [47]–[49] to achieve
PReLU. The graphical representation of (26) for both ReLU
improved generalization. Rectified Linear Unit (ReLU) is one
and PReLU is presented in Fig.5 and Fig.6 respectively:
of these approaches under the rectifier neuron [31], [39]–[41]
From (26), we define Zi as the input of the activation
that used to better the success of deep networks [10].
function f on the ith input channel and ci the coefficient of
We utilize CNN modeling to perform the classification task
the slope. (26) can further be expanded as:
for the predictive maintenance framework resources. In this
research, we add an optional section to improve the quality of f (zi ) = max (0, zi ) + ci min(0, zi ) (27)
CNN models built for extensive networks by replacing the tra-
ditional rectifier linear unit (ReLU) activation function to the To train PReLU, the back propagation method can be
Parameterized Rectifier Linear Unit (PReLU). Using PReLU, used [51]. Its optimization is done simultaneously for all
we improve the model data fitting capability with reduced risk layers [50]. Equation (28) is used to find the gradient of ci
of overfitting when training those models and achieve better for one layer:
accuracy than using traditional ReLU. The PReLU function ∂ε X ∂ε ∂f (zi )
= (28)
incorporates ReLU and adds extra parameters that make this ∂ci zi ∂f (zi ) ∂ci
technique appropriate for deep networks [50]. As previously
mentioned, the PReLU activation function is useful for vast From (28), ε represents the objection function and ∂f∂ε (zi )
and deep networks in which it improves performance (accu- the gradient propagation from the deeper layer of the neural
racy). In this research, the improvement is very negligible as network. To find the gradient of the activation function (29)
working with data of a small manufacturing entity. However, is used:
(
the CNN model we build in this study offers a future proof ∂f (zi ) 0 if zi > 0
option for a more substantial amount of data. = (29)
∂ci zi if zi ≤ 0
Equation (26) is a mathematical expression for an activa-
tion function: The next section presents the workflow and steps of our
( predictive maintenance framework and the main components
Zi if Zi > 0 used for the overall architecture of the system. Elements
f (Zi ) = (26)
ci Zi if Zi ≤ 0 detailed in previous sections (UTS, MTS, PCA, and CNN

121040 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 8. Machine vibration trend according to ISO 10816 [52].

FIGURE 9. Vibration as a sine wave [55].

frequency, as displayed in Fig.9. The velocity or speed of a


vibration can is the first derivative of the amplitude, com-
monly known as the displacement over a certain amount of
time or frequency. When analyzing vibrations on their own
in systems and equipment, it is not very easy to automatically
FIGURE 7. Flow chart diagram. establish whether the vibrations recorded are harmless to the
smooth system’s operation. Several rules, such as Table 2,
steps) are backbones of the final architecture of the frame- have been developed to address this issue. Another critical
work. practice in this regard is frequent vibration monitoring, useful
to detect early impairment in the machinery. The vibration
IV. PREDICTIVE MAINTENANCE FRAMEWORK FLOW velocity is measured from special vibration sensors (analog
DIAGRAM AND OVERALL ARCHITECTURE signal input) and stored in a controller.
The flow diagram that summarizes operations of our exper- A machine breaks when exposed to several deformations,
imental predictive maintenance framework is presented which is, in reality, a change in amplitude (displacement)
in Fig.7. The overall predictive maintenance framework is that repeatedly occurs at a specific frequency. In other words,
displayed in Fig.10. the severity of impairment like vibration depends on its dis-
placement and its frequency. As previously mentioned, veloc-
V. EXPERIMENTAL RESULTS ity is also a function of displacement and time (frequency);
We used data from the conveyor system of a small manu- therefore, the velocity of a vibration can be considered an
facturing plant to test the effectiveness of our experimental excellent indicator of vibration severity. For machines oper-
framework. The conveyor system is composed of a conveyor ating between 10Hz and 1000Hz, vibration velocity is a good
AC motor, a variable frequency drive (VFD), and a conveyor indicator of its severity. For those running at frequencies
belt with its components. Data preparation is the first step above 1000Hz, acceleration is one of the most reliable mea-
of our experimental framework process. The small manu- sures of vibration severity [56].
facturing plant uses to always rely on the motor’s vibration The vibration severity criteria in Table 2 depends on the
speed reading to initiate predictive maintenance on them. four classes related to the type and size of the motor used.
Depending on the type of machinery (motor size), vibration Class I is for Small-sized equipment (from 0 to 15KW)
thresholds determine their states: normal, warning, or alarm. Class II is for Medium-sized equipment (from 15 to
Fig.8 illustrates the vibration velocity warning and alarm 75KW)
states. Class III is for Large-sized equipment (powered > 75KW)
One of the simplest ways to study and understand vibra- mounted on ‘‘Rigid Support’’ structures and foundations.
tion’s signal behavior is to consider it as a sine wave with Class IV is for Large-sized machines (powered > 75KW)
all its characteristics: amplitude (displacement), period, and mounted on ‘‘Flexible Support’’ structures [5].

VOLUME 8, 2020 121041


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

TABLE 2. Vibration severity criteria based on ISO 2372 [5]. TABLE 3. Time-series dataset variables.

the accuracy of the conveyor system. Undesirable vibrations


reduce the accuracy of the conveyor system. An excessive
vibrating conveyor used in a bottling plant spills out content
The small manufacturing plant in this experimental of recipients, stops them on undesired spots, makes them fall
research utilizes a class II equipment (medium-sized equip- on the conveyor and creates congestion in the chain.
ment). From Table 2, all vibration velocity above 4.5mm/s - Misalignment: Because of the continuous motion of the
indicates a high severity in the system faults. However, conveyor system, the machine’s shaft gets quickly out of line
the plant supervisors observed that not all ‘‘critical’’ vibration and causes a misalignment fault that result in undesirable
speed would result in a ‘‘critical’’ fault and the need to vibrations.
perform predictive maintenance on the system. - Looseness: In normal conditions, the conveyor system’s
Not all apparent ‘‘non-critical’’ vibration speeds were safe structure and components need to be stiff and solid to oper-
for conveyor motors since it would cause critical or minor ate smoothly. A decreased in stiffness or loss parts causes
faults. These false falts caused for a long time unneces- vibrations in the system. Vibrations caused by this fault are
sary repetitive additional maintenance cost since the existing not very severe and can remain unnoticed without proper
system would request for a predictive maintenance action monitoring. In the small manufacturing plant, this is the least
while not required or a long-time failure because of misin- severe of all three faults. Misalignment or looseness, individ-
terpretation of critical faults in need of immediate mainte- ually, cause minor fault in the plant (lower vibrations nearing
nance action. There was, therefore, a need to take more than the unsatisfactory border: 4.5 mm/s on Table 2). But when
one parameter (motor vibration speed) into consideration to occurring simultaneously result in vibrations velocities in
accurately detect predictive maintenance schedules. Using the unsatisfactory spectrum of the vibration severity criteria
specialized sensors (IoT sensors), VFD reading, controllers based on Table 2.
Inputs/Outputs (I/O) reading, and several other parameters, Parameters in Table 3 represent the independent variables
which combination was observed by supervisors to poten- of our deep learning model. A combination of these vari-
tially influenced system failure, were recorded from the con- ables determines three states of the conveyor motor: (1)
veyor system over a successive period. They are part of the No-Fault (2) Minor Fault and (3) Critical Fault with urgent
time series dataset and presented in Table 3. For confidential- need of maintenance. or confidentiality sake, we display,
ity reasons, the small manufacturing plant did not disclose in Table 4(in the appendix section), only one combination
their identity and the overall data. The conveyor system oper- sample for each state. Fig.11 presents the time series variables
ates at 3 main speed controlled by the VFD: f1 = 15Hz, from Table 3. We selected a portion of the dataset over a
f2 = 30Hz and f3 = 50Hz. The sampling frequency (fs ) used smaller period for visibility purposes. There is a need for
for the experiment is fs = 2.56f3 that is about fs = 128Hz. data conversion and scaling in controllers and Supervisory
Plants operators detected three primary types of faults in the Control And Data Acquisition (SCADA) for field operators
conveyor system: to interpret information easily.
- Imbalance: Unattended broken parts in the conveyor This research built a classification model that studies in
system result in an imbalance in the running machinery that depth the combination of all the above parameters and gen-
causes vibrations. This fault seems unnoticed at the lowest erates a more accurate way to detect critical faults in need of
speed, but it becomes very severe when operating at 30Hz and immediate maintenance, minors faults more negligible, and
worst at 50Hz causing vibrations velocity in unsatisfactory no faults. From our dataset, we can tell that we are deal-
range from Table 2 for class II. Undesirable vibrations reduce ing with MTS input data. Therefore, as per our framework

121042 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 10. Overall system’s architecture.

FIGURE 11. Independent variables MTS plot.

workflow, we apply PCA to reduce the dimension of the TABLE 4. PCA settings.
independent variables to a maximum of two channels.

A. ALGORITHM SETTINGS SUMMARY: PCA


Table 4 is a summary of important settings used for the dimen-
sionality reduction of the MTS to PCA (The ‘R’ platform was
used for PCA dimensionality reduction):
PCA algorithm generates two new sets of independent
variables replacing all variables in Table 3. The two variables
are named PCAvar1 and PCAvar2, with their values different these two variables. We used the same time interval for both
from original raw data. Fig.12 is a graphical representation of Fig.11 and Fig.12.

VOLUME 8, 2020 121043


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 12. PCA variables representing reduced independent MTS variables.

FIGURE 13. ‘‘No Fault (NF)’’ motor status sample on GAF images. FIGURE 14. ‘‘Minor Fault (MF)’’ motor status sample on GAF images.

B. ALGORITHM SETTINGS SUMMARY: GAF


We load the new independent variables from Fig.12 into the
image encoding section where they are (1) normalized, (2)
converted to polar coordinates, and (3) transformed to GAF
images. We used the python platform to generate GAF images
for each case of the dataset. The following are the most
critical settings and steps for this section:
• Gramian angular field library imported from pyts.image.
• Separate each motor condition cases from PCA vari-
ables. FIGURE 15. ‘‘Critical Fault (CF)’’ motor status sample on GAF images.
• Load each three cases (‘1’, ‘2’ and ‘3’) individually in
the GAF code.
• Image_Size: 3 Most critical faults in the system were caused by an imbal-
• Generate and save ‘‘summation’’ (GASF) and ‘‘dif- ance system when running the system at 50Hz. Some others
ference’’ (GADF) images for each line data: from were caused by a combination of at least two faults, for
X_gasf [0], X_gadf [0] to X_gasf[n], X_gadf[n] (n is the example, misalignment and looseness occurring at the same
last item number of each case) time. The minor fault was usually a result of misalignment or
• Save images in separate folders based on cases. looseness individually.
Fig.13, Fig.14, and Fig.15 are the image samples of the three Our research aims to build an effective classification model
motor states are No-fault (‘3’), Minor fault (‘2’), and Critical that uses several motor parameters and observations as inputs
fault (‘1’), classified by our framework. of the system. These inputs are:

121044 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

TABLE 5. SVM settings. TABLE 6. CNN parameters settings.

• Vibration speed (VS),


• Motor torque (MT),
• Acceleration (ACC),
• Motor speed (MS),
• Air pressure (AP),
• Product weight (PW),
• Deceleration (DEC),
• Current(CUR),
• Belt tension (BT),
• Motor tension (MT-S)
• Temperature (TMP)
The output of the model is the ‘‘Fault Severity in the System’’
that determines a predictive maintenance schedule. The fault
severity or the output of the system has three conditions dis-
played in Fig.13, Fig.14, and Fig.15: No-Fault, Minor Fault,
and Critical Fault (that requires immediate action), respec-
tively. Fig.16 is a representation of our predictive framework
inputs/Outputs (I/O) model.

C. ALGORITHM SETTINGS SUMMARY: SVM AND CNN


As previously mentioned, our framework machine learning
section offers an optional section for more extensive net-
works by building a CNN model based on the PReLU activa-
tion function instead of the standard ReLU. To evaluate our
results, we use three machine learning models:(1) Support
Vector Machine (SVM), (2) Standard CNN (using ReLU),
and (3) CNN + PReLU.
The SVM model is built in an R platform using PCA
variables. The critical parameters used to generate are:
The CNN models are built in a python platform using
images generated in GAF and saved in different folders.
We divided all converted images into two categories: a train-
ing set and a test set. The training set applied to train and build
our CNN models and the test set to measure the accuracy
of the model. This step is the pre-processing data phase of
modeling our CNNs. Unlike SVM models, where all this is
done automatically in the machine learning script, this part is
achieved manually for CNN in this study.
Parameters settings in Table 6 are worth mentioning for the
CNN models.
Dealing with three classification models, the evaluation
metrics used for the above models are:
• Accuracy: The accuracy of a classification model can be in the test set to evaluate the model, the accuracy can be
defined as the percentage of correct predictions of the represented by (30) as follows:
overall model over the total number of samples used for
prediction. Let assume CP the number of correct predic- CP
tions in a model and n the total number of instances used accuracy = 100% (30)
n
VOLUME 8, 2020 121045
K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

FIGURE 16. Framework Inputs/Outputs architecture.

• Precision: Another evaluation metric is precision TABLE 7. Confusion matrix result of SVM model.
defined as the percentage of correct prediction for each
different class, individually, over the total number of
instances predicted for those classes. In this research
our dataset has three classes for the classification model:
no fault (NF), minor fault (MF) and critical fault (CF).
Equation (31) is the mathematical expression for the
precision:
CPm TABLE 8. Confusion matrix result of CNN + RELU.
precision = 100%, m ∈ N (31)
Pm
where m is the number of classes of the dataset, CPm is the
number of correct predictions per class, pm is the total number
of instances predicted for that class (correct and incorrect).
• Recall: The recall is known as the percentage of
instances of a class that were correctly predicted.
In other words it is a ratio of the number of correct pre-
dictions of a class over the sum of correct predictions and
TABLE 9. Confusion matrix result of CNN + PRELU.
missed correct predictions. Its mathematical expression
is presented on (32).
CPm
recall = 100%, m ∈ N (32)
(CPm + IPm )
where m is the number of classes of the dataset, CPm is the
number of correct predictions per class, pm is the total number
of instances predicted for that class (correct and incorrect).
While accuracy is an evaluation metric for the overall
model (all classes included), Precision and Recall are useful for each class. The remaining uncolored cells contain the
to have insights on individual classes and interpret the behav- number of incorrect predictions. The meaning of the labels
ior of each class better. is (1) CF: critical fault, (2) MF: minor fault, and (3) NF:
A confusion matrix is an essential tool that displays a sum- no-fault. We reduced the dimension of the input data for all
mary of classification results, mainly the actual labels versus three machine learning models by applying PCA.
predicted ones [53]. It also computes the accuracy, precision, Note: Confusion matrices results of both CNN mod-
and recall of a classification model. Tables 7, 8, and 9 are els are quite impressive with 100% positive predictions.
confusion matrices of the three experimental classification We achieved an outstanding prediction by running three
models used in our predictive maintenance framework. epochs, which is the number of times the CNN algorithm
On the above confusion matrices, the green-colored cells learns the model behavior using available training set data
are the number of correct predictions made by the model when the training and testing of these models. Training and

121046 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

TABLE 10. Classification models results summary. obtained show the relevance of using deep learning algo-
rithms such as CNN to improve the accuracy of classification
models. Our predictive maintenance framework architecture
accommodates both UTS and MTS data input. It classifies
the conveyor motor status into three categories based on input
parameters: critical fault, minor fault, and no-fault. The best
overall accuracy achieved by our experimental framework is
about 100%, which is quite sufficient for initiating predict-
ing maintenance schedules. With these excellent results, our
framework reduces the high risk of missing critical faults in
the system, which could lead to a more prolonged breakdown
or unnecessarily initiating maintenance on motors due to
incorrect predictions leading to a waste of resources.
For future work, we would like to expend the frame-
validation (testing) accuracies obtained at the last epoch for work’s ability to deal with diverse data types by adding a
both models are very close to just less than 1% (99, xx% - feature that includes non-linear time series input. We could
100%), which reflects models with fewer chances of overfit- reduce the dimensionality of the non-linear data through
ting. Table 7 summarizes the evaluation metrics results of the the Kernel PCA algorithm; we would also fine-tune our
three classification models used in this research: CNN models by incorporating additional parameters such as
As per the results in Table 10, using CNN for predictive ‘‘Dropout,’’ which can prevent the risk of overfitting. Fur-
maintenance increases the experimental system’s accuracy thermore, we would like to incorporate CNN classifications
to almost 50% as opposed to utilizing a traditional SVM results of the predictive maintenance framework in the oper-
machine learning model. Although the preparation and mod- ational technology (OT) environment where we include clas-
eling steps of a predictive maintenance system using CNN sification motor statuses in Supervisory Control And Data
may seem tedious and demanding, the hardest part of the Acquisition (SCADA) displayed on a Human Machine Inter-
modeling is done once in the beginning. The remaining face (HMI) and remotely accessible in cloud-based applica-
operations will be a fine-tuning of parameters and load- tions by supervisors and operators.
ing new observations in the system for the algorithm to
improve its performance. Depending on the plant activities, REFERENCES
the supervisors can perform this operation once a month or [1] R. Wan, S. Mei, J. Wang, M. Liu, and F. Yang, ‘‘Multivariate temporal
convolutional network: A deep neural networks approach for multivariate
during maintenance and shutdown. Results obtained for both time series forecasting,’’ Electronics, vol. 8, no. 8, p. 876, Aug. 2019.
CNN+ReLU and CNN+PReLU are identical for relatively [2] R. Zhang, F. Zheng, and W. Min, ‘‘Sequential behavioral data process-
small datasets but could make a difference for more extensive ing using deep learning and the Markov transition field in online fraud
detection,’’ Math., Comput. Sci., vol. 1808, no. 05329, pp. 1–5, Aug. 2018.
networks. This option is, therefore, quite handy for a future [Online]. Available: https://arxiv.org/pdf/1808.05329.pdf
proof model, which will undoubtedly have to deal with a [3] C. L. Yang, Z. X. Chen, and C. Y. Yang, ‘‘Sensor classification using
more significant number of data. The best overall accuracy convolutional neural network by encoding multivariate time series as
two-dimensional colored image,’’ Sensors, vol. 20, no. 1, 168, pp. 1–15,
achieved is 100%. These are outstanding results that bring Dec. 2019, doi: 10.3390/s20010168.
more effectiveness and reliability in the system and makes a [4] M. Gahirwal and M. Vijayalakshmi, ‘‘Inter time series sales forecasting,’’
big difference when predicting machine conditions (conveyor Int. Journ. Adv. Stud. Comp. Sci. Eng., vol. 2, no. 1, pp. 55–66, Mar. 2013.
[Online]. Available: https://arxiv.org/pdf/1303.0117.pdf
motors) since an incorrect prediction could either result in
[5] K. S. Kiangala and Z. Wang, ‘‘Initiating predictive maintenance for a
critical breakdown or unnecessary maintenance expenses. conveyor motor in a bottling plant using industry 4.0 concepts,’’ Int. J. Adv.
As the production/manufacturing systems are different, it is Manuf. Technol., vol. 97, nos. 9–12, pp. 3251–3271, May 2018.
essential to conduct a proper study on each system behavior [6] K. Wang, ‘‘Intelligent predictive maintenance (IPdM) system-industry 4.0
scenario,’’ WIT Trans. Eng. Sci., vol. 113, no. 10, pp. 259–268, 2016.
before implementing the adequate predictive maintenance [7] P. Domingos, ‘‘A few useful things to know about machine learning,’’
approach. Commun. ACM, vol. 55, no. 10, pp. 78–87, Oct. 2012.
[8] C. M. Bishop, Pattern Recognition and Machine Learning (Information
VI. CONCLUSION Science and Statistics). New York, NY, USA: Springer-Verlag, 2006.
[9] Q. Zhang, L. T. Yang, Z. Chen, and P. Li, ‘‘A survey on deep learning for
This research presents an experimental framework that trig- big data,’’ Inf. Fusion, vol. 42, pp. 146–157, Jul. 2018.
gers effective predictive maintenance for conveyor motors [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classifica-
in a small manufacturing industry utilizing a classification tion with deep convolutional neural networks,’’ in Proc. Adv. Neural Inf.
Process. Syst. (NIPS), vol. 25. Red Hook, NY, USA: Curran Associates,
model built through time-series input data imaging and CNN
Dec. 2012, pp. 1097–1105.
with a future-proof option for more extensive networks using [11] Z. Wang and T. Oates, ‘‘Imaging time-series to improve classifica-
PReLU activation function to improve model performance. tion and imputation,’’ 2015, arXiv:1506.00327. [Online]. Available:
Several conveyor system parameters observed sequentially http://arxiv.org/abs/1506.00327
[12] Z. Wang and T. Oates, ‘‘Encoding time series as images for visual inspec-
are converted into images using GASF that has the advan- tion and classification using tiled convolutional neural networks,’’ in Proc.
tage of preserving temporal features. Experimental results AAAI Workshops, Jan. 2015, pp. 40–46.

VOLUME 8, 2020 121047


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

[13] W. Chen and K. Shi, ‘‘A deep learning framework for time series classi- [36] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun,
fication using relative position matrix and convolutional neural network,’’ ‘‘Overfeat: Integrated recognition, localization and detection using convo-
Neurocomputing, vol. 359, pp. 384–394, Sep. 2019. lutional networks,’’ in Proc. Int. Conf. Learn. Represent., Scottsdale, AZ,
[14] G. Martínez-Arellano, G. Terrazas, and S. Ratchev, ‘‘Tool wear classifi- USA, 2014, pp. 1–16.
cation using time series imaging and deep learning,’’ Int. J. Adv. Manuf. [37] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, ‘‘Return of
Technol., vol. 104, nos. 9–12, pp. 3647–3662, Oct. 2019. the devil in the details: Delving deep into convolutional nets,’’ 2014,
[15] R. Adhikari and R. K. Agrawal, ‘‘An introductory study on time series arXiv:1405.3531. [Online]. Available: http://arxiv.org/abs/1405.3531
modeling and forecasting,’’ 2013, arXiv:1302.6613. [Online]. Available: [38] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov,
http://arxiv.org/abs/1302.6613 D. Erhan, V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with con-
[16] L. Sadouk, ‘‘CNN approaches for time series classification,’’ in Time volutions,’’ 2014, arXiv:1409.4842. [Online]. Available: http://arxiv.org/
Series Analysis Methods and Applications for Flight Data. London, abs/1409.4842
U.K.: IntechOpen, 2018. [Online]. Available: https://www.intechopen. [39] V. Nair and G. E. Hinton, ‘‘Rectified linear units improve restricted Boltz-
com/books/time-series-analysis-data-methods-and-applications/cnn- mann machines,’’ in Proc. 27th Int. Conf. Mach. Learn. (ICML), 2010,
approaches-for-time-series-classification, doi: 10.5772/intechopen.81170. pp. 807–814.
[17] G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time Series [40] A. L. Maas, A. Y. Hannun, and A. Y. Ng, ‘‘Rectifier nonlinearities improve
Analysis: Forecasting and Control, 5th ed. New York, NY, USA: Wiley, neural network acoustic models,’’ in Proc. Int. Conf. Mach. Learn.(ICML),
2015. 2013, pp. 1–6.
[18] T. Górecki and M. Łuczak, ‘‘Multivariate time series classification with [41] M. D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q. V. Le,
parametric derivative dynamic time warping,’’ Expert Syst. Appl., vol. 42, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, and G. E. Hinton, ‘‘On
no. 5, pp. 2305–2312, Apr. 2015. rectified linear units for speech processing,’’ in Proc. IEEE Int. Conf.
[19] A. Coraddu, L. Oneto, A. Ghio, S. Savio, D. Anguita, and M. Figari, Acoust., Speech Signal Process., Vancouver, BC, Canada, May 2013,
‘‘Machine learning approaches for improving condition-based mainte- pp. 3517–3521.
nance of naval propulsion plants,’’ Proc. Inst. Mech. Eng., Part M, J. Eng. [42] M. Lin, Q. Chen, and S. Yan, ‘‘Network in network,’’ 2013,
Maritime Environ., vol. 230, no. 1, pp. 136–153, Feb. 2016. arXiv:1312.4400. [Online]. Available: http://arxiv.org/abs/1312.
[20] K. Leahy, R. L. Hu, I. C. Konstantakopoulos, C. J. Spanos, and 4400
A. M. Agogino, ‘‘Diagnosing wind turbine faults using machine learning [43] R. K. Srivastava, J. Masci, S. Kazerounian, F. Gomez, and J. Schmidhuber,
techniques applied to operational data,’’ in Proc. IEEE Int. Conf. Prognos- ‘‘Compete to compute,’’ in Proc. Adv. Neural Inf. Process. Syst. (NIPS),
tics Health Manage. (ICPHM), Jun. 2016, pp. 1–8. 2013, pp. 2310–2318.
[21] G. A. Susto, A. Schirru, S. Pampuri, S. McLoone, and A. Beghi, ‘‘Machine [44] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Ben-
learning for predictive maintenance: A multiple classifier approach,’’ IEEE gio, ‘‘Maxout networks,’’ in Proc. Int. Conf. Mach. Learn., vol. 28, no. 3,
Trans. Ind. Informat., vol. 11, no. 3, pp. 812–820, Jun. 2015. Jun. 2013, pp. 1319–1327.
[22] M. Paolanti, L. Romeo, A. Felicetti, A. Mancini, E. Frontoni, and J. Lon- [45] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Spatial pyramid pooling in deep
carski, ‘‘Machine learning approach for predictive maintenance in industry convolutional networks for visual recognition,’’ IEEE Trans. Pattern Anal.
4.0,’’ in Proc. 14th IEEE/ASME Int. Conf. Mech. Embedded Syst. Appl. Mach. Intell., vol. 37, no. 9, pp. 1904–1916, Sep. 2015.
(MESA), Jul. 2018, pp. 1–6. [46] A. G. Howard, ‘‘Some improvements on deep convolutional neural net-
[23] K. Kulkarni, U. Devi, A. Singhee, J. Hazra, and P. Rao, ‘‘Predictive main- work based image classification,’’ 2013, arXiv:1312.5402. [Online]. Avail-
tenance for supermarket refrigeration systems using only case temperature able: http://arxiv.org/abs/1312.5402
data,’’ in Proc. Ann. Amer. Contr. Conf. (ACC), Milwaukee, WI, USA, [47] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and
Jun. 2018, pp. 4640–4645. R. R. Salakhutdinov, ‘‘Improving neural networks by preventing co-
[24] J. Tenenbaum, V. de Silva, and J. C. Langford, ‘‘A global geometric adaptation of feature detectors,’’ 2012, arXiv:1207.0580. [Online].
framework for nonlinear dimensionality reduction,’’ Science, vol. 290, Available: http://arxiv.org/abs/1207.0580
no. 5500, pp. 2319–2323, Dec. 2000. [48] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and
[25] S. T. Roweis and L. K. Saul, ‘‘Nonlinear dimensionality reduction by R. Salakhutdinov, ‘‘Dropout: A simple way to prevent neural networks
locally linear embedding,’’ Science, vol. 290, no. 5500, pp. 2323–2326, from overfitting,’’ J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958,
Dec. 2000. 2014.
[26] J. Patterson and A. Gibson, Deep Learning: A Practitioner’s Approach. [49] L. Wan, M. Zeiler, S. Zhang, Y. L. Cun, and R. Fergus, ‘‘Regularization
Sebastopol, CA, USA: O’Reilly Media, 2017. of neural networks using dropconnect,’’ in Proc. Int. Conf. Mach. Learn.,
[27] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, 2013, pp. 1058–1066.
W. Hubbard, and L. D. Jackel, ‘‘Handwritten digit recognition with a back- [50] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Delving deep into rec-
propagation network,’’ in Proc. Adv. Neural Inf. Process. Syst. Conf., 1990, tifiers: Surpassing human-level performance on ImageNet classifica-
pp. 396–404. tion,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Dec. 2015,
[28] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning pp. 1026–1034.
for image recognition,’’ 2015, arXiv:1512.03385. [Online]. Available: [51] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard,
http://arxiv.org/abs/1512.03385 W. Hubbard, and L. D. Jackel, ‘‘Backpropagation applied to handwrit-
[29] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for ten zip code recognition,’’ Neural Comput., vol. 1, no. 4, pp. 541–551,
large-scale image recognition,’’ 2014, arXiv:1409.1556. [Online]. Avail- Dec. 1989.
able: http://arxiv.org/abs/1409.1556 [52] IFM Electronics. (2013). From Process Monitoring to Vibration
[30] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, ‘‘Rethinking Analysis. Condition Monitoring Systems. pp. 5–12. [Online]. Available:
the inception architecture for computer vision,’’ 2015, arXiv:1512.00567. http://www.ifm.com/download/files/ifm-efector-octavis-brochure-GB-
[Online]. Available: http://arxiv.org/abs/1512.00567 2013/$file/ifm-efector-octavis-brochure-GB-2013.pdf
[31] X. Glorot, A. Bordes, and Y. Bengio, ‘‘Deep sparse rectifier neural net- [53] S. Visa, B. Ramsay, A. L. Ralescu, and E. Van Der Knaap, ‘‘Confusion
works,’’ in Proc. Conf. Artif. Intell. Statist., 2011, pp. 315–323. matrix-based feature selection,’’ in Proc. Midwest Artif. Intel. Cognit.
[32] Y. T. Zhou and R. Chellappa, ‘‘Computation of optical flow using a Scienc. Conf., vol. 710, 2011, pp. 120–127.
neural network,’’ in Proc. IEEE Int. Conf. Neural Netw., vol. 2, Jul. 1988, [54] A. Yunusa-Kaltungo, J. K. Sinha, and A. D. Nembhard, ‘‘A novel fault
pp. 71–78. diagnosis technique for enhancing maintenance and reliability of rotating
[33] A. Devarakonda, M. Naumov, and M. Garland, ‘‘AdaBatch: Adaptive machines,’’ Struct. Health Monit., Int. J., vol. 14, no. 6, pp. 604–621,
batch sizes for training deep neural networks,’’ 2017, arXiv:1712.02029. Nov. 2015, doi: 10.1177/1475921715604388.
[Online]. Available: http://arxiv.org/abs/1712.02029 [55] C. Sanders. (2011). A guide to vibration analysis and associated techniques
[34] M. Munir, ‘‘Eigenvalues-theory and applications,’’ Dept. Math., in condition monitoring. DAK Consulting-Chiltern House. [Online].
Govern. Postgrad. College, Karl-Franzens-Universität, Graz, Austria, Available: http://www.dakacademy.com/newsite/index.php?option=com_
Tech. Rep., 2015. [Online]. Available: https://www.researchgate. k2&Itemid=500&id=94_007cd4b8b347e375bc10dbe5efbccc28&lang=
net/publication/309012418_Eigenvalues-Theory_and_Applications, en&task=download&view=item
doi: 10.13140/RG.2.2.15926.91201. [56] J. Alsalaet. (Dec. 2012). Vibration Analysis and Diagnostic Guide.
[35] M. D. Zeiler and R. Fergus, ‘‘Visualizing and understanding convolutional [Online]. Available: https://www.researchgate.net/publication/311420765_
networks,’’ in Proc. Eur. Conf. Comput. Vis., 2014, pp. 818–833. Vibration_Analysis_and_Diagnostic_Guide

121048 VOLUME 8, 2020


K. S. Kiangala, Z. Wang: Effective Predictive Maintenance Framework for Conveyor Motors

KAHIOMBA SONIA KIANGALA received the ZENGHUI WANG (Member, IEEE) received the
B.Tech. degree in electrical engineering from the B.Eng. degree in automation from the Naval Avia-
Tshwane University of Technology (TUT), South tion Engineering Academy, China, in 2002, and the
Africa, in 2014, and the master’s degree in elec- Ph.D. degree in control theory and control engi-
trical engineering from the University of South neering from Nankai University, China, in 2007.
Africa (UNISA), in 2019, where she is currently He is currently a Professor with the Department
pursuing the Ph.D. degree in science, engineer- of Electrical and Mining Engineering, Univer-
ing and technology (SET). Her research interest sity of South Africa (UNISA), South Africa. His
includes adapting artificial intelligence techniques research interests are industry 4.0, control theory
to improve automation techniques for small to and control engineering, engineering optimization,
medium scale industries in an Industry 4.0 environment. image/video processing, artificial intelligence, and chaos.

VOLUME 8, 2020 121049

You might also like