Development of Advanced Artificial Intel
Development of Advanced Artificial Intel
PII: S0169-8095(19)31123-8
DOI: https://doi.org/10.1016/j.atmosres.2020.104845
Reference: ATMOS 104845
Please cite this article as: B.T. Pham, L.M. Le, T.-T. Le, et al., Development of advanced
artificial intelligence models for daily rainfall prediction, Atmospheric Research(2019),
https://doi.org/10.1016/j.atmosres.2020.104845
This is a PDF file of an article that has undergone enhancements after acceptance, such
as the addition of a cover page and metadata, and formatting for readability, but it is
not yet the definitive version of record. This version will undergo additional copyediting,
typesetting and review before it is published in its final form, but we are providing this
version to give early visibility of the article. Please note that, during the production
process, errors may be discovered which could affect the content, and all legal disclaimers
that apply to the journal pertain.
Rainfall Prediction
Binh Thai Pham1* ; Lu Minh Le 2 ; Tien-Thinh Le 3* ; Kien-Trinh Thi Bui4 ; Vuong Minh Le 2 ; Hai-
1
University of Transport Technology, Hanoi 100000, Vietnam
2
Faculty of Engineering, Vietnam National University of Agriculture, Gia Lam, Hanoi 100000,
of
Vietnam
ro
3
Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam
4
-p
Geomatics Center, Thuyloi University, Hanoi 100000, Vietnam
re
5
Department of Science & Technology, Bhaskarcharya Institute for Space Applications and Geo-
lP
Tien-Thinh Le (letienthinh@duytan.edu.vn)
ur
Abstract:
Jo
In this study, the main objective is to develop and compare several advanced Artificial Intelligent (AI)
models namely Adaptive Network based Fuzzy Inference System optimized with Particle Swarm
Optimization (PSOANFIS), Artificial Neural Networks (ANN) and Support Vector Machines (SVM)
for the prediction of daily rainfall in Hoa Binh province, Vietnam. For this, meteorological variable
parameters such as maximum temperature, minimum temperature, wind speed, relative humidity and
solar radiation were collected and used as input parameters and daily rainfall as an output parameter in
the models. Validation of the developed models was achieved using various quality assessment criteria
such as correlation coefficient (R) and Mean Absolute Error (MAE), Skill Score (SS), Probability of
1
Journal Pre-proof
Detection (POD), Critical Success Index (CSI), and False Alarm Ratio (FAR). The results showed that
all the AI models provided reasonable predictions of daily rainfall but the SVM was found to be the
best method for predicting rainfall. This method was also found to be the most robust and efficient
prediction model while taking into account of input variability using the Monte Carlo approach. This
AI based study would be helpful in quick and accurate prediction of daily rainfall.
Keywords: Rainfall; Artificial Neural Networks; Robustness analysis; Support Vector Machines;
of
1. Introduction
ro
Rainfall is one of the most crucial meteorological factor that has a direct influence on many fields
-p
including agriculture (Trinh, 2018), hydroelectric production (Haddad, 2011), and water resources
re
management (Hartmann et al., 2016a; Serinaldi and Kilsby, 2012). In fact, agriculture production of the
developing countries depends significantly on rainfall. About 65% total agriculture production of
lP
developing countries like India and Vietnam depends on monsoon rainfall (Sahai et al., 2000; Trinh,
na
2018; Le et al., 2019b). Hydroelectric generation all over the world is mainly depended on the rainfall
in the catchment area. Adequate rainfall is important for recharging depleting ground water resources
ur
(Hartmann et al., 2016a, 2016b). In addition, many catastrophic natural disasters are also directly
Jo
related to rainfall intensity and duration such as flood (Bezak et al., 2016; Bui et al., 2019; Faccini et
al., 2018; Janizadeh et al., 2019; Khosravi et al., 2019; Tien Bui et al., 2019, 2016), drought (Deo et al.,
2018; Mouatadid et al., 2018), landslide (Abedini et al., 2019; Bezak et al., 2016; Chen et al., 2017;
Dou et al., 2019; Pham et al., 2019b, 2018, 2017), local sea-level rise (Senior et al., 2002). Therefore, it
is extremely important to have an quick and accurate method of predicting rainfall intensity and
duration for proper water resources management of an area (Mislan et al., 2015). Moreover, timely
precise rainfall prediction helps in planning of agricultural activities, even in case of unusual
precipitation (Abbot and Marohasy, 2017; Navone and Ceccatto, 1994). Accurate and timely rainfall
2
Journal Pre-proof
prediction is also important to prevent and minimize adverse effects of natural disasters such as
landslides, floods, and droughts (Bezak et al., 2016; Hung et al., 2009).
Prediction of rainfall is generally done by using operational numerical weather models in conjunction
with meteorological radars data (He et al., 2013; Lynch, 2008). These models have been widely used in
many works involving multiple regressions and climatology average methods (Abbot and Marohasy,
2014a; He et al., 2013; Lynch, 2008; Mahmud and Ross, 2005; Shao and Li, 2013; Tanessong et al.,
2014), numerical methods (Azadi et al., 2013; Novak et al., 2014), and empirical formulations
of
(Silvestro and Rebora, 2014). These traditional models are based on long-term measurement of rainfall
ro
and its statistical correlation with other meteorological factors such as air temperature, solar radiation,
cloud information, wind speed, sunshine duration, relative humidity (G.Gouda et al., 2019; Mghouchi
-p
et al., 2016; Mousavi et al., 2017; Paoli et al., 2010), and geographical parameters (i.e. latitude,
re
longitude and altitude) (Chegaar and Chibani, 2001; TürkToğrul and Onat, 1999). Various empirical
lP
equations for estimating rainfall intensity have been proposed by various researchers (AlHassoun,
2011), (Al Mamun et al., 2018) and (Awadallah et al., 2017). Niu and Zhang (2015) (Niu and Zhang,
na
2015) have proposed operational numerical weather models for rainfall forecasting over a large range
ur
of location in China to explore the influence of space inputs on the meteorological variables. In another
Jo
study, Rajeevan et al. (2007) (Rajeevan et al., 2007) have used multiple linear regression model for the
prediction of rainfall in India using six relevant predictors based on the experimental data. Linear
regression and statistical methods have also been combined in for forecasting rainfall in the Caribbean
area considering various characteristics of sea (Ashby et al., 2005). Many researchers have attempted to
improve the prediction capability of numerical weather models using different meteorological inputs
obtained from field surveys and radar data (Villarini et al., 2014; Wang et al., 2015) to reduce
3
Journal Pre-proof
Recently, Artificial Intelligent (AI) based approaches such as Artificial Neural Networks (ANN) and
Adaptive Neuro-Fuzzy Inference Systems (ANFIS) have been adopted for predicting rainfall intensity
considering climate and weather conditions by many countries such as Australia (Abbot and Marohasy,
2014b; Deo and Şahin, 2015), India (Acharya et al., 2014), Thailand (Hung et al., 2009), Greece
(Nastos et al., 2013) and Italy (Faccini et al., 2018). For this, different time period rainfall data: hourly
(Hung et al., 2009; Wei, 2013), daily (Mehdi Keshtkar et al., 2013; Nastos et al., 2014), weekly (El-
Shafie et al., 2012; Warsito et al., 2016; Le et al., 2020), monthly (Abbot and Marohasy, 2014b; Mislan
of
et al., 2015), seasonally (Garric et al., 2002) and annually (Philip and Joseph, 2003) have been analyzed
in the models.
ro
Dabhi and Chaudhary (2014) (Dabhi and Chaudhary, 2014) employed hybrid wavelet-postfix-GP
-p
model for the prediction of daily rainfall of Anand region, India, using various meteorological variables
re
such as maximum temperature, minimum temperature, evaporation index and relative humidity.
lP
Discrete wavelet transform was applied as a data preprocessing technique for reducing noise in the raw
database to improve the prediction capability. Wu et al. (2010) (Wu et al., 2010) investigated modeling
na
of daily rainfall prediction based on three parametric studies including model inputs, modeling methods
ur
and data preprocessing techniques. AI based methods such as ANN and K-nearest-neighbors, and
Jo
processing technique of principal component analysis were employed for the prediction of rainfall.
Altunkaynak and Nigussie (2015) (Altunkaynak and Nigussie, 2015) integrated Seasonally Adjusted
Series (SAS) technique in a multilayer perceptron model for rainfall prediction based on data collected
from two stations in Turkey. Partal et al. (2015) (Partal et al., 2015) utilized daily mean temperature,
daily max temperature, daily min temperature, daily total specific humidity, daily total evaporation
predicting daily total precipitation using different AI techniques such as feed forward back propagation,
radial basis function and generalized regression neural network. Use of Support Vector Machine
(SVM) in daily rainfall prediction was investigated by Ortiz-García et al.(2014) (Ortiz-García et al.,
4
Journal Pre-proof
2014) involving meteorological variables such as temperature, wind speed, wind direction, and
humidity. In this study, SVM model exhibited greater prediction performance than other alternative AI
approaches namely multi-layer perceptron, extreme learning machine, decision trees and K-nearest
neighbor models.
In terms of hybrid AI models, which are combination of single AI and meta-heuristic algorithms,
Nasseri et al. (2008) (Nasseri et al., 2008) combined neural network model and genetic algorithm for
rainfall prediction based on data from rain gauges in the Upper Parramatta catchment, Sydney,
of
Australia. Real coded genetic algorithm has also been applied for optimizing weight parameters of
ro
ANN for daily rainfall–runoff prediction in Ourika basin, Morocco (Sedki et al., 2009). Even though
many works using the AI based algorithms have been done on the rainfall prediction, there is still scope
-p
of improving the accuracy and effectiveness of the models. In addition, prediction of rainfall remains a
re
big challenge as meteorological parameters are of stochastic in nature (Nielsen et al., 2014; Villarini et
lP
al., 2014). Rainfall in any area depends on various meteorological parameters such as temperature,
humidity, wind besides time and space (Abbot and Marohasy, 2017; Navone and Ceccatto, 1994;
na
Till now, most of the developed AI rainfall prediction models are of deterministic type which needed
Jo
sensitivity analysis to evaluate their robustness considering variable inputs (Khosravi et al., 2010). If
the model is not robust enough, uncertainties could be produced in the predicted output (Hong et al.,
2006; Kasiviswanathan and Sudheer, 2013). Moreover, sensitivity analysis can provide relevant
information about the complex relationship between various meteorological variables during running of
AI models and can also identify irrelevant parameters which may not be required as input in the model.
The main objective of the present study is to develop and compare several advanced Artificial
Intelligent (AI) models namely Adaptive Network based Fuzzy Inference System optimized with
Particle Swarm Optimization (PSOANFIS), ANN and SVM for the prediction of daily rainfall in Hoa
5
Journal Pre-proof
Binh province, Vietnam. Significance of the present study is that we have developed and applied first
time the hybrid model: PSOANFIS for accurate prediction of daily rainfall. Moreover, in this study, the
influence of randomness of the database on the prediction capability of the AI models was investigated
and analyzed using Monte Carlo technique as well as statistical analysis. To construct and validate the
predictive models, daily meteorological data (minimum temperature, maximum temperature, wind
speed, relative humidity, solar radiation and rainfall) of the study area was used. Validation of the
models was done using various criteria such as Mean Absolute Error (MAE), Correlation Coefficient
of
(R), Skill Score (SS), Probability of Detection (POD), Critical Success Index (CSI), and False Alarm
Ratio (FAR).
ro
2. Materials and Methods
ANN is a computational model based on the structure and functions of biological neural networks
na
(McCulloch and Pitts, 1943). In fact, the neural network itself is not an algorithm, it is rather a
framework in which many machine learning algorithms can work and process complex input data.
ur
ANN algorithm does not need to define any conditions on the input data. Once the learning process has
Jo
been done, ANN can successfully process new data or predict the outcome of a problem without any
given input data. There are three main layers in a neural network: input layers, hidden layers and output
layer (Fig 1) (Narayanakumar and Raja, 2016). There are many algorithms for training an ANN. In this
study, the scaled conjugate gradient algorithm was applied for training the neural network as it
accelerates the convergence by guiding the search in the conjugate directions (Møller, 1991).
6
Journal Pre-proof
of
ro
Fig 1. Architecture of ANN used in this study.
-p
2.1.2. Adaptive Network Based Fuzzy Inference System (ANFIS)
re
ANFIS is a combination between the learning rules of adaptive networks and a fuzzy inference system
lP
in order to make precise predictions in many aspects of human knowledge. Out of these, the inference
na
system is based on if-then rules (Takagi and Sugeno, 1983) and the adaptive networks system is based
on the gradient descent and the chain rule introduced (Werbos, 1974). Basically, the ANFIS structure
ur
consists of five main layers (Fig 2), each layer contains node functions of the same function family
Jo
(Jang, 1993). An overview of these layers are presented in many literatures (Dao et al., 2019a; Ly et al.,
7
Journal Pre-proof
of
Fig 2. Illustration of basic ANFIS structure with two input parameters.
ro
2.1.3. Particle Swarm Optimization (PSO)
-p
PSO is an efficient-swarm intelligence technique proposed and developed based on the social behavior
re
of bird flocks for solving complex optimization problems (Kennedy and Eberhart, 1995; Shi and
lP
Eberhart, 1998). PSO defines swarm and particle as the population of the potential solution and each
na
individual in that, respectively. The particles fly in the domain space according to some simple
formulae to find their best positions, which are the optimized solutions of an optimization problem (Bui
ur
SVM is a supervised learning algorithm that analyzes data for classification and regression, which was
developed using the idea of statistical learning theory (CORTES and VAPNIK, 1995; Vapnik, 1998).
The main objective of SVM algorithm is to find a hyperplane in an N dimensional space that distinctly
separates data points, maximizing the distance between two datasets (Fig 3). In SVM, support vectors
are data points that are close to the hyperplane, which contribute to the determination of the position
and the orientation of the hyperplane. This hyper plane is in a feature space induced by a kernel K,
8
Journal Pre-proof
which defines a dot product in that space (Evgeniou and Pontil, n.d.; Wahba, 1990). Thus, SVM try to
maximize the distance between the hyperplane and the data points through a cost function for solving
of
ro
-p
re
Fig 3. Illustration of SVM algorithm.
lP
In the present work, various validation criteria such as Correlation Coefficient (R) and Mean Absolute
Error (MAE), Score Skill (SS), Probability of Detection (POD), Critical Success Index (CSI) and False
ur
Alarm Ratio (FAR) were used to validate the developed AI models. Out of these, R allows us to
Jo
identify the statistical relationship between two actual and predicted values (Pearson Karl and Galton
Francis, 1895). It yields a value between 0 and 1 inclusive, where 0 is no correlation and 1 is total
correlation (Ly et al., 2019c; Dao et al., 2019b; Pham et al., 2020). In the case of MAE, which has the
same units as the quantity being estimated, low value of MAE basically indicates good accuracy of
prediction output using the models (Montavon et al., 2013; Ly et al., 2019e; Nguyen et al., 2019b;
Thanh et al., 2020). In term of SS, it is given by the following formula (Murphy, 1988; Benedetti,
9
Journal Pre-proof
A-A ref
SS= ×100, (1)
A perf -A ref
where A is the measure of accuracy (i.e. MAE or R) of the proposed forecast, Aref is the set of reference
forecasts and Aperf is a perfect forecast. In this case, Aperf is taken as what actually happened. SS has a
range of negative infinity to 100. A positive value of SS means the proposed forecast is an
improvement over the reference forecast. On the other hand, a negative value of SS means that the
proposed forecast has smaller performance than the reference forecast. In this study, the reference
forecast was chosen as Global Forecast System (GFS) produced by United States' National Weather
of
Service (Kanamitsu et al., 1991).
ro
With contingency scores, the formulation for calculating POD, CSI and FAR for a given threshold are
-p
expressed as follows (Scharf and Demeure, 1991; Gneiting and Raftery, 2007):
re
Hits
POD= , (2)
lP
Hits+Misses
Hits
(3)
na
CSI= ,
Hits+Misses+FalseAlarms
ur
FalseAlarms
FAR= (4)
Hits+FalseAlarms
Jo
where, Number of Hits, Number of Misses, Number of False Alarms and Number of Correct Negative
Number of False
Forecast Rain Number of Hits
Alarms
10
Journal Pre-proof
Number of Number of
No rain
Misses Correct Negative
In this study, Monte Carlo method which is highly efficient for propagating variability of input
parameters through the models (Robert and Casella, 2004; Ly et al., 2019a; Hun et al., 2019) has been
used in order to investigate the robustness of AI three models: ANN, PSOANFIS and SVM. In this
of
method, calculation of output is realized by repeating randomly inputs following the probability density
ro
function of each (Mordechai, 2011). In this way, statistical behavior of each input is fully propagated
-p
on the output through the models. Statistical analysis of the output is then determined in order to
re
explore the influence of input variability and the robustness of applied models (Goffart et al., 2017;
In Monte Carlo simulation, statistical analysis of output requires usually a convergence of variability
na
propagation, i.e. an optimal number of realizations. In this work, statistical convergence function
ur
CONV of the mean of the output result has been calculated using the following equation (Guilleminot
n MC
1
CONV(n MC )
n MC
Output ,
i1
i
(5)
where n MC is the number of Monte Carlo realizations. Optimal value of nMC could demonstrate a time-
to-convergence to the stationary solution. Various other statistical parameters such as mean, median,
standard deviation and quantiles were also used in this paper in order to fully quantify behavior of
probability distribution. Q25 , Q 50 and Q 75 denote, respectively, the 25th , 50th and 75th quantiles. For
illustration, Q25 splits off the highest 75% of data values from the lowest 25%.
11
Journal Pre-proof
2.2. Data used
In this study, rainfall data was collected from meteorological gauge located at Latitude 20.763m and
Longitude 105.312m in the Cao Phong district, Hoa Binh province, Vietnam (Fig 4). Weather data was
obtained from the Global Weather Data for SWAT (Dile and Srinivasan, 2014; Fuka et al., 2014)
(available at: https://globalweather.tamu.edu). Hoa Binh province is a tropical moon region with two
separate seasons: dry season (November to April) and rainy season (May to October). Max.
Temperature is in July (32.4°C) whereas min. Temperature (°C) is in January (12.8°C). Highest
of
monthly rainfall is in September (332mm) (Table 2). This province has the largest hydroelectric dam in
ro
Vietnam with 128 m in height, and 970 m in length provided the required electricity for the most
northern province of Vietnam. The present study is important as the rainfall is the main water source of
-p
the Black River Reservoir where the water is stored for the dam operation.
re
Table 2. Monthly meteorological characteristics of Hoa Binh province (https://en.climate-
lP
data.org/asia/vietnam/hoa-binh-province/hoa-binh-4274/#climate- graph).
na
Months
Parameters
January February March April May June July August September October November December
ur
Avg. 16.5 17.8 20.3 24.2 27.7 29 28.6 28.5 27.3 24.8 21.8 18.4
Jo
Temperature
(°C)
Min. 12.8 14.4 16.7 20.5 23.4 25.1 24.9 24.9 23.8 20.7 17.8 14.4
Temperature
(°C)
Max. 20.3 21.3 23.9 28 32 32.9 32.4 32.1 30.9 29 25.9 22.5
Temperature
(°C)
12
Journal Pre-proof
(mm)
In modeling, a total of five parameters namely maximum temperature, minimum temperature, wind
speed, relative humidity, and solar radiation were used as input variables and daily rainfall data was
used as an output variable for generating training and testing datasets. In total 3653 data samples were
collected during the period January 01, 2004 to December 31, 2013.
Initial analysis of the data used is presented in Table 3. It can be observed that the daily rainfall varies
from 0 mm (that is no rain) to 59.1507 mm with mean value of 7.4195 mm and standard deviation of
of
8.8599 mm. The maximum temperature ranges from 8.0804°Cto 42.2402°C with mean value of
ro
26.8487°C and standard deviation of 6.4102°C. The minimum temperature varies from -0.8650°C to
-p
27.2890°C with mean value of 17.7903°C and standard deviation of 5.0405°C. The wind speed ranges
from 0.6414 m/s to 2.8707m/s with mean value of 1.3493m/s and standard deviation of 0.3070 m/s.
re
The relative humidity varies from 0.2198 to 0.9822 with mean value of 0.8260 and standard deviation
lP
of 0.1063. The solar radiation ranges from 1.5933 MJ/m2 to 29.5522 MJ/m2 with mean value of
13
Journal Pre-proof
of
ro
-p
re
lP
14
Journal Pre-proof
3. Methodology
of
ro
-p
re
lP
na
ur
Jo
15
Journal Pre-proof
of
ro
-p
re
lP
na
ur
Jo
16
Journal Pre-proof
Modeling of prediction of daily rainfall using AI models was carried out in four main steps (Fig 5): (1)
generation of datasets, (2) construction of the models, (3) validation of the models, and (4) robustness
analysis.
Step 1, Generation of datasets: In this step, the data collected was randomly split into two parts. One
part includes 70% of the data, which was then used to create the training dataset. Another part consists
of 30% remaining data, which was used to create the testing dataset.
Step 2, Construction of the models: In this step, the training dataset was used to train and construct the
of
models. With PSOANFIS model, PSO optimization was used with 25 particles, the inertia weight of
ro
0.4 and 1000 iterations to optimize the consequent and antecedent parameters of ANFIS model. Using
-p
the optimal parameters, ANFIS was trained with the Gaussian membership function and the c-means
clustering algorithm. In ANN, the scaled conjugate gradient algorithm was used to train the model, a
re
trial-and-error test was applied to select the best number of hidden layers of 10. To construct SVM,
lP
third degree polynomial kernel function was chosen, the regularization constant (c) of box constraint
na
Step 3, Validation of the models: The constructed models were validated using both training and
ur
testing datasets. Various criteria namely R, MAE, SS, CSI, POD, and FAR were used to validate and
Jo
Step 4, Robustness analysis: Monte Carlo simulation was prepared in order to investigate the
robustness of AI methods. In the Monte Carlo simulations, 1000 runs were performed using each AI
method. A total of 1000 Monte Carlo runs were then executed using all three AI methods in order to
compare statistical information. Statistical and convergence analysis were finally carried out to deduce
efficient feedback.
17
Journal Pre-proof
4.1. Validation of the AI models
Performance of three AI prediction models was evaluated and presented in Fig 6 and Table 4. Figs 6a,
b, and c present output results of ANN, PSOANFIS and SVM, associated with the training part (70% of
data) in function of the corresponding targets. Figs 6d, e, and f present output results of ANN,
PSOANFIS and SVM, associated with the testing part and the corresponding testing targets. Linear
equations “Predicted = a*Target +b” were also presented in each case to estimate the correlation of the
of
results. Figs 6a, b and c present a slope a = 0.717, a = 0.758 and a = 0.659 for the linear fit equation,
ro
showing that the angles between the linear fit and horizontal line were 35.64°, 37.16° and 33.38° for
-p
training ANN, PSOANFIS and SVM, respectively. This indicates that PSOANFIS provided the closest
linear fit to the diagonal line compared to ANN and SVM. This was also confirmed by the values of R
re
(Table 4). For the training part, PSOANFIS gave the highest value of R compared to ANN and SVM
lP
(i.e. R=0.844, 0.861, 0.850 for ANN, PSOANFIS and SVM, respectively). On the contrary, SVM
na
provided the best training result as per the MAE criterion (i.e. MAE=3.171, 3.141, 2.853 for ANN,
18
Journal Pre-proof
of
ro
-p
re
lP
na
Fig 6. Predicted versus target rainfall for the training dataset using: (a) ANN, (b) PSOANFIS, and (c)
ur
SVM. Predicted versus target rainfall for the testing dataset using: (d) ANN, (e) PSOANFIS, and (f)
Jo
SVM.
For the testing dataset, SVM algorithm exhibited the best prediction results with respect to both MAE
and R criteria (MAE = 3.209, 3.281, 2.728 and R = 0.829, 0.844, 0.863 using ANN, PSOANFIS and
SVM, respectively). However, the slope of the linear equation using SVM is not the case of smallest
deviation regarding to the diagonal line. As shown in Fig 6, the slopes are a = 0.683, 0.819 and 0.733
using ANN, PSOANFIS and SVM, respectively. This indicates that the angles between linear equation
19
Journal Pre-proof
Table 4. Summary information of prediction capability of three AI models.
of
Testing ANN 3.209 0.829
ro
PSOANFIS 3.281 0.844
Fig 7. Histogram of error between predicted result and corresponding target using three AI models for
20
Journal Pre-proof
As shown in Fig 7, SVM method exhibited a highest peak and centered at zero. Statistical information
of histograms of e is also indicated in Table 5. The mean and median values of e are -0.065, 0.227
using ANN, 0.317, 0.542 for PSOANFIS and -0.333, 0.224 for SVM, respectively. Based on these
values, ANN and SVM provided the closest mean and median values with respect to zero error. SVM
model presented lowest variation of error distribution with respect to all Q 25 , Q 75 and standard
deviation (StD) values. In the case of Q 25 , ANN, PSOANFIS and SVM presented values of -1.600, -
1.829, and -1.441, respectively. As regards to Q 75 , ANN, PSOANFIS and SVM produced values of
of
2.236, 2.574, and 1.544, respectively. Considering StD, ANN, PSOANFIS and SVM gave values of
4.935, 4.634, and 4.244, respectively. These observations confirmed the highest concentration of error
ro
distribution around zero value with SVM model. From overall analysis of error distribution, SVM gave
-p
smallest deviation between prediction output and target rainfall.
re
Table 5. Error distribution of the AI techniques.
lP
Method Q25 (mm) Q50 (mm) Q75 (mm) Mean (mm) StD (mm)
na
The accuracy of the proposed AI models is referenced to the accuracy of the GFS forecast. As the size
of GFS data is very important (i.e. approximately 1Gb for one day information under GRB2 format, 1°
grid size), it is not possible to request a subset of the archived database; thus, only GFS data from June
21
Journal Pre-proof
01, 2010 to August 31, 2010, were retrieved for calculation of SS. Indeed, based on the experimental
data, all 92 days from June 01, 2010 to August 31, 2010 had the rain at the study area (the minimum
rainfall recorded was 1.27mm). In addition, the average measured rainfall was 6.27, 13.92 and 19.18
mm for June, July and August, 2010, respectively. Therefore, we only compared the proposed AI
models and the reference forecast under a real rainy condition. Consequently, there were around 90 Gb
of GRB2 files which have been downloaded from the NOAA database. GRB2 files were visualized
using zyGrib (Version 8.0.1) and Matlab (Version R2018). The cumulative daily rainfall was down
of
scaled from the global GFS forecast map at the location of the study area (i.e. longitude 105.312 and
latitude 20.763). Fig 8 presents an example of visualization of GFS rainfall forecast for Northern region
ro
of Vietnam including study area.
-p
re
lP
na
ur
Jo
Fig 8. GFS rainfall forecast map for Northern region of Vietnam (August 24, 2010, at 12h), the study
area is marked by red square. GFS rainfall forecast 24h ahead is 5.66 mm/h at the study area.
Fig 9 shows the evaluation of daily rainfall from June 1, 2010 to August 31, 2010, including the
measured data and the predicted values by ANN, PSOANFIS, SVM and GFS. Table 6 indicates the
values of SS combining the use of three AI models and two quality assessment criteria such as MAE
and R. Based on MAE, the values of SS are 62.56, 67.17, and 66.16 using ANN, PSOANFIS and
22
Journal Pre-proof
SVM, respectively. Based on R, the values of SS are 71.59, 80.19, and 71.83 using ANN, PSOANFIS
and SVM, respectively. It is seen that all values of SS given in Table 6 are positive, indicating that the
proposed AI models exhibited an improvement compared to the reference forecast. However, it should
also be noticed that such calculation of SS was performed only for three months in summer of 2010. A
longer range of days data analysis (for instance, by trying to overpass the GFS archived data size
of
ro
-p
re
lP
na
ur
Jo
Fig 9. Daily rainfall from June 2010 to August 2018, from measured data, predicted by ANN,
Table 6. Values of SS of the models based on different criteria such as MAE and R.
23
Journal Pre-proof
4.1.3. Using contingency scores
Contingency scores (POD, CSI and FAR) were evaluated to assess performance of the AI models. The
daily rainfall threshold values range from 2 to 30 mm. Fig 10 presents the evaluation of CSI, POD and
FAR in function of daily rainfall threshold whereas all these values of the models are indicated in Table
7. As shown in Figs10a and 10b, the values of CSI and POD decrease on increasing the value of
threshold for all three AI models. In addition, the evaluation result doesn’t follow a linear
approximation. At 2 mm of threshold, the values of CSI are 0.76, 0.75 and 0.78; the values of POD are
of
0.91, 0.94 and 0.89; the values of FAR are 0.18, 0.21 and 0.14, using ANN, PSOANFIS and SVM,
ro
respectively. Whereas at 10 mm of threshold, the values of CSI decrease to 0.69, 0.70 and 0.70; the
values of POD decrease to 0.85, 0.80 and 0.81; the values of FAR are 0.21, 0.15 and 0.15, using ANN,
-p
PSOANFIS and SVM, respectively. Based on the observation of contingency scores at different
re
thresholds, it could be deduced that all three AI models exhibit good performance for threshold ranging
lP
from 2 to 18 mm. It is also interesting noticed that SVM model provides lowest FAR from 2 to 26 mm
of threshold.
na
ur
Jo
24
Journal Pre-proof
Fig 10. Evaluation of contingency scores such as (a) CSI, (b) POD and (c) FAR in function of
of
threshold.
ro
Table 7. Values of contingency scores of the models at different thresholds.
Threshold -p
re
ANN PSOANFIS SVM
(mm)
lP
25
Journal Pre-proof
26 0.20 0.23 0.38 0.38 0.58 0.48 0.16 0.18 0.35
Robustness of the AI models was conducted to evaluate prediction capability of these models including
selection of training dataset. In this study, 70% of data were randomly taken for the development of
ANN, PSOANFIS and SVM algorithms. In order to investigate the robustness of each AI technique
of
under input variability, 1000 Monte Carlo simulations were performed to take into account random
ro
combinations of data index. Therefore, 1000 values of R and MAE criteria were obtained by
calculating the deviation between the testing target and the corresponding outputs for each model.
-p
Statistical analysis of R and MAE distributions was conducted to finally provide significant feedback to
re
deduce the robustness of each AI model with the presence of input variability.
lP
The statistical convergence of the mean of R and MAE was first investigated in order to determine the
na
optimal number of Monte Carlo simulations as shown in Fig 11. A large variation of both R and MAE
was observed while using ANN technique along with a small number of nMC (i.e. nMC< 150). It means,
ur
output obtained by ANN exhibited a high order of fluctuation (i.e. a low degree of robustness in the
Jo
presence of input variability). The stationary solution using ANN was found at nMC=500 and 200 for R
and MAE criteria, respectively. In the cases of PSOANFIS and SVM, the amplitude of variation was
smaller than that of the case using ANN. Consequently, the rate of convergence of these two models
was better than ANN algorithm. The optimal numbers nMC were approximately 400, 200 for R using
PSOANFIS and SVM, respectively, whereas they were 150 and 100 in the case of MAE for
PSOANFIS and SVM, respectively. It is worth noting that for both R and MAE criteria, SVM method
provided rather small variation (high degree of robustness) and quick convergence (time-consuming) to
the stationary result, which could indicate that SVM technique is the most stable prediction algorithm
26
Journal Pre-proof
while accounting for input uncertainty. Besides, the number of 1000 realizations were proven to be
of
ro
Fig 11. Statistical convergence function using the AI models: (a) R and (b) MAE
-p
Normalized histograms of 1000 values of R and MAE criteria are presented in Fig 12 corresponding to
re
1000 Monte Carlo simulations. Quantitative summary of statistical information is also presented in
lP
Table 8. In the case of R, the values of quantile Q 25 were 0.817, 0.836, 0.843, the values of median
were 0.841, 0.846, 0.849, values of quantile Q 75 were 0.854, 0.854, 0.855 and the values of mean were
na
0.829, 0.844, 0.849 for ANN, PSOANFIS and SVM algorithms, respectively. These results indicate
ur
that SVM model could provide better prediction performance compared to others. It is also noticed that
SVM model contributed a lowest value of dispersion of R (StD = 0.009) compared to ANN and
Jo
27
Journal Pre-proof
Fig 12. Normalized histograms after 1000 Monte Carlo realizations: (a) R (with a resolution of 0.01)
Similar observations were also for the statistical results of MAE. The MAE values of Q 25 , Q 50 , Q 75 and
mean were 3.004, 3.164, 3.490, 3.294 for ANN method, whereas that of PSOANFIS model were 3.069,
3.145, 3.233, 3.155, respectively. The lowest values of Q 25 , Q 50 , Q 75 and mean of MAE were observed
using SVM method (i.e. 2.782, 2.844, 2.911 and 2.846, respectively). Standard deviation of MAE in
the case of SVM was also the smallest value, StD = 0.097, compared with 0.407, 0.129 for ANN and
of
PSOANFIS, respectively.
ro
Table 8. R and MAE distribution of the AI models.
(mm)
PSOANFIS 3.069 3.145 3.233 3.155 0.129
5. Discussion
Accurate prediction of rainfall is one of the key problems for proper water resource management, flood
control and agriculture. In this study, three advanced AI techniques namely PSOANFIS, ANN and
SVM were developed and applied models in the Hoa Binh province, Vietnam to compare for daily
28
Journal Pre-proof
rainfall prediction. Results of these model studies were compared to select best model. Comparative
results of these models along with the results of other model studies are presented in the Table 9.
Analysis of results indicated that performance of these models in comparison to previous model studies
for prediction of rainfall 24h ahead (i.e. RSVM=0.829, MAESVM=2.728 mm, PODSVM(2mm)=0.89,
Rainfall is a dichotomous variable. Therefore, validation of the ability of the proposed AI models for
rainfall prediction for the transit period from a non-rain to a rainy day is desirable to indicate the
of
strengths and/or weaknesses of the models. In view of this, evaluation of daily rainfall for four years
ro
from 2010 to 2013 was done for measured and predicted data by ANN, PSOANFIS and SVM (Fig 13).
In the study area, the non-rain period (i.e. daily rainfall lower than 10 mm) typically occurs from
-p
November to April, whereas the rain period between May and October. Based on the plotted time series
re
rainfall data, the three proposed AI models, especially SVM, exhibit strong ability to track behavior of
lP
rain process in passing from a non-rain period to a rain period. Meteorological record indicated that the
period January to April in the year 2010 was non-rain period. During this period, ANN, PSOANFIS
na
and SVM models also predicted very small quantity of rainfall (except a high peak of rainfall was
ur
observed in January month as rainfall is a dichotomous variable). For the period April to October in the
Jo
year 2010, and from July to October (wet season), the predictive ability of precipitation of proposed AI
models was very good. For the years 2011, 2012 and 2013, prediction performance of the proposed AI
models when passing from a non-rain period to a rainy day was also very good.
29
Journal Pre-proof
Table 9. Comparison between this study and previously published works for short-term rainfall prediction.
Reference AI methods Lead Input variables Space scale Nr of Metrics used Values of metrics
used time stations
(Kuligows Back- 6h, 24h Moisture: total column precipitable Medium- 4 R, RMSE, RMSE=2.552 mm
ki and propagation water, column-average relative scale Threat score R=0.549
Barros, neural humidity, pressure levels, (Youghioghe Threat scrore6h max=0.32
1998) network equivalent potential temperature ny basin and Threat scrore24h max=0.47
Vertical lift: vertical velocity, Swatara
zonal/meridional velocity,
horizontal divergence, absolute
vorticity advection, thermal
Creek basin,
USA)
o f
ro
advection, vertical lapse rates
(Sedki et ANN-back- 24h Rainfall and runoff daily data Local 3 RMSE, R2 RMSEGABP =0.162 mm/day
al., 2009) propagation
(BP);
(Ourika,
Morocco)
- p R2 GABP =0.91
RMSEBP =0.199 mm/day
e
ANN-genetic R2 BP =0.87
(Hung et
algorithm-BP
speed
a l
temperature, air pressure, wind
(Bangkok,
Thailand)
Efficiency
Index
R4h =0.64, R5h =0.62, R6h =0.60
RMSE1h =0.87 mm, RMSE2h =1.36
mm, RMSE3h =1.72 mm,
u
Mean temperature, daily max
temperature, min temperature, total
Global
(Turkey)
28 MSE, R2 R2 min =0.701
R2 max=0.882
Jo
CIGIZOG specific humidity, total evaporation MSEmin =2.12 mm2
LU, 2009) MSEmax=9.37 mm2
(Wu and ANN-Moving 24h Max temperature, min temperature Local 1 RMSE, RMSEANN-MA =5.3 mm.
Chau, Average; (Wuxi, Willmott’s RMSEANN-SSA =5.8 mm
2013) ANN-Singular China) index
Spectrum
(Dabhi and Wavelet- 24h Max temperature, min temperature, Local - MAE, MSE, MAEGP =7.02 mm
Chaudhary PostFix- mean temperature evaporation, (Anand, R, Adjusted MSEGP =49.25 mm2
, 2014) Genetic relative humidity India) fitness RGP =0.67
Programming; MAEW-ANN =7.23 mm
Wavelet ANN MSEW-ANN =52.66 mm2
RW-ANN =0.69
30
Journal Pre-proof
(Sánchez- Hierarchical 24h Water vapor: total precipitable Local 1 Accuracy Accuracy=80.440
Monedero nominal– water, equivalent potential (Santiago de Geometric GM=35.470
et al., ordinal temperature, humidity, pressure Compostela, mean (GM) AMAE=0.710 mm/6h
2014) support vector levels Spain) Average of the
classifier Vertical movements: temperature, Mean
wind speed, wind direction, CAPE, Absolute Error
CIN (AMAE)
(Altunkayn Wavelet - 24h Rainfall Local 2 RMSE, R2 , R2 Develi =0.952, CEDeveli =0.921
ak and Multilayer (Develi and CE, Skill RMSEDeveli =0.907 mm/day
Nigussie, Perceptron - Tomarza, Score R2 Tomarza=0.941, CETomarza=0.919
2015)
(Sehad et
Seasonalgorith
m
Support vector 3h, 24h Optical and microphysical
Turkey)
Medium- 304
o f
R, RMSE,
RMSETomarza=0.968 mm/day
-
domain with
p Bias 24h =0.61 mm, MAE24h =1.14mm
(Asanjan
et al.,
Recurrent
neural
From
0.5h to
Longwave IR channel of GOES
l
provided by Climatic Prediction
P
Medium-
scale
- R, RMSE,
POD, CSI,
For state of Oregon:
RMSE1h =3.18mm/h, R1h =0.65,
a
2018) network; 6h Center (Oregon, FAR POD1h =0.68, CSI1h =0.5, FAR1h =0.35
Long every Oklahoma, RMSE6h =2.20mm/h, R6h =0.5,
Short‐Term
Memory -
0.5h
r n and Florida,
USA)
POD6h =0.35, CSI6h =0.35,
FAR6h =0.45
PERSIANN
u
Jo
algorithm
(Jing et al., Multi-Level 0.5, 1 Raw CINRAD/SA radar Medium- 5 POD, FAR, POD1h =0.6937, FAR1h =0.2404,
2019) Correlation and observations scale CSI, Heidke CSI1h =0.5810, HSS1h =0.6779
Long Short- 1.5h (Hangzhou, skill score POD1.5h =0.6538, FAR1.5h =0.2818,
Term Memory Nanjing, (HSS) CSI1.5h =0.5348, HSS1.5h =0.6298
Xiamen,
Changsha,
and Fuzhou,
China)
Our study ANN 24h Min temperature, max temperature, Local (Hoa 1 R, MAE, RSVM=0.829
PSOANFIS solar radiation, wind speed, relative Binh, POD, CSI, MAESVM=2.728 mm
SVM humidity Vietnam) FAR, PODSVM(2mm)=0.89
Robustness CSISVM(2mm)=0.78
analysis FARSVM(2mm)=0.14
31
Journal Pre-proof
o f
r o
- p
r e
l P
na
u r
Jo
32
Journal Pre-proof
of
ro
-p
re
lP
na
ur
Jo
33
Journal Pre-proof
of
ro
-p
re
lP
Fig 13. Evaluation of daily rainfall for four years from 2010 to 2013 based on measured data and
na
However, it should be noted that the group of input variables employed in this study for training the AI
Jo
models was not the most appropriate for the 24-h physical process of precipitation. It has been pointed
out in many published works that the mechanisms producing precipitation in the next 24 hours are
2014) have classified information of clouds into three main groups: condensation nuclei, water vapor
and vertical movements. Therefore, in their studies, a robust ensemble of input variables was selected
at different pressure levels involving total precipitable water, equivalent potential temperature,
humidity (group of water vapor); wind speed, wind direction, convective available potential energy
(CAPE), convective inhibition (CIN) (group of movements). Hashim et al. (2016) (Hashim et al., 2016)
34
Journal Pre-proof
has also confirmed the significant influence of information of clouds when studying rainfall in Patna
city, India, based on a set of variables including cloud cover, vapor pressure, max and min temperature
and wet day frequency. Thus, as rainfall is a complex non-linear atmospheric process largely depending
on local-scale space (Applequist et al., 2002), it is still difficult to identify the most suitable set of
meteorological variables for training the AI models taking into account the physical mechanism of 24-h
rainfall process. As investigated by Kuligowski and Barros (1998) (Kuligowski and Barros, 1998), a
combination of ten small-scale meteorological variables (regrouping moisture and vertical lift
of
properties) at different altitudes has gave a RMSE around 2.5 mm when applying neural network for
ro
recommended to select meteorological variables taking into account characteristics of clouds as
suggested by -p
many researchers for daily rainfall prediction problems (Kuligowski and Barros, 1998;
re
Hall et al., 1999; Sánchez-Monedero et al., 2014; Ortiz-García et al., 2014).
lP
Despite constrains mentioned above, the present model study suggests that performance of SVM is the
best for daily rainfall prediction in comparison to other two models (PSOANFIS and ANN) as this
na
technique provided highest value of R mean and lowest value of MAE mean over 1000 Monte Carlo
ur
simulations.
Jo
6. Conclusion
In the present study, three advanced Artificial Intelligent (AI) models namely PSOANFIS, ANN and
SVM were applied for the prediction of daily rainfall in Hoa Binh province. Maximum temperature,
minimum temperature, wind speed, relative humidity and solar radiation were used as input parameters
and daily rainfall as an output parameter in the models. Validation of the models was done using R and
MAE, SS, POD, CSI, and FAR. Performance of these models in prediction of daily rainfall is very
FARSVM(2mm)=0.14).
35
Journal Pre-proof
The present model study suggested that performance of SVM is the best for daily rainfall prediction in
comparison to other two models (PSOANFIS and ANN) as this technique provided highest value of R
mean and lowest value of MAE mean over 1000 Monte Carlo simulations. This model also found to be
the most robust and efficient prediction model using the Monte Carlo approach.
The AI based study would be helpful in quick and accurate prediction of daily rainfall. However, as the
rainfall is a dichotomous variable, validation of the proposed AI models for rainfall prediction for the
transit period from a non-rain to a rainy day is desirable to indicate the strengths and/or weaknesses of
of
the models. Analysis indicated that for the study years 2010, 2011, 2012 and 2013, prediction
ro
performance of the proposed AI models was very good. However, as the mechanisms producing
precipitation in the next 24 hours are highly related to information of clouds, integrated study may have
-p
to be adopted in refining AI models for the daily prediction of rainfall.
re
Conflict of Interest: The authors declare that there is no conflict of interest.
lP
Reference
na
Abbot, J., Marohasy, J., 2017. Skilful rainfall forecasts from artificial neural networks with long duration series
and single-month optimization. Atmospheric Research 197, 289–299.
ur
https://doi.org/10.1016/j.atmosres.2017.07.015
Abbot, J., Marohasy, J., 2014a. Input selection and optimisation for monthly rainfall forecasting in Queensland,
Australia, using artificial neural networks. Atmospheric Research 138, 166–178.
Jo
https://doi.org/10.1016/j.atmosres.2013.11.002
Abbot, J., Marohasy, J., 2014b. Input selection and optimisation for monthly rainfall forecasting in Queensland,
Australia, using artificial neural networks. Atmospheric Research 138, 166–178.
https://doi.org/10.1016/j.atmosres.2013.11.002
Abedini, M., Ghasemian, B., Shirzadi, A., Shahabi, H., Chapi, K., Pham, B.T., Ahmad, B.B., Bui, D.T., 2019. A
novel hybrid approach of Bayesian Logistic Regression and its ensembles for landslide susceptibility
assessment. Geocarto International 34, 1427–1457. https://doi.org/10.1080/10106049.2018.1499820
Acharya, N., Shrivastava, N.A., Panigrahi, B.K., Mohanty, U.C., 2014. Development of an artificia l neural
network based multi-model ensemble to estimate the northeast monsoon rainfall over south peninsular
India: an application of extreme learning machine. Climate Dynamics 43, 1303–1310.
Al Mamun, A., bin Salleh, Md.N., Noor, H.M., 2018. Estimation of short-duration rainfall intensity from daily
rainfall values in Klang Valley, Malaysia. Appl Water Sci 8, 203. https://doi.org/10.1007/s13201-018-
0854-z
AlHassoun, S.A., 2011. Developing an empirical formulae to estimate rainfall intensity in Riyadh region. Journal
of King Saud University - Engineering Sciences 23, 81–88. https://doi.org/10.1016/j.jksues.2011.03.003
36
Journal Pre-proof
Altunkaynak, A., Nigussie, T.A., 2015. Prediction of daily rainfall by a hybrid wavelet-season-neuro technique.
Journal of Hydrology 529, 287–301. https://doi.org/10.1016/j.jhydrol.2015.07.046
Applequist, S., Gahrs, G.E., Pfeffer, R.L., Niu, X.-F., 2002. Comparison of Methodologies for Probabilistic
Quantitative Precipitation Forecasting. Wea. Forecasting 17, 783–799. https://doi.org/10.1175/1520-
0434(2002)017<0783:COMFPQ>2.0.CO;2
Asanjan, A.A., Yang, T., Hsu, K., Sorooshian, S., Lin, J., Peng, Q., 2018. Short-Term Precipitation Forecast
Based on the PERSIANN System and LSTM Recurrent Neural Networks. Journal of Geophysical
Research: Atmospheres 123, 12,543-12,563. https://doi.org/10.1029/2018JD028375
Ashby, S.A., Taylor, M.A., Chen, A.A., 2005. Statistical models for predicting rainfall in the Caribbean. Theor.
Appl. Climatol. 82, 65–80. https://doi.org/10.1007/s00704-004-0118-8
Awadallah, A.G., Magdy, M., Helmy, E., Rashed, E., 2017. Assessment of Rainfall Intensity Equations Enlisted
in the Egyptian Code for Designing Potable Water and Sewage Networks [WWW Document]. Advances
in Meteorology. https://doi.org/10.1155/2017/9496787
Azadi, M., Taghizadeh, E., Memarian, M.H., Dmitrieva-Arrago, L.R., 2013. Comparing the results of
precipitation forecast based on mesoscale models on the territory of Iran during the cold season. Russian
Meteorology and Hydrology.
of
Benedetti, R., 2010. Scoring Rules for Forecast Verification. Mon. Wea. Rev. 138, 203–211.
https://doi.org/10.1175/2009MWR2945.1
ro
Bezak, N., Šraj, M., Mikoš, M., 2016. Copula-based IDF curves and empirical rainfall thresholds for flash floods
and rainfall-induced landslides. Journal of Hydrology, Flash floods, hydro-geomorphic response and risk
management 541, 272–284. https://doi.org/10.1016/j.jhydrol.2016.02.058
-p
Black, T.L., 1994. The New NMC Mesoscale Eta Model: Description and Forecast Examples. Wea. Forecasting
9, 265–278. https://doi.org/10.1175/1520-0434(1994)009<0265:TNNMEM>2.0.CO;2
Bui, D.T., Tsangaratos, P., Ngo, P.-T.T., Pham, T.D., Pham, B.T., 2019. Flash flood susceptibility modeling
re
using an optimized fuzzy rule based feature selection technique and tree based ensemble methods.
Science of The Total Environment 668, 1038–1054. https://doi.org/10.1016/j.scitotenv.2019.02.422
lP
Bui, K.-T.T., Tien Bui, D., Zou, J., Van Doan, C., Revhaug, I., 2018. A novel hybrid artificial intelligent
approach based on neural fuzzy inference model and particle swarm optimization for horizontal
displacement modeling of hydropower dam. Neural Comput & Applic 29, 1495–1506.
https://doi.org/10.1007/s00521-016-2666-0
na
Chegaar, M., Chibani, A., 2001. Global solar radiation estimation in Algeria. Energy Conversion and
Management 42, 967–973. https://doi.org/10.1016/S0196-8904(00)00105-9
Chen, W., Panahi, M., Pourghasemi, H.R., 2017. Performance evaluation of GIS-based new ensemble data
ur
mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA),
differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling.
CATENA 157, 310–324. https://doi.org/10.1016/j.catena.2017.05.034
Jo
CORTES, C., VAPNIK, V., 1995. Support-Vector Networks. Machine Learning 20, 273–297.
Dabhi, V.K., Chaudhary, S., 2014. Hybrid Wavelet-Postfix-GP Model for Rainfall Prediction of Anand Region
of India [WWW Document]. Advances in Artificial Intelligence. https://doi.org/10.1155/2014/717803
Dao, D.V., Ly, H.-B., Trinh, S.H., Le, T.-T., Pham, B.T., 2019a. Artificial Intelligence Approaches for
Prediction of Compressive Strength of Geopolymer Concrete. Materials 12, 983.
https://doi.org/10.3390/ma12060983
Dao, D.V., Trinh, S.H., Ly, H.-B., Pham, B.T., 2019b. Prediction of Compressive Strength of Geopolymer
Concrete Using Entirely Steel Slag Aggregates: Novel Hybrid Artificial Intelligence Approaches.
Applied Sciences 9, 1113. https://doi.org/10.3390/app9061113
Deo, R.C., Şahin, M., 2015. Application of the Artificial Neural Network model for prediction of monthly
Standardized Precipitation and Evapotranspiration Index using hydrometeorological parameters and
climate indices in eastern Australia. Atmospheric Research 161–162, 65–81.
https://doi.org/10.1016/j.atmosres.2015.03.018
Deo, R.C., Salcedo-Sanz, S., Carro-Calvo, L., Saavedra-Moreno, B., 2018. Chapter 10 - Drought Prediction
With Standardized Precipitation and Evapotranspiration Index and Support Vector Regression Models,
in: Samui, P., Kim, D., Ghosh, C. (Eds.), Integrating Disaster Science and Management. Elsevier, pp.
151–174. https://doi.org/10.1016/B978-0-12-812056-9.00010-5
37
Journal Pre-proof
Dile, Y.T., Srinivasan, R., 2014. Evaluation of CFSR climate data for hydrologic prediction in data-scarce
watersheds: an application in the Blue Nile River Basin. JAWRA Journal of the American Water
Resources Association 50, 1226–1241. https://doi.org/10.1111/jawr.12182
Dou, J., Yunus, A.P., Tien Bui, D., Merghadi, A., Sahana, M., Zhu, Z., Chen, C.-W., Khosravi, K., Yang, Y.,
Pham, B.T., 2019. Assessment of advanced random forest and decision tree algorithms for modeling
rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Science of The Total
Environment 662, 332–346. https://doi.org/10.1016/j.scitotenv.2019.01.221
El-Shafie, A., Noureldin, A., Taha, M., Hussain, A., Mukhlisin, M., 2012. Dynamic versus static neural network
model for rainfall forecasting at Klang River Basin, Malaysia. Hydrology and Earth System Sciences 16,
1151–1169. https://doi.org/10.5194/hess-16-1151-2012
Evgeniou, T., Pontil, M., n.d. WORKSHOP ON SUPPORT VECTOR MACHINES: THEORY AND
APPLICATIONS.
Faccini, F., Luino, F., Paliaga, G., Sacchini, A., Turconi, L., de Jong, C., 2018. Role of rainfall intensity and
urban sprawl in the 2014 flash flood in Genoa City, Bisagno catchment (Liguria, Italy). Applied
Geography 98, 224–241. https://doi.org/10.1016/j.apgeog.2018.07.022
Fuka, D.R., Walter, M.T., MacAlister, C., Degaetano, A.T., Steenhuis, T.S., Easton, Z.M., 2014. Using the
of
Climate Forecast System Reanalysis as weather input data for watershed models. Hydrological
Processes 28, 5613–5623. https://doi.org/10.1002/hyp.10073
ro
Garric, G., Douville, H., Déqué, M., 2002. Prospects for improved seasonal predictions of monsoon precipitation
over Sahel. International Journal of Climatology 22, 331–345. https://doi.org/10.1002/joc.736
G.Gouda, S., ZakiaHussein, ShuaiLuo, QiaoxiaYuan, 2019. Model selection for accurate daily global solar
-p
radiation prediction in China. Journal of Cleaner Production 221, 132–144.
Gneiting, T., Raftery, A.E., 2007. Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the
American Statistical Association 102, 359–378. https://doi.org/10.1198/016214506000001437
re
Goffart, J., Mara, T., Wurtz, E., 2017. Generation of stochastic weather data for uncertainty and sensitivity
analysis of a low-energy building. Journal of Building Physics 41, 41–57.
lP
https://doi.org/10.1177/1744259116668598
Guilleminot, J., Le, T.T., Soize, C., 2013. Stochastic framework for modeling the linear apparent behavior of
complex materials: Application to random porous materials with interphases. Acta Mechanica Sinica 29,
773–782. https://doi.org/10.1007/s10409-013-0101-7
na
Haddad, M.S., 2011. Capacity choice and water management in hydroelectricity systems. Energy Economics 33,
168–177. https://doi.org/10.1016/j.eneco.2010.05.005
Hall, T., Brooks, H.E., Doswell, C.A., 1999. Precipitation Forecasting Using a Neural Network. Wea.
ur
of
Kennedy, J., Eberhart, R., 1995. Particle swarm optimization, in: Proceedings of ICNN’95 - International
Conference on Neural Networks. Presented at the Proceedings of ICNN’95 - International Conference
ro
on Neural Networks, pp. 1942–1948 vol.4. https://doi.org/10.1109/ICNN.1995.488968
Khosravi, A., Nahavandi, S., Creighton, D., 2010. A prediction interval-based approach to determine optimal
structures of neural network metamodels. Expert Systems with Applications 37, 2377–2387.
https://doi.org/10.1016/j.eswa.2009.07.059
-p
Khosravi, K., Shahabi, H., Pham, B.T., Adamawoski, J., Shirzadi, A., Pradhan, B., Dou, J., Ly, H. -B., Gróf, G.,
Ho, H.L., 2019. A Comparative Assessment of Flood Susceptibility Modeling Using Multi-Criteria
re
Decision-Making Analysis and Machine Learning Methods. Journal of Hydrology.
Kuligowski, R.J., Barros, A.P., 1998. Localized Precipitation Forecasts from a Numerical Weather Prediction
lP
39
Journal Pre-proof
Ly, H.-B., Monteiro, E., Le, T.-T., Le, V.M., Dal, M., Regnier, G., Pham, B.T., 2019d. Prediction and Sensitivity
Analysis of Bubble Dissolution Time in 3D Selective Laser Sintering Using Ensemble Decision Trees.
Materials 12, 1544.
Ly, H.-B., Pham, B.T., Dao, D.V., Le, V.M., Le, L.M., Le, T.-T., 2019e. Improvement of ANFIS Model for
Prediction of Compressive Strength of Manufactured Sand Concrete. Applied Sciences 9, 3841.
https://doi.org/10.3390/app9183841
Lynch, P., 2008. The origins of computer weather prediction and climate modeling. Journal of Computational
Physics, Predicting weather, climate and extreme events 227, 3431–3444.
https://doi.org/10.1016/j.jcp.2007.02.034
Mahmud, M., Ross, R.S., 2005. Precipitation assessment of a superensemble forecast over South-East Asia.
Meteorological Applications 12, 177–186.
McCulloch, W.S., Pitts, W., 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of
Mathematical Biophysics 5, 115–133. https://doi.org/10.1007/BF02478259
Mehdi Keshtkar, M., Merad, L., Sidi Mohammed, M., Miraoui, A., BOUSAHLA, M., Hanaeia, H., Kakaei
Lafdani, E., Moghaddam Nia, A., Pahlavanravi, A., Ahmadi, A., JAJARMIZADEH, M., 2013. Daily
Rainfall-Runoff Prediction and Simulation Using ANN, ANFIS and Conceptual Hydrological
of
MIKE11/NAM Models. International Journal of Engineering & Technology Sciences 1.
Mghouchi, Y.E., T. Ajzoul, A., Bouardi, E., 2016. Prediction of daily solar radiation intensity by day of the year
ro
in twenty-four cities of Morocco. Renewable and Sustainable Energy Reviews 53, 823–831.
Mislan, Haviluddin, Hardwinarto, S., Sumaryono, Aipassa, M., 2015. Rainfall Monthly Prediction Based on
Artificial Neural Network: A Case Study in Tenggarong Station, East Kalimantan - Indonesia. Procedia
-p
Computer Science, International Conference on Computer Science and Computational Intelligence
(ICCSCI 2015) 59, 142–151. https://doi.org/10.1016/j.procs.2015.07.528
Møller, M.F., 1991. A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning. Neural Networks.
re
Montavon, G., Rupp, M., Gobre, V., Vazquez-Mayagoitia, A., Hansen, K., Tkatchenko, A., Müller, K.-R.,
Lilienfeld, O.A. von, 2013. Machine learning of molecular electronic properties in chemical compound
lP
optimization to predict the Standardized Precipitation and Evaporation Index in a drought-prone region.
Atmospheric Research 212, 130–149. https://doi.org/10.1016/j.atmosres.2018.05.012
Mousavi, S.M., Mostafavi, E.S., Jiao, P., 2017. Next generation prediction model for daily solar radiation on
ur
horizontal surface using a hybrid neural network and simulated annealing method. Energy Conversion
and Management 153, 671–682. https://doi.org/10.1016/j.enconman.2017.09.040
Murphy, A.H., 1988. Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation
Jo
40
Journal Pre-proof
Nguyen, H.-L., Le, T.-H., Pham, C.-T., Le, T.-T., Ho, L.S., Le, V.M., Pham, B.T., Ly, H.-B., 2019a.
Development of Hybrid Artificial Intelligence Approaches and a Support Vector Machine Algorithm for
Predicting the Marshall Parameters of Stone Matrix Asphalt. Applied Sciences 9, 3172.
https://doi.org/10.3390/app9153172
Nguyen, H.-L., Pham, B.T., Son, L.H., Thang, N.T., Ly, H.-B., Le, T.-T., Ho, L.S., Le, T.-H., Tien Bui, D.,
2019b. Adaptive Network Based Fuzzy Inference System with Meta-Heuristic Optimizations for
International Roughness Index Prediction. Applied Sciences 9, 4715.
https://doi.org/10.3390/app9214715
Nielsen, J.E., Thorndahl, S., Rasmussen, M.R., 2014. Improving weather radar precipitation estimates by
combining two types of radars. Atmospheric Research 139, 36–45.
https://doi.org/10.1016/j.atmosres.2013.12.013
Niu, J., Zhang, W., 2015. Comparative analysis of statistical models in rainfall prediction, in: 2015 IEEE
International Conference on Information and Automation. Presented at the 2015 IEEE International
Conference on Information and Automation, pp. 2187–2190.
https://doi.org/10.1109/ICInfA.2015.7279650
Novak, D.R., Bailey, C., Brill, K.F., Burke, P., Hogsett, W.A., Rausch, R., Schichtel, M., 2014. Precipitation and
of
Temperature Forecast Performance at the Weather Prediction Center. Weather and Forecasting 29, 489–
504.
ro
Ortiz-García, E.G., Salcedo-Sanz, S., Casanova-Mateo, C., 2014. Accurate precipitation prediction with support
vector classifiers: A study including novel predictive variables and observational data. Atmospheric
Research 139, 128–136. https://doi.org/10.1016/j.atmosres.2014.01.012
-p
Paoli, C., Voyant, C., Muselli, M., Nivet, M.-L., 2010. Forecasting of preprocessed daily solar radiation time
series using neural networks. Solar Energy 84, 2146–2160. https://doi.org/10.1016/j.solener.2010.08.011
PARTAL, T., CIGIZOGLU, H.K., 2009. Prediction of daily precipitation using wavelet—neural networks.
re
Hydrological Sciences Journal 54, 234–246. https://doi.org/10.1623/hysj.54.2.234
Partal, T., Cigizoglu, H.K., Kahya, E., 2015. Daily precipitation predictions using three different wavelet neural
lP
network algorithms by meteorological data. Stoch Environ Res Risk Assess 29, 1317–1329.
https://doi.org/10.1007/s00477-015-1061-1
Pearson Karl, Galton Francis, 1895. VII. Note on regression and inheritance in the case of two parents.
Proceedings of the Royal Society of London 58, 240–242. https://doi.org/10.1098/rspl.1895.0041
na
Pham, B.T., Nguyen, M.D., Dao, D.V., Prakash, I., Ly, H.-B., Le, T.-T., Ho, L.S., Nguyen, K.T., Ngo, T.Q.,
Hoang, V., Son, L.H., Ngo, H.T.T., Tran, H.T., Do, N.M., Van Le, H., Ho, H.L., Tien Bui, D., 2019a.
Development of artificial intelligence models for the prediction of Compression Coefficient of soil: An
ur
application of Monte Carlo sensitivity analysis. Science of The Total Environment 679, 172–184.
https://doi.org/10.1016/j.scitotenv.2019.05.061
Pham, B.T., Nguyen, M.D., Ly, H.-B., Pham, T.A., Hoang, V., Van Le, H., Le, T.-T., Nguyen, H.Q., Bui, G.L.,
Jo
2020. Development of Artificial Neural Networks for Prediction of Compression Coefficient of Soft
Soil, in: Ha-Minh, C., Dao, D.V., Benboudjema, F., Derrible, S., Huynh, D.V.K., Tang, A.M. (Eds.),
CIGOS 2019, Innovation for Sustainable Infrastructure, Lecture Notes in Civil Engineering. Springer
Singapore, pp. 1167–1172.
Pham, B.T., Prakash, I., Jaafari, A., Bui, D.T., 2018. Spatial Prediction of Rainfall-Induced Landslides Using
Aggregating One-Dependence Estimators Classifier. J Indian Soc Remote Sens 46, 1457–1470.
https://doi.org/10.1007/s12524-018-0791-1
Pham, B.T., Prakash, I., Singh, S.K., Shirzadi, A., Shahabi, H., Tran, T.-T.-T., Bui, D.T., 2019b. Landslide
susceptibility modeling using Reduced Error Pruning Trees and different ensemble techniques: Hybrid
machine learning approaches. CATENA 175, 203–218. https://doi.org/10.1016/j.catena.2018.12.018
Pham, B.T., Tien Bui, D., Pham, H.V., Le, H.Q., Prakash, I., Dholakia, M.B., 2017. Landslide Hazard
Assessment Using Random SubSpace Fuzzy Rules Based Classifier Ensemble and Probability Analysis
of Rainfall Data: A Case Study at Mu Cang Chai District, Yen Bai Province (Viet Nam). J Indian Soc
Remote Sens 45, 673–683. https://doi.org/10.1007/s12524-016-0620-3
Philip, N.S., Joseph, K.B., 2003. A neural network tool for analyzing trends in rainfall. Computers &
Geosciences 29, 215–223. https://doi.org/10.1016/S0098-3004(02)00117-6
41
Journal Pre-proof
Rajeevan, M., Pai, D.S., Anil Kumar, R., Lal, B., 2007. New statistical models for long-range forecasting of
southwest monsoon rainfall over India. Clim Dyn 28, 813–828. https://doi.org/10.1007/s00382-006-
0197-6
Robert, C., Casella, G., 2004. Monte Carlo Statistical Methods, 2nd ed, Springer Texts in Statistics. Springer-
Verlag, New York.
Sahai, A.K., Soman, M.K., Satyan, V., 2000. All India summer monsoon rainfall prediction using an artificial
neural network. Climate Dynamics 16, 291–302.
Sánchez-Monedero, J., Salcedo-Sanz, S., Gutiérrez, P.A., Casanova-Mateo, C., Hervás-Martínez, C., 2014.
Simultaneous modelling of rainfall occurrence and amount using a hierarchical nominal–ordinal support
vector classifier. Engineering Applications of Artificial Intelligence 34, 199–207.
https://doi.org/10.1016/j.engappai.2014.05.016
Scharf, L.L., Demeure, C., 1991. Statistical Signal Processing: Detection, Estimation, and Time Series Analysis.
Addison-Wesley Publishing Company.
Sedki, A., Ouazar, D., El Mazoudi, E., 2009. Evolving neural network using real coded genetic algorithm for
daily rainfall–runoff forecasting. Expert Systems with Applications 36, 4523–4527.
https://doi.org/10.1016/j.eswa.2008.05.024
of
Sehad, M., Lazri, M., Ameur, S., 2017. Novel SVM-based technique to improve rainfall estimation over the
Mediterranean region (north of Algeria) using the multispectral MSG SEVIRI imagery. Advances in
ro
Space Research 59, 1381–1394. https://doi.org/10.1016/j.asr.2016.11.042
Senior, C.A., Jones, R.G., Lowe, J.A., Durman, C.F., Hudson, D., 2002. Predictions of extreme precipitation and
sea-level rise under climate change. Philos Trans A Math Phys Eng Sci 360, 1301–1311.
https://doi.org/10.1098/rsta.2002.1001
Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360), pp. 69–73.
https://doi.org/10.1109/ICEC.1998.699146
Silvestro, F., Rebora, N., 2014. Impact of precipitation forecast uncertainties and initial soil moisture conditions
ur
of
interpolation and Bayesian gauge-based adjustment. Journal of Hydrology, Hydrologic Applications of
Weather Radar 531, 408–426. https://doi.org/10.1016/j.jhydrol.2015.05.049
ro
Warsito, B., Gernowo, R., Sugiharto, A., 2016. Rainfall Prediction by Using Wavelet General Regression Neural
Network. International Journal of Applied Mathematics and Statistics TM 54, 32–41.
Wei, C.-C., 2013. Soft computing techniques in ensemble precipitation nowcast. Applied Soft Computing 13,
-p
793–805. https://doi.org/10.1016/j.asoc.2012.10.006
Werbos, P., 1974. Beyond regression: New tools for prediction and analysis in the behavioral sciences.
Wilks, D.S., 2011. Statistical Methods in the Atmospheric Sciences. Academic Press.
re
Wu, C.L., Chau, K.W., 2013. Prediction of rainfall time series using modular soft computing methods.
https://doi.org/10.1016/j.engappai.2012.05.023
lP
Wu, C.L., Chau, K.W., Fan, C., 2010. Prediction of rainfall time series using modular artificial neural networks
coupled with data-preprocessing techniques. Journal of Hydrology 389, 146–167.
Yuan, W., Göncü, A., Ökten, G., 2015. Estimating sensitivities of temperature-based weather derivatives.
Applied Economics 47, 1942–1955. https://doi.org/10.1080/00036846.2014.1002888
na
Zellou, B., Rahali, H., 2019. Assessment of the joint impact of extreme rainfall and storm surge on the risk of
flooding in a coastal area. Journal of Hydrology 569, 647–665.
https://doi.org/10.1016/j.jhydrol.2018.12.028
ur
Jo
43
Journal Pre-proof
Highlights :
Rainfall prediction was carried out using AI methods such as ANN, PSOANFIS and SVM;
R, MAE, Skill Score and contingency scores were employed to validate the models;
Monte Carlo method was applied to analyze the robustness of AI models;
AI based study would be helpful in quick and accurate prediction of daily rainfall.
of
ro
-p
re
lP
na
ur
Jo
44
Journal Pre-proof
Author statement:
Vuong Minh Le: Data curation, Visualization, Writing- Original draft preparation
of
ro
-p
re
lP
na
ur
Jo
45
Journal Pre-proof
of
ro
-p
re
lP
na
ur
Jo
46