Forest Fire Prediction System Using Machine Learning
Forest Fire Prediction System Using Machine Learning
https://doi.org/10.22214/ijraset.2020.32546
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 8 Issue XII Dec 2020- Available at www.ijraset.com
Abstract: Forest fires, usually occurring in the forest area or wild land and are uncontrolled fires and cause significant
damage to natural and human resource, which are one of the most dangerous disasters to the ecological environment .The
Recommended system use various technology like Machine learning techniques and Artificial Intelligence and Wireless
network utilized for collecting 24-hour weather data continuously, which provides a high chance to reflect accurately the
status of forest environment. Depending on the system, we can decide which days have the high possibility of forest fires and
danger and paid special attention to prevent forest fire for forest guards. Forest fire prediction constitutes a significant
component of forest fire management. It contains a major role in resource allocation, mitigation and recovery efforts. This
system presently analyzed of the forest fire prediction methods based on machine learning . A novel forest fire risk prediction
algorithm, based on support vector machines, is presented. The algorithm depends on previous weather conditions and data in
order to predict the fire hazard level of a forest. The implementation of the algorithm using the present data and accurately
predict the hazard of fire occurrence.
Index Terms: wildfires; susceptibility mapping; machine learning; random forest; spatial-cross validation ; correlation and
regression
I. INTRODUCTION
Forest fires have become one of the major disasters occurring in recent years. The effects of forest fires have a lasting impact on the
environment as it led to deforestation and global warming, which is also one of its major cause of occurrence. Forest fires can be
dealt by collecting the satellite images of forest and if there is any emergency caused by the fires then the authorities are notified to
neutralize its effects. By the time the authorities get to know about condition, the fires would have already caused a lot of damage to
the specific sector. By adopting Data mining and machine learning techniques it can provide an efficient prevention approach where
data associated with forests can be used for predicting the places with high possibility of forest fires. Numerous algorithms like
Logistic regression, Support Vector Machine, Random forest, K-Nearest neighbors in addition to Bagging and Boosting predictors
are used, both with and without Principal Component Analysis (PCA). Among the models in which PCA was applied, Logistic
Regression gave the highest F-1 score of 68.26 and among the models where PCA was absent, Gradient boosting gave the highest
score of 68.36. Geostationary satellite remote sensing systems are a useful tool for forest fire detection and monitoring because of
their high temporal resolution over large areas. These computerized system is capable of capturing, storing, analyzing, and
displaying geographically referenced information that is, data identified according to location.
Forest Fire are a major environmental issue, creating economic damage and ecological imbalance while threatening human lives.
Fast detection is a key element for controlling such phenomenon. To achieve this, one alternative is to use automatic tools based on
local sensors, such as provided by meteorological stations.
B. Existing System
1) The current system consists of Data Mining and sensor which are capable of sensing the smoke and fire. In effect,
meteorological conditions (e.g. temperature, wind) are mostly the cause of forest fires and several fire indexes, such as the
forest Fire Weather Index (FWI), use such data. In this work, we explore a Data Mining (DM) approach to predict the burned
area of forest fires.
2) Five different DM techniques, e.g. Support Vector Machines (SVM) and Random Forests, and four distinct feature selection
setups (using spatial, temporal, FWI components and weather attributes), were tested on real-world data collected from the
northeast region of Portugal. The best configuration uses a SVM and four meteorological inputs (i.e. temperature, relative
humidity, rain and wind) and it is capable of predicting the burned area caused by small fires, which are more frequent. Such
knowledge is particularly useful for improving firefighting resource management (e.g. prioritizing targets for air tankers and
ground crews).
3) Our system consists of high temporal and spatial image to prevent these destructions. Using of Geostationary satellite remote
sensing systems which are a useful tool for forest fire detection and monitoring because of their high temporal resolution over
large areas. In this, we propose a combined 3-step forest fire detection algorithm (i.e., thresholding, machine learning-based
modeling, and post processing) with the help of geostationary satellite.
4) This threshold-based algorithm filtered the forest fire candidate pixels using adaptive threshold values considering the diurnal
cycle and seasonality of forest fires while allowing a high rate of false alarms. The random forest (RF) machine learning model
then effectively removed the false alarms from the results of the threshold-based algorithm (overall accuracy ~99.16%,
probability of detection (POD) ~93.08%, probability of false detection (POFD) ~0.07%, and 96% preventing the false alarmed
pixels for validation), and the remaining false alarms were removed through post processing using the forest map. This
threshold-based algorithm filtered the forest fire pixels using adaptive threshold values considering the diurnal cycle and
seasonality of forest fires while allowing a high rate of false alarms.
5) The random forest (RF) machine learning model then effectively removed the false alarms from the results of the threshold-
based algorithm (overall accuracy
6) ~99.16%, probability of detection (POD) ~93.08%, probability of false detection (POFD) ~0.07%, and 96% reduction of the
false alarmed pixels for validation), and the remaining false alarms were removed through post processing using the forest
map.
IV. METHODOLOGY
In this project we tried to make a prediction for the burned area within the Montesinho park. Forest Fires Data Set was used for this
analysis. The data was clusterized. Stepwise regression methods were applied to choose one best predictor. It is interesting to see,
which one of them has the biggest impact on the burned area in each cluster.
C. Method 3: Bagging
Bagging is a classic ensemble method known as bootstrap aggregation. Bagging algorithm consist of many classifiers each uses only
some portions of data in each iterations and then combining them through a model averaging techniques. The idea behind this is
reduce the over fitting in the class of models. The bootstrap method in bagging creates a random subset of data from a given dataset
by sampling
V. IMPLEMTATION
The implementation of the linear regression, ridge regression, and lasso regression algorithms are done using the Jupyter Notebook
.Jupyter notebook helps to write and execute Python in the browser, where it is open-source and widely used for the implementation
of machine learning algorithms such as regression, classification, and clustering.
A. Data Extraction
The data is extracted from the UCI machine learning repository. The data consist of meteorological. .FWI system data and amount
of area burned during fires over a period of 2000-2003 in Montesinho park in Portugal. The factors that mainly affect the forest fire
are the climatic conditions of the forest. The data set has clear description of the climatic conditions such as Relative humidity,
temperature of the forest, Wind speed and rainfall in the forest. These data is collected from the local sensors with are available in
the Portugal. The Portugal has around 162 weather stations so getting this data is not a big deal. The FWI system is the which is
widely used as a fire danger rating system. The data also contains the day , month and X and Y axis values where the fire occurred.
The getting the day and month we can separate the fires into week day and weekend. The next FWI data is like moisture code, Fire
index, Drought code and spread index which are mainly depend on the weather conditions. These values calculated by the FWI
system is a direct indicator of the fire intensity. The Relative humidity value is a changing one because it will be high in the morning
and keep reducing to the minimum value as hours past. The wind speed is a major factor since it can make the fire to spread rapidly.
From looking at the data we can say when the wind speed is around 15/hr the chances of fire is high. One of the most important
feature in the dataset is temperature of the forest which can cause fire.
B. Clustering Data
First, the coordinates were clusterized. The cluster amount was chosen using the elbow method. The K-Elbow Visualizer
implements the “elbow” method of selecting the optimal number of clusters for K-means clustering. K-means is a simple
unsupervised machine learning algorithm that groups data into a specified number (k) of clusters. Because the user must specify in
advance what k to choose, the algorithm is somewhat naive – it assigns all members to k clusters even if that is not the right k for
the dataset. The elbow method runs k-means clustering on the dataset for a range of values for k (say from 1-10) and then for each
value of k computes an average score for all clusters.
There is last bend somewhere near fifth point, and then the curve is more smoothed. So, as we can see, the optimal number of
clusters is 5 So, kmeans algorithm with the same configurations was applied to find the clusters.
Figure 3: Clusters
And for each cluster, the burned area prediction was found using regression method in machine learning. There is really small
correlation values between data and dependent variable, so the stepwise regression methods was applied to chose the best predictors.
Figure 4: cluster 0
Figure 5 :cluster 1
Figure 6: cluster 2
Figure 7: cluster 3
Figure 8: cluster 4
Figure 9: Predictor
The DMC predictor was chosen with r squared 14 %.So, the best model for this cluster is area = 0.1324 * DMC
F. Evaluation
In the evaluation section we will be selecting the best model in terms of Accuracy Various results of the predictive models . The
best model for this cluster is backward elimination algorithm result: area = 0.9878 * temp
REFERENCES
[1] Pradeep Kumar Singh, Amit Sharma, “An insight forest fire detection techniques using wireless sensor networks”, Signal Processing Computing and Control
(ISPCC)2017 4th International Conference on, pp 647653, 2018.
[2] Diwakar Pant, Sandeep Verma, Piyush Dhulia, “A study on disaster detection and management using WSN in Himalayan region of Uttarakhand”, Advances in
Computing Communication & Automation (ICACCA)(Fall)2018 3rd International Conference on, pp. 1-6 2018.
[3] Evizal Abdul Kadir, Sri Listia Rosa, An Yulianti, “Application of WSNs for Detection Land and Forest Fire in Riau Province Indonesia”, Electrical
Engineering and Computer Science (ICECOS)2018 International Conference on pp. 25-28, 2019.
[4] George E. Sakr, Imad H. Elhajj, George Mitri and Uche Chukwu C. Wejinya “Artificial Intelligence for Forest Fire Prediction”, Advanced Intelligent
Mechatronics ,Montréal, Canada, July 6-9, 2019
[5] Hanchao Li, Xiang Fei, “Study on Most Important Factor and Most Vulnerable Location for A Forest Fire Case Using Various Machine Learning
Techniques”, Electrical Engineering and Computer Science (ICECOS)2019 International Conference on pp. 25-28, 2019.
[6] R. Rishikesh, A. Shahina, A. Nayeemulla Khan “Predicting Forest Fires using Supervised and Ensemble Machine Learning Algorithms”, International Journal
of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8 Issue-2, July 2019
[7] Taylor, S. W. and Alexander, M. E. (2006). Science, technology, and human factors in fire danger rating: the Canadian experience., International Journal of
Wild land Fire 15(1): 121–135.
[8] Stojanova, D., Panov, P., Kobler, A., Dˇzeroski, S. and Taˇskova, K. (2006). Learning to predict forest fires with different data mining techniques, Conference
on Data Mining and Data Warehouses (SiKDD 2006), Ljubljana, Slovenia, pp. 255–258.
[9] Stocks, B. J., Lynham, T., Lawson, B., Alexander, M., Wagner, C. V., McAlpine, R. and Dube, D. (1989). Canadian forest fire danger rating system: an
overview, The Forestry Chronicle 65(4): 258–265.
[10] Ozbayo˘glu, A. M. and Bozer, R. (2012). Estimation of the burned area in forest fires ¨ using computational intelligence techniques, Procedia Computer
Science 12: 282–287
[11] Boubeta, M., Lombard´ıa, M. J., Gonz´alez-Manteiga, W. and Marey-P´erez, M. F. (2016). Burned area prediction with semiparametric models, International
Journal of Wildland Fire 25(6): 669–678.
[12] Ammann, H., Blaisdell, R., Lipsett, M., Stone, S. L., Therriault, S., Jenkins, J. and Lynch, K. (2001). Wildfire smoke: a guide for public health officials,
California Air Resources Board. http://www. arb. ca. gov/smp/progdev/pubeduc/wfgv8. pdf (accessed 06/02/08) .