Computer Networks

International Journal of Innovative Technology and Exploring Engineering (IJITEE)

ISSN: 2278-3075, Volume-9 Issue-1, November 2019

Prediction of Crops based on Environmental Factors

using IoT & Machine Learning Algorithms
Ashok Tatapudi, P Suresh Varma

 paramount society. Ancient people in their own territory grow

Abstract: India being an agricultural country, the most part of the plants and have been tailored for their needs. Therefore,
economy is depends on yield growth. Agriculture is largely many people such as human beings, animals and birds have
dependent on rainwater and also relies on different soil variables grown the natural crops and used them. The area of farming is
such as aspects of nitrogen, phosphorus, potassium and climate increasingly deteriorating owing to the development of new
such as temperature, precipitation, etc. The technological growth
innovative technology and techniques. Because of this,
in agriculture will increase the crop productivity. Remote sensing
systems like IOT systems are being more widely used in smart plentiful citizens with creativity were focused on developing
farming systems, these systems produce generous amount of data. synthetic goods that are composite items which contribute to
Machine learning is an ongoing work that has been filed to an unhealthy life. Nowadays, there is no knowledge among
forecast the plant based on data trends. The proposed system modern people regarding increasing the crops in the right
would integrate sensors such as Ph., Moisture, Rainfall, time and place. Such cultivation strategies often adjust the
Temperature and Humidity to analyze the information from these seasonal climatic conditions against the fundamental
sensors and to implement machine learning algorithms: Linear resources such as land, water and air which contribute to food
Regression, Decision Trees, Random Forest, and GDBoost. The
insecurity. After considering all of these problems and
most desirable crops are predicted according to the current
environment. This work gives farmers a better prediction to plant concerns such as climate, temperature, and various variables,
what kind of crops in their farm field based on the criteria listed there is no proper solution or engineering to solve the
above in order to improve Smart Farming's productivity. problem they face. In India, there are several ways to boost
agricultural economic growth. There are several directions in
Keywords: Agriculture, IoT, Machine Learning, Data which plant production and crop value can be enhanced and
Analytics, Prediction strengthened.


In this paper the proposed system is a form of plant The Internet of Things is an interconnection between
forecasting program to increase production based on key electronic systems, physical or virtual computers, artifacts
technologies: the internet of things and techniques of machine with unique identifiers and the ability to transfer information
learning. Sensor technology has been developed and sensor over a network without human-to-human or
forms such as humidity, temperature, soil moisture sensor, user-to-computer contact. In the field of agriculture, few
researchers have suggested IoT-based systems with machine
and pH sensors are used to detect the appropriate plant
learning to forecast plant type. This device which manages
prediction elements. Machine Learning technology predicts
the plant growth form. The proposed system helps to make
the crop based on the sensor data. Uses of these technologies
accurate judgments and analyze data obtained from the
helpful to the farmer for better production rate in agriculture. sensors and processed in the server and evaluated using
1.1 AGRICULTURE machine learning algorithms. The data sensed by the sensor
Improving the development and value of crops while from crop yield for different parameters of humidity,
increasing operating costs and degradation of the atmosphere temperature, precipitation, pH quality. Etc. is processed in
is a key objective of agriculture. Potential growth and yield storage by IoT systems that are then used to forecast plant
depends on many different attributes of development such as varieties that have a direct impact on crop growth after a
environment, surface properties, and control of irrigation and predictive decision is made to forward to the end user for
fertilizer. Agriculture is every economy's backbone. In a further intervention that will help the end user.
nation such as India, which is undergoing ever-increasing 1.3 MACHINE LEARNING
food demand due to rising growth, developments in the Machine learning is a tool that is commonly used to tackle
agricultural sector are needed to meet the needs. Since ancient agricultural problems. It is used in the analysis of large data
times, farming has been known as India's primary and sets and in the data sets to establish useful classifications and
patters. The overall goal of the Machine Learning system is to
Revised Manuscript Received on November 22, 2019.
* Correspondence Author use into an understandable framework. This paper's main
Ashok Tatapudi*, Assistant Professor, Department of Computer Science & Engineering, University College of Engineering, AdiKavi Nannaya University, Rajamahendravaram, India.
Science & Engineering, University College of Engineering, AdiKavi
Nannaya University, Rajamahendravaram, India. Email: crop based on the properties of soil and weather.
P Suresh Varma, Professor, Department of Computer Science &
Engineering, University College of Engineering, AdiKavi Nannaya
University, Rajamahendravaram, India. Email: sureshvarmap@gmail.com

Prediction of Crops based on Environmental Factors using IoT & Machine Learning Algorithms

As the population is growing in today's world and it is different system components and input and output for
supposed to be in billions as the years go by and we need to different modules present in system. Sensed information is
improve crop production to feed those billions of people. The contrasted to data set that is stored on past experience and is
population is increasing and on the other hand the agricultural generated as a consequence. The system architecture is shown
land is declining due to various reasons such as in Fig. 1. Main components of system are
industrialization, retail markets and residential buildings are
being built on these agricultural lands and in order to feed  IOT devices
these billions we need to increase production and this can be
achieved by introducing appropriate technology in
 Machine learning algorithm for prediction.
agriculture. Smart agriculture is the most important thing
required in everyday life. As per the result obtained from the analysis, the farmer
must settle on the process selection of best crop for that
Section 2 outlines related work, while Section 3 analyzes particular soil for increase the production rate of the crop.
the proposed system and structure, Section 4 presents the
findings, and Section 5 finishes the document. 3.1 IOT DEVICES
In IoT-based smart agriculture, sensors (light, humidity,
II. RELATED WORK temperature, soil moisture, etc.) are used to track the plant
field and optimize the irrigation system. The farmers were
The numerical approach, namely the methodology of able to monitor the conditions of the field from anywhere.
multiple linear regression and the process of data mining, This work aims to provide a device for tracking specific
namely the clustering strategy dependent on size, were used to plant temperature & humidity values, soil and water pH
predict plant yields [1]. Kalman filter (KF) is used with sensor and fertilizer control, soil moisture sensor for detecting
predictive analysis in the proposed technique to acquire soil moisture levels. All these devices are used as well as
quality data without any noise and to transmit this data for producing the information to track the plant. IoT system
cluster-based WSNs. Decision tree using predictive analytics interfaced with the sensor:
for crop yield prediction, seed identification, soil
classification, weather prediction, and crop disease prediction
for decision making. This platform combines IoT modules
such as and cube (IOT Gateway) and Mobius (IOT Service
Platform) to provide consumers with a smart crop growth
tracking solution [2]. In machine learning algorithm is
developed using logistic regression to process raw data and
forecast outcomes. It provides the result but is less reliable
than other algorithms [3]. The use of spatial data mining in the
agricultural field has been clarified by authors [4]. They used Fig. 1. IoT Architecture
the K-means algorithm along with incremental improvement
of the optimization approach for the study of spatial
connections. As initial spatial information, temperature and 3.1.1 FC-28 SOIL MOISTURE SENSOR
precipitation are provided and evaluated to increase crop The sensor for soil moisture is quite clear to use. The two
yield and growing crop losses. In [5] authors consider the wide uncovered pads serve as detector sensors, functioning as
problem of predicting the average yield of a type of crop (e.g., a variable resistor together. The more moisture in the ground,
soybean) for a region of interest based on a sequence of the greater will be the conductivity between the surfaces,
remotely sensed images taken before the harvest and resulting in lower resistance and higher output of SIG. [8]
convolutional neural networks applied data to predict the type Table- I: FC-28 Soil Moisture Sensor
of crop. This [6] addresses scientific advancements over the Input Output Input Output
past 15 years on machine-based learning techniques for Signal
Voltage Voltage Current
precise crop yield forecasting and approximation of nitrogen
status, and concludes that rapid advances in sensing 3.3-5V 0-4.2V 35mA Analog &
technology and ML techniques will provide cost-effective and Digital
detailed solutions for improved crop and environmental
condition estimation and decision-making. Remote
Monitoring System (RMS) is introduced in this system [7], a
hybrid solution to internet and wireless communications. The
main goal is to capture real-time agricultural production
environment information that provides easy access to
agricultural facilities such as warnings through Short
Messaging Service (SMS) and weather patterns, plants, etc.


Fig. 2.FC-28 Soil Moisture Sensor
The proposed research work focuses on the use of effective
IOT devices and decision learning for prediction. In system
design, we have included flow of communication between

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-1, November 2019


DHT11 Temperature & Humidity Sensor uses an electronic
signal performance optimized temperature & humidity sensor
PH is an indicator of a solution's acidity and
system. This guarantees high reliability and outstanding
alkalinity, the pH level ranges between 0 and 14. The pH
long-term durability through the use of the proprietary digital
shows the concentration in certain solutions of hydrogen [H]
signal processing methodology and sensing technologies for
+ ions. A detector that detects the potential difference
temperature & humidity. It device incorporates a resistive
between two electrodes can reliably calculate it: a reference
moisture measurement module and an NTC temperature
electrode (silver / silver chloride) and a hydrogen-sensitive
measurement element and connects to an 8-bit
glass electrode. That's what the test was. We also need to use
high-performance microcontroller providing excellent
an electronic circuit to properly condition the signal, and with
quality, fast response, anti-interference capabilities and
a micro-controller like Arduino we can use this sensor. [11]
cost-effectiveness. [9]
Table- IV: REES52 pH Sensor
Table- II: DTH -11 Sensor
Item Humidit Temperatur
e Resolutio Packag Supply Current Consumption Working
Measuremen Range y
n e Voltage Temperatur
t Accuracy Accuracy
20-90 4 Pin
DHT11 % RH ±2℃ 1 Single 5V 5-10mA ≤ 0.5 W 10-50 ºC
0-50 ℃ Row

Fig. 3.DTH-11 Sensor

Fig. 5. REES52 pH Sensor

The unit of the rain detector is an easy tool to measure
water. When raindrop falls through the rainy board and also to Machine learning is widely applied to aspects in
measure rainfall rate, it can be used as a switch. [10] agriculture. It is used in the study of large data sets and in the
data sets to define valuable classifications and patters. The
Table- III: KG004 Rain Drop Sensor overall goal of the Machine Learning process is to extract the
Item Range Output Output information from a set of data and transform it for further use
Measurement Voltage Signal into an understandable structure.
Based on available data, this paper analyzes the crop yield
KG004 −0.3 to +36 V Digital form. The methodology of machine learning was used to
+36 V and
forecast crop yields in order to maximize crop profitability.
Figure 6 demonstrates the stream of estimation of the
expected crop yield.

Fig. 6. System Architecture

Fig. 4. KG004 Rain Drop Sensor

Prediction of Crops based on Environmental Factors using IoT & Machine Learning Algorithms

As shown in the previous example, sensors are installed on Bangalore Karnatak June 29.4 19.8 79.7
the farm to detect data related to humidity, temperature,
Bangalore Karnatak July 27.7 19.3 109.7
precipitation, and pH. Logistic regression, Decision Tree, a
Random Forest, and GDBoost algorithms are used to
characterize sensed data. The forecast result shows which soil 4.2. MACHINE LEARNING MODELS
may be ideal for different crops and the state of soils.
Machine learning is the method of finding correlations in
4.1. OVER VIEW OF DATA large datasets that were previously unknown and potentially
increasing. The extracted data is used as a statistical and
Data was gathered from different outlets and create datasets in classification template to reflect. Datasets obtained from IoT
this process. And for analysis, such databases are used. & Multiple Sources tend to be considerably more complex
Internet sources such as Data.gov.in then indiastat.org are than the database used in machine learning historically.
several.to produce the data. The below table having the Machine learning is specifically defined as machine learning
sample data values, we almost cover all the crops which are that is analytical or predictive. Yet predictive data mining is
helpful to the farmer. Duration shows that crop duration in mostly used in the agricultural area. Classification and
months, Min & Max attributes are showing the range clustering are two primary methods. Some of the methods
temperature which is required for the crop. N, P, K Values are below are used to extract the response from the data collected.
fertilizers for specific crop. pH min & max values for soil
management. Rain fall is the range of that area. The technique suggested includes two phases: the stage
of preparation and the phase of evaluation. The data was
Table V. Overview of Sample Data Points collected and pre-processed during the training phase. The
learning cycle utilizes pre-processed information to train the
template. The yield value is calculated in the testing phase
based on the rules developed. Work begins with phase of
pre-processing. The data collected were pre-processed in this
phase. Several information has been omitted from the data set
in the pre-processing. Some of the field was inadequate for
crop production. So the data will be deleted. Models used in
the phase of training and testing described below:

A. Logistic Regression
The method used to relate a dependent variable to one or
more independent variable is logistic regression. The
dependent variable is sometimes called predictors, and
predictors are called independent variable. Regarding plant
Table VI. State Wise Sample Data Points type prediction(c) as variable based and temperature and
State N P K humidity disparity, soil moisture, pH rate as variable
independent. The formula that has been established is in the
A& N VL VL L ppm( parts per million)

AP L VH M Nitroge Phosphoru Potassium

Y = B0+ B1X1+B2X2+ B3X3 (1)
n s
B. Decision Trees
Karnata H M M VL: 10 VL: > 5 VL: >100
k One of the classification algorithms that can be used in
Assam M L VL L:10-20 L:5 -10 L: 100-150 machine learning is the decision tree. The model for inductive
Bihar VL VH H M:20-30 M:10-20 M:150-25 learning is the educational decision tree. According to some
0 parameters, a model is created from information or
Goa M VL M H:30-40 H: 20-30 H: observations. The design is aimed at discovering from the
250-300 experienced instances a general rule. Therefore, decision
Gujarat VL VH H VH:40+ VH: 30+ VH: 300+
trees may execute two different tasks based on whether the
goal feature is distinct or constant. In the case of the wood, a
Table VI. Sample Temperature Data Points classification tree will lead in the creation of a regression tree
Station Name Month Max Min Mean as in the second case. 12] [13]
Temp Temp
C. Random Forest
Bangalore Karnatak January 27.4 15 4.9
a Random forests are a mixture of tree predictors so that each
Bangalore Karnatak Feb 30.1 16.6 7.9 tree relies on the values of a self-sampled random variable
a with the same distribution for all forest trees. Forest
Bangalore Karnatak March 32.8 19 10 generalization error converges a.s. The number of trees in the
woods is increasing to a peak.
Bangalore Karnatak April 33.9 21.3 43.9
Bangalore Karnatak May 33.1 21.1 111.9

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-1, November 2019

A tree classifier forest's generalization error depends on the

strength of the specific forest trees and their comparison.
Using a random selection of features to separate each node
produces error rates that are more stable in terms of noise
relative to Adaboost. Internal measurements calculate
variance, frequency and consistency, and these are used to
show the response to increase the number of characteristics
used in the splitting. External measurements are also used for
parameter importance estimation. Specific ideas apply to
regression as well. [14]

D. Gradient Boosting Fig.9. Humidity Sensor Data

The main causes of variation between real and expected
values are noise, uncertainty, and discrimination as we try to
predict the goal factor using any machine learning
methodology. Ensemble allows certain variables to be
growing. Gradient Boosting slowly, additively or
sequentially teaches multiple prototypes. By using gradients
in the loss function (y= ax+b+e, e needs a special note as it is
the error term), gradient boosting does the same. The loss
concept is a measure of how the coefficients of the good
model fit the underlying data. A rational interpretation of the
function of loss will rely on what they seek to maximize. [15] Fig 10. pH Sensor Data


IoT Analytics is a fully managed product that allows
advanced analytics easy to run and operationalize on massive We can easily predict the type of crop based on the features
volumes of IoT information without having to worry about the we consider after extracting the features from the represented
cost or difficulty usually required to build an IoT analytics information. In order to approximately evaluate the efficiency
system. It is the best way to run IoT data analytics and gain and effectiveness of our experiment, we compare ensemble,
insights to create better and more accurate IoT apps and classification algorithms on the data for predictions.
machine learning use cases decisions. Classification Algorithms are Logistic Regression and
Decision Tree algorithms, these generates 85.2%, 94.14%
accuracies respectively. Ensemble algorithms are Random
Forest and GDBoost. Random Forest is ensemble method to
control the bias and variance in data, it is improves its
accuracy when the depth of the tree is high and low when its
depth is low. To avoid the over fitting of the model we put a
threshold values to our model and that’s value 14. With that
depth its generated 96.32% accuracy with 3.68% error rate
Fig showed below. Gradient Boost is also an ensemble
method to boost up the model performance, this method
generated highest accuracy with the boosting technique and
Fig.7. Soil Moisture Data that is 96.69% and the error rate is 3.61. Accuracies table
showed below.
Table VII. Accuracies of Machine Learning Models

Classifier Error Rate Accuracy

Logistic Regression 14.8 85.2

Decision Tree 5.86 94.14

Random Forest 3.68 96.32

Gradient Boost 3.61 96.69

Fig.8. Temperature Sensor Data

Prediction of Crops based on Environmental Factors using IoT & Machine Learning Algorithms

The proposed system lists all potential crops that are
feasible in a given area, allowing the farmer to determine
which plant to produce. The program has carried out a careful
examination of the climate, weather and pH information and
recommends which are the most viable crops that can be
grown in the correct environmental situation. This program
often analyses the past data output that will enable the farmer
gain insight into the market demand and value of different
crops. Because total plant varieties under this program will be
protected, farmers can know about the crop that may never
have been produced. IOT leads to connection of all farming
devices together with help of internet in. Different types of
Fig 11. Random Forest Algorithm Depth Generation sensors employed in farm is give real time data of farm
condition and the devices can be used to increase the
moisture, acidity, etc. accordingly.. Further the best profitable
crop can also be found in light of the monetary and inflation

I wish to express my sincere appreciation to those who have
contributed to this thesis and supported me in one way or the
other during this amazing journey. I am extremely grateful to
my thesis guide, Professor Suresh Varma, for his guidance
and all the useful discussions and brainstorming sessions,
especially during the difficult conceptual development stage.
His deep insights helped me at various stages of my research.

Fig.12. Scoring Values of Models on Predicted Value 1. M.K. Gayathri, Dr.G.S. Anandha Mala, J Jayasakthi, “Providing Smart
Agricultural Solutions to Farmers for better yielding using IoT”, TIAR
Asian Journal of Pharmaceutical and Clinical Research, Vol. 10, no.
13, Apr. 2017, pp. 148-52, doi:10.22159/ajpcr.2017.v10s1.19601.
3. Thomas Truong; Anh Dinh; Khan Wahid. An IoT environmental data
collection system for fungal detection in crop fields [M]//2017 IEEE
30th Canadian Conference on Electrical and Computer Engineering
4. D, Rajesh. (2011). Application of Spatial Data mining for Agriculture.
International Journal of Computer Applications. 15.
5. YOU, J.; LI, X.; LOW, M.; LOBELL, D.; ERMON, S.. Deep Gaussian
Process for Crop Yield Prediction Based on Remote Sensing Data.
AAAI Conference on Artificial Intelligence, North America, feb. 2017.
6. Anna Chlingaryan, Salah Sukkarieh, Brett Whelan, Machine learning
approaches for crop yield prediction and nitrogen status estimation in
Fig.13. Error Rate of precision agriculture: A review, Computers and Electronics in
Models Agriculture, Volume 151, 2018, Pages 61-69, ISSN 0168-1699.
7. K. A. Patil and N. R. Kale, "A model for smart agriculture using IoT,"
2016 International Conference on Global Trends in Signal Processing,
Prediction Information Computing and Communication (ICGTSPICC), Jalgaon,
100 2016, pp. 543-545.
96.32 96.69 8. https://www.electronicscomp.com/soil-moisture-sensor-module-india.
95 94.14 9. https://www.mouser.com/ds/2/758/DHT11-Technical-Data-Sheet-Tra
90 85.2 10. https://www.rhydolabz.com/sensors-weather-sensors-c-137_147/raind
85 11. https://scidle.com/how-to-use-a-ph-sensor-with-arduino/
12. Georg Ruß, Rudolf Kruse, Martin Schneider, and Peter Wagner.
80 Estimation of neural network parameters for wheat yield prediction. In
Max Bramer, editor, Artificial Intelligence in Theory and Practice II,
75 volume 276 of IFIP International Federation for Information
Logistic Regression Decision Tree
Processing, pages 109–118. Springer, July 2008.
Random Forest Gradient Boost

Fig.14. Accuracies of Machine Learning Models

International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-1, November 2019

13. Iv´an Mej´ıa-Guevara and ´Angel Kuri-Morales. Evolutionary feature

and parameter selection in support vector regression. In Lecture Notes
in Computer Science, volume 4827, pages 399–408. Springer, Berlin,
Heidelberg, 2007.
14. Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32.
15. https://towardsdatascience.com/understanding-gradient-boosting-mac

