[go: up one dir, main page]

Academia.eduAcademia.edu
Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 IoT Based Agriculture as a Cloud and Big Data Service: The Beginning of Digital India Sukhpal Singh Gill, CLOUDS Lab, School of Computing and Information Systems, University of Melbourne, Melbourne, Australia Inderveer Chana, Computer Science and Engineering Department, Thapar University, Patiala, India Rajkumar Buyya, CLOUDS Lab, School of Computing and Information Systems, University of Melbourne, Melbourne, Australia ABSTRACT Cloud computing has transpired as a new model for managing and delivering applications as services efficiently. Convergence of cloud computing with technologies such as wireless sensor networking, Internet of Things (IoT) and Big Data analytics offers new applications’ of cloud services. This paper proposes a cloud-based autonomic information system for delivering Agriculture-as-a-Service (AaaS) through the use of cloud and big data technologies. The proposed system gathers information from various users through preconfigured devices and IoT sensors and processes it in cloud using big data analytics and provides the required information to users automatically. The performance of the proposed system has been evaluated in Cloud environment and experimental results show that the proposed system offers better service and the Quality of Service (QoS) is also better in terms of QoS parameters. KEywORDS Agriculture as a Service, Autonomic Management, Big Data, Cloud Computing, Internet of Things 1. INTRODUCTION Emergence of ICT (Information and Communication Technologies) plays an important role in the agriculture sector by providing services through computer-based agriculture systems (Singh and Chana, 2015). But these agriculture systems are not able to fulfill the needs of today’s generation due to processing of large amount of data, lack of important requirements like processing speed, data storage space, reliability, availability, scalability etc. and even resources used in computer-based agriculture systems are not utilized efficiently. Agriculture-as-a-Service (AaaS) applications exhibit Big data characteristics. For example, the volume of agriculture dataset captured by environments such as Open Government Data Platform India (data.gov.in, 2015), India Agriculture and Climate Data Set (Sanghi et al.), and regional land and climate modelling in China (Shangguan et al., 2012) can be in order of 1000000 records with size of 3.5 GB. The data is coming in large data variety and volume from both users in the form of images like damaged crop images due to weather, insects etc. and devices through Internet of Things (IoT) sensors and satellites (GPS systems) that send weather related images. As a result of regular capturing and collection of datasets, they grow with the velocity DOI: 10.4018/JOEUC.2017100101 Copyright © 2017, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. 1 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 of 80.72 KB/minute or more (data.gov.in, 2015). To solve the problem of existing agriculture systems, there is a need to develop a cloud-based service that can easily manage different types of agriculture related-data based on different domains (crop, weather, soil, pest, fertilizer, productivity, irrigation, cattle, and equipment) through these steps: i) gather data from various sensors through preconfigured devices, ii) classify the gathered data (heterogeneous, high volume of big data) into various classes through analysis, iii) store the classified information in cloud repository for future use, and iv) automatic diagnosis of the agriculture status. As large number of users are using agriculture systems operating on large datasets simultaneously, there is a need of highly scalable and elastic distributed computing environment such as cloud computing. In addition, cloud-based autonomic information system should be able to identify the QoS (Quality of Service) requirements of user request and resources should be allocated efficiently to execute the user request based on these requirements. The main aim of this paper is to design architecture of Agriculture-as-a-Service (AaaS) that manages various types of agriculture-related data based on different domains. This is realized through the following objectives: i) propose an autonomic resource management technique which is used to a) gather the information from various users through preconfigured devices, IoT sensors, GPS (Global Positioning System), etc. b) extract the attributes, c) analyze the information by creating various classes based on the information received, d) store the classified information in cloud repository for future use and e) diagnose the agriculture status automatically and ii) perform resource allocation automatically at infrastructure level after identification of QoS requirements of user request. The rest of the paper is organized as follows. Section 2 presents related work of existing agricultures systems. Proposed architecture is presented in Section 3. Section 4 presents Autonomic Resource Management. Sections 5 describe the experimental setup and present the results of evaluation. Section 6 presents conclusions and future scope. 2. RELATED wORK Existing research reported that few agriculture systems have been developed with limited functionality. Related work of existing agriculture systems has been presented in this section. 2.1. Existing Agriculture Systems Ranya et al. (2013) presented ALSE (Agriculture Land Suitability Evaluator) to study various types of land to find the appropriate land for different types of crops by analyzing geo-environmental factors. ALSE used GIS (Global Information System) capabilities to evaluate land using local environment conditions through digital map and based on this information decisions can be made. Raimo et al. (2010) proposed FMIS (Farm Management Information System) used to find the precision agriculture requirements for information systems through web-based approach. Author identified the management of GIS data is a key requirement of precision agriculture. Sorensen et al. (2010) studied the FMIS to analyze dynamic needs of farmers to improve decision processes and their corresponding functionalities. Further they reported that identification of process used for initial analysis of user needs is mandatory for actual design of FMIS. Zhao (2002) presented an analysis of web-based agricultural information systems and identified various challenges and issues still pending in these systems. Due to lack of automation in existing agriculture system, the system is taking longer time and is difficult to handle dynamic needs of user which leads to customer dissatisfaction. Sorensen et al. (2011) identified various functional requirements of FMIS and information model is presented based on these requirements to refine decision processes. They identified that complexity of FMIS is increasing with increase in functional requirements and found that there is a need of autonomic system to reduce complexity. Yuegao et al. (2004) proposed WASS (Web-based Agricultural Support System) and identified functionalities (information, collaborative work and decision support) and characteristics of WASS. Based on characteristics, authors divided WASS into three subsystems: production, research-education and management. 2 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Reddy at el. (1995) proposed GIS based DSS (Decision Support System) framework in which Spatial DDS has been designed for watershed management and management of crop productivity at regional and farm level. GIS is used to gather and analyze the graphical images for making new rules and decisions for effective management of data. Shitala et al. (2013) presented mobile computing based framework for agriculturists called AgroMobile for cultivation and marketing and analysis of crop images. Further, AgroMobile is used to detect the disease through image processing and also discussed how dynamic needs of user affects the performance of system. Seokkyun et al. (2013) proposed cloud based Disease Forecasting and Livestock Monitoring System (DFLMS) in which sensor networks has been used to gather information and manages virtually. DFLMS provides an effective interface for user but due to temporary storage mechanism used, it is unable to store and retrieve data in databases for future use. The proposed QoS-aware Cloud Based Autonomic Information System (AaaS) has been compared with existing agriculture systems as described in Table 1. All the above research works have focused on different domains of agriculture with different QoS parameters. None of the existing agriculture systems considers self-management of resources. Due to lack of automation of resource management, services become inefficient which further leads to customer dissatisfaction. The proposed system is a novel QoS-aware cloud based autonomic information system and considers various domains of agriculture and, allocates and manages the resources automatically which is not considered in other existing agriculture systems. 3. AGRICULTURE-AS-A-SERVICE ARCHITECTURE The existing agriculture systems are not able to fulfill the needs of today’s generation due to lacking in important requirements like processing speed, data storage space, reliability, availability, scalability etc. Even resources used in computer based agriculture systems are not utilized efficiently. To solve the problem of existing agriculture systems, there is a need to develop a cloud-based autonomic information system that delivers Agriculture-as-a-Service. This section presents architecture of cloud-based autonomic information system for agriculture service called AaaS that manages various Table 1. Comparisons of existing agriculture systems with proposed system (AaaS) Agriculture System Mechanism QoS-aware (Parameter) Data Classification Resource Management Big Data ALSE (Elsheikh et al., 2013) NonAutonomic Yes (Suitability) Soil Yes No No FMIS (Nikkila et al., 2010) NonAutonomic No Pest and Crop No No No WASS (Hu et al., 2004) NonAutonomic No Productivity No No No AgroMobile (Prasad et al., 2013) NonAutonomic Yes (Data accuracy) Crop Yes No No DFLMS (Jeong et al., 2013) NonAutonomic No Crop No Yes No Autonomic Yes (Cost, Time, Resource Utilization, Latency, Throughput and Attack Detection Rate) Crop, Weather, Soil, Pest, Fertilizer and Irrigation Yes Yes Yes Proposed System (AaaS) Domains 3 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 types of agriculture-related data based on different domains. Architecture of AaaS is shown in Figure 1. QoS parameters (execution time and cost) must be identified before the allocation of resources. AaaS is the key mechanism that ensures that the resource manager can serve large amount of requests without violating SLA terms and dynamically manages the resources based on QoS requirements identified by QoS manager. The services of AaaS has been divided into three types: SaaS (Software as a Service), PaaS (Platform as a Service) and IaaS (Infrastructure as a Service). In SaaS, a user interface is designed in which users can interact with system. Aneka is a .NET-based application development PaaS, which is used as a scalable cloud middleware to make interaction between cloud subsystem and user subsystem. In IaaS, an autonomic resource manager manages the resource automatically based on the identified QoS requirements of a particular request. The architecture of AaaS comprises of two subsystems: i) user and ii) cloud. 3.1. User Subsystem This subsystem provides a user interface, in which different type of users interact with AaaS to provide and get useful information about agriculture based on different domains. Nine types of information of different domains in agriculture has been considered: crop, weather, soil, pest, fertilizer, productivity, irrigation, cattle, and equipment. Users are basically classified in three categories: i) agriculture expert, ii) agriculture officer, and iii) farmer. The agriculture expert shares professional knowledge by answering farmer queries and updates the AaaS database based on the latest research done in the field of agriculture with respect to their domain. Agriculture officers are the government officials that provide the latest information about new agriculture policies, schemes, and rules passed by the government. Farmer is an important entity of AaaS who can take maximum advantage by asking his queries and getting automatic reply after analysis. Users can monitor any data related to their domain and get their response without visiting the agriculture help center. It integrates the different domains of agriculture with AaaS. The queries received from user(s) are forwarded to cloud repository for updates and response sends back to particular user on their preconfigured devices (tablets, mobile phones, laptops etc.) via internet. 3.2. Cloud Subsystem This subsystem contains the platform in which agriculture service is hosted on a cloud. Details about users and agriculture information are stored in a cloud repository in different classes for different domains with unique identification number. The information is monitored, analyzed, and processed continuously by AaaS. The analysis process consists of various sub processes: selection, data preprocessing, transformation, classification and interpretation as shown in Figure 1. Different classes for every domain and sub classes for further categorization of information have been designed. In storage repository, user data is categorized based on different predefined classes of every domain. This information is further forwarded to agriculture experts and agriculture officers for final validation through preconfigured devices. Further, a number of users can use cloud-based agriculture service so the QoS manager and autonomic resource manager in cloud subsystem have been integrated. QoS manager identifies the QoS requirements based on the number and type of user queries as discussed in previous research work (Jeong et al., 2013; Singh and Chana, 2015; Singh et al., 2015). Based on QoS requirements, autonomic resource manager identifies resource requirements automatically and allocates and executes the resources at infrastructure level. Performance monitor is used to verify the performance of system and also maintain it automatically. If the system will not be able to handle the request automatically then the system generates an alert. 3.2.1. Cloud-Based Agriculture Service Cloud-based agriculture service provides a user platform through which user can access agriculture service as shown in Figure 2. Firstly, agriculture service allows user to create profile for interaction with AaaS. After profile creation, the user is required to provide his personal details along with the 4 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Figure 1. Agriculture-as-a-Service architecture Figure 2. Functional aspects of AaaS details of information domain. AaaS analyses the information to verify whether the data is complete or not for further processing by performing various checks. Further data is processed and redundancy of data is removed and data is used to select domain to which data belongs. Information is classified properly in order with unique identification number. This information is forwarded to agriculture experts and agriculture officers for final validation through preconfigured devices. After successful validation of information, it is stored in AaaS database. If user wants to know the response of their query, then system will automatically diagnose the user query and send the response back to that user. 5 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 3.2.2. Detailed Methodology AaaS allows users to upload the data related to different domains of agriculture through preconfigured devices and classified them based on the domains specified in database. Subtasks of information gathering and provided in AaaS are: i) selection, ii) preprocessing, iii) transformation, iv) classification and v) interpretation. In selection, target datasets are created based on the relevant information that will further be considered for analysis in next sub process. In preprocessing, different users have different information regarding agriculture. To develop a final training set, there is need of preprocessing steps because data might contain some missing sample or noise components. In AaaS, data preprocessing contains four different sub processes: i) data cleaning, ii) data integration, iii) data conversion and iv) data reduction. Data transformation provides an interface between data analysis sub process (classification) and data preprocessing. After data preprocessing, this process converts the labeled data into adequate format suitable for classification. In classification, AaaS classify the agriculture information of different users of different domains based on the extracted data. K-NN (k-Nearest Neighbor) classification mechanism has been used in this research work to identify the different class labels of users. K-NN is supervised machine learning technique which is used to classify the unknown data using training data set generated by it. K-NN used to identify the productivity level through Training Instance Dataset (TID). Figure 3 describes the K-NN Algorithm. In K-NN algorithm, distance is computed from one specific instance to every training instance to classify that unknown instance. Both k-nearest neighbor and k minimum distance is determined and output class label is identified among k classes. During training phase, K-NN Algorithm utilizes training data. Figure 4 illustrates the classification process used in this research work. K-NN model is used to identify the productivity level through Training Instance Dataset (TID). Five levels of productivity (A - E) have been fixed as shown in Table 2. The level ‘A’ indicates the productivity is very high while level ‘E’ indicates the productivity is very low. Based on the given information, TID identifies the class in which given data belongs. Test data is an input of this model and it is compared with TID and identifies the class in which data laid using following rule: Rule: If {Crop Name ˄ Temperature ˄ Soil Texture ˄ Season ˄ Pesticide ˄ Fertilizer} then Productivity The final step is to interpret the agriculture data submitted by different users of different domains which helps user to understand the classified datasets. AaaS is capable to diagnose the agriculture status based on the information entered by user and send the diagnosed agriculture status to particular user Figure 3. Pseudo code of K-NN algorithm 6 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Figure 4. Classification process Table 2. Productivity Levels Productivity Level Description A Very High Productivity B High Productivity C Neutral Productivity D Low Productivity E Very Low Productivity automatically. Six attributes have been considered: Crop Name, Temperature, Soil Texture, Season, Pesticide and Fertilizer and one output: Productivity. Based on these six attributes, AaaS designs rules. Values for six variables are considered as TID. For example, refer to Table 3. AaaS uses the rule shown in Table 3 to find the productivity level using TID (see Table 4). Similarly, any type of query related to different domains can be asked by users and AaaS executes the user query and send response back to particular user automatically based on the rules defined in AaaS database. Through AaaS, users can easily diagnose the agriculture status automatically. 3.2.3. Infrastructure Management (IaaS) Efficient management of infrastructure in cloud is mandatory to maintain the performance of the Agri-Info. It comprises of two sub units: QoS Manager and Resource Manager. Table 3. User wants to retrieve the productivity level using AaaS User Query Crop Name Soybean Temperature 21-27 °C Soil Texture Slity Loam Clay Season Winter Pesticide Organochlorine Fertilizer Urea Productivity ? 7 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Table 4. AaaS response utilized to in order to find the productivity level using TID AaaS Response Crop Name Soybean Soil Texture Temperature Slity Loam Clay 21-27 °C Season Pesticide Winter Organochlorine Fertilizer Productivity C Urea 3.2.3.1. QoS Manager User submits a request to Agri-Info to retrieve some specific agriculture related information. AgriInfo identifies the QoS parameters required to process the user request through analysis based on user request. Based on the key QoS requirements of a particular user request, the QoS Manager puts the user request into critical and non-critical queues through QoS assessment. For QoS assessment, QoS Manager will calculate the execution time of user request and find the approximate user request completion time. If the completion time is lesser than the desired deadline then it will execute immediately with the available resources and release the resource(s) back to resource manager for another execution otherwise calculate extra number of resources required and provide from the reserved stock for current execution. 3.2.3.2. Resource Manager Further, two resource scheduling policies (Singh and Chana, 2015) are used to schedule the resources for execution of user queries: time based and cost based scheduling policy. Time based scheduling policy works as per following: First, the allocation agent begins to compute the Deadline Time of the user request in the given budget. Allocate resources based on time, the user request which has shortest Deadline Time will execute first. If the two requests have same deadline time then that request will execute first that has lesser execution time. The allocation agent then schedules all the requests with smallest execution time request to the resources that provide high QoS. The rules for time based scheduling policy are described in Table 5 along with their conditions. Cost based scheduling policy works as per following: First, the allocation agent begins to compute the cost of each request then sort, as the priority is given to the request which has maximum budget. If the two requests have same budget then that request will execute first that has lesser execution time. The allocation agent then schedules all the requests with high budget request to the resources that provide high QoS. Finally, all other requests are scheduled on the available resources set. The rules for cost based scheduling policy are described in Table 6 along with their conditions. 4. AUTONOMIC RESOURCE MANAGEMENT Working of autonomic element of Agri-Info is based on IBM’s autonomic model that considers four steps of autonomic system: i) monitor, ii) analyze, iii) plan and iv) execute as shown in Figure 1. The Table 5. Rules of time based resource scheduling Request Pending Yes 8 Urgency Yes Add Resource Reserve Request Submit Yes No Available Submit No - - Finish Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Table 6. Rules of cost based resource scheduling Request Pending RA > 0 Et > Wd Status BA > Pr Yes True True True Yes False True True Add Resource No - - - Finish Yes True False True Finish Yes True True False Finish RA = Resource Available, Et = Estimated Time, Pr = Resource Price, Wd = Desired Deadline and time and cost based scheduling policy is given in previous research work (Singh and Chana, 2015). Add Resource BA = Available Budget. Details of both objective of resource provisioning in autonomic resource management is to provision the resources to process user requests. The requests submitted should be executed within their budget and deadline. Requests submitted by user to resource provisioner are stored as bulk of workloads for their execution. All the submitted workloads are analyzed based on their QoS requirements. Based on importance of the attribute, weights for every cloud workload are calculated. After that, workloads are clustered based on k-means based clustering algorithm for better resource provisioning (Singh et al., 2015). If the value of workloads executes within deadline and budget and [Resource Consumption and Requests Missed is lesser than Threshold Value] then it will provision resources otherwise generate alert for analyses the workload again. After successful provisioning of resources, Resource Scheduler (RS) takes the information from the appropriate workload after analyzing the various workload details which user request demanded (Singh and Chana, 2015). Knowledge Base contains details of all the resources available in resource pool and reserve resource pool. Based on Cloud consumer details, RS assigns resources and executes Cloud workloads. During execution of a particular cloud workload, the Resource Executor (RE) will check the current workload. If the resources are sufficient for execution then it will continue with execution otherwise request for more resources. If the value of Resource Consumption and Requests Missed is lesser than threshold value, then RE will execute workloads otherwise RE will generate alert. After successful execution of Cloud workloads, RE releases the free resources to resource pool and RE is ready for execution of new cloud workloads. During execution of user requests, performance is monitored continuously using sub unit performance monitor to maintain the efficiency of Agri-Info and generates alert in case of performance degradation. Alerts can be generated in two conditions generally: i) if resource consumption is more than threshold values of resource consumption to execute user request (Action: Reallocates resources) and ii) if the number of missed requests are greater than the threshold value (Action: Predict QoS Requirements Again). Same action is performed twice, if Agri-Info fails to correct it then system will be treated as down. Components of autonomic system are described below: 4.1. Sensors Sensors get the information about performance of other nodes using in the system and their current state. Firstly, the updated information from processing nodes is transfer to manager node then manager node transfers this information to sensors. Updated information includes information about QoS parameters (execution time, execution cost and resource utilization etc.). 4.2. Monitor Initially, Monitors are used to collect the information from sensors for monitoring continuously performance variations by comparing expected and actual performance, and monitors the value of 9 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 resource consumption and missed requests. Actual information about performance is observed based QoS parameters and transfers this information to next module for further analysis. 4.3. Analysis and Plan Analyze and plan module start analyzing the information received from monitoring module and make a plan for adequate actions for corresponding alert. Following formula is used to calculate Resource Consumption (Equation 1): n  ActualResourceUsage   Resource Consumption = ∑   PredictedReesourceUsage  i =1  (1) where Actual Resource Usage is usage of resource to execute particular number of user requests and Predicted Resource Usage is resource usage estimated before actual execution and n is the number of resources. Assumed:  Predicted Resource Usage ≤ Actual Resource Usage . Value of ResourceConsumption . is more than 1 generally because Actual Resource Usage is more than Predicted Resource Usage but ideally it will be 1 when both are equal. In this research rk, maximum values for ResourceConsumption has been fixed and that is called threshold value. Following formula is used to calculate number of requests missed (Requests Missed ) in a particular period of time (Equation 2): Requests Missed = [Number of Requests Executed Successfully – Number of Requests Missed Deadline] (2) For successful execution of resources, value of Requests Missed is lesser than threshold value. Algorithm 1 is used to analyses the performance of management of resources. With the help of (Equation 1) and (Equation 2), resource consumption is calculated and allocates the resources for execution and then compares the resource consumption with threshold value (Thc ) . If resource consumption is less than threshold value and value of Requests Missed is less than threshold value (Thm ) then execution of resources continues otherwise no resource is allocated and process of reallocation is started using Algorithm 1. After meeting this condition, resources are allocated for Algorithm 1. Analyzing and Panning Unit (AU) 10 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 further execution and value of resource consumption and Requests Missed are checked periodically. In case of more value than threshold, alert will be generated by performance monitor. 4.4. Executor Executor implements the plan after analyzing completely. To reduce the execution time and execution cost and improve resource utilization is a main objective of executor. Based on the output given by analysis and executor tracks the new user request submission and resource addition, and take the action according to rules described in knowledge base. 4.5. Effector Effector is used to exchange updated information and it is used to transfer the new policies, rules and alerts to other nodes with updated information. 5. PERFORMANCE EVALUATION The aim of this performance evaluation is to demonstrate that it is feasible to implement and deploy the agriculture as a service on real cloud resources. Tools used for setting up cloud environment for performance analysis are Microsoft Visual Studio 2010 (SaaS), Aneka (PaaS), SQL Server 2008, and Citrix Xen Server (IaaS). Aneka has been installed along with its requirements on all the nodes that provide cloud service. Nodes in this system can be added or removed based on the requirement. AaaS is installed on main server and tested on virtual cloud environment that has been established at CLOUDS Lab, University of Melbourne, Australia. Different number of virtual machines have been installed on different servers, and deployed the AaaS to measure the variations. In this experimental setup, three different cloud platforms are used: Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) as shown in Figure 5. At SaaS level, Microsoft Visual Studio is used to develop e-agriculture web service to provide user interface in which user can access service from any geographical location. At PaaS level, Aneka cloud application platform is used as a scalable cloud middleware to make interaction between IaaS Figure 5. Deployment of components at runtime and their interaction 11 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 and SaaS, and continually monitor the performance of the system. At IaaS level, three different servers (consist of virtual nodes) have been created through Citrix Xen Server and SQL Server has been used for data storage. Scheduler as shown in Figure 5, runs at IaaS level on Citrix Xen Server. Computing nodes used in this experiment work are further categorized into three categories as shown in Table 7. The execution cost is calculated based on user request and deadline (if deadline is too early (urgent) it will be more costly because there is a need of greater processing speed and free resources to process particular request with urgency). There is individual price is fixed (artificially) for different resources because all the resources are working in coordination manner to fulfill the demand of user (demand of user is changing dynamically). Experiment setup using 3 servers in which further virtual nodes (12 = 6 (Server 1) +4 (Server 2) +2 (Server 3)) are created. Every virtual node has different number for Execution Components (ECs) to process user request and every EC has their own cost (C$/EC time unit (Sec)). Table 1 shows the characteristics of the resources used and their Execution Component (EC) access cost per time unit in Cloud dollars (C$) and access cost in C$ is manually assigned for experimental purposes. The access cost of an EC in C$/time unit does not necessarily reflect the cost of execution when ECs have different capabilities. The execution agent needs to translate the access cost into the C$ for each resource. Such translation helps in identifying the relative cost of resources for executing user requests on them. Due to limited number of resources, cost increases with increase in user requests. Cost is varying in two different cases: i) relaxed deadline and ii) tight deadline. In both cases, when the deadline is low (e.g. 200 secs), the number of user requests processed increases as the budget value increases. When a higher budget is available, the execution agent uses expensive resources to process more user requests within the deadline. Alternatively, when scheduling with a low budget, the number of user requests processed increases as the deadline is relaxed. Different number of experiments has been performed by comparing AaaS (QoS-aware Autonomic) as discussed in Section 4 with non-autonomic resource management technique (non-autonomic) in which no autonomic scheduling mechanism is considered while allocating resources to process the user requests. 5.1. Datasets Datasets used in this research work are downloaded from the Open Government Data Platform India (data.gov.in, 2015), India Agriculture and Climate Data Set (Sanghi et al.), and regional land and climate modelling in China (Sanghi et al.) can be in the order of 1000000 records, with size of 3.5 GB. The data is coming in large data variety and volume from both users in the form of images like damaged crop images due to weather, insects etc. and devices through Internet of Things (IoT) sensors and satellites (GPS systems) that send weather related images. As a result of regular capturing and collection of datasets, they grow with the velocity of 80.72 KB/minute or more (Sanghi et al.). Five different tables used to process the different types of data as described in Table 8 to Table 12. Table 7. Configuration Details of Cloud Environment Resource_Id Operating System Number of Virtual Node Number of ECs Price (C$/ EC Time Unit) 1 GB RAM and 160 GB HDD Windows 6 18 2 Intel Core i52310- 2.9GHz 1 GB RAM and 160 GB HDD Linux 4 12 3 Intel XEON E 52407-2.2 GHz 2 GB RAM and 320 GB HDD Linux 2 6 4 Configuration Specifications R1 Intel Core 2 Duo 2.4 GHz R2 R3 12 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Table 8. Crop Information Crop Name CropId Crop Type Min Land Soil Texture Growing Period Seed Type Price Quantity C1 Rice Kharif Slity Clay 5 Acre 3 Months Wet 1200 Rs./ Kg 2 Kg/Acre C2 Maize Rabi Slity Loam Clay 4 Acre 4 Months Dry 1600 Rs./ Kg 1 Kg/Acre C3 Wheat Zaid Loam Clay 3 Acre 3 Months Wet 1000 Rs./ Kg 2 Kg/Acre C4 Sugarcane Cash Slity 4 Acre 6 Months Dry 800 Rs./Kg 6 Kg/Acre Table 9. Weather information Crop Name Temperature Season Pressure (CFM) Wind Speed Rainfall Location Rice 15-18 °C Winter 0.75 to 1.5 16 Km/h 300–650 mm Ambala Maize 17-22 °C Summer 0.05 to 0.5 12 Km/h 100–150 mm Amritsar Wheat 25-30 °C Rainy 1.5 to 5.2 17.3 Km/h 200–250 mm Ganga Nagar Sugarcane 35-40 °C Summer 1 to 10 8 Km/h 400–600 mm Pathankot Table 10. Soil information Soil Texture Bulk Density Inorganic Material Slity Clay 2.60 to 2.75 grams per cm3 Sand and clay Slity Loam Clay 2.7 to 2.75 grams per cm3 Loam Clay Slity Organic Material Water Air Color Structure Infiltration Plant and animal residues 25% 28% Brown Plate-like 15 mm/hour Sand and Slit Animal residues 22% 18% Red Prism-like 10 mm/hour 2.60 to 2.75 grams per cm3 Clay and Slit Plant residues 37% 21% Brown Block like 18 mm/hour 2.60 to 2.75 grams per cm3 Sand, Slit and Clay Plant and animal residues 31% 29% Black Sphere like 22 mm/hour Solubility in Water Price Outcome Table 11. Pest information Crop Type Crop Disease Kharif Bacterial brown spot Degrade soil fertility Reduce Irrigation Carbonate Yes Rs. 1500/L Improve Productivity Rabi Zonate eye spot Degrade productivity Distribute Soil Organophosphate No Rs. 2200/L Improve soil fertilization Zaid Dwarf bunt Increase risk of other disease Spray irrigation Parathyroid Yes Rs. 2300/L Reduce risk of other diseases Cash Ergot Degrade productivity Drip Irrigation Parathyroid Yes Rs. 1800/L Reduce productivity Effect Treatment Pesticide Name 13 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Table 12. Fertilizer information Crop Type Fertilizer Name Nutrient Composition Price Kharif Urea Nitrogen in form of urea (amide) (N) 7000 Rs./10 Kg Rabi Ammonium-Nitrate Ammoniacal Nitrogen, Nitrogen Nitrate and Urea Nitrogen 9100 Rs./10 Kg Zaid Ammonium-Sulphate Ammoniacal nitrogen and Sulpher 6200 Rs./10 Kg Cash Urea-Ammonium Ammoniacal nitrogen and Neutral ammonium citrate Soluble phosphate 13200 Rs./10 Kg 5.2. Performance Metrics The following metrics are used to calculate the execution cost, execution time, resource utilization, latency, detection rate and scalability for processing user requests as taken from previous work (Singh and Chana, 2015; Singh et al., 2015; Singh and Chana, 2016): Execution Time is a ratio of difference of request finish time (WFi ) and request start time (WStarti ) to number of requests. Following formula is used to calculate Execution Time (ET) (Equation 3): n  WF −WStarti   ETi = ∑  i   n i =1  (3) where n is the number of requests to be executed. Execution Cost is defined as the total amount of cost spent per one hour for the execution of request and measured in Cloud Dollars (C$). Following formula is used to calculate execution cost (C) (Equation 4): C = ETi ×Price (4) Latency is a defined as a difference of time of input cloud workload and time of output produced with respect to that workload. Following formula is used to calculate Latency (Equation 5): n  output  produced after   − time of  input of  cloud  workload ) Latencyi = ∑ (timeof execution (5) i =1 where n is number of workloads. Resource Utilization is defined as a ratio of actual time spent by resource to execute workload to total uptime of resource for single resource. Following formula is used to calculate resource utilization (Equation 6): n   workload     resourceto execute actual timespentby   = ResourceUtilization i ∑   total uptimeof  resource i =1 where n is number of workloads. 14 (6) Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Security is measured in terms of detection rate. Experiment has been conducted with different type of attacks (DoS, R2L, U2R and Probing) and different tools used to launch different attacks are metasploit framework for DoS, Hydra for R2L, NetCat for L2R and NMAP for probing. Detection Rate is the ratio of total number of true positives to the total number of intrusions (Sorensen et al., 2010): Detection Rate = Total Number of True Positives Total Number of Intrusions (7) Scalability is measured in terms of throughput. It is the ratio of total number of workloads to the total amount of time required to execute the workloads. Following formula is used to calculate throughput (Equation 8): Throughput = TotalNumberofWorkloads(Wn ) Total amount of   time required toexecutethe   workloads (Wn ) (8) 5.3. Experimental Results -- Based on Modelling and Simulation using CloudSim Experiment has been conducted with 180 user requests for verification of execution cost, execution time, resource utilization, latency, detection rate and scalability. With increasing the number of user requests, the value of latency is increasing. The value of latency in QoS-aware autonomic system is lesser as compared to non-autonomic based resource scheduling at different number of user requests as shown in Figure 6. The maximum value of latency is 193 seconds and minimum value of latency is 59 seconds in QoS-aware autonomic resource management technique. Average latency in QoSaware autonomic is 15.22% lesser than non-autonomic resource management technique. The value of average cost for both QoS-aware cloud based autonomic resource management technique and Figure 6. Effect of change in number of user requests on latency 15 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 non-autonomic resource management is calculated with different number of user requests as shown in Figure 7. Average cost is increasing with increase in number of user requests. At 180 user requests, average cost in QoS-aware autonomic is 25.36% lesser than non-autonomic resource management technique. QoS-aware autonomic performs excellent with different number of user requests. Execution cost in QoS-aware autonomic is 27.65% lesser than non-autonomic resource management technique. As shown in Figure 8, the execution time is increasing with increase in number of user requests. At 90 user requests, execution time in QoS-aware autonomic resource management technique is 24.66% lesser than non-autonomic resource management technique. After 120 user requests, execution time increases abruptly in non-autonomic resource management technique but QoS-aware autonomic performs better than non-autonomic technique. Average execution time in QoS-aware autonomic is 18.960% lesser than non-autonomic resource management technique. With increasing the number of user requests, the percentage of resource utilization is increasing. The percentage of resource utilization in QoS-aware autonomic resource management technique is more as compared to non-autonomic resource management (non-autonomic) at different number of user requests as shown in Figure 9. The maximum percentage of resource utilization is 94.66% at 180 user requests in QoS-aware autonomic but QoS-aware autonomic performs better than non-autonomic technique. Average resource utilization in QoS-aware autonomic is 31.96% more than non-autonomic resource management technique. Scalability is measured in terms of throughput. Number of software, network and hardware faults (fault percentage) has been injected to verify the throughput of the proposed system with 100 user requests. Figure 10 shows the comparison of throughput of both QoS-aware autonomic resource management approach and non-QoS based resource management technique (non-autonomic) at 100 user requests and it is clearly shown that QoS-aware autonomic performs better than non-autonomic. In this experiment, it has been found the maximum value of throughput at fault percentage 45% i.e. QoS-aware autonomic has 26% more throughput than non-autonomic. Detection rate increases with respect to time and it considers the number of blocked and detected attacks. For new attack or intrusion detection, database is updated with new signatures and new polices and rules are generated to avoid Figure 7. Effect of change in number of user requests on execution cost 16 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Figure 8. Effect of execution time with change in number of user requests Figure 9. Effect of change in number of user requests on resource utilization same attack. Experiment has been conducted for known attacks; it is clearly shown in Figure 11 that QoS-aware autonomic performs better than snort anomaly detector (non-autonomic). Further signatures of some known attacks have been removed from database to verify the working of proposed system. Table 13 describes the comparison of execution time used to process different number of workloads (90 and 180) on cloud environment for proposed system with different number of 17 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Figure 10. Throughput [100 user requests] vs. Fault percentage (%) Figure 11. Detection rate vs. Attacks Virtual Machines (VMs). The number of VMs used to execute the workloads was incremented gradually showing how the total execution time was reduced when more VMs were added to the cloud. With one virtual node running on Server R1, execution of 45 workloads finished in 436.12 seconds. With 12 virtual nodes (6 running on R1, 4 running on R2 and 2 running on R3), the application took 276.16 seconds. It is noted that the execution time is reduced with adding additional virtual nodes. 18 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Table 13. Total execution time of a bulk of cloud workloads distributed in three servers Virtual Nodes Number of Workloads R1 R2 Total Workers R3 Execution Time (Seconds) 45 1 0 0 1 436.12 45 1 1 0 2 428.69 45 2 1 0 3 418.97 45 2 2 0 4 407.55 45 3 2 0 5 398.17 45 4 2 0 6 380.30 45 4 2 1 7 361.66 45 4 3 1 8 345.18 45 5 3 1 9 331.21 45 5 3 2 10 315.03 45 5 4 2 11 299.97 45 6 4 2 12 276.16 90 1 0 0 1 1803.11 90 1 1 0 2 1771.18 90 2 1 0 3 1759.66 90 2 2 0 4 1736.15 90 3 2 0 5 1691.77 90 4 2 0 6 1668.96 90 4 2 1 7 1636.11 90 4 3 1 8 1625.19 90 5 3 1 9 1578.21 90 5 3 2 10 1551.68 90 5 4 2 11 1529.11 90 6 4 2 12 1503.11 5.4. Statistical Analysis Statistical significance of the results has been analyzed by Coefficient of Variation (Coff . ofVar .) , a statistical method. Coff . ofVar . is statistical measure of the distribution of data about the mean value. Coff . of Var. is used to compare to different means and furthermore offer an overall analysis of performance of the technique used for creating the statistics. It states the deviation of the data as a proportion of its average value, and is calculated as follows (Equation 9): Coff . ofVar .= SD ×100 M (9) where SD is a standard deviation and M is mean. Coff . ofVar . of execution time and have been studied of QoS-aware autonomic resource management technique and non-autonomic resource 19 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 management technique as shown in Figure 12 and Figure 13. Range of Coff . ofVar . (0.25% - 1.69%) for execution time and (0.37% - 1.96%) for cost approves the stability of QoS-aware autonomic resource management technique as shown in Figure 12 and Figure 13. Small value of Coff . ofVar . signifies QoS-aware autonomic resource management technique is more efficient in resource scheduling in the situations where the number of user requests has changed. Value of Coff . ofVar . decreases as the number of user requests is increasing. 6. CONCLUSION AND FUTURE DIRECTIONS Cloud-based autonomic information system (AaaS) for agriculture service has been presented, which manages the various types of agriculture-related data based on different domains through different user preconfigured devices. K-NN (k-Nearest Neighbor) classification mechanism is used to classify the agriculture data. Further, classified data is interpreted and users can easily diagnose the agriculture status automatically through AaaS. In addition, AaaS uses two resource scheduling polices (time and cost) for efficient resource allocation at infrastructure level after identification of QoS requirements of user request. The performance of proposed system has been evaluated in cloud environment and Figure 12. CoV for execution time with each scheduling algorithm Figure 13. CoV for execution cost with each scheduling algorithm 20 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 experimental results show that the proposed system performs better in terms of execution time, cost, resource utilization, latency, scalability and security. In future, the proposed technique can be extended by incorporating other QoS parameters like network bandwidth, availability, customer satisfaction, computing capacity etc. Proposed technique can be extended by developing pluggable scheduler, in which resource scheduling can be changed easily based on the requirements. ACKNOwLEDGMENT One of the authors, Dr. Sukhpal Singh Gill [Post Doctorate Fellow], gratefully acknowledges the CLOUDS Lab, School of Computing and Information Systems, The University of Melbourne, Australia, for awarding him the Fellowship to carry out this research work. 21 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 REFERENCES Agriculture Data, Government of India. (n. d.). Retrieved from https://data.gov.in/catalogs/sector/Agriculture-9212 Elsheikh, R., Mohamed Shariff, A. R. B., Amiri, F., Ahmad, N. B., Balasundram, S. K., & Soom, M. A. M. (2013). Agriculture Land Suitability Evaluator (ALSE): A decision and planning support tool for tropical and subtropical crops. Computers and Electronics in Agriculture, 93, 98–110. doi:10.1016/j.compag.2013.02.003 Hu, Y., Quan, Z., & Yao, Y. (2004). Web-based Agricultural Support Systems. Proceeding of the Workshop on Web-based Support Systems (pp. 75-80). India Agriculture And Climate Data Set. (n. d.). Retrieved from https://ipl.econ.duke.edu/dthomas/dev_data/ datafiles/india_agric_climate.htm Jeong, S., Jeong, H., Kim, H., & Yoe, H. (2013). Cloud Computing based Livestock Monitoring and Disease Forecasting System. International Journal of Smart Home, 7(6), 313–320. doi:10.14257/ijsh.2013.7.6.30 Narayana Reddy, M., & Rao, N. H. (1995). GIS Based Decision Support Systems in Agriculture. National Academy of Agricultural Research Management Rajendranagar. Nikkilä, R., Seilonen, I., & Koskinen, K. (2010). Software architecture for farm management information systems in precision agriculture. Computers and Electronics in Agriculture, 70(2), 328–336. doi:10.1016/j. compag.2009.08.013 Prasad, S., Peddoju, S. K., & Ghosh, D. (2013). AgroMobile: A Cloud-Based Framework for Agriculturists on Mobile Platform. International Journal of Advanced Science and Technology, 59, 41–52. doi:10.14257/ ijast.2013.59.04 Ruixue, Z. (2002). Study on Web-based Agricultural Information System Development Method. Proceedings of the Third Asian Conference for Information Technology in Agriculture, China (pp. 601-605). Shangguan, W., Dai, Y., Liu, B., Ye, A., & Yuan, H. (2012, February 29). A soil particle-size distribution dataset for regional land and climate modelling in China. Geoderma, 171, 85–91. doi:10.1016/j.geoderma.2011.01.013 Singh, S., & Chana, I. (2015). QoS-aware Autonomic Resource Management in Cloud Computing: A Systematic Review. ACM Computing Surveys, 48(3), 1–46. doi:10.1145/2843889 Singh, S., & Chana, I. (2015). QRSF: QoS-aware resource scheduling framework in cloud computing. The Journal of Supercomputing, 71(1), 241–292. doi:10.1007/s11227-014-1295-6 Singh, S., & Chana, I. (2015). Q-aware: Quality of service based cloud resource provisioning. Computers & Electrical Engineering, 47, 138–160. doi:10.1016/j.compeleceng.2015.02.003 Singh, S., & Chana, I. (2016). EARTH: Energy-aware Autonomic Resource Scheduling in Cloud Computing. Journal of Intelligent and Fuzzy Systems, 30(3), 1581–1600. doi:10.3233/IFS-151866 Sørensen, C. G., Fountas, S., Nash, E., Pesonen, L., Bochtis, D., Pedersen, S. M., & Blackmore, S. B. et al. (2010). Conceptual model of a future farm management information system. Computers and Electronics in Agriculture, 72(1), 37–47. doi:10.1016/j.compag.2010.02.003 Sørensen, C. G., Pesonen, L., Bochtis, D. D., Vougioukas, S. G., & Suomi, P. (2011). Functional requirements for a future farm management information system. Computers and Electronics in Agriculture, 76(2), 266–276. doi:10.1016/j.compag.2011.02.005 22 Journal of Organizational and End User Computing Volume 29 • Issue 4 • October-December 2017 Sukhpal Singh Gill joined Computer Science and Engineering Department of Thapar University, Patiala, India, in 2016 as a Faculty. Presently, Dr. Gill is working as Post Doctorate Fellow at CLOUDS Lab, School of Computing and Information Systems, The University of Melbourne, Australia. Dr. Gill obtained the Degree of Master of Engineering in Software Engineering from Thapar University, as well as a Doctoral Degree specialization in “Autonomic Cloud Computing” from Thapar University. Dr. Gill received the Gold Medal in Master of Engineering in Software Engineering. Dr. Gill is a DST Inspire Fellow [2013-2016] and worked as a SRF-Professional on DST Project, Government of India. He has done certifications in Cloud Computing Fundamentals, including Introduction to Cloud Computing and Aneka Platform (US Patented) by ManjraSoft Pty Ltd, Australia and Certification of Rational Software Architect (RSA) by IBM India. His research interests include Software Engineering, Cloud Computing, Internet of Things and Fog Computing. He has more than 40 research publications in reputed journals and conferences. Inderveer Chana joined Computer Science and Engineering Department of Thapar University, Patiala, India, in 1997 as Lecturer and is presently serving as Professor in the department. She is Ph.D. in Computer Science with specialization in Grid Computing, M.E. in Software Engineering from Thapar University and B.E. in Computer Science and Engineering. Her research interests include Grid and Cloud computing and other areas of interest are Software Engineering and Software Project Management. She has more than 100 research publications in reputed Journals and Conferences. Under her supervision, more than 40 ME thesis and seven Ph.D thesis have been awarded and five Ph.D. thesis are on-going. She is also working on various research projects funded by Government of India. Rajkumar Buyya is a Fellow of IEEE, Professor of Computer Science and Software Engineering and Director of the Cloud Computing and Distributed Systems (CLOUDS) Laboratory at the University of Melbourne, Australia. He is also serving as the founding CEO of Manjrasoft, a spin-off company of the University, commercialising its innovations in Cloud Computing. He has authored over 500 publications and four text books. He is one of the highly cited authors in computer science and software engineering worldwide (h-index 110+, 60000+ citations). He has served as the founding Editor-in-Chief (EiC) of IEEE Transactions on Cloud Computing and now serving as Co-EiC of Journal of Software: Practice and Experience. 23