1. Introduction
Healthcare is a fundamental concern globally, where errors in health systems can have profound and irreversible consequences [
1]. Among the myriad of health conditions, diabetes stands out as a particularly perilous disease, with any lapse in addressing abnormalities potentially leading to severe complications, including diabetic coma or even death. More than half a billion people are living with diabetes worldwide, affecting men, women, and children of all ages in every country, and that number is projected to more than double to 1.3 billion people in the next 30 years, with every country seeing an increase. This disease ranks among the most pervasive and deadly diseases, generating vast amounts of healthcare data daily. However, much of these data remains untapped, lacking analysis or actionable insights. Herein lies the opportunity for technologies such as data mining to unravel the myriad threats and disorders associated with diabetes, enabling timely interventions and informed decision-making. Advancements in data mining algorithms offer a glimmer of hope in enhancing the detection of abnormalities in diabetic patients, particularly by analyzing blood sugar levels to glean deeper insights into the patient’s physiological state. While traditional diabetes control systems are available, the integration of fog computing offers distinct advantages, particularly in improving response times. Incorporating local data mining not only strengthens system reliability but also reduces dependence on cloud-based services, potentially leading to cost savings and enhanced operational efficiency. To focus this study, we make the following assumptions:
A local analysis enhances the reliability of diabetic patient monitoring systems by minimizing reliance on external networks and enhancing data processing efficiency.
Fog computing and local data mining technologies accelerate information transmission and processing, resulting in decreased response times for detecting blood sugar abnormalities.
Data mining techniques enable the detection of abnormalities tailored to each patient’s unique biological profile, thereby improving the effectiveness of alert mechanisms.
In alignment with the above assumptions, our primary research question is as follows:
“How can we optimize diabetic patient monitoring systems by leveraging local analysis to enhance reliability, utilizing fog computing and local data mining to expedite information transmission and processing for reduced response times, and employing data mining to detect abnormalities tailored to the patient’s biological profile?”
In this study, a novel IoT healthcare system leveraging fog computing and data mining techniques is proposed, focusing on diabetes management. The research addresses two major challenges in healthcare: the timely detection of abnormalities and accurate diagnosis tailored to individual patient conditions. By utilizing fog computing, which processes data at the network edge, and the KNN algorithm for data analysis, the system enhances response times and reliability. Through simulations and real-world testing with diabetic patient data, the study demonstrates the system’s effectiveness in providing prompt alerts and accurate notifications. This approach highlights the potential for improving patient care by integrating advanced technologies to optimize monitoring and intervention processes.
The motivation behind this study stems from the urgent need to enhance diabetes management and patient care through advanced technology. Diabetes is a widespread chronic condition requiring continuous monitoring and timely intervention to prevent severe complications. Traditional healthcare systems often struggle with delayed responses and generalized treatment approaches that may not cater to individual patient needs. By integrating fog computing and data mining techniques, this study aims to address these challenges by enabling real-time data processing and personalized analysis. The goal is to improve the accuracy of diagnoses, speed up response times, and ultimately provide more effective and timely care for diabetic patients, potentially reducing healthcare costs and improving quality of life.
With the described motivation, the main contributions of this paper are listed as follows:
Integration of IoT and Fog Computing: This study presents a novel integration of IoT and fog computing technologies within healthcare systems, specifically targeting the management of chronic conditions like diabetes. The approach enhances real-time data analysis and decision-making at the edge of the network, enabling more proactive and personalized patient care.
Implementation of Raspberry Pi in Fog Layer: The research showcases the practical implementation of Raspberry Pi microcomputers within the fog layer. By employing the KNN algorithm on these devices, the system effectively analyzes patient data, providing timely alerts and insights for healthcare providers.
Experimental Validation: The proposed system was experimentally tested and evaluated using an IBM SPSS dataset real-world data. The results validated the system’s efficiency, demonstrating its potential to improve healthcare outcomes through prompt abnormality detection and notification.
The upcoming sections of this paper unfold as follows: The following section offers an in-depth exploration of the existing literature on healthcare and data mining, providing a comprehensive overview. Subsequently, we outline the proposed methodology and the necessary systems for implementation. The succeeding section is dedicated to conducting simulations and executing the proposed approach. Finally, the concluding section encapsulates our findings and conclusions and provides recommendations for future research.
2. Literature Review
The pervasive influence of emerging technologies on our daily lives underscores the belief that there exists an Information Technology (IT) solution for nearly every societal challenge. One such solution gaining increasing prominence is the Internet of Things (IoT) [
2].
As described in an IEEE special report, the IoT embodies a self-configuring, adaptive, and intricate network through which “things” communicate via standard protocols. These interconnected entities possess sensing and activation capabilities, programmability, and unique detectability [
3].
The IoT represents a groundbreaking concept where physical objects, ranging from wearables to vehicles and from the literature to environments, can be interconnected, addressed, and managed remotely. Advancements in technologies like radio frequency identification (RFID) sensors have accelerated the evolution of the IoT [
1].
This transformative technology has permeated various domains, including healthcare, transportation, computing, and manufacturing [
4]. Among its most critical and compelling applications lies pharmacovigilance and healthcare. The IoT holds the potential to revolutionize facets of healthcare delivery, including telehealth, fitness monitoring, chronic disease management, and eldercare. It facilitates remote patient monitoring, enhances reporting mechanisms, and streamlines homecare services. Through IoT-enabled healthcare solutions, medical costs stand to decrease, quality of life to improve, and user experiences to be enhanced. Moreover, for healthcare providers, the IoT enables remote device management, potentially reducing downtime and facilitating rapid equipment replacement [
5].
Due to the continuous and advanced technology transformation phase, healthcare industries are still migrating their legacy and outdated eHealth systems and applications to provide a better scope of experiences to their customer/patient [
6].
The healthcare sector, characterized by its unpredictable nature and critical time-sensitive scenarios, demands swift and accurate communication and localization during emergencies. Cloud-based mechanisms often struggle to deliver timely notifications during emergencies [
7].
Traditional cloud servers are unable to meet the low latency requirements of IoT medical equipment and consumers. Because of IoT data transfer, it is therefore vital to reduce network latency, computation delay, and energy consumption. Using FC, data can be stored, processed, and analyzed. Cloud computing data are located at a network edge to reduce high latency [
8]. In scenarios where a sensor-to-cloud architecture is impractical or prohibited, a reliable healthcare system capable of real-time patient monitoring becomes imperative. One such solution is fog computing [
9].
Fog computing, an IoT-based architecture, leverages established technologies such as cloud computing, distributed control systems, cloudlets, and wireless networks. Unlike traditional IT architectures where intelligence resides either in the cloud or at the endpoint, this architecture offers edge computing capabilities, enabling low latency, geographic distribution, and real-time analytics [
10,
11]. Despite its potential, the vast amounts of data generated by IoT devices pose challenges and opportunities alike. Data mining emerges as a pivotal technique to extract actionable insights from large datasets, thereby enhancing IoT systems’ intelligence [
12]. Nowadays, big data have become an increasingly popular domain in various fields that are associated with society, technology, science, and engineering. A huge amount of data is recorded and produced from diverse sectors, from distinct resources like sensor networks, mobile applications, high throughput instruments, and streaming machines and also in other fields like the healthcare industry [
13].
In the realm of healthcare, the IoT has garnered significant attention, particularly in disease management. Among these, diabetes stands out as a focal point. Diabetes, a metabolic disorder characterized by high glucose levels and inadequate insulin, poses severe health risks, including blindness, organ failure, and cardiovascular complications. While existing systems focus on monitoring glucose levels, they often fall short in addressing other vital signs’ fluctuations and associated risks. Leveraging IoT data analytics, however, holds promise in mitigating these risks and improving patient outcomes. By analyzing sensor data in real time, healthcare providers can make informed decisions promptly, potentially averting critical health incidents [
14].
In 2023, Yuqian Yang et al. [
15] proposed a fault–tolerant control protocol that dynamically addresses fault data during system operation in multi-agent systems. These studies have made the use of multiple sensors and agents more reliable in healthcare fields.
Recent technological advancements such as radio frequency identification (RFID) and web development have facilitated machine-to-machine communication through the Internet, resulting in the emergence of the Internet of Things (IoT). This global network facilitates ubiquitous processing, enabling entities to sense their environments, interact with others, and make decisions. Healthcare systems have embraced the IoT, with care beds equipped with sensors that relay various information to clinicians.
2.1. Healthcare Systems
In 2015, Islam et al. [
5] categorized IoT-based healthcare services into single-state and multi-state groups, addressing either a specific disease or multiple diseases. Key categories include Ambient Assisted Living (AAL), utilizing artificial intelligence for eldercare, and M-IoT, leveraging 4G networks for mobile healthcare. Semantic Medicine involves analyzing big data using established rules from relevant concepts and sciences. This study presents the following categories presented in
Figure 1 below. These categories considered a wide range of healthcare systems of different groups, ages, and conditions. Tele-healthcare can be a combination of medical sensors and computing and communication technologies. Elder healthcare systems extend life with services like medication control. Drug Reaction Investigation systems sense the compatibility of the health records of patient and pharmaceutical information. In addition, wearable devices play a vital role in healthcare systems. Infant healthcare can have services to improve nutritional habits and emergency health systems have a critical role in nature disasters and human errors with a bundle of solutions, like notifications based on data collected from the environment.
Intriguingly, IoT-generated data in healthcare often traverses the cloud, prompting studies on the relationship between wearable devices, the cloud, and big data [
14]. Yin Zhang et al. [
16] introduced a cyber–physical system, “Health-CPS”, aided by cloud infrastructure and big data analysis. This three-layered system encompasses data gathering, management, and data service layers.
2.2. Disease Prediction
A 2017 study explored disease prediction using big data, categorizing data into structured (e.g., age, gender) and unstructured (e.g., medical records) types. Machine learning algorithms such as naïve Bayes, KNN, and decision tree were employed for prediction [
17]. Emergency situations necessitate alerting systems in healthcare services.
2.3. Alter-Based Healthcare Systems
Oryema et al. (2017) proposed an interoperable messaging system for IoT healthcare services utilizing constrained application protocol (CoAP) and message queue telemetry transport (MQTT) [
18].
2.4. Healthcare and Edge Computing Technology
The accuracy, speed, and continuity of patient monitoring systems are critical in healthcare. Cloud dependency poses risks, leading to the adoption of edge computing. Challenges in cloud implementation may render some applications impractical. Edge computing alternatives include mobile edge computing (MEC), fog computing, and cloudlets. Fog computing, with its features like internode cooperation and short delays, proves suitable for healthcare systems [
18]. There are some challenges for IoT-based cloud implementation, so it may be impossible to implement some applications inside the cloud layer. Some of the challenges are summarized in
Figure 2. Challenges such as security issues may have destructive social effects as well as interruption issues, delays in sending and receiving data, and the need for high bandwidth, making the cloud an unsuitable tool for healthcare applications.
2.5. Types of Edge Computing
A 2017 study compared edge technologies regarding their computing, cashing, and communication convergence [
19]. The potential benefits of mobile edge computing (MEC) providing cloud servers at base stations are short delays, high bandwidth, context awareness, and real-time services. Fog computing uses near-to-user edge devices as edge routers for computing and cashing. It has the same benefits as MEC. Cloudlets are another edge computing-based technology that is only one step from the user. It is self-configuring and energy efficient. These technologies are compared in
Table 1. The differences and common features of fog, cloudlet, and MEC technologies are shown in
Figure 3 [
20].
Fog Computing
Fog computing is an emerging technology to address computing and networking bottlenecks in the large-scale deployment of IoT applications. It is a promising complementary computing paradigm to cloud computing where computational, networking, storage, and acceleration elements are deployed at the edge and network layers in a multi-tier, distributed, and possibly cooperative manner [
21].
Internode cooperation, context awareness, the ability to have more than one layer, content awareness, and short delays are the main features of fog computing that make it a suitable choice for healthcare systems. Fog computing is a process that brings some of the cloud processing structures to the network edge. It adds new features to the network, including shorter delays due to processing data at the edge of the network (2018).
In 2016, Chiang et al. described the general structure of fog computing [
22]. As shown in
Figure 4, fog–cloud communication manages fog with the cloud; exchanges information bilaterally, when necessary; provides required services to each other; and offers end-to-end services. The fog nodes communicate with each other to support applications, share data, and collaborate for computing, storage, or backup. The fog–user interaction is necessary for presentations and data transfer.
Establishing communication every time with the cloud is not required with the introduction of fog, and thus, the latency is reduced. Healthcare is a latency-sensitive application area. Therefore, the deployment of fog computing in this area is of vital importance. Proper analytics and research may lead to better care, improved treatment, and enhanced patient satisfaction [
23]. Fog computing benefits healthcare by enhancing services. A proposed platform combines fog and cloud technologies for medical data collection, automatic medication prescription, and robotic medication delivery [
24].
As shown in
Figure 5, the platform consists of three parts: (1) medical data collecting, (2) automatic medication prescription, and (3) robotic medication delivery.
During the fog implementation and operation phases, the gateways are adjacent to the IoT domain and accessible through short-range radio communications such as Bluetooth. The cloud is further away and accessible through a wide area network (WAN). It provides the needs of healthcare systems through the platform as a service (PaaS) and infrastructure as a service (IaaS). Fog computing has changed the quality, accuracy, and speed of healthcare services.
A fog-based approach to improve the IoT structure for healthcare was presented in 2017 [
25]. The approach consisted of three layers: (1) sensors and activators, (2) a smart gateway network, and (3) the final system. The middle layer function is to process local data for improving the response speed to medical situations, filter the data and preprocess at the edge, and compress and store the data locally.
2.6. Data Mining
Data mining, the technique of discovering patterns in large datasets, plays a vital role in extracting valuable information from raw data. In 2017, Liu et al. reviewed some of the major supervised algorithms [
26], namely decision tree (DT), naïve Bayes (NB), support vector machine (SVM), radial basis function neural network (RBFNN), and KNN algorithms. Various datasets were used to test the algorithms. The memory and CPU usage of the algorithms are compared in
Figure 6.
2.6.1. Data Mining for IoT
A review study conducted in 2015 [
12] defined data mining as the best-practice process for predicting or developing a descriptive model for a large volume of data so that even new data can be generated. Generally, the purpose of data mining is to find interesting information among a large body of stored data. The sequence of a typical data mining process is shown in
Figure 7.
Shikhar et al. (2017) analytics showed that a [
27] real-time IoT analysis is not limited to databases, and it can be applied to the network edge. As the analysis reaches the edge, data-generating things might be able to locally analyze the data instead of sending them to databases. This can reduce process delays, which is critical for real-time analysis.
2.6.2. Data Mining in Healthcare
Data mining is a process that interacts with a large dataset to determine complex, interesting patterns from unknown structured data. It is strongly associated with high-performance computing, computer graphics, multimedia systems, human–computer interaction, and pattern recognition [
28]. Numerous organizations utilize data mining to analyze enormous datasets, to enhance the decision-making process, and to obtain better long-term results [
29]. Data mining techniques enable IoT platforms to generate valuable insights from static healthcare data that would otherwise remain unused.
2.7. Diabetes
Many diseases required constant care and control. As the third and fifth leading cause of death in the world and Iran, respectively, diabetes is a critical and pervasive condition. Many people are born or diagnosed with diabetes every year. Many studies and services have been presented to control this disease.
IoT and Diabetes
Diabetes, a pervasive and critical condition, demands constant care and control. IoT-based approaches for diabetes management include self-managing systems, RFID-enabled devices for insulin management, and healthcare systems establishing bilateral connections between patients and clinicians [
30,
31,
32].
A healthcare platform with a humanoid robot showcased a centralized IoT approach for diabetes care [
33]. In summary, the literature review highlights the pervasive influence of the IoT in healthcare, emphasizing its role in disease prediction, alert-based systems, edge computing, fog computing, data mining, and specific applications for diabetes management. This groundwork sets the stage for our proposed study, which integrates fog computing, data mining, and the IoT for enhancing healthcare services, with a focus on diabetic patient monitoring.
2.8. Analysis of Methods, Systems, and Steps Taken
Ensuring the accuracy of diagnosis and post-care, along with promptly responding to critical abnormalities, while global standards dictate the monitoring of blood sugar and vital signs in many countries, they may not always align with individual patient needs. Some individuals may exhibit normal blood sugar and blood pressure levels that deviate from these standards, rendering traditional monitoring systems less effective in detecting potential abnormalities.
In such cases, leveraging data mining techniques becomes imperative to tailor monitoring approaches to individual patient profiles, thereby enhancing accuracy and responsiveness. Additionally, deploying nodes at the fog layer helps mitigate delays in alerting caregivers, ensuring swift intervention in critical situations.
3. Proposed Method
As shown in
Figure 8, the proposed method was carried out in 2 stages:
To implement the method, we needed a fog space to process data, classify and detect anomalies, and send notifications. Our research focuses on leveraging Raspberry Pi within the fog layer as a specific implementation example.
To analyze the data, we took the help of machine learning algorithms with IBM SPSS Modeler.
3.1. IBM SPSS Modeler
We utilized IBM SPSS Modeler v18, a comprehensive data mining and analysis simulator. This software offers a wide array of libraries for implementing data mining algorithms, ensuring compatibility with various Excel files or databases. The visual presentation of outputs enhances the interpretation of results, facilitating insightful analysis.
3.2. Datasets
The proposed method necessitates blood sugar data, which were sourced from the UCI website, renowned for its comprehensive datasets in machine learning research. The dataset utilized comprises 10,000 entries and includes three primary categories: test time, a code indicating the patient’s condition before the test (refer to
Table 2), and glucose level.
Table 2 presents the dataset codes, indicating various patient conditions and their corresponding codes. Notably, each condition is associated with a distinct standard glucose level. For instance, the standard fasting blood sugar (code 58) for diabetic patients typically falls within the range of 90–125 mg/dL, as per the American Diabetes Association website. The data were organized in an Excel file and categorized by week, as illustrated in
Table 3.
3.3. Data Mining Algorithms
A plethora of algorithms exists for classifying and analyzing data. Among them, the KNN algorithm stands out for its simplicity and efficiency. Being a lazy algorithm, KNN requires no preprocessing, making it particularly suitable for implementation on fog layer nodes. Hence, we opted for KNN as the data mining algorithm for our proposed method.
3.4. KNN Algorithms
KNN is a simple algorithm that stores all the received data and classifies them based on their similarity. New input is classified by comparing the similarity (or distance) of the new input to all the neighboring data. The KNN can use various methods to calculate the similarity, two of which are described below.
The length of the line between two points is considered the distance. It is calculated by the following equation.
where
x is the new input and
y is the neighboring point.
A plethora of algorithms exists for classifying and analyzing data. Among them, the KNN algorithm stands out for its simplicity and efficiency. Being a lazy algorithm, KNN requires no preprocessing, making it particularly suitable for implementation on fog layer nodes. Hence, we opted for KNN as the data mining algorithm for our proposed method.
- 2
Cosine similarity
Cosine similarity measures the similarity between two non-zero vectors by calculating the cosine of the angle formed between them. The equation below illustrates this calculation.
where
A is the new input and
B is the neighboring point.
An important consideration in utilizing KNN algorithm is selecting an appropriate value for K, representing the number of nearest neighbors to consider. Two common approaches for this are arbitrary selection and cross-validation, which determines the optimal. In our proposed method, cross-validation was employed to ascertain an optimal.
3.5. Fog Layer Implementation
3.5.1. Data Collection
During this stage, the measurement of glucose level, blood pressure, and body temperature on a diabetic patient during two consecutive days were extracted from the dataset, and these data, according to the UCI standard and measurements, were taken under different conditions, including before and after insulin injection, before the main meal, and under fasting conditions.
The data were stored in an Excel file, as shown in
Table 4.
3.5.2. Raspberry Pi
Raspberry Pi computers serve as the fog layer nodes in our system. These microcomputers, resembling the size of a credit card, come in multiple versions. For our implementation, we utilized the Raspberry Pi 2 model. Raspberry Pi 2 offers the computing power necessary for executing our proposed method.
Raspian v9 was used as the operating system of Raspberry Pi. Raspian is an open-layer operating system.
3.5.3. Node-RED
Node-RED is a powerful programming tool based on JavaScript. We leveraged Node-RED for programming tasks at the fog layer in our system. The interface of Node-RED provides a user-friendly environment for creating and deploying workflows. Its intuitive visual interface simplifies the development process, making it accessible for both novice and experienced programmers.
4. Simulation, Implementation, and Results
In this, we employed two distinct methods to validate our proposed approach: simulation using IBM SPSS Modeler and fog layer implementation. Here is a breakdown of each method.
- (a)
Simulation using IBM SPSS Modeler:
As illustrated in
Figure 9, the Excel file is accessed by the modeler for data processing. Prior to importing the file into IBM SPSS Modeler, a new row labeled “Online Blood Glucose” was appended to include input data for another week, while any aberrant data points in the “Blood Glucose” column were filtered out. Within the Excel node configuration, time serves as the input, code functions serves as the classifier, and the blood glucose level is designated as the input data. Notably, the target variable is denoted as the online glucose level, as depicted in
Figure 10.
As depicted in
Figure 11, the KNN algorithm was operationalized through the utilization of the type and KNN nodes. The system was configured to classify the data from the “Blood Glucose” column based on the corresponding values in the “Code” column. The overarching objective is to enable a comparative analysis between “Online Glucose” data and the classes derived from “Blood Glucose” to detect abnormalities effectively.
Figure 12 illustrates the classification of data based on the condition codes.
Figure 13 demonstrates the comparison of a blood sugar of 197 mg/dL, which exceeds the normal range, with three neighboring data points.
Figure 14 illustrates that the blood sugar level of 197 mg/dL deviates significantly from its neighboring data points, indicating that it does not belong to the correct class.
Hence, the blood sugar level of 197 mg/dL does not align with the class of neighboring points. Consequently, the system categorizes these data as abnormal and may trigger alerts to caregivers as deemed necessary.
- (b)
Fog Layer Implementation
As shown in
Figure 15, the fog implementation process starts with data collection and goes to data analysis with the help of Raspberry Pi and Node-RED.
Data collected from a diabetic individual were employed to deploy the fog layer. Blood sugar level fluctuations are depicted in
Figure 16 and
Figure 17. Throughout the observed days, the patient exhibited normal conditions with no indications of abnormal spikes or drops in sugar levels except for the second day, where the second blood sugar measurement reached 250 mg/dL. Notably, it is evident that the patient’s normal blood sugar levels exceed the global standard. Therefore, comparing the blood sugar level to the global standard values may lead to false alerts within the system.
As previously mentioned, we utilized Raspberry Pi 2 for the fog layer implementation. The patient data files were converted to .csv format, and the FTP tool integrated into FileZilla software (3.56.2) facilitated the transfer of these converted files to Raspberry Pi. Subsequently, Node-RED was employed to execute the transferred files and initiate the data processing tasks. The parameters selected for analysis included the blood sugar level and patient condition codes. These parameters were transmitted to the function node as inputs, where the process of abnormality detection was conducted, as illustrated in
Figure 18.
The KNN algorithm was executed within the function node using JavaScript, as depicted in
Figure 19.
The UI node visually displayed the two parameters alongside the condition results. Subsequently, two sample fasting datasets were employed: one representing normal conditions and the other exhibiting abnormal characteristics, as depicted in
Figure 20 and
Figure 21.
Data point #105 was established as the benchmark for normal blood sugar levels, while data point #250, indicating an anomalous blood sugar level, served as the threshold for triggering alerts. Consequently, the responses generated by Raspberry Pi were considered appropriate based on these criteria.