Background: COVID-19 prompted a global shift to online learning, including video conference-assis... more Background: COVID-19 prompted a global shift to online learning, including video conference-assisted online learning (VCAOL), which necessitated educators understanding students' perspectives. Objective: This study aims to develop machine learning (ML) model-agnostic interpretability that could predict students' academic performance in VCAOL. Material and methods: Synthetic Minority Over-sampling Technique (SMOTE) and data augmentation were used to handle imbalanced data from small-scale datasets. The prediction model was developed using Random Forest (RF), Support Vector Machine (SVM), and Gaussian Naive Bayes (GNB). SHAP model-agnostic interpretability was used to interpret and comprehend prediction findings. The data was gathered from September 2022 to January 2023, resulting in 361 records. The research variables included students' academic performance as the dependent variable, and the video conference application (VC), learning material (LM), internet connection (IC), students' ability to learn (SL), and student knowledge (SK) as independent variables, which were mapped into 28 attributes. Result: The SMOTE improved the performance of three algorithms, with RF outperforming SVM and GNB in almost all tests, achieving an accuracy of 79.45%, precision of 75.71%, and recall of 79.45%. SHAP bar plots ranked attributes by importance demonstrated that "Performance," "Frequency Constraint," and "Increase Value" had a significant impact on prediction results. When we mapped the three attributes to our study perspective, we determined that SK and SL were the most important views for students to perform well in VCAOL. SHAP's beeswarm revealed students' performance in VCAOL was positively correlated with "Performance", "Increase Value", "Completing Project", "Adequate Method", "User Interface", and "Feature". As we mapped the three attributes to our study perspective, we found that SK, LM, SL, and VC were positively related to students' performance in VCAOL. Conclusion: The study highlighted the potential of ML in developing data-driven decision-making tools for predicting students' academic performance and identifying critical attributes in a prediction model.
2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)
According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 mil... more According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 million people every year. Machine learning analysis of patients' Electronic Medical Records (EMR) data was helpful for disease risk prediction. Goal of our research was for constructing machine learning-based prediction model for atherosclerotic heart disease prediction. This research investigated machine learning algorithms, namely AdaBoost, Random Forest and Naive Bayes. The study began with data collection based on the electronic medical records from Harapan Kita National Heart Center patients from the period of 2016-2021. Data preprocessing produced 4691 records for dataset. Discussion with a medical professional was used to select the features predictions, which were thrombocyte, khermchc, erythrocyte, hematocrit, hermch, hemoglobin, age, leukocyte, and gender. The outcomes revealed AdaBoost reached the utmost ROC AUC score (68%), followed by Naive Bayes (66%) and Random Forest (56%). Subsequently we used Shapley Additive Explanations (SHAP) and beeswarm plot to reveal information-dense summary, to interpret the prediction result and to describe how each attribute affected the prediction model. We revealed that thrombocyte was the most important feature for the prediction model. This research contributed by paving the way a framework for predicting and improved the information-dense summary of arteriosclerosis heart disease prediction model.
INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT SYSTEMS
In the field of medical data mining, imbalanced data categorization occurs frequently, which typi... more In the field of medical data mining, imbalanced data categorization occurs frequently, which typically leads to classifiers with low predictive accuracy for the minority class. This study aims to construct a classifier model for imbalanced data using the SMOTE oversampling algorithm and a heart disease dataset obtained from Harapan Kita Hospital. The categorization model utilized logistic regression, decision tree, random forest, bagging logistic regression, and bagging decision tree. SMOTE improved the model prediction accuracy with imbalanced data, particularly for minority classes.
Background: COVID-19 prompted a global shift to online learning, including video conference-assis... more Background: COVID-19 prompted a global shift to online learning, including video conference-assisted online learning (VCAOL), which necessitated educators understanding students' perspectives. Objective: This study aims to develop machine learning (ML) model-agnostic interpretability that could predict students' academic performance in VCAOL. Material and methods: Synthetic Minority Over-sampling Technique (SMOTE) and data augmentation were used to handle imbalanced data from small-scale datasets. The prediction model was developed using Random Forest (RF), Support Vector Machine (SVM), and Gaussian Naive Bayes (GNB). SHAP model-agnostic interpretability was used to interpret and comprehend prediction findings. The data was gathered from September 2022 to January 2023, resulting in 361 records. The research variables included students' academic performance as the dependent variable, and the video conference application (VC), learning material (LM), internet connection (IC), students' ability to learn (SL), and student knowledge (SK) as independent variables, which were mapped into 28 attributes. Result: The SMOTE improved the performance of three algorithms, with RF outperforming SVM and GNB in almost all tests, achieving an accuracy of 79.45%, precision of 75.71%, and recall of 79.45%. SHAP bar plots ranked attributes by importance demonstrated that "Performance," "Frequency Constraint," and "Increase Value" had a significant impact on prediction results. When we mapped the three attributes to our study perspective, we determined that SK and SL were the most important views for students to perform well in VCAOL. SHAP's beeswarm revealed students' performance in VCAOL was positively correlated with "Performance", "Increase Value", "Completing Project", "Adequate Method", "User Interface", and "Feature". As we mapped the three attributes to our study perspective, we found that SK, LM, SL, and VC were positively related to students' performance in VCAOL. Conclusion: The study highlighted the potential of ML in developing data-driven decision-making tools for predicting students' academic performance and identifying critical attributes in a prediction model.
Telemedicine is a solution to overcome the problem of limited public access to health services, e... more Telemedicine is a solution to overcome the problem of limited public access to health services, especially during the COVID-19 pandemic. The purpose of this research is to develop mobile-based telemedicine application to help people get health services. The method used in this research consists of data collection method, problem identification method, application design method and evaluation method. Problem tree method is used to identify problems in this research. Application design method using System Development Life Cycle (SDLC) Waterfall model. The results of the study showed a positive response from the public to telemedicine application as a means for ease of health services during the COVID-19 pandemic.
Telemedicine refers to remote clinical services. Telemedicine technology was frequently used for ... more Telemedicine refers to remote clinical services. Telemedicine technology was frequently used for consultation and other clinical services that could be delivered remotely. This research aimed to develop a health care mobile application for sub-district primary health care. This application could be considered as an additional channel for health care services to relate healthcare personnel and patients without boundaries. The main features were developed namely Registration, Consultation, Visiting schedule, Referral hospital and News. System Development Life Cycle (SDLC) waterfall was performed as an application development method. Evaluation based on the User Experience Questionnaire (UEQ) method was performed to evaluate the application result. User experience evaluation based on UEQ approach was revealed the acceptance of users for this application on attractive and novelty aspect shown positive result but on the other hand negative result shown for perspicuity, efficiency, dependability and stimulation aspects.
2023 International Seminar on Intelligent Technology and Its Applications (ISITIA), 2023
According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 mil... more According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 million people every year. Machine learning analysis of patients' Electronic Medical Records (EMR) data was helpful for disease risk prediction. Goal of our research was for constructing machine learning-based prediction model for atherosclerotic heart disease prediction. This research investigated machine learning algorithms, namely AdaBoost, Random Forest and Naive Bayes. The study began with data collection based on the electronic medical records from Harapan Kita National Heart Center patients from the period of 2016-2021. Data preprocessing produced 4691 records for dataset. Discussion with a medical professional was used to select the features predictions, which were thrombocyte, khermchc, erythrocyte, hematocrit, hermch, hemoglobin, age, leukocyte, and gender. The outcomes revealed AdaBoost reached the utmost ROC AUC score (68%), followed by Naive Bayes (66%) and Random Forest (56%). Subsequently we used Shapley Additive Explanations (SHAP) and beeswarm plot to reveal information-dense summary, to interpret the prediction result and to describe how each attribute affected the prediction model. We revealed that thrombocyte was the most important feature for the prediction model. This research contributed by paving the way a framework for predicting and improved the information-dense summary of arteriosclerosis heart disease prediction model.
Objectives: The number of deaths from cardiovascular disease is projected to reach 23.3 million b... more Objectives: The number of deaths from cardiovascular disease is projected to reach 23.3 million by 2030. As a contribution to preventing this phenomenon, this paper proposed a machine learning (ML) model to predict patients with arteriosclerotic heart disease (AHD). We also interpreted the prediction model results based on the ML approach and deployed modelagnostic ML methods to identify informative features and their interpretations.Methods: We used a hematology Electronic Health Record (EHR) with information on erythrocytes, hematocrit, hemoglobin, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, leukocytes, thrombocytes, age, and sex. To detect and predict AHD, we explored random forest (RF), XGBoost, and AdaBoost models. We examined the prediction model results based on the confusion matrix and accuracy measures. We used the Shapley Additive exPlanations (SHAP) framework to interpret the ML model and quantify the contribution of features to predictions.Res...
Proceeding of the 3rd International Conference on Electronics, Biomedical Engineering, and Health Informatics, 2022
Early detection has already reduced pregnancy risk, complications, emergency situations, and also... more Early detection has already reduced pregnancy risk, complications, emergency situations, and also maternal mortality cases. Our study's goal was to build on the intelligent application for early risk pregnancy prediction based on machine learning. We examined 997 patient data and 114 attributes from the electronic medical records on primary health care cohort data the ENA System of the Sawah Besar Primary Health Care. Subsequently, eight attributes were chosen based on the Indonesian Ministry of Health, Maternal and Child Health handbook and medical doctor-supervised as classifier attributes. Machine learning and Knowledge Discovery from Data (KDD) technique was also applied to build an intelligent prediction in this work. In addition, we investigated the decision tree C4.5, random forest, and naive bayes algorithms for seeing which one was the right match for our application. The accuracy values for decision tree C4.5, random forest, and naive bayes were 98,01 %, 98,51 %, and 68,81 %, respectively. On most accuracy measures, the random forest algorithm exceeded the decision tree C4.5 and the naive bayes algorithm. As a consequence, we employed random forest to build the web-based application. Additionally, all three algorithms obtained AUCs ranging from 0.95 to 0.99, indicating perfect prediction accuracy. Our study's contribution was to pave the way for machine learning potential in intelligent applications for early risk pregnancy prediction. In conclusion, we successfully developed an intelligent application for risk pregnancy prediction based on machine learning and revealed potential implications in providing self-checking and early detection of pregnancy risk based on machine learning.
Proceeding of the 3rd International Conference on Electronics, Biomedical Engineering, and Health Informatics, 2022
Nutritional problems such as stunting and wasting on Indonesian children are still facing by Indo... more Nutritional problems such as stunting and wasting on Indonesian children are still facing by Indonesia government. Moreover, the COVID-19 pandemic can provide various negative impacts on children in short, medium, or long-term. UNICEF estimates show that the number of children who experiencing wasting or malnutrition increased by about 15% during the pandemic COVID-19. Children at risk of wasting will tend to experience stunting and vulnerable to long-term developmental disorders. Early detection of children malnutrition is the key to successful prevention and treatment [2]. If this activity runs optimally, many cases of malnutrition can be prevented and treated quickly and appropriately so that their condition does not get worse. Therefore, this study contrasts three machine learning models utilized to predict nutritional status among children under 5 years of age based on physical examination. The machine learning models are C4.5 Decision Tree, K-Nearest Neighbors (KNN), and Naive Bayes. Healthcare Record data from 360 patients with 4 attributes were taken from Sawah Besar Community Health Service. Model performance was assessed using accuracy and F1-Score. C4.5 Decision Tree, K-Nearest Neighbors, and Naïve Bayes algorithms can be utilized for predicting nutritional status of children under five years of age. Moreover, the model can help mothers to know nutritional status of her children and can be used by the Sawah Besar Community Health Service to monitor the nutritional health of children under five years of age. The result shows that C4.5 Decision Tree has the highest performance with 89.87% of accuracy and 91.67% of F1-Score. Based on the experimental results, the C4.5 Decision Tree method is the best method in predicting the nutritional status of children under five years of age.
Atherosclerosis is a common condition characterized by the buildup of plaque in the arteries [1].... more Atherosclerosis is a common condition characterized by the buildup of plaque in the arteries [1]. This accumulation can obstruct blood flow throughout the body. Consequently, when atherosclerosis affects the heart's blood vessels, it can lead to coronary heart disease and heart attacks [1]. Cardiovascular disease (CVD) is the primary cause of death globally, accounting for approximately 17.9 million fatalities annually [2]. The Sample Registration System (SRS) 2019 report from the Ministry of Health Republic of Indonesia
The International Journal of Fuzzy Logic and Intelligent Systems , 2023
In the field of medical data mining, imbalanced data categorization occurs frequently, which typi... more In the field of medical data mining, imbalanced data categorization occurs frequently, which typically leads to classifiers with low predictive accuracy for the minority class. This study aims to construct a classifier model for imbalanced data using the SMOTE oversampling algorithm and a heart disease dataset obtained from Harapan Kita Hospital. The categorization model utilized logistic regression, decision tree, random forest, bagging logistic regression, and bagging decision tree. SMOTE improved the model prediction accuracy with imbalanced data, particularly for minority classes.
Background: COVID-19 prompted a global shift to online learning, including video conference-assis... more Background: COVID-19 prompted a global shift to online learning, including video conference-assisted online learning (VCAOL), which necessitated educators understanding students' perspectives. Objective: This study aims to develop machine learning (ML) model-agnostic interpretability that could predict students' academic performance in VCAOL. Material and methods: Synthetic Minority Over-sampling Technique (SMOTE) and data augmentation were used to handle imbalanced data from small-scale datasets. The prediction model was developed using Random Forest (RF), Support Vector Machine (SVM), and Gaussian Naive Bayes (GNB). SHAP model-agnostic interpretability was used to interpret and comprehend prediction findings. The data was gathered from September 2022 to January 2023, resulting in 361 records. The research variables included students' academic performance as the dependent variable, and the video conference application (VC), learning material (LM), internet connection (IC), students' ability to learn (SL), and student knowledge (SK) as independent variables, which were mapped into 28 attributes. Result: The SMOTE improved the performance of three algorithms, with RF outperforming SVM and GNB in almost all tests, achieving an accuracy of 79.45%, precision of 75.71%, and recall of 79.45%. SHAP bar plots ranked attributes by importance demonstrated that "Performance," "Frequency Constraint," and "Increase Value" had a significant impact on prediction results. When we mapped the three attributes to our study perspective, we determined that SK and SL were the most important views for students to perform well in VCAOL. SHAP's beeswarm revealed students' performance in VCAOL was positively correlated with "Performance", "Increase Value", "Completing Project", "Adequate Method", "User Interface", and "Feature". As we mapped the three attributes to our study perspective, we found that SK, LM, SL, and VC were positively related to students' performance in VCAOL. Conclusion: The study highlighted the potential of ML in developing data-driven decision-making tools for predicting students' academic performance and identifying critical attributes in a prediction model.
2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)
According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 mil... more According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 million people every year. Machine learning analysis of patients' Electronic Medical Records (EMR) data was helpful for disease risk prediction. Goal of our research was for constructing machine learning-based prediction model for atherosclerotic heart disease prediction. This research investigated machine learning algorithms, namely AdaBoost, Random Forest and Naive Bayes. The study began with data collection based on the electronic medical records from Harapan Kita National Heart Center patients from the period of 2016-2021. Data preprocessing produced 4691 records for dataset. Discussion with a medical professional was used to select the features predictions, which were thrombocyte, khermchc, erythrocyte, hematocrit, hermch, hemoglobin, age, leukocyte, and gender. The outcomes revealed AdaBoost reached the utmost ROC AUC score (68%), followed by Naive Bayes (66%) and Random Forest (56%). Subsequently we used Shapley Additive Explanations (SHAP) and beeswarm plot to reveal information-dense summary, to interpret the prediction result and to describe how each attribute affected the prediction model. We revealed that thrombocyte was the most important feature for the prediction model. This research contributed by paving the way a framework for predicting and improved the information-dense summary of arteriosclerosis heart disease prediction model.
INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGENT SYSTEMS
In the field of medical data mining, imbalanced data categorization occurs frequently, which typi... more In the field of medical data mining, imbalanced data categorization occurs frequently, which typically leads to classifiers with low predictive accuracy for the minority class. This study aims to construct a classifier model for imbalanced data using the SMOTE oversampling algorithm and a heart disease dataset obtained from Harapan Kita Hospital. The categorization model utilized logistic regression, decision tree, random forest, bagging logistic regression, and bagging decision tree. SMOTE improved the model prediction accuracy with imbalanced data, particularly for minority classes.
Background: COVID-19 prompted a global shift to online learning, including video conference-assis... more Background: COVID-19 prompted a global shift to online learning, including video conference-assisted online learning (VCAOL), which necessitated educators understanding students' perspectives. Objective: This study aims to develop machine learning (ML) model-agnostic interpretability that could predict students' academic performance in VCAOL. Material and methods: Synthetic Minority Over-sampling Technique (SMOTE) and data augmentation were used to handle imbalanced data from small-scale datasets. The prediction model was developed using Random Forest (RF), Support Vector Machine (SVM), and Gaussian Naive Bayes (GNB). SHAP model-agnostic interpretability was used to interpret and comprehend prediction findings. The data was gathered from September 2022 to January 2023, resulting in 361 records. The research variables included students' academic performance as the dependent variable, and the video conference application (VC), learning material (LM), internet connection (IC), students' ability to learn (SL), and student knowledge (SK) as independent variables, which were mapped into 28 attributes. Result: The SMOTE improved the performance of three algorithms, with RF outperforming SVM and GNB in almost all tests, achieving an accuracy of 79.45%, precision of 75.71%, and recall of 79.45%. SHAP bar plots ranked attributes by importance demonstrated that "Performance," "Frequency Constraint," and "Increase Value" had a significant impact on prediction results. When we mapped the three attributes to our study perspective, we determined that SK and SL were the most important views for students to perform well in VCAOL. SHAP's beeswarm revealed students' performance in VCAOL was positively correlated with "Performance", "Increase Value", "Completing Project", "Adequate Method", "User Interface", and "Feature". As we mapped the three attributes to our study perspective, we found that SK, LM, SL, and VC were positively related to students' performance in VCAOL. Conclusion: The study highlighted the potential of ML in developing data-driven decision-making tools for predicting students' academic performance and identifying critical attributes in a prediction model.
Telemedicine is a solution to overcome the problem of limited public access to health services, e... more Telemedicine is a solution to overcome the problem of limited public access to health services, especially during the COVID-19 pandemic. The purpose of this research is to develop mobile-based telemedicine application to help people get health services. The method used in this research consists of data collection method, problem identification method, application design method and evaluation method. Problem tree method is used to identify problems in this research. Application design method using System Development Life Cycle (SDLC) Waterfall model. The results of the study showed a positive response from the public to telemedicine application as a means for ease of health services during the COVID-19 pandemic.
Telemedicine refers to remote clinical services. Telemedicine technology was frequently used for ... more Telemedicine refers to remote clinical services. Telemedicine technology was frequently used for consultation and other clinical services that could be delivered remotely. This research aimed to develop a health care mobile application for sub-district primary health care. This application could be considered as an additional channel for health care services to relate healthcare personnel and patients without boundaries. The main features were developed namely Registration, Consultation, Visiting schedule, Referral hospital and News. System Development Life Cycle (SDLC) waterfall was performed as an application development method. Evaluation based on the User Experience Questionnaire (UEQ) method was performed to evaluate the application result. User experience evaluation based on UEQ approach was revealed the acceptance of users for this application on attractive and novelty aspect shown positive result but on the other hand negative result shown for perspicuity, efficiency, dependability and stimulation aspects.
2023 International Seminar on Intelligent Technology and Its Applications (ISITIA), 2023
According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 mil... more According to WHO, cardiovascular illnesses are the main cause of death globally, killing 17.9 million people every year. Machine learning analysis of patients' Electronic Medical Records (EMR) data was helpful for disease risk prediction. Goal of our research was for constructing machine learning-based prediction model for atherosclerotic heart disease prediction. This research investigated machine learning algorithms, namely AdaBoost, Random Forest and Naive Bayes. The study began with data collection based on the electronic medical records from Harapan Kita National Heart Center patients from the period of 2016-2021. Data preprocessing produced 4691 records for dataset. Discussion with a medical professional was used to select the features predictions, which were thrombocyte, khermchc, erythrocyte, hematocrit, hermch, hemoglobin, age, leukocyte, and gender. The outcomes revealed AdaBoost reached the utmost ROC AUC score (68%), followed by Naive Bayes (66%) and Random Forest (56%). Subsequently we used Shapley Additive Explanations (SHAP) and beeswarm plot to reveal information-dense summary, to interpret the prediction result and to describe how each attribute affected the prediction model. We revealed that thrombocyte was the most important feature for the prediction model. This research contributed by paving the way a framework for predicting and improved the information-dense summary of arteriosclerosis heart disease prediction model.
Objectives: The number of deaths from cardiovascular disease is projected to reach 23.3 million b... more Objectives: The number of deaths from cardiovascular disease is projected to reach 23.3 million by 2030. As a contribution to preventing this phenomenon, this paper proposed a machine learning (ML) model to predict patients with arteriosclerotic heart disease (AHD). We also interpreted the prediction model results based on the ML approach and deployed modelagnostic ML methods to identify informative features and their interpretations.Methods: We used a hematology Electronic Health Record (EHR) with information on erythrocytes, hematocrit, hemoglobin, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, leukocytes, thrombocytes, age, and sex. To detect and predict AHD, we explored random forest (RF), XGBoost, and AdaBoost models. We examined the prediction model results based on the confusion matrix and accuracy measures. We used the Shapley Additive exPlanations (SHAP) framework to interpret the ML model and quantify the contribution of features to predictions.Res...
Proceeding of the 3rd International Conference on Electronics, Biomedical Engineering, and Health Informatics, 2022
Early detection has already reduced pregnancy risk, complications, emergency situations, and also... more Early detection has already reduced pregnancy risk, complications, emergency situations, and also maternal mortality cases. Our study's goal was to build on the intelligent application for early risk pregnancy prediction based on machine learning. We examined 997 patient data and 114 attributes from the electronic medical records on primary health care cohort data the ENA System of the Sawah Besar Primary Health Care. Subsequently, eight attributes were chosen based on the Indonesian Ministry of Health, Maternal and Child Health handbook and medical doctor-supervised as classifier attributes. Machine learning and Knowledge Discovery from Data (KDD) technique was also applied to build an intelligent prediction in this work. In addition, we investigated the decision tree C4.5, random forest, and naive bayes algorithms for seeing which one was the right match for our application. The accuracy values for decision tree C4.5, random forest, and naive bayes were 98,01 %, 98,51 %, and 68,81 %, respectively. On most accuracy measures, the random forest algorithm exceeded the decision tree C4.5 and the naive bayes algorithm. As a consequence, we employed random forest to build the web-based application. Additionally, all three algorithms obtained AUCs ranging from 0.95 to 0.99, indicating perfect prediction accuracy. Our study's contribution was to pave the way for machine learning potential in intelligent applications for early risk pregnancy prediction. In conclusion, we successfully developed an intelligent application for risk pregnancy prediction based on machine learning and revealed potential implications in providing self-checking and early detection of pregnancy risk based on machine learning.
Proceeding of the 3rd International Conference on Electronics, Biomedical Engineering, and Health Informatics, 2022
Nutritional problems such as stunting and wasting on Indonesian children are still facing by Indo... more Nutritional problems such as stunting and wasting on Indonesian children are still facing by Indonesia government. Moreover, the COVID-19 pandemic can provide various negative impacts on children in short, medium, or long-term. UNICEF estimates show that the number of children who experiencing wasting or malnutrition increased by about 15% during the pandemic COVID-19. Children at risk of wasting will tend to experience stunting and vulnerable to long-term developmental disorders. Early detection of children malnutrition is the key to successful prevention and treatment [2]. If this activity runs optimally, many cases of malnutrition can be prevented and treated quickly and appropriately so that their condition does not get worse. Therefore, this study contrasts three machine learning models utilized to predict nutritional status among children under 5 years of age based on physical examination. The machine learning models are C4.5 Decision Tree, K-Nearest Neighbors (KNN), and Naive Bayes. Healthcare Record data from 360 patients with 4 attributes were taken from Sawah Besar Community Health Service. Model performance was assessed using accuracy and F1-Score. C4.5 Decision Tree, K-Nearest Neighbors, and Naïve Bayes algorithms can be utilized for predicting nutritional status of children under five years of age. Moreover, the model can help mothers to know nutritional status of her children and can be used by the Sawah Besar Community Health Service to monitor the nutritional health of children under five years of age. The result shows that C4.5 Decision Tree has the highest performance with 89.87% of accuracy and 91.67% of F1-Score. Based on the experimental results, the C4.5 Decision Tree method is the best method in predicting the nutritional status of children under five years of age.
Atherosclerosis is a common condition characterized by the buildup of plaque in the arteries [1].... more Atherosclerosis is a common condition characterized by the buildup of plaque in the arteries [1]. This accumulation can obstruct blood flow throughout the body. Consequently, when atherosclerosis affects the heart's blood vessels, it can lead to coronary heart disease and heart attacks [1]. Cardiovascular disease (CVD) is the primary cause of death globally, accounting for approximately 17.9 million fatalities annually [2]. The Sample Registration System (SRS) 2019 report from the Ministry of Health Republic of Indonesia
The International Journal of Fuzzy Logic and Intelligent Systems , 2023
In the field of medical data mining, imbalanced data categorization occurs frequently, which typi... more In the field of medical data mining, imbalanced data categorization occurs frequently, which typically leads to classifiers with low predictive accuracy for the minority class. This study aims to construct a classifier model for imbalanced data using the SMOTE oversampling algorithm and a heart disease dataset obtained from Harapan Kita Hospital. The categorization model utilized logistic regression, decision tree, random forest, bagging logistic regression, and bagging decision tree. SMOTE improved the model prediction accuracy with imbalanced data, particularly for minority classes.
Springer in series Lecture Notes in Artificial Intelligence LNCS/LNAI, 2019
Regional land use planning and monitoring remain an issue in many developing countries. Efficient... more Regional land use planning and monitoring remain an issue in many developing countries. Efficient solution for both tasks depended on remote sensing technology to capture and analyze remotely sensed data of the region of interest. Although a plethora of methods for land cover classification have been reported, the problem remained a challenging task in computer vision field. The advent of deep learning method in the past decade has been very instrumental to develop a robust method for land cover classification using satellite imagery as input. The objective of this paper was to present empiric results on using CNN as a land cover classifier model using Sentinel-2 spatial satellite imagery. Prior to model training, the input image representation was extracted using eCognition to produce texture, brightness, shape, and vegetation index. Land cover labeling followed the Land Cover Class in Medium Resolution Optical Imagery Interpretation document provided by Indonesian National Standardization Agency. The training of CNN model achieved 0.98 mean training accuracy and 0.98 mean testing accuracy. As comparison, the same data and same feature were trained with another model: Gradient Boosting Model (GBM). The results revealed that the training accuracy and testing accuracy with GBMs were 0.98 and 0.95 respectively. CNN model showed small improvement of the accuracy to classify land cover with the image feature (NDVI, Brightness, GLCM homogeneity and Rectangular fit).
—This study aims to learn decision rules from input dataset using decision tree learning as a sup... more —This study aims to learn decision rules from input dataset using decision tree learning as a supervised classification method to predict cardiovascular risk level for adult patients (above 30-year-old). Dataset for this study is provided by a blood chemical lab from a private hospital in Southern Jakarta and used under permission. The experiment results using CART algorithm found an optimum decision tree to represent decision rule made previously by a Physician in predicting Cardiovascular risk level. The with a 9 tree-depth of 13 features that achieved 97.3 % training accuracy and 96.8 % testing accuracy respectively. Further decision tree simplification discovers a set of rules to predict level of Cardiovascular risk despite incomplete predictors as input.
—this paper aimed to present a formal representation design or classification concept to design r... more —this paper aimed to present a formal representation design or classification concept to design relationships among real-world description into image description on a land cover domain using ontology approach. This paper focused on airport domain in Indonesia since detection of airport imagery features was an open challenge due to various features were used to describe the visual existence of an airport. Main characteristics of real-world definition for airport domain were identified by using the reference from Klasifikasi Penutup Lahan SNI 7645 Badan Standardisasi Nasional (Land Cover Classification SNI 7645 National Standardization Body) and Ministry of the Transportation Republic of Indonesia. Subsequently, image description of airport domain was defined based on the real-world description of the airport. This description used image features consideration namely spatial, spectral (color and texture) which was organized as ontological properties itself. At the end, the formal representation of airport domain was designed by mapping real-world description of an object into image description. This paper showed ontology approach could be used to design the formal representation of land cover by mapping real-world description of an object into image description. Ontology itself would be used for image identification/classification/clustering at airport domain or other image domains.
The research goal was to do a comparative study based on SentiWordNet between Naïve Bayes classif... more The research goal was to do a comparative study based on SentiWordNet between Naïve Bayes classification method and Support Vector Machine method on analyzing sentiment analysis towards the election in Indonesia with Twitter as the source of data. Sentiment analysis was created to be a substitution for survey tool as to satisfy the needs of doing survey for the upcoming election in Indonesia in order to know how people feel towards the candidates. The research began by crawling tweets from Twitter, pre-processing the tweets gathered, determining the label of each tweet with SentiWordNet, weighting and selecting words in each tweet, conducting sentiment analysis with k-fold cross-validation and then ending with generating a confusion matrix. The results of the sentiment analysis with the Support Vector Machine classification method achieved 5.91% higher accuracy than the the Naïve Bayes method.
Tujuan penelitian adalah untuk melakukan studi komparatif berdasarkan SentiWordNet terhadap metode klasifikasi Naïve Bayes dan Support Vector Machine pada analisis sentimen terhadap pemilihan umum dengan media sosial Twitter sebagai sumber datanya. Analisis sentimen dipergunakan sebagai salah satu alat pengganti bagi survei untuk memenuhi kebutuhan dalam menyelenggarakan pemilihan umum, agar dapat diketahui bagaimana perasaan rakyat terhadap kandidat pemilihan umum. Penelitian dimulai dengan penarikan data tweet dari Twitter, pre-processing data tweet, penentuan label dengan SentiWordNet, pembobotan dan seleksi kata setiap tweet, melakukan analisis sentimen dengan validasi silang k-fold dan kemudian diuji dengan confusion matrix. Hasil analisis sentimen dengan metode klasifikasi Support Vector Machine mencapai akurasi 5.91% lebih tinggi dibandingkan metode Naïve Bayes.
Kata Kunci: Analisis Sentimen, Twitter, SentiWordNet, Naïve Bayes, SVM
Uploads
Papers by eka miranda
Keywords: Sentiment Analysis, Twitter, SentiWordNet, Naïve Bayes, SVM
Tujuan penelitian adalah untuk melakukan studi komparatif berdasarkan SentiWordNet terhadap metode klasifikasi Naïve Bayes dan Support Vector Machine pada analisis sentimen terhadap pemilihan umum dengan media sosial Twitter sebagai sumber datanya. Analisis sentimen dipergunakan sebagai salah satu alat pengganti bagi survei untuk memenuhi kebutuhan dalam menyelenggarakan pemilihan umum, agar dapat diketahui bagaimana perasaan rakyat terhadap kandidat pemilihan umum. Penelitian dimulai dengan penarikan data tweet dari Twitter, pre-processing data tweet, penentuan label dengan SentiWordNet, pembobotan dan seleksi kata setiap tweet, melakukan analisis sentimen dengan validasi silang k-fold dan kemudian diuji dengan confusion matrix. Hasil analisis sentimen dengan metode klasifikasi Support Vector Machine mencapai akurasi 5.91% lebih tinggi dibandingkan metode Naïve Bayes.
Kata Kunci: Analisis Sentimen, Twitter, SentiWordNet, Naïve Bayes, SVM