Journal of
Clinical Medicine
Article
Comparing Multiple Linear Regression and Machine Learning
in Predicting Diabetic Urine Albumin–Creatinine Ratio in a
4-Year Follow-Up Study
Li-Ying Huang 1 , Fang-Yu Chen 1 , Mao-Jhen Jhou 2 , Chun-Heng Kuo 1 , Chung-Ze Wu 3,4 , Chieh-Hua Lu 5 ,
Yen-Lin Chen 6 , Dee Pei 1 , Yu-Fang Cheng 7 and Chi-Jie Lu 2,8,9, *
1
2
3
4
5
6
7
Citation: Huang, L.-Y.; Chen, F.-Y.;
8
Jhou, M.-J.; Kuo, C.-H.; Wu, C.-Z.; Lu,
9
C.-H.; Chen, Y.-L.; Pei, D.; Cheng,
*
Division of Endocrinology and Metabolism, Department of Internal Medicine, Department of Medical
Education, Fu Jen Catholic University Hospital, School of Medicine, College of Medicine,
Fu Jen Catholic University, New Taipei City 24352, Taiwan; liyinghuang@yahoo.com (L.-Y.H.);
julia0770@yahoo.com.tw (F.-Y.C.); cpp0103@gmail.com (C.-H.K.); peidee@gmail.com (D.P.)
Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242062, Taiwan;
aaa73160@gmail.com
Division of Endocrinology, Department of Internal Medicine, Shuang Ho Hospital,
New Taipei City 23561, Taiwan; chungze@yahoo.com.tw
Division of Endocrinology and Metabolism, Department of Internal Medicine, School of Medicine,
College of Medicine, Taipei Medical University, Taipei 11031, Taiwan
Division of Endocrinology and Metabolism, Department of Internal Medicine, Tri-Service General Hospital,
School of Medicine, National Defense Medical Center, Taipei 11490, Taiwan; undeca2001@gmail.com
Department of Pathology, Tri-Service General Hospital, National Defense Medical Center,
Taipei 11490, Taiwan; anthonypatho@gmail.com
Department of Endocrinology and Metabolism, Changhua Christian Hospital, Changhua 50051, Taiwan;
cch143989@gmail.com
Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City 242062, Taiwan
Department of Information Management, Fu Jen Catholic University, New Taipei City 242062, Taiwan
Correspondence: 059099@mail.fju.edu.tw; Tel.: +886-2-2905-2973
Y.-F.; Lu, C.-J. Comparing Multiple
Linear Regression and Machine
Learning in Predicting Diabetic Urine
Albumin–Creatinine Ratio in a 4-Year
Follow-Up Study. J. Clin. Med. 2022,
11, 3661. https://doi.org/10.3390/
jcm11133661
Academic Editor: Fernando
Gómez-Peralta
Received: 29 April 2022
Accepted: 22 June 2022
Published: 24 June 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affiliations.
Abstract: The urine albumin–creatinine ratio (uACR) is a warning for the deterioration of renal
function in type 2 diabetes (T2D). The early detection of ACR has become an important issue.
Multiple linear regression (MLR) has traditionally been used to explore the relationships between
risk factors and endpoints. Recently, machine learning (ML) methods have been widely applied in
medicine. In the present study, four ML methods were used to predict the uACR in a T2D cohort. We
hypothesized that (1) ML outperforms traditional MLR and (2) different ranks of the importance of the
risk factors will be obtained. A total of 1147 patients with T2D were followed up for four years. MLR,
classification and regression tree, random forest, stochastic gradient boosting, and eXtreme gradient
boosting methods were used. Our findings show that the prediction errors of the ML methods are
smaller than those of MLR, which indicates that ML is more accurate. The first six most important
factors were baseline creatinine level, systolic and diastolic blood pressure, glycated hemoglobin,
and fasting plasma glucose. In conclusion, ML might be more accurate in predicting uACR in a T2D
cohort than the traditional MLR, and the baseline creatinine level is the most important predictor,
which is followed by systolic and diastolic blood pressure, glycated hemoglobin, and fasting plasma
glucose in Chinese patients with T2D.
Keywords: type 2 diabetes; nephropathy; urine albumin-creatinine ratio; machine learning
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1. Introduction
Type 2 diabetes (T2D) has become a growing global issue in recent decades. According
to the 2021 Atlas of the International Diabetes Federation, it is estimated that there are
5.37 billion patients worldwide, and this trend will further increase to 6.0 billion by 2045 [1].
Not surprisingly, a similar endemic was noted in Taiwan. According to the data bank of the
J. Clin. Med. 2022, 11, 3661. https://doi.org/10.3390/jcm11133661
https://www.mdpi.com/journal/jcm
J. Clin. Med. 2022, 11, 3661
2 of 14
National Health Insurance Company, the total number of diabetic patients increased from
1.32 million to 2.2 million within 10 years (2005 to 2014). This represents an astonishing 66%
increase [2]. It is now the 5th highest cause of death. In 2020, the cost spent on T2D was
over 10 billion USD, which is approximately 4.66% of the budget of the National Health
Insurance Company in one year. The accompanying complications, such as micro- and
macrovascular diseases, impose heavy burdens on individuals and their families, as well
as health providers and society [3,4]. It is important to note that this trend is particularly
prominent among people aged <40 and ≥80 years [5].
Among all the complications, diabetic nephropathy is the leading cause of chronic
kidney disease and end-stage renal disease (ESRD) [6], which are associated with high
morbidity and mortality rate. According to the annual report of the US Renal Data System,
Taiwan has the highest incidence (523 per million population) and prevalence of treated
ESRD requiring renal replacement therapy [7]. In 2019, there were 84,615 dialysis patients
and the National Health Insurance spent 1.54 billion, which is approximately 8.7–9.3% of
the annual budget [8,9]. Therefore, its early detection and prevention are urgently required.
It is well known that urine albumin–creatinine ratio (uACR) is a strong predictor
of the subsequent decline of the glomerular filtration rate in T2D, with an average of
0.93 mL per minute per month in approximately 35% of the subjects [10]. The underlying
pathophysiology is due to the increased glomerular pressure, which is independent of
hyperfiltration or hyperglycemia [11–13].
Traditionally, most studies have used multiple linear regression (MLR) to explore
the relationships between risk factors and outcomes (complications) in medical research.
Nevertheless, artificial intelligence using machine learning (ML), which enables machines
to learn from past data or experiences without being explicitly programmed, has now
become a new modality for data analysis that is competitive with MLR [14–16]. Because
ML can capture nonlinear relationships in data and complex interactions among multiple
predictors, it has the potential to outperform conventional MLR in disease prediction [17].
To our knowledge, only one study has attempted to predict the uACR in a T2D cohort.
Thus, in the present study, we applied four different ML methods and attempted to answer
the following questions in a diabetic cohort that was followed up for four years.
1.
2.
Compare the prediction accuracy between ML and traditional MLR.
Rank the importance of risk factors, such as demographic and biochemistry data.
2. Methods
2.1. Participant and Study Design
Data for this study were obtained from the diabetic outpatient clinic of the Cardinal
Tien Hospital in Taiwan from 2013 to 2019. This study is a prospective study, as we have
collected our patients from 2013 to 2016. We designated this cohort as the Cardinal Tien
Diabetes Study Cohort. Informed consent was obtained from all participants, and data
were collected anonymously. The study protocol was approved by the Institutional Review
Board of the hospital. In total, 1682 T2D patients were enrolled. After excluding subjects
with different causes, 1147 subjects remained for analysis (women: 608, men: 539), as
shown in Figure 1. They were followed up for 4 years. The following were the criteria for
inclusion: (1) type 2 diabetes; (2) age between 50 and 75 years; (3) body mass in the range
of 22–30 kg/m2 ; (4) glycated hemoglobin level between 6.5 and 10.5%; (5) the patients did
not undergo regular dialysis. A flowchart of participant selection is displayed in Figure 1.
On the day of the study, senior nursing staff recorded the subject’s medical history,
including information on any current medications, and a physical examination was performed. The waist circumference was measured horizontally at the level of the natural waist.
The body mass index (BMI) was calculated as the participant’s body weight (kg) divided by
the square of the participant’s height (m). The systolic blood pressure (SBP) and diastolic
blood pressure (DBP) were measured using standard mercury sphygmomanometers on the
right arm of each subject while seated.
J. Clin. Med. 2022, 11, 3661
3 of 14
Figure 1. Flowchart of sample selection from the Cardinal Tien Hospital Diabetes Study Cohort.
As previously published, the procedures for collecting demographic and biochemical
data are as follows [18]. After fasting for 10 h, blood samples were collected for biochemical
analyses. Plasma was separated from the blood within 1 h of collection and stored at 30 ◦ C
until the analysis of fasting plasma glucose (FPG) and lipid profiles. FPG was measured
using the glucose oxidase method (YSI 203 glucose analyzer; Yellow Springs Instruments,
Yellow Springs, OH, USA). The total cholesterol and triglyceride (TG) levels were measured
using the dry multilayer analytical slide method with a Fuji Dri-Chem 3000 analyzer
(Fuji Photo Film, Tokyo, Japan). The serum high-density lipoprotein cholesterol (HDL-C)
and low-density lipoprotein cholesterol (LDL-C) concentrations were analyzed using an
enzymatic cholesterol assay, following dextran sulfate precipitation. A Beckman Coulter
AU 5800 biochemical analyzer was used to determine the urine ACR by turbidimetry.
Table 1 lists the definitions of the 15 baseline clinical variables (independent variables,
sex, age, BMI, duration of diabetes, smoking, alcohol use, FPG, glycated hemoglobin,
triglyceride, HDL-C, LDL-C, alanine aminotransferase, creatinine (Cr), SBP, and DBP) used
in this study. The uACR at the end of the follow-up was a numerical variable, which
was used as a dependent (target) variable, while the remaining 15 variables were used as
predictor variables in this study.
J. Clin. Med. 2022, 11, 3661
4 of 14
Table 1. Variable definition.
Variables
Description
Unit
Sex
Male/Female
-
Age
Patient age
year
Body mass index
Body mass index
Kg/m2
Duration of diabetes
Duration of diabetes
year
Smoking
No/Yes
-
Alcohol
No/Yes
-
Baseline fasting plasma glucose
Fasting plasma glucose baseline
mg/dL
Baseline glycated hemoglobin
HbA1c (Glycated hemoglobin) baseline
%
Baseline triglyceride
Triglyceride baseline
mg/dL
Baseline high-density lipoprotein cholesterol
High-density lipoprotein cholesterol baseline
mg/dL
Baseline low-density lipoprotein cholesterol
Low-density lipoprotein cholesterol baseline
mg/dL
Baseline alanine aminotransferase baseline
Alanine aminotransferase baseline
U/L
Baseline creatinine
Creatinine baseline
mg/dL
Baseline systolic blood pressure
Systolic blood pressure baseline
mmHg
Baseline diastolic blood pressure
Diastolic blood pressure baseline
mmHg
uACR at the end of follow-up
Urine albumin to creatinine ratio = albumin
(mg/dL)/urine creatinine (mg/dL) follow up 4 year
mg/g
uACR: urine albumin–creatinine ratio.
2.2. Proposed Scheme
This research proposed a scheme based on four machine learning methods, namely
classification and regression tree (CART), random forest (RF), stochastic gradient boosting
(SGB), and eXtreme gradient boosting (XGBoost), to construct predictive models for predicting diabetic uACR and to identify the importance of these risk factors. These ML methods
have been applied in various healthcare applications and do not have prior assumptions
regarding data distribution [19–28]. MLR was used as the benchmark for comparison.
The first method, CART, is a tree-structure method [29]. It is composed of root nodes,
branches, and leaf nodes that grow recursively based on the tree structures from the root
nodes and split at each node based on the Gini index to produce branches and leaf nodes
with the rule. Then, the pruning node in the overgrown tree for optimal tree size using the
cost-complexity criterion generates different decision rules to compose a complete structure
tree [30,31].
RF, the second method in this study, is an ensemble learning decision tree algorithm
that combines bootstrap resampling and bagging [32]. RF’s principle entails randomly
generating many different and unpruned CART decision trees, in which the decrease in
Gini impurity is regarded as the splitting criterion, and all generated trees are combined
into a forest. Then, all the trees in the forest are averaged or voted to generate output
probabilities and a final model that generates a robust model [33].
The third method, SGB, is a tree-based gradient boosting learning algorithm that
combines both bagging and boosting techniques to minimize the loss function to solve
the overfitting problem of traditional decision trees [34,35]. In SGB, many stochastic weak
learners of trees are sequentially generated through multiple iterations, in which each
tree concentrates on correcting or explaining errors of the tree generated in the previous
iteration, that is, the residual of the previous iteration tree is used as the input for the
newly generated tree. This iterative process is repeated until the convergence condition or a
stopping criterion is reached for the maximum number of iterations. Finally, the cumulative
results of many trees are used to determine the final robust model.
J. Clin. Med. 2022, 11, 3661
5 of 14
XGBoost, the fourth method of this study, is a gradient boosting technology based on
an SGB optimized extension [36]. Its principle is to train many weak models sequentially
to ensemble them using the gradient boosting method of outputs, which achieves a better
prediction performance. In XGBoost, Taylor binomial expansion is used to approximate
the objective function and arbitrary differentiable loss functions to accelerate the model
construction convergence process [37]. Then, XGBoost applies a regularized boosting
technique to penalize the complexity of the model and correct overfitting, thus increasing
model accuracy [36].
A flowchart of the proposed prediction and important variable identification scheme
that combines the four ML methods is shown in Figure 2. First, patient data were collected
using the proposed method to prepare the dataset. The dataset was then randomly divided
into an 80% training dataset for model building and a 20% testing dataset for model
testing. In the training process, each ML method has its hyperparameters that must
be tuned to construct a relatively well-performed model. In this study, a 10-fold crossvalidation (CV) technique for hyperparameter tuning was used. The training dataset was
further randomly divided into a training dataset to build the model with a different set of
hyperparameters and a validation dataset for model validation. All possible combinations
of the hyperparameters were investigated using a grid search. The model with the lowest
root mean square error for the validation dataset was viewed as the best model for each
ML method. The best turned RF, SGB, CART, and XGBoost models were generated, and
the corresponding variable importance ranking information was obtained.
During the testing process, the testing dataset was used to evaluate the predictive
performance of the best RF, SGB, CART, and XGBoost models. As the target variable of the
models built in this study is a numerical variable, the metrics used for model performance
comparison are the mean absolute percentage error (MAPE), symmetric MAPE (SMAPE),
and relative absolute error (RAE), which are shown in Table 2.
Table 2. Equation of Performance Metrics.
Metrics
Description
Calculation
MAPE
Mean Absolute Percentage Error
SMAPE
Symmetric Mean Absolute Percentage Error
RAE
Relative Absolute Error
yi −ŷi
× 100
yi
i =1
n
|yi −ŷi |
× 100
SMAPE = n1 ∑ (|y |+|
ŷi |)/2
i
i
=
1
r
n
(y −ŷ )2
RAE = ∑i=1n i 2i
∑ i =1 ( y i )
MAPE =
1
n
n
∑
where ŷi and yi represent predicted and actual values, respectively; n stands the number of instances.
To provide a more robust comparison, the training and testing processes mentioned
above were randomly repeated 10 times. The averaged metrics of the RF, SGB, CART, and
XGBoost models were used to compare the model performance of the benchmark MLR
model that used the same training and testing dataset as the ML methods. An ML model
with an average metric lower than that of MLR was considered a convincing model.
Because all of the ML methods used can produce the importance ranking of each
predictor variable, we defined that the priority demonstrated in each model ranked 1
as the most critical risk factor and 15 as the last selected risk factor. The different ML
methods may produce different variable importance rankings because they have different
modeling characteristics; therefore, we integrated the variable importance ranking of the
convincing ML models to enhance the stability and integrity of re-ranking the importance
of risk factors. In the final stage of the proposed scheme, we summarize and discuss our
significant findings regarding the convincing ML models and identify important variables.
J. Clin. Med. 2022, 11, 3661
6 of 14
Figure 2. Proposed ML prediction scheme.
J. Clin. Med. 2022, 11, 3661
7 of 14
In this study, all methods were performed using R software version 4.0.5 and RStudio
version 1.1.453 with the required packages installed (http://www.R-project.org, accessed
on 1 February 2022; https://www.rstudio.com/products/rstudio/, accessed on 1 February
2022). The implementations of RF, SGB, CART, and XGBoost were the “randomForest”
R package version 4.6-14 [38], “gbm” R package version 2.1.8 [39], “rpart” R package
version 4.1-15 [40], and “XGBoost” R package version 1.5.0.2, respectively [41]. In addition,
to estimate the best hyperparameter set for the developed effective CART, RF, SGB, and
XGBoost methods, the “caret” R package version 6.0–90 was used [42]. The MLR was
implemented using the “stats” R package version 4.0.5, and the default setting was used to
construct the models.
3. Results
A total of 1147 participants were enrolled in the study (men: 539, women: 608). The
demographic data are shown in Table 3 (mean ± standard deviation). The results of the
comparison between the traditional MLR and the four ML methods (i.e., RF, SGB, CART,
and XGBoost) in predicting diabetic uACR in a 4-year follow-up cohort are shown in Table 4.
From the table, it can be seen that all four ML methods yielded lower prediction errors
than the MLR method and were all convincing ML models. To determine whether the
four ML methods significantly outperformed the MLR method, the Wilcoxon signed-rank
test was used. The Wilcoxon signed-rank test is one of the most popular distributionfree, non-parametric statistical tests for evaluating the performance of two prediction
models [43]. Table 5 shows the test results of the four ML methods and the MLR method.
It can be observed from the table that the prediction error values of all ML methods were
significantly different from those of the MLR method. Therefore, it can be determined
that the ML methods used in this study significantly outperformed traditional MLR in
predicting uACR at the end of the follow-up in terms of prediction error.
Table 3. Participant demographics.
Variables
Mean ± SD
N
Age
BMI
Duration of diabetes
Baseline fasting plasma glucose
Baseline glycated hemoglobin
Baseline triglyceride
Baseline high-density lipoprotein cholesterol
Baseline low-density lipoprotein cholesterol
Baseline alanine aminotransferase baseline
Baseline creatinine
Baseline systolic blood pressure
Baseline diastolic blood pressure
uACR at the end of follow-up
63.82 ± 11.49
26.45 ± 3.95
14.13 ± 7.65
149.84 ± 42.80
7.74 ± 1.49
142.99 ± 94.55
44.87 ± 12.00
98.82 ± 27.73
29.38 ± 21.48
0.90 ± 0.37
131.13 ± 14.07
75.91 ± 11.66
195.30 ± 711.98
1123
1134
1137
1146
1140
1144
845
1129
1134
1093
969
969
1147
N (%)
N
Sex
Male
Female
608 (53.01%)
539 (46.99%)
1147
Smoking
No
Yes
430 (60.06%)
286 (39.94%)
Alcohol
No
Yes
715 (90.62%)
74 (9.38%)
716
789
BMI: body mass index. uACR: urine albumin–creatinine ratio.
J. Clin. Med. 2022, 11, 3661
8 of 14
Table 4. The average performance of the MLR, RF, SGB, CART, and XGBoost methods.
MLR
RF
SGB
CART
XGBoost
MAPE
SMAPE
RAE
18.245 (4.79)
16.174 (4.82)
14.850 (3.09)
9.528 (1.76)
11.872 (2.80)
1.545 (0.04)
1.266 (0.05)
1.522 (0.07)
1.312 (0.06)
1.274 (0.06)
1.126 (0.17)
1.072 (0.19)
1.040 (0.16)
0.841 (0.10)
0.915 (0.11)
MLR: multiple linear regression; RF: random forest; SGB: stochastic gradient boosting; CART: classification and
regression tree; XGBoost: eXtreme gradient boosting; MAPE: mean absolute percentage error; SMAPE: symmetric
mean absolute percentage error; RAE: relative absolute error.
Table 5. Wilcoxon sign-rank test between four ML methods and MLR method.
MLR
RF
SGB
CART
XGBoost
41.736 (0.001) **
20.814 (0.001) **
30.680 (0.001) **
44.489 (0.001) **
The numbers in parentheses are the corresponding p-value; **: p < 0.05.
Table 6 presents the average importance ranking of each factor generated by the RF,
SGB, CART, and XGBoost methods. It can be observed from the figure that the different ML
methods generated different relative importance rankings for each factor. The darkness of
the blue color indicates the importance of risk factors. The darker the blue color, the more
important the risk factor. For instance, in the RF method, the first three important factors
were baseline Cr, age, and baseline SBP. The most important feature of the SGB method was
baseline Cr, which was followed by baseline HDL-C and baseline DBP. To fully integrate
the importance rankings of each factor in all the four ML methods, the average importance
ranking of each risk factor was obtained by averaging the ranking values of each variable
in each method.
Table 6. Importance ranking of each risk factor using the four convincing methods.
Variables
Sex
RF
SGB
CART
XGBoost
Average
11.3
14.9
15.0
13.7
13.7
Age
4.8
9.0
9.5
5.4
7.2
Body mass index
14.9
11.8
12.0
9.8
12.1
Duration of diabetes
8.8
7.0
10.7
8.4
8.7
Rank value
Smoking
10.8
14.4
15.0
14.7
13.7
1.0~1.4
Alcohol
11.6
13.6
15.0
14.6
13.7
1.5~2.4
Baseline fasting plasma glucose
5.4
6.3
10.9
5.3
7.0
2.5~3.4
Baseline glycated hemoglobin
5.8
5.0
10.3
6.1
6.8
3.5~4.4
Baseline triglyceride
11.9
10.2
12.7
13.1
12.0
4.5~5.4
Baseline high-density lipoprotein cholesterol
7.7
2.8
5.8
6.8
5.8
5.5~
Baseline low-density lipoprotein cholesterol
5.8
10.9
11.2
7.5
8.9
Baseline alanine aminotransferase baseline
9.6
8.3
12.4
12.6
10.7
Baseline creatinine
1.3
1.1
1.8
1.1
1.3
Baseline systolic blood pressure
5.0
4.9
4.3
3.9
4.5
Baseline diastolic blood pressure
5.3
4.1
4.1
4.7
4.6
Note: Different blue colors indicate different rank values of risk factors. The darker the blue color, the more
important the risk factor.
Figure 3 depicts the risk factors based on the increasing order of the averaged ranking
values. It can be noted from the figure that the first six important risk factors in predicting
diabetic uACR in a 4-year follow-up cohort are baseline Cr, baseline SBP, baseline DBP,
baseline HDL-C, baseline glycated hemoglobin, and baseline FPG.
J. Clin. Med. 2022, 11, 3661
9 of 14
Baseline creatinine
1.3
Baseline systolic blood pressure
4.5
Baseline diastolic blood pressure
4.6
Variables
Baseline high density lipoprotein cholesterol
5.8
Baseline glycated hemoglobin
6.8
Baseline fasting plasma glucose
7.0
Age
7.2
Duration of diabetes
8.7
Baseline low density lipoprotein cholesterol
8.9
Baseline alanine aminotransferase baseline
10.7
Baseline triglyceride
12.0
Body mass index
12.1
Alcohol
13.7
Smoking
13.7
Sex
13.7
0.0
2.0
4.0
6.0
8.0
10.0
Average Rank
12.0
14.0
16.0
Figure 3. Integrated importance ranking of all risk factors. Note: The darker color indicates the first
six important risk factors of this study.
4. Discussion
As mentioned in the Introduction, the present study has two goals. The first was to
compare the accuracy between ML methods and MLR, and the second was to identify
the rank of different risk factors for predicting uACR. Our study showed that all four ML
methods outperformed the MLR. We also found that baseline Cr, blood pressure, HDL-C,
glycated hemoglobin, and FPG were the most important factors.
Traditionally, MLR has been widely used to analyze medical research to deal with
continuous variables. However, it is difficult to describe the nonlinear data patterns of
MLR, and the effective use of MLR requires fitting its strong assumptions during modeling.
Unlike MLR, ML does not require strong model assumptions and can capture the delicate
underlying nonlinear relationships contained in empirical data [19]. Our present data
showed that all four ML methods are superior to MLR because the MAPE and RAE of the
ML methods all have lower values (Table 4). Our results suggest that ML might have a
great potential for medical studies and applications.
Because diabetic nephropathy causes a serious burden on individuals and consumes
a large portion of the government health budget, extensive studies have focused on this
topic [6,44–47]. From these previous studies, it could be concluded that sex, high blood
glucose and blood pressure, smoking, dyslipidemia, decreased glomerular filtration rate,
BMI, and uACR are common risk factors for future uACR. However, in the present study,
our data showed that baseline Cr, DBP, SBP, HDL-C, glycated hemoglobin, and FPG were
the most important risks. Additionally, the roles of diabetes duration, glycated hemoglobin,
BMI, HDL-cholesterol, triglyceride, sex, smoking, and alcohol use were less important.
Our data suggest that the most important predictor of albuminuria is baseline Cr.
This is not surprising because albuminuria occurs early in the course of diabetic nephropathy [48]. According to the majority of previous studies, a summary of this relationship
could be depicted as follows: diabetic patients with albuminuria are at a higher risk of
J. Clin. Med. 2022, 11, 3661
10 of 14
end-stage renal and cardiovascular diseases [49,50]. This indicates that albuminuria is the
cause of end-stage renal disease, which differs from the findings of the present study. Our
results show that an increase in serum Cr level could predict albuminuria four years later,
which is an opposite cause–effect relationship to the majority of the other studies. However,
our finding can be supported by the cornerstone study conducted by Gansevoort et al. [51].
This meta-analysis clearly showed that there are independent, continuous, and negative
associations between serum Cr and albuminuria. Thus, it could be postulated that each of
these factors could affect the other at the same time. Further research is required to explore
this area.
Both diastolic and systolic blood pressures were identified as the second and third
important factors for predicting albuminuria. Their relationships are well known and have
been extensively studied [52]. Similar to the role of increased serum Cr levels, kidney
disease causes an increase in BP, which could further deteriorate renal function. More
specifically, the change in BP is in concordance with and even precedes albuminuria [53]. By
controlling BP, the speed of end-stage renal disease progression can be slowed down [54].
Interestingly, HDL cholesterol level was the only lipid found to be correlated with
albuminuria. However, few studies have focused on this topic. Most previous studies
have demonstrated that different stages of diabetic kidney disease (DKD) have different
influences on blood lipid levels [55,56]. Other studies measured apolipoproteins and
the size of LDL-cholesterol, which all showed positive correlations with DKD, including
albuminuria [57]. To our knowledge, only two studies are relatively close to the present
findings. The first study was performed by Sacks et al. In a group of 2535 T2D patients, they
evaluated the impact of HDL-C levels on uACR. Furthermore, kidney disease was defined
as albuminuria, proteinuria, or decreased eGFR. The data showed that the odds ratio of
having kidney disease decreased by 0.86 (0.82–0.91) for every 0.2 mmol/L (approximately
1 quintile) increase in HDL-C [58]. The second study was conducted on a cohort of 524
Chinese patients. Using multiple logistic regression, after adjusting for the available
confounding factors, they suggested that subjects with the highest quartile HDL-C had a
lower odds ratio (OR = 0.17, 95% confidence interval 0.15–0.52) of having uACR than the
lowest quartile. However, a limitation of this study was that it was cross-sectional. Thus,
it was unable to infer the causation or directionality of this relationship [59]. This study
responds to this limitation in its longitudinal design. The causative influence of HDL-C level
can be explained by several assumptions. First, the glomerular and renal tubules could be
injured by impaired HDL-C function, which hinders the reversal of the cholesterol transport
process [60]. Second, the antioxidative ability of the HDL-C is reduced and oxidative stress
is increased, which further influences the immune-mediated diabetic nephropathy [61].
Finally, it is well known that low HDL-C levels are associated with insulin resistance,
hyperinsulinemia, and hyperglycemia. All these untoward derangements can damage
endothelial cells in the glomerulus [62,63].
The last two factors affecting albuminuria are glycated hemoglobin and FPG levels.
This finding is compatible with the results of the Diabetes Control and Complication Trial
(DCCT) [64]. The data showed positive relationships between glucose control and albuminuria. Moreover, after controlling for blood glucose levels, albuminuria also improved [65].
Because DCCT enrolled patients with type 1 diabetes, its pathophysiology is different
from that of the present study. Regarding T2D, few studies have been conducted in this
area. A comprehensive meta-analysis conducted by Lo et al. [66] showed that for intensive
control (glycated hemoglobin < 7% and FPG < 6.6 mmol/L), the relative risk of having
uACR was 0.59 (confidence interval: 0.38–0.93). As this study enrolled 11 studies (29,141
subjects) and follow-ups were conducted for an average of 56.7 months, their conclusion
is convincing. The underlying pathophysiology to support this result is that high blood
glucose concentration could involve mesangial cell damage in nephrons [67]. However,
it is worth noting that both A1c and FPG were classified as important predictors. This
might indicate that because FPG is only one blood glucose measurement within 90 days
J. Clin. Med. 2022, 11, 3661
11 of 14
compared to A1c, it is less accurate than A1c. Our results show that they are ‘independent’
of each other.
Interestingly, in the present study, the duration of diabetes, body mass index, sex,
smoking, and alcohol use were less important. This finding could be attributed to the
nature of the ML. ML methods are data-driven, non-parametric models. They can map
any nonlinear function without an a priori assumption about the properties of the data
and have the ability to capture subtle functional relationships among the empirical data,
even though the underlying relationships are unknown or difficult to describe [68–70].
These factors may contain richer linear pattern information and less important nonlinear
information than baseline creatinine, blood pressure, albuminuria level, and age. Thus,
they were ranked as less important risk factors using ML methods.
This study had some limitations. First, the smoking and alcohol details need to be more
defined because some other reports have shown that they have an important impact on the
occurrence of diabetic nephropathy. Second, we did not collect information on the use of
angiotensin-converting enzyme inhibitors, angiotensin receptor blockers, sodium-glucose
cotransporter 2 inhibitors, and glucagon-like peptide-1 agonists. All these medications
would have beneficial effects on DKD. Third, some of the data, such as uACR and blood
pressure, were collected only once. For some of the participants, we did have data more
than once. However, because the number is less than the present number, we still chose
to enroll subjects with only one value. Even though these drawbacks do exist, our large
n number and the characteristics of ML (alleviating the effects of extremes) could at least
partially adjust.
5. Conclusions
ML might be more accurate in predicting uACR in T2D than the traditional MLR,
and the baseline creatinine level is the most important factor to predict uACR in a T2D
cohort, which is followed by systolic and diastolic blood pressure, glycated hemoglobin,
and fasting plasma glucose.
Author Contributions: Developed the theory and wrote the draft, L.-Y.H.; Conceived and planned
the experiment, F.-Y.C.; perform the machine learning analysis, M.-J.J.; helped to do the figures
and tables, C.-H.K.; supervised the project, C.-Z.W.; discuss the results and contributed to the final
manuscript, C.-H.L.; discuss the results and contributed to the final manuscript, Y.-L.C. and Y.-F.C.;
collecting the medical records, D.P.; designed the data analysis scheme and wrote the draft, C.-J.L. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: The study was approved by the Research Ethics Review
Committee at the Cardinal Tien Hospital (IRB No. CTH-100-2-5-036).
Informed Consent Statement: This manuscript contains no person’s details, images, or videos.
Data Availability Statement: Data available on request due to privacy/ethical restrictions.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
2.
3.
4.
5.
International Diabetes Federation. IDF Diabetes Atlas, 10th ed.; International Diabetes Federation: Brussels, Belgium, 2021;
Available online: http://www.diabetesatlas.org/ (accessed on 22 March 2022).
Sheen, Y.-J.; Hsu, C.-C.; Jiang, Y.-D.; Huang, C.-N.; Liu, J.-S.; Sheu, W.H.-H. Trends in prevalence and incidence of diabetes
mellitus from 2005 to 2014 in Taiwan. J. Formos. Med. Assoc. 2019, 118, S66–S73. [CrossRef] [PubMed]
Tseng, C.H.; Chong, C.K.; Heng, L.T.; Tseng, C.P.; Tai, T.Y. The incidence of type 2 diabetes mellitus in Taiwan. Diabetes Res. Clin.
Pract. 2000, 50, S61–S64. [CrossRef]
Chang, C.-J.; Lu, F.-H.; Yang, Y.-C.; Wu, J.-S.; Wu, T.-J.; Chen, M.-S.; Chuang, L.-M.; Tai, T.Y. Epidemiologic study of type 2 diabetes
in Taiwan. Diabetes Res. Clin. Pract. 2000, 50, S49–S59. [CrossRef]
Chang, C.H.; Shau, W.Y.; Jiang, Y.D.; Li, H.Y.; Chang, T.J.; Sheu, W.H.; Kwok, C.F.; Ho, L.T.; Chuang, L.M. Type 2 diabetes
prevalence and incidence among adults in Taiwan during 1999–2004: A national health insurance data set study. Diabet. Med.
2010, 27, 636–643. [CrossRef]
J. Clin. Med. 2022, 11, 3661
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
12 of 14
Alicic, R.Z.; Rooney, M.T.; Tuttle, K.R. Diabetic Kidney Disease: Challenges, Progress, and Possibilities. Clin. J. Am. Soc. Nephrol.
2017, 12, 2032–2045. [CrossRef]
United States Renal Data System. 2020 Usrds Annual Data Report: Epidemiology of Kidney Disease in the United States; National
Institutes of Health; National Institute of Diabetes and Digestive and Kidney Diseases: Bethesda, MD, USA, 2020.
Chiang, J.K.; Chen, J.S.; Kao, Y.H. Comparison of medical outcomes and health care costs at the end of life between dialysis
patients with and without cancer: A national population-based study. BMC Nephrol. 2019, 20, 265. [CrossRef]
Taiwan Society of Nephrology. National Health Research Institutes, Taiwan Annual Report on Kidney Disease in Taiwan. 2020.
Available online: https://www.tsn.org.tw/UI/L/L002.aspx (accessed on 22 March 2022).
Nelson, R.G.; Bennett, P.H.; Beck, G.J.; Tan, M.; Knowler, W.C.; Mitch, W.E.; Hirschman, G.H.; Myers, B.D. Development and
progression of renal disease in Pima Indians with non-insulin-dependent diabetes mellitus. Diabetic Renal Disease Study Group.
N. Engl. J. Med. 1996, 335, 1636–1642. [CrossRef]
Anderson, S.; Meyer, T.W.; Rennke, H.G.; Brenner, B.M. Control of glomerular hypertension limits glomerular injury in rats with
reduced renal mass. J. Clin. Investig. 1985, 76, 612–619. [CrossRef]
Anderson, S.; Rennke, H.G.; Brenner, B.M. Therapeutic advantage of converting enzyme inhibitors in arresting progressive renal
disease associated with systemic hypertension in the rat. J. Clin. Investig. 1986, 77, 1993–2000. [CrossRef]
Zatz, R.; Dunn, B.R.; Meyer, T.W.; Anderson, S.; Rennke, H.G.; Brenner, B.M. Prevention of diabetic glomerulopathy by
pharmacological amelioration of glomerular capillary hypertension. J. Clin. Investig. 1986, 77, 1925–1930. [CrossRef]
Marateb, H.R.; Mansourian, M.; Faghihimani, E.; Amini, M.; Farina, D. A hybrid intelligent system for diagnosing microalbuminuria in type 2 diabetes patients without having to measure urinary albumin. Comput. Biol. Med. 2014, 45, 34–42. [CrossRef]
[PubMed]
Ye, Y.; Xiong, Y.; Zhou, Q.; Wu, J.; Li, X.; Xiao, X. Comparison of Machine Learning Methods and Conventional Logistic
Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study. J. Diabetes Res. 2020,
2020, 4168340. [CrossRef] [PubMed]
Nusinovici, S.; Tham, Y.C.; Yan, M.Y.C.; Ting, D.S.W.; Li, J.; Sabanayagam, C.; Wong, T.Y.; Cheng, C.Y. Logistic regression was as
good as machine learning for predicting major chronic diseases. J. Clin. Epidemiol. 2020, 122, 56–69. [CrossRef] [PubMed]
Miller, D.D.; Brown, E.W. Artificial Intelligence in Medical Practice: The Question to the Answer? Am. J. Med. 2018, 131, 129–133.
[CrossRef]
Lu, C.-H.; Pei, D.; Wu, C.-Z.; Kua, H.-C.; Liang, Y.-J.; Chen, Y.-L.; Lin, J.-D. Predictors of abnormality in thallium myocardial
perfusion scans for type 2 diabetes. Heart Vessel. 2021, 36, 180–188. [CrossRef]
Tseng, C.-J.; Lu, C.-J.; Chang, C.-C.; Chen, G.-D.; Cheewakriangkrai, C. Integration of data mining classification techniques and
ensemble learning to identify risk factors and diagnose ovarian cancer recurrence. Artif. Intell. Med. 2017, 78, 47–54. [CrossRef]
Ting, W.-C.; Chang, H.-R.; Chang, C.-C.; Lu, C.-J. Developing a Novel Machine Learning-Based Classification Scheme for
Predicting SPCs in Colorectal Cancer Survivors. Appl. Sci. 2020, 10, 1355. [CrossRef]
Shih, C.-C.; Lu, C.-J.; Chen, G.-D.; Chang, C.-C. Risk Prediction for Early Chronic Kidney Disease: Results from an Adult Health
Examination Program of 19,270 Individuals. Int. J. Environ. Res. Public Health 2020, 17, 4973. [CrossRef]
Lee, T.-S.; Chen, I.-F.; Chang, T.-J.; Lu, C.-J. Forecasting Weekly Influenza Outpatient Visits Using a Two-Dimensional Hierarchical
Decision Tree Scheme. Int. J. Environ. Res. Public Health 2020, 17, 4743. [CrossRef]
Chang, C.-C.; Yeh, J.-H.; Chen, Y.-M.; Jhou, M.-J.; Lu, C.-J. Clinical Predictors of Prolonged Hospital Stay in Patients with
Myasthenia Gravis: A Study Using Machine Learning Algorithms. J. Clin. Med. 2021, 10, 4393. [CrossRef]
Chang, C.-C.; Huang, T.-H.; Shueng, P.-W.; Chen, S.-H.; Chen, C.-C.; Lu, C.-J.; Tseng, Y.-J. Developing a Stacked Ensemble-Based
Classification Scheme to Predict Second Primary Cancers in Head and Neck Cancer Survivors. Int. J. Environ. Res. Public Health
2021, 18, 12499. [CrossRef] [PubMed]
Chiu, Y.-L.; Jhou, M.-J.; Lee, T.-S.; Lu, C.-J.; Chen, M.-S. Health Data-Driven Machine Learning Algorithms Applied to Risk
Indicators Assessment for Chronic Kidney Disease. Risk Manag. Healthc. Policy 2021, 14, 4401–4412. [CrossRef] [PubMed]
Wu, T.-E.; Chen, H.-A.; Jhou, M.-J.; Chen, Y.-N.; Chang, T.-J.; Lu, C.-J. Evaluating the Effect of Topical Atropine Use for Myopia
Control on Intraocular Pressure by Using Machine Learning. J. Clin. Med. 2021, 10, 111. [CrossRef]
Wu, C.-W.; Shen, H.-L.; Lu, C.-J.; Chen, S.-H.; Chen, H.-Y. Comparison of Different Machine Learning Classifiers for Glaucoma
Diagnosis Based on Spectralis OCT. Diagnostics 2021, 11, 1718. [CrossRef]
Chang, C.-C.; Yeh, J.-H.; Chiu, H.-C.; Chen, Y.-M.; Jhou, M.-J.; Liu, T.-C.; Lu, C.-J. Utilization of Decision Tree Algorithms for
Supporting the Prediction of Intensive Care Unit Admission of Myasthenia Gravis: A Machine Learning-Based Approach. J. Pers.
Med. 2022, 12, 32. [CrossRef] [PubMed]
Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees. Biometrics 1984, 40, 874. [CrossRef]
Patel, N.; Upadhyay, S. Study of various decision tree pruning methods with their empirical comparison in WEKA. Int. J. Comput.
Appl. 2012, 60, 20–25. [CrossRef]
Tierney, N.J.; Harden, F.A.; Harden, M.J.; Mengersen, K.L. Using decision trees to understand structure in missing data. BMJ Open
2015, 5, e007450. [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [CrossRef]
Calle, M.; Urrea, V. Letter to the editor: Stability of random forest importance measures. Brief. Bioinform. 2011, 12, 86–89.
[CrossRef]
J. Clin. Med. 2022, 11, 3661
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
13 of 14
Friedman, J. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [CrossRef]
Friedman, J. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002, 38, 367–378. [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794.
Torlay, L.; Perrone-Bertolotti, M.; Thomas, E.; Baciu, M. Machine learning–XGBoost analysis of language networks to classify
patients with epilepsy. Brain Inform. 2017, 4, 159–169. [CrossRef]
Breiman, L.; Cutler, A.; Liaw, A.; Wiener, M. randomForest: Breiman and Cutler’s Random Forests for Classification and
Regression. R Package Version, 4.6-14. 2022. Available online: https://CRAN.R-project.org/package=randomForest (accessed on
1 January 2022).
Greenwell, B.; Boehmke, B.; Cunningham, J. Gbm: Generalized Boosted Regression Models. R Package Version, 2.1.8. 2020.
Available online: https://CRAN.R-project.org/package=gbm (accessed on 1 January 2022).
Therneau, T.; Atkinson, B. Rpart: Recursive Partitioning and Regression Trees. R Package Version, 4.1.15. 2022. Available online:
https://CRAN.R-project.org/package=rpart (accessed on 1 January 2022).
Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T.; et al. Xgboost: Extreme
Gradient Boosting. R Package Version, 1.5.0.2. 2022. Available online: https://CRAN.R-project.org/package=xgboost (accessed
on 1 January 2022).
Kuhn, M. Caret: Classification and Regression Training. R Package Version, 6.0-90. 2022. Available online: https://CRAN.Rproject.org/package=caret (accessed on 1 January 2022).
Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy. J. Bus. Econ. Stat. 1995, 20, 134–144. [CrossRef]
Gross, J.L.; De Azevedo, M.J.; Silveiro, S.P.; Canani, L.H.; Caramori, M.L.; Zelmanovitz, T. Diabetic nephropathy: Diagnosis,
prevention, and treatment. Diabetes Care 2005, 28, 164–176. [CrossRef] [PubMed]
Harjutsalo, V.; Groop, P.-H. Epidemiology and risk factors for diabetic kidney disease. Adv. Chronic Kidney Dis. 2014, 21, 260–266.
[CrossRef]
Duan, J.; Wang, C. Prevalence and risk factors of chronic kidney disease and diabetic kidney disease in Chinese rural residents: A
cross-sectional survey. Sci. Rep. 2019, 9, 10408. [CrossRef]
Hussain, S.; Jamali, M.C.; Habib, A.; Hussain, M.S.; Akhtar, M.; Najmi, A.K. Diabetic kidney disease: An overview of prevalence,
risk factors, and biomarkers. Clin. Epidemiol. Glob. Health 2021, 9, 2–6. [CrossRef]
Wu, X.Q.; Zhang, D.D.; Wang, Y.N.; Tan, Y.Q.; Yu, X.Y.; Zhao, Y.Y. AGE/RAGE in diabetic kidney disease and ageing kidney. Free
Radic. Biol. Med. 2021, 171, 260–271. [CrossRef]
Newman, D.J.; Mattock, M.B.; Dawnay, A.B.; Kerry, S.; McGuire, A.; Yaqoob, M.; Hitman, G.A.; Hawke, C. Systematic review on
urine albumin testing for early detection of diabetic complications. Health Technol. Assess. 2005, 9, 1–122. [CrossRef]
Hong, J.W.; Ku, C.R.; Noh, J.H.; Ko, K.S.; Rhee, B.D.; Kim, D.-J. Association between low-grade albuminuria and cardiovascular
risk in Korean adults: The 2011–2012 Korea National Health and Nutrition Examination Survey. PLoS ONE 2015, 10, e0118866.
[CrossRef] [PubMed]
Gansevoort, R.T.; Matsushita, K.; Van Der Velde, M.; Astor, B.C.; Woodward, M.; Levey, A.S.; De Jong, P.E.; Coresh, J. Lower
estimated GFR and higher albuminuria are associated with adverse kidney outcomes. A collaborative meta-analysis of general
and high-risk population cohorts. Kidney Int. 2011, 80, 93–104. [CrossRef] [PubMed]
Hsu, C.C.; Brancati, F.L.; Astor, B.C.; Kao, W.H.; Steffes, M.W.; Folsom, A.R.; Coresh, J. Blood pressure, atherosclerosis, and
albuminuria in 10,113 participants in the atherosclerosis risk in communities study. J. Hypertens. 2009, 27, 397–409. [CrossRef]
[PubMed]
Fagerudd, J.A.; Tarnow, L.; Jacobsen, P.; Stenman, S.; Nielsen, F.S.; Pettersson-Fernholm, K.J.; Grönhagen-Riska, C.; Parving, H.H.;
Groop, P.H. Predisposition to essential hypertension and development of diabetic nephropathy in NIDDM. Diabetes 1998, 47,
439–444. [CrossRef]
Ruggenenti, P.; Fassi, A.; Ilieva, A.P.; Bruno, S.; Iliev, I.P.; Brusegan, V.; Rubis, N.; Gherardi, G.; Arnoldi, F.; Ganeva, M.; et al.
Preventing microalbuminuria in type 2 diabetes. N. Engl. J. Med. 2004, 351, 1941–1951. [CrossRef]
Shoji, T.; Emoto, M.; Kawagishi, T.; Kimoto, E.; Yamada, A.; Tabata, T.; Ishimura, E.; Inaba, M.; Okuno, Y.; Nishizawa, Y.
Atherogenic lipoprotein changes in diabetic nephropathy. Atherosclerosis 2001, 156, 425–433. [CrossRef]
Jenkins, A.J.; Lyons, T.J.; Zheng, D.; Otvos, J.D.; Lackland, D.T.; Mcgee, D.; Garvey, W.T.; Klein, R.L.; The DCCT/EDIC Research
Group. Lipoproteins in the dcct/edic cohort: Associations with diabetic nephropathy. Kidney Int. 2003, 64, 817–828. [CrossRef]
Tolonen, N.; Forsblom, C.; Thorn, L.; Wadén, J.; Rosengård-Bärlund, M.; Saraheimo, M.; Feodoroff, M.; Mäkinen, V.P.; Gordin, D.;
Taskinen, M.R.; et al. Lipid abnormalities predict progression of renal disease in patients with type 1 diabetes. Diabetologia 2009,
52, 2522–2530. [CrossRef]
Sacks, F.M.; Hermans, M.P.; Fioretto, P.; Valensi, P.; Davis, T.; Horton, E.; Wanner, C.; Al-Rubeaan, K.; Aronson, R.; Barzon, I.;
et al. Association between plasma triglycerides and high-density lipoprotein cholesterol and microvascular kidney disease and
retinopathy in type 2 diabetes mellitus: A global case-control study in 13 countries. Circulation 2014, 129, 999–1008. [CrossRef]
Sun, X.; Xiao, Y.; Li, P.M.; Ma, X.Y.; Sun, X.J.; Lv, W.S.; Wu, Y.L.; Liu, P.; Wang, Y.G. Association of serum high-density lipoprotein
cholesterol with microalbuminuria in type 2 diabetes patients. Lipids Health Dis. 2018, 17, 229. [CrossRef]
Vaziri, N.D. Lipotoxicity and impaired high density lipoprotein-mediated reverse cholesterol transport in chronic kidney disease.
J. Ren. Nutr. 2010, 20, S35–S43. [CrossRef] [PubMed]
J. Clin. Med. 2022, 11, 3661
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
14 of 14
Li, C.; Gu, Q. Protective effect of paraoxonase 1 of high-density lipoprotein in type 2 diabetic patients with nephropathy.
Nephrology 2009, 14, 514–520. [CrossRef] [PubMed]
Drew, B.G.; Duffy, S.J.; Formosa, M.F.; Natoli, A.K.; Henstridge, D.C.; Penfold, S.A.; Thomas, W.G.; Mukhamedova, N.; de
Courten, B.; Forbes, J.M.; et al. High-density lipoprotein modulates glucose metabolism in patients with type 2 diabetes mellitus.
Circulation 2009, 119, 2103–2111. [CrossRef] [PubMed]
Brunham, L.R.; Kruit, J.K.; Hayden, M.R.; Verchere, C.B. Cholesterol in β-cell dysfunction: The emerging connection between
HDL cholesterol and Type 2 diabetes. Curr. Diabetes Rep. 2010, 10, 55–60. [CrossRef]
Bilous, R. Microvascular disease: What does the UKPDS tell us about diabetic nephropathy? Diabet Med. 2003, 20, 25–29.
[CrossRef]
The Diabetes Control and Complications (DCCT) Research Group. Effect of intensive therapy on the development and progression
of diabetic nephropathy in the Diabetes Control and Complications Trial. Kidney Int. 1995, 47, 1703–1720. [CrossRef]
Lo, C.; Zoungas, S. Intensive glucose control in patients with diabetes prevents onset and progression of microalbuminuria, but
effects on end-stage kidney disease are still uncertain. Evid. Based Med. 2017, 22, 219–220. [CrossRef]
Genuth, S.; Eastman, R.; Kahn, R.; Klein, R.; Lachin, J.; Lebovitz, H.; Nathan, D.; Vinicor, F.; American Diabetes Association.
Implications of the United Kingdom prospective diabetes study. Diabetes Care 2003, 26, S28–S32. [CrossRef]
Chen, I.-F.; Lu, C.-J. Sales forecasting by combining clustering and machine-learning techniques for computer retailing. Neural
Comput. Appl. 2017, 28, 2633–2647. [CrossRef]
Jiang, F.; Jiang, Y.; Zhi, H.; Dong, Y.; Li, H.; Ma, S.; Wang, Y.; Dong, Q.; Shen, H.; Wang, Y. Artificial intelligence in healthcare: Past,
present and future. Stroke Vasc. Neurol. 2017, 2, 230. [CrossRef]
Koteluk, O.; Wartecki, A.; Mazurek, S.; Kołodziejczak, I.; Mackiewicz, A. How Do Machines Learn? Artificial Intelligence as a
New Era in Medicine. J. Pers. Med. 2021, 11, 32. [CrossRef] [PubMed]