[go: up one dir, main page]

Academia.eduAcademia.edu

Machine learning and chronic kidney disease risk prediction

2021, African Journal of Nephrology

With a prevalence of approximately 10–15% in Africa and a close relationship with other non-communicable diseases, chronic kidney disease (CKD) can result in a significant comorbidity burden and impact on quality of life. The complex spectrum of precipitants and drivers of progression present a challenge for early diagnosis and effective interventions. Predicting this progression can provide clinicians with guidance on the need and frequency of monitoring in specialist clinics, the degree to which interventions such as kidney biopsies and aggressive risk factor modification may be of use, and to plan, in a timely manner, the various elements of dialysis initiation and transplantation. For patients, such predictions have the potential to contextualise the recommended therapies and monitoring regimes prescribed, allowing them to engage better with decision making and planning if, and when, kidney replacement therapies are needed. This paper explores the use of machine learning to faci...

African Journal of Nephrology African Journal of Nephrology I Volume 24, No 1, 2021 Official publication of the African Association of Nephrology Volume 24, No 1, 2021, 58-67 REVIEW ARTICLE Machine learning and chronic kidney disease risk prediction Marina Wainstein1, Sally Shrapnel2, Chevon Clark3, Wendy Hoy4,5, Helen Healy4,5, Ivor Katz6 1 University of Queensland, Faculty of Medicine, Brisbane, Australia. 2School of Mathematics and Physics, University of Queensland, Brisbane, Australia. 3National Renal Care, South Africa. 4CKD.QLD, Centre for Chronic Disease, University of Queensland, Brisbane, Australia. 5Kidney Health Service, Royal Brisbane and Women’s Hospital, Brisbane, Australia. 6St George Hospital, Renal Department, Kogarah, Faculty of Medicine, University of New South Wales, Sydney, Australia. ABSTRACT With a prevalence of approximately 10–15% in Africa and a close relationship with other non-communicable diseases, chronic kidney disease (CKD) can result in a significant comorbidity burden and impact on quality of life. The complex spectrum of precipitants and drivers of progression present a challenge for early diagnosis and effective interventions. Predicting this progression can provide clinicians with guidance on the need and frequency of monitoring in specialist clinics, the degree to which interventions such as kidney biopsies and aggressive risk factor modification may be of use, and to plan, in a timely manner, the various elements of dialysis initiation and transplantation. For patients, such predictions have the potential to contextualise the recommended therapies and monitoring regimes prescribed, allowing them to engage better with decision making and planning if, and when, kidney replacement therapies are needed. This paper explores the use of machine learning to facilitate such predictions and improve our understanding of CKD as well as to provide a platform for future studies to examine their clinical utility and value to both clinicians and patients. Keywords: machine learning; chronic kidney disease; predictions. INTRODUCTION Although infectious diseases are still the leading cause of death in Africa, non-communicable chronic diseases contribute a significant burden of morbidity and mortality [1]. The overall incidence of chronic kidney disease (CKD) in Africa is approximately 10–15%, with roughly 4.6% being advanced stages 3–5 [2]. In South Africa in 2017, there were 10,744 people in a kidney replacement therapy (KRT) programme, which represented a decline in per million population (pmp) from 70 in 1994 to 66 pmp in 2017 [3]. Nephrologists and most kidney services in Africa spend much of their time and resources triaging and managing patients with CKD and planning the delivery of care for its terminal state end-stage kidney failure (ESKF). Patients may live their entire lives unaware of having this silent disease or they may be increasingly 58 debilitated by its relentless progression and ultimate fate, connected to a dialysis chair or dying without it. The ability to predict such divergent outcomes is as difficult and time-consuming for most clinicians as is understanding the complexity of this disease and conceptualising the range of risk factors that contribute to its progression. The prediction of disease trajectory and collateral events, especially when contributed by a spectrum of modifiable risk factors as does CKD, is enormously helpful and beneficial. The capacity to distinguish between patients who are likely to progress to kidney failure from those in whom the disease will linger without much consequence, could allow us to selectively deliver specialised nephrology care without overwhelming our resources, steer the urgency and aggressiveness with which we target risk fac- Received: 30 September 2020; accepted 13 September 2021; published 28 October 2021. Correspondence: Marina Wainstein, marinawainstein@outlook.com. © The Author(s) 2021. Published under a Creative Commons Attribution 4.0 International License. Machine learning and kidney disease tor modification and plan for dialysis and transplantation in a timely manner. On a population level, these predictions have the capacity to improve the implementation and detection rate of screening programmes and shape health policy to address at-risk populations. In research, prediction models can be used to guide entry into clinical trials and inform sample size calculations, resulting in studies with better and more pragmatic design. Ultimately, these predictions empower patients with the knowledge to plan and engage in decision making with regard to their future. Although various risk prediction equations and scores exist today, their capacity to be adapted to wide and noisy data sets, such as electronic health records (EHR), or to model complex, non-linear relationships to predict CKD progression, is limited. Machine learning prediction models are rapidly proving their worth in many areas of medicine, from diagnosing skin lesions to directing drug trials for cancer patients [4]. These computer models, which can learn from data with minimal external programming and be deployed to find patterns in vast and complex data sets, present a unique opportunity to predict CKD trajectory and offer insights into its contributing factors. The aim of this review is, first, to look at existing algorithms that assess decline in CKD before focusing on machine learning models and their potential to predict CKD progression. CKD CLASSIFICATION AND THE CENTRAL ROLE OF ESTIMATED GLOMERULAR FILTRATION RATE (EGFR) AND ALBUMINURIA The recognition of CKD as a major health priority has generated enormous efforts to improve its detection and timely management. Inherent to this task has been the development of a language to describe and characterise it according to disease severity and to provide a framework for epidemiological CKD research. In 2002, the Kidney Disease Outcomes Quality Initiative (KDOQI) of the National Kidney Foundation (NKF) published a set of guidelines aimed at defining, staging and risk stratifying CKD. The threshold eGFR of less than 60 mL/min/1.73 m2 was chosen as it represented loss of half or more of the normal measured GFR in a young adult (120–130 mL/ min/1.73 m2) [5], and the point at which the complications of kidney disease became apparent and significant. Five stages of CKD severity were developed, and proteinuria was identified as an important marker of kidney damage, but not formally incorporated into this earlier classification. 59 In 2009, the Kidney Disease: Improving Global Outcomes (KDIGO) initiated a collaborative meta-analysis to further investigate the role of low eGFR and albuminuria on mortality and kidney outcomes to inform practice guide- lines [6]. The meta-analysis confirmed an increased risk in all outcomes (all-cause mortality, cardiovascular mortality and kidney outcomes, which included kidney failure, acute kidney injury and progressive CKD) at an eGFR less than 60 mL/min/1.73 m2 as well as with a urine to albumin creatinine ratio (ACR) greater than 3 mg/mmol, independent of eGFR [6,7]. Additionally, the working group found a steep rise in risk of all outcomes below an eGFR threshold of 45 mL/min/1.73 m2, prompting a split in the stage 3 category. It was noted that risk in kidney-specific outcomes, par-ticularly risk of progression to ESKF, were exponentially increased in patients with lower eGFR and higher albuminuria levels [6]. The revisions were implemented into the latest KDIGO CKD guidelines to generate the familiar CKD heat map that is commonly used today and cemented eGFR and albuminuria as the guiding markers of overall prognosis [8] (Figure 1, from KDIGO guidelines). They now feature heavily in most risk prediction scores for CKD progression. Despite the value of formalising eGFR as a reliable estimate of kidney function, its routine reporting in a patient’s biochemical profile generated a form of universal screening which did not achieve the desired outcome. It triggered a sharp rise in referrals to specialist kidney services, leading to longer waiting times, misdiagnosis and over-investigation of low-risk patients [9,10], especially those with extremes in muscle mass and creatinine levels. This inaccuracy generated by using eGFR as the sole marker of kidney function, added to a growing understanding of CKD as a complex and heterogeneous disease. As a result, clinicians and scientists have started to look beyond eGFR to consider other potential predictors of disease progression and to explore their integration into clinically useful prediction models. PREDICTING CKD PROGRESSION Clinicians have been slow to adopt risk scores in day-today practice, contrasting with their increasing appearance in the clinical literature over the past 20 years. Three systematic reviews have evaluated 112 risk prediction models and scores from 1980 to 2018 [11-13]. Risk prediction models have been developed in different incident populations, from general CKD to disease-specific, such as the IgA nephropathy population [14], to stage-specific, as in patients with eGFR less than 30 mL/min/1.73 m2 [15]. The risk predictions have been made for occurrence of kidney disease, for progression (either defined as a change in eGFR or reaching ESKF), as well as for the risk of cardiovascular events, hospitalisations, acute kidney injury, and all-cause mortality (Figure 2) [13]. Machine learning and kidney disease Persistent albuminuria categories Description and range GFR categories (ml/min/1.73 m2) Description and range Prognosis of CKD by GFR and albuminuria categories: KDIGO 2012 G1 Normal or high ≥ 90 G2 Mildly decreased 60–89 G3a Mildly to moderately decreased 45–59 G3b Moderately to severely decreased 30–44 G4 Severely decreased 15–29 G5 Kidney failure < 15 A1 A2 A3 Normal to mildly increased Moderately increased Severely increased < 30 mg/g < 3 mg/mmol 30–300 mg/g 3–30 mg/mmol > 300 mg/g > 30 mg/mmol General CKD Patient characteristics Cheng, 2017 X Schroeder, 2017 X X Hsu, 2016 X X Tangri, 2016 AJKD X X Xie, 2016 X X Marks, 2015 X X Maziarz, 2015 X X Levin, 2014 X X Maziarz, 2014 X X Drawz, 2013 X X X X X X X X X X X X X4 X X X X X X X X X Other Hemoglobin X2 X X X X X X X X X X X X X X X X X X X X X X X Tangri, 2011 8v model X X X X X Johnson, 2008 X X X X X X Johnson, 2007 X X X X X X X X X X Dimitrov, 2003 Calcium X Tangri, 2011 4v model Landray, 2010 Phosphate Laboratory variables X4 X Serum albumin Proteinuria / albuminuria Serum creatinine eGFR Histology (biopsy) Comorbidities Smith, 2013 60 Other Hypertension Diabetes Other Blood pressure Ethnicity Sex Age Figure 1. Prognosis of CKD by GFR and albuminuria category [6]. Green: low risk (if no other markers of kidney disease, no CKD); yellow: moderately increased risk; orange: high risk; red: very high risk. Reproduced with permission. X3 X X2 X X X X X X X X X X Figure 2. Studies and variables used in developing risk prediction scores for chronic kidney disease progression [13]. X Machine learning and kidney disease Despite this heterogeneity in model composition, many important insights can be gleaned. These included that serum creatinine can be used in place of eGFR as a marker of kidney function, that several biochemical markers (serum albumin, haemoglobin and C-reactive protein) should be examined as potential variables, that patient-specific comorbidities like diabetes and hypertension are important variables to consider either as primary kidney disease or as additional comorbidities, and that progression of CKD can be defined in multiple different ways. THE KIDNEY FAILURE RISK EQUATION SCORE In 2011, Tangri and colleagues published their 4- and 8-variable risk prediction models for ESKF, jointly known as the Kidney Failure Risk Equation (KFRE), and these are generally accepted as the best prediction models for progression of kidney disease to date [16,17]. The models were developed in a development cohort and validated in a second cohort of people with eGFR less than 60 mL/ min/1.73 m2 (i.e., stage 3 and below). Multiple candidate variables at baseline were considered in the development of the models including age, gender, weight, systolic and diastolic blood pressure, comorbidities (cardiovascular disease and diabetes) as well as laboratory variables such as eGFR, urine ACR and serum albumin, phosphate, bicarbonate and calcium. Kidney failure, as defined by dialysis initiation or transplantation, was codified as an independent variable. Cohort participants were followed for an extended period of seven years, enabling the prospective development of 1-, 3- and 5-year time horizons for risk predictions. The best-performing model in both cohorts included age, gender, eGFR, urine ACR, serum albumin, phosphate, and bicarbonate, with a C-statistic of 0.917 (95% confidence interval 0.901–0.933) in the development cohort and 0.841 (95% CI 0.825–0.857) in the validation cohort. The models identified that lower eGFR, higher urine ACR, younger age and male gender predicted faster progression to ESKF, like the RENAAL and Kaiser Permanente models [18,19]. The 8-variable KFRE risk prediction model for ESKF added lower serum albumin, calcium and bicarbonate and higher phosphate as further predictors of progression. The lack of model performance improvement using variables of diabetic status, weight and hypertension was thought to relate to their high prevalence among the CKD population and therefore limited use as markers of disease severity. 61 Building on this, the strength of the KFRE today lies in its extensive external validation and proposed clinical utility [20]. It is the only prediction model with proposed actionable risk thresholds to guide decision making in clinical practice, including triaging of new referrals to specialist nephrology care and timing of pre-dialysis education and vascular access creation [21,22]. The KFRE 5-year 3% threshold was used in a population from Manitoba in 2013 to triage referrals as a response to the overwhelming number of new referrals generated by the automatic reporting of eGFR, resulting in a reduction in the waiting time to see a nephrologist from a median 280 days to 58 days [12]. This combination of good predictive performance, extensive external validation, and proven clinical utility is what makes the KFRE the current benchmark risk prediction equation in CKD. However, several limitations and shortcomings have been specifically attributed to CKD risk prediction scores in systematic reviews. These include an inappropriate use of a heterogeneous CKD population for model development (in particular with regard to CKD aetiology), lack of specification as to when the models should be used, poor definition of prediction time frames, lack of uniformity in the outcome definition between models, and a paucity of external validation across different populations as well as longer-term impact studies, preferably in the form of a randomised, controlled trial (RCT) to confirm clinical utility [11,23,24]. The transition to an era of Big Data and precision medicine poses new challenges and demands on the development and application of new risk prediction tools, which include the computational capacity to process and integrate a large number of real-time, longitudinal predictor variables, the statistical flexibility to accommodate nonlinear relationships between variables and outcomes, and the ability to be incorporated into existing patient data platforms, such as EHRs [25]. MACHINE LEARNING Under the broad umbrella of artificial intelligence – a science dedicated to creating intelligent computers able to perceive vision, language, and sound – emerges the field of machine learning (ML), which explores the ability of machines and computer algorithms to learn from data to make accurate and generalisable predictions. The field has evolved significantly since Arthur Samuel, a pioneer in the area of computer gaming, first coined the term in 1959 while proving that a machine could learn to play a game of checkers better and quicker than a human being [27]. At its most basic level, ML is the study of algorithms that allow computers to learn without being explicitly programmed to do so. A computer, programmed with a basic algorithmic architecture, can take a quantity of information or data (usually very large and noisy) and find generalisable predictive patterns. Importantly, this basic algorithmic structure that primes the mathematical functions a model can Machine learning and kidney disease RULES (if eGFR < 30 & ACR > 100...) Traditional Programming ANSWER (ESKF or no ESKF) Machine Learning PREDICTIVE MODEL EXAMPLES (patient 1: eGFR 25, urine ACR 150...) EXAMPLES (patient 1: eGFR 25, urine ACR 150...) ANSWERS (ESKF or no ESKF) Figure 3. Basic concept of machine learning [28]. Abbreviations: eGFR, estimated glomerular filtration rate; ACR, albumin creatinine ratio; ESKF, end-stage kidney failure. learn is not constrained by any assumptions between variables and outcome in the way that traditional statistical models have been. Until recently, the use of computers to help us make predictions relied on humans inputting a set of rules, usually based on study findings, such as “if eGFR lower than 60 mL/min/1.73 m2 and urine ACR greater than 30 mg/mmol, then risk of progression to ESKF equals X”, and then showing the computer new reallife patient data and asking it to produce a risk score based on the programmed rules. With ML, a computer is shown the latter patient data with the respective outcomes – that is, if and/or when they reached ESKF – to find patterns that will form the architecture of a prediction model (Figure 3). This, coupled with the computational power to process enormous amounts of data, allows these models to consider and collate vast sources of health information to make accurate and individualised forecasts. 62 Broadly, ML can be divided into supervised and unsupervised learning. Supervised learning involves “training” an algorithm by exposing it to labelled data, so it can find patterns between the features of the data (independent variables) and the label – that is, dependent variable – attached to each data point, so that when it encounters those features it can correctly assign a label. Supervised ML algorithms can be divided into classification and regression types according to the nature of the outcome they try to predict – that is, whether the data belong to a certain category or they predict a numerical value. However, many of the most used ML algorithms, such as neural networks and random forest models, can be used both for classification and regression tasks, while others such as linear regression and Naïve Bayes classifier are best suited to predict either a continuous or discrete outcome, respectively. Unsupervised learning is used to find patterns in an unlabelled data set, as is the case with data mining where an ML algorithm is deployed to find any meaningful patterns or groupings in a random data set without any direction rules. An example of unsupervised learning would involve an analysis of patients’ clinical letters to find new predictors of kidney disease progression. Such a study was in fact conducted in 2016 by Singh et al., and identified ascorbic acid level and fast-food consumption as additional predictors of CKD progression [29]. Supervised learning requires training a model with a range of inputs (or features) which are associated with an outcome (or label) [30]. An example of this in nephrology would be training a model to relate a patient’s comorbidity profile – for example, the presence of hypertension, diabetes, cardiovascular disease, etc. – with the existence of kidney disease as defined by low eGFR. Once the model has been trained with enough labelled examples, it can make predictions on new and unseen data. The final trained model can be thought of as a single mathematical function that maps each input (for example, eGFR, urine ACR, age, etc.) to an outcome (for instance, ESKF or no ESKF). In this review, we will be looking at artificial neural networks (ANN) and random forest (RF) models as two examples of supervised learning. Machine learning and kidney disease At the start of training, the model parameters are randomised, and training follows a path of iterative improvement in reducing the error between prediction and outcome by using an optimisation technique. In the case of ANNs this is done by allowing information to be passed through layers, composed of modules or “neurons”, which change or “transform” this information according to a set of tuneable parameters (known as weights and biases) and a linear or non-linear activation function which determines if and how the information will be fed on to the next layer. After passing through several hidden layers, information reaches the final output layer and generates a prediction which is measured against the actual label using an error function. This error in prediction is then fed back through the layers (backpropagation) to allow the weights and biases to be adjusted accordingly. This process will continue iteratively and at a certain controlled rate until the prediction error is minimised [31]. In this way, a successfully trained model will have seen enough combinations of features and labels and adjusted its internal parameters to match those labels, so that when exposed to a new combination of features during the testing phase, it will correctly assign or predict the label. This pipeline of model development involving iterative training and testing forms the general process of generic supervised machine learning model development (Figure 4). In addition, a few settings or “hyperparameters” in the model architecture, such as the number of hidden layers, number of neurons in the case of ANNs, learning rate, and optimisation function, can be adjusted to improve model performance. The architecture of stacked transformations with limitless hidden layers which act to remove any sort of parametric or frequency distribution assumptions, is what allows neural networks to model complex and powerful relationships among many variables. However, it is also what makes them less interpretable and more prone to a “black box” effect (to be discussed later). An RF model can be used as a classification or regression algorithm and is derived from an ensemble of individual decision trees and their predictions [32,33]. In a decision tree, which is graphically represented by an upside-down tree, observations are passed down from the root through various nodes, which split the observations according to questions and subsequent decisions until a terminal node or leaf is reached (Figure 5). The questions that determine each split are drawn from the available variables in the data and selected according to their ability to split the data homogeneously, a characteristic referred to as “goodness of split” and determined through a variable’s impurity index. The goal is for each case (or patient example) to travel down the tree and be directed to a classification category (or terminal node) based on its constitutive features. In an RF model, an ensemble of trees is developed from subsets DATA (examples & answers) Model structure / settings ML model Adjust model parameters TRAINING Error function (backpropagation) Figure 4. A supervised learning pipeline. 63 TEST DATA (examples only) TESTING Prediction Prediction Compare prediction to answer Measure performance Machine learning and kidney disease TRAINING DATA eGFR > 30 uACR > 30 Age > 70 ESKF TREE #1 YES TREE #2 TREE #3 NO Figure 5. Random forest plots for prediction of end-stage kidney failure (ESKF). Abbreviations: eGFR, estimated glomerular filtration rate (mL/min/1.73m2); uACR, urine albumin to creatinine ratio (mg/mmol); age in years; ESKF, end-stage kidney failure. of the data which make simultaneous, individual predictions sample and to give a quantifiable measure of confidence on each new case, which are then collated to form a final that the observed relationship describes a “true” phe- prediction “vote” [32]. This ensemble structure confers nomenon that is not the product of noise or chance [36]. greater stability and generalisability to the prediction than On the other hand, ML models aim to predict a future that made by any individual tree. Performance of RFs can event accurately without requiring an understanding of the be improved through bagging or boosting, which allows mechanism that links the predictive variables to the out- individual trees to be developed on a randomly selected come. It can be said that statistics looks at the “how” and subset of the data (bagging) or iteratively using the entire ML at the “what”. A natural conclusion stemming from this training data set and adjusting weighting of samples is that ML models are much less interpretable, or “audit- according to classification errors (boosting) [34]. These able”, than the traditional regression and classification techniques have been shown to improve prediction accu- approaches. As an example, using an ML model to predict racy, lessen the probability of model overfitting, and adapt likelihood of requiring dialysis in a 5-year period, we may better to smaller data sets by avoiding the need to split be able to predict this with great certainty and accuracy them into training and test sets [35]. Following training, the from a plethora of collected laboratory and demographic performance of ML models can be assessed using well- variables, but we may not be able to explain which variable/s known metrics of calibration, discrimination, and reclassifi- exerted the most impact on this prediction. Having said cation [34]. Additionally, RFs have the capacity to select this, ML algorithms differ in their degree of interpreta- and rank variables according to their impact on the target bility, which is largely determined by their mathematical class, conferring additional insight and understanding of the architecture, with ANNs being the most common example role of individual variables on the model’s prediction. of a “black box” algorithm and RF models affording Although we have already outlined several ways in which ML models differ from traditional statistical algorithms, on 64 greater insights and transparency (Figure 6, from SideyGibbons [30]). a conceptual level they can also be understood to have Choosing the right algorithm that fits a particular task is very different purposes. Statistical models are generally therefore of great importance. Nevertheless, interpretability used to make inferences on relationships within a data and explainability of various ML models is a rapidly growing Machine learning and kidney disease Black Boxes Auditable Algorithms Better for complex* data Simpler models including multiple regression and decision trees. Non-linear relationships between predictors and outcomes make interpretation extremely difficult. Linear relationships between predictors and outcomes facilitate interpretation. Many commonalities to statistical techniques. Computationally “cheap” can often be run on a consumer PC. Complex models including neural networks and some Support Vector Machines. Share few commonalities to statistical techniques. Better for interpretation Computationally “expensive”, may require days of processor time to build models. *“Complex” data could refer to data which do not have a linear relationship with the outcome, such as a pixel in an image, the frequency of a wave in a sound bit, or movement data captured by a smart phone. Figure 6. The complex/interpretability trade-off in machine learning tools [30]. field of research. Newer advances suggest that the tradeoffs depicted in Figure 6 are somewhat misleading: techniques exist to interpret multiple aspects of ANN models, and sophisticated algorithms to analyse large complex ensembles of decision trees [37]. Such recent improvements in ML model explainability (XAI) hope to address key open problems with ML deployment. The aim is to ensure that models are fair, transparent, reliable and remain robust to data shift. While such goals also exist for traditional statistical techniques, the enormous numbers of parameters involved in deep learning and ensemble ML techniques have made XAI a fertile area of research. While it is beyond the scope of this review to provide a complete taxonomy, specific techniques exist to target training bias, treatment bias, ascertainment bias, missing data and data shift over time [38]. ML AND RISK PREDICTION IN CHRONIC KIDNEY DISEASE 65 The prediction of CKD progression with ML models has gained increasing attention recently, both for the reasons already mentioned as well as because CKD is defined by a laboratory variable (eGFR) and can therefore be easily identified in EHRs. Three studies are especially relevant in the context of this review. In 2014 Rucci et al. used a classification tree analysis (CTA) to look for relevant variables associated with differential decline in eGFR [39]. The CTA used only 6 of the 17 potential predictor variables and found proteinuria to be the most discriminative variable to split the group (patients with proteinuria had a mean annual eGFR decline of –2.35 vs. –0.80 mL/min/1.73 m2 in patients without proteinuria). Among the group with proteinuria, those with an eGFR at baseline greater than 33 mL/min/1.73 m2 appeared to have faster progression compared to an eGFR less than 33 mL/min/1.73 m2 at baseline (–3.77 vs –1.78 mL/min/1.73 m2). The authors concluded that proteinuria was a clear predictor of CKD progression, as was phosphate to a lesser but significant extent. They further commented that eGFR alone was not sufficient to predict risk, that CKD progression slowed down with increasing age, and that diabetes was a risk factor for progression even in the absence of proteinuria. COMMON PROBLEMS AND LIMITATION OF MACHINE LEARNING A common problem arises when ML algorithms are so tuned in to the particular combinations of features and outcomes that they cannot be generalised, limiting utility when presented with new data. Two ways of overcoming this problem, also known as “overfitting”, are by penalising model complexity (for example, using dropout, batch normalisation, and specific hyperparameter choices) and external validation [40]. Small data sets represent a challenge to the ML approach, particularly when the data set is split into a training and validation set, further reducing the data sets and impacting on the model’s performance and generalisability. Despite the vast potential of ML models in the modern era of Big Data to generate individualised, real-time health predictions from routinely collected data, few, if any, have been incorporated into clinical practice, particularly in the Machine learning and kidney disease field of nephrology. Many of the explanations for this have already been mentioned as proposed barriers to the adoption of risk scores; however, there may be other considerations such as the current state of our healthcare data platforms, the fragmentation of data capture among public and private domains, and a range of ethical considerations surrounding patient privacy and data ownership, to name a few. The application of ML in Africa, particularly in health care and nephrology, is inhibited by legacy systems and scarce and fragmented data that are often insufficient to train ML models that can achieve good performance. This paucity of data means that models applied in Africa are often developed on non-African populations, raising the potential for unintended algorithmic bias [4]. Furthermore, limited resources, high associated costs and the poor adoption of EHRs, along with a range of ethical considerations surrounding patient privacy and data ownership, continue to contribute to a low level of digitalisation across Africa and limit the ability to integrate AI and ML technology [4]. Although computational capacity can be sourced via cloud platforms, reliable and affordable internet connectivity and electricity can be a rate-limiting step that hinders data generation and analysis needed for advanced automation of patient care. Going forward, building on existing systems rather than starting anew may facilitate overcoming many of the existing barriers in AI adoption and implementation in low-resource settings [4]. Disclosures None REFERENCES 1. Matsha TE, Erasmus RT. Chronic kidney disease in sub-Saharan Africa. Lancet Glob Health. 2019; 7:e1587-e1588. 2. George JA, Brandenburg JT, Fabian J, Crowther NJ, Agongo G, Alberts M, et al. Kidney damage and associated risk factors in rural and urban sub-Saharan Africa (AWI-Gen): a cross-sectional population study. Lancet Glob Health. 2019; 7:e1632-e1643. 3. Davids MR, Jardine T, Marais N, Jacobs JC, Sebastian S. South African Renal Registry Report 2017. Afr J Nephrol. 2019; 22:60-71. 4. Owoyemi A, Owoyemi J, Osiyemi A, Boyd A. Artificial Intelligence for Healthcare in Africa. Front Digit Health. 2020; 2. 5. Davies DF, Shock NW. Age changes in glomerular filtration rate, effective renal plasma flow, and tubular excretory capacity in adult males. J Clin Invest. 1950; 29:496-507. 6. Astor BC, Matsushita K, Gansevoort RT, van der Velde M, Woodward M, Levey AS, et al. Lower estimated glomerular filtration rate and higher albuminuria are associated with mortality and end-stage renal disease. A collaborative meta-analysis of kidney disease population cohorts. Kidney Int. 2011; 79:1331-1340. 66 7. Matsushita K, Van der Velde M, Astor B, Woodward M, Levey A, De Jong P, et al. Chronic Kidney Disease Prognosis Consortium: Association of estimated glomerular filtration rate and albuminuria with all-cause and cardiovascular mortality in general population cohorts: A collaborative meta-analysis. Lancet. 2010; 375:2073-2081. 8. KDIGO group. KDIGO 2012 Clinical Practice Guidelines for the Evaluation and Management of Chronic Kidney Disease. Kidney Int. 2013; 3:112-119. 9. Hemmelgarn BR, Zhang J, Manns BJ, James MT, Quinn RR, Ravani P, et al. Nephrology visits and health care resource use before and after reporting estimated glomerular filtration rate. JAMA. 2010; 303:1151-1158. 10. Glassock RJ, Winearls C. Screening for CKD with eGFR: doubts and dangers. Clin J Am Soc Nephrol. 2008; 3:1563-1568. 11. Echouffo-Tcheugui JB, Kengne AP. Risk models to predict chronic kidney disease and its progression: a systematic review. PLoS Med. 2012; 9:e1001344. 12. Hingwala J, Wojciechowski P, Hiebert B, Bueti J, Rigatto C, Komenda P, et al. Risk-Based Triage for Nephrology Referrals Using the Kidney Failure Risk Equation. Can J Kidney Health Dis. 2017; 4:1-9. 13. Ramspek CL, de Jong Y, Dekker FW, van Diepen M. Towards the best kidney failure prediction tool: a systematic review and selection aid. Nephrol Dial Transplant. 2020; 35:1527-1538. 14. Barbour SJ, Coppo R, Zhang H, Liu ZH, Suzuki Y, Matsuzaki K, et al. Evaluating a New International Risk-Prediction Tool in IgA Nephropathy. JAMA Intern Med. 2019; 179:942-952. 15. Grams ME, Sang Y, Ballew SH, Carrero JJ, Djurdjev O, Heerspink HJL, et al. Predicting timing of clinical outcomes in patients with chronic kidney disease and severely decreased glomerular filtration rate. Kidney Int. 2018; 93:1442-1451. 16. Tangri N, Stevens LA, Griffith J, Tighiouart H, Djurdjev O, Naimark D, et al. A predictive model for progression of chronic kidney disease to kidney failure. JAMA. 2011; 305:1553-1559. 17. Whitlock RH, Chartier M, Komenda P, Hingwala J, Rigatto C, Walld R, et al. Validation of the Kidney Failure Risk Equation in Manitoba. Can J Kidney Health Dis. 2017; 4:2054358117705372. 18. Keane WF, Zhang Z, Lyle PA, Cooper ME, de Zeeuw D, Grunfeld JP, et al. Risk scores for predicting outcomes in patients with type 2 diabetes and nephropathy: the RENAAL study. Clin J Am Soc Nephrol. 2006; 1:761-767. 19. Johnson ES, Thorp ML, Platt RW, Smith DH. Predicting the risk of dialysis and transplant among patients with CKD: a retrospective cohort study. Am J Kidney Dis. 2008; 52:653-660. 20. Tangri N, Grams ME, Levey AS, Coresh J, Appel LJ, Astor BC, et al. Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis. JAMA. 2016; 315:164-174. 21. Hingwala J, Bhangoo S, Hiebert B, Sood MM, Rigatto C, Tangri N, et al. Evaluating the implementation strategy for estimated glomerular filtration rate reporting in Manitoba: the effect on referral numbers, wait times, and appropriateness of consults. Can J Kidney Health Dis. 2014; 1:9. 22. Tangri N, Ferguson T, Komenda P. Pro: Risk scores for chronic kidney disease progression are robust, powerful and ready for implementation. Nephrol Dial Transplant. 2017; 32:748-751. 23. Ramspek CL, de Jong Y, Dekker FW, van Diepen M. Towards the best kidney failure prediction tool: a systematic review and selection aid. Nephrology, dialysis, transplantation 2020; 35:1527-1538. 24. Dekker FW, Ramspek CL, van Diepen M. Con: Most clinical risk scores are useless. Nephrol Dial Transplant. 2017; 32:752-755. 25. Saez-Rodriguez J, Rinschen MM, Floege J, Kramann R. Big science and big data in nephrology. Kidney Int. 2019; 95:1326-1337. 26. Topol E. Deep medicine: how artificial intelligence can make healthcare human again. ed. New York: Hachette Book Group; 2019. 27. Samuel A. Some Studies in Machine Learning Using the Game of Checkers. IBM J Res Dev. 1959; 3:210-229. 28. Datalya. Machine Learning vs. Traditional Programming Paradigm. https://datalya.com/blog/machine-learning/machine-learning-vstraditional-programming-paradigm. Accessed 12/07/2021. Machine learning and kidney disease 29. Singh K, Betensky RA, Wright A, Curhan GC, Bates DW, Waikar SS. A Concept-Wide Association Study of Clinical Notes to Discover New Predictors of Kidney Failure. Clin J Am Soc Nephrol. 2016; 11:2150-2158. 30. Sidey-Gibbons JAM, Sidey-Gibbons CJ. Machine learning in medicine: a practical introduction. BMC Med Res Methodol. 2019; 19:64. 31. Wainberg M, Merico D, Delong A, Frey BJ. Deep learning in biomedicine. Nat Biotechnol. 2018; 36:829-838. 32. Breiman L. Random forests. Machine Learning. 2001; 45:5-32. 33. Banerjee M, Reynolds E, Andersson HB, Nallamothu BK. Tree-Based Analysis: A Practical Approach to Create Clinical Decision-Making Tools. Circulation: Cardiovascular Quality and Outcomes. 2019; 12:e004879. 34. Breiman L. Bagging predictors. Machine Learning. 1996; 24:123-140. 35. Briem GJ, Benediktsson JA, Sveinsson JR. Multiple classifiers applied to multisource remote sensing data. IEEE Trans Geosci Remote Sens. 2002; 40:2291-2299. 36. Bzdok D, Altman N, Krzywinski M. Statistics versus machine learning. Nat Methods. 2018; 15:233-234. 37. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. Explainable AI for trees: From local explanations to global understanding. arXiv preprint arXiv:190504610. 2019. 38. Barredo Arrieta A, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion. 2020; 58:82-115. 39. Rucci P, Mandreoli M, Gibertoni D, Zuccala A, Fantini MP, Lenzi J, et al. A clinical stratification tool for chronic kidney disease progression rate based on classification tree analysis. Nephrol Dial Transplant. 2014; 29:603-610. 40. Goodfellow I, Bengio Y, Courville A. Deep Learning. ed: MIT Press; 2016. 67