Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey

International Journal of
Molecular Sciences
Review
Towards the Interpretability of Machine Learning Predictions
for Medical Applications Targeting Personalised Therapies:
A Cancer Case Survey
Antonio Jesús Banegas-Luna 1, *, Jorge Peña-García 1 , Adrian Iftene 2 , Fiorella Guadagni 3,4 , Patrizia Ferroni 3,4 ,
Noemi Scarpato 4 , Fabio Massimo Zanzotto 5 , Andrés Bueno-Crespo 1 and Horacio Pérez-Sánchez 1, *
1 Structural Bioinformatics and High-Performance Computing Research Group (BIO-HPC), Universidad

Católica de Murcia (UCAM), 30107 Murcia, Spain; jpena@ucam.edu (J.P.-G.); abueno@ucam.edu (A.B.-C.)
2 Faculty of Computer Science, Universitatea Alexandru Ioan Cuza (UAIC), 700505 Jashi, Romania;
adiftene@info.uaic.ro
3 Interinstitutional Multidisciplinary Biobank (BioBIM), IRCCS San Raffaele Roma, 00166 Rome, Italy;
fiorella.guadagni@sanraffaele.it (F.G.); patrizia.ferroni@sanraffaele.it (P.F.)
4 Department of Human Sciences and Promotion of the Quality of Life, San Raffaele Roma Open University,
00166 Rome, Italy; noemi.scarpato@uniroma5.it
5 Dipartimento di Ingegneria dell’Impresa “Mario Lucertini”, University of Rome Tor Vergata,
00133 Rome, Italy; fabio.massimo.zanzotto@uniroma2.it
* Correspondence: ajbanegas@ucam.edu (A.J.B.-L.); hperez@ucam.edu (H.P.-S.)

Abstract: Artificial Intelligence is providing astonishing results, with medicine being one of its
Citation: Banegas-Luna, A.J.; favourite playgrounds. Machine Learning and, in particular, Deep Neural Networks are behind
Peña-García, J.; Iftene, A.; Guadagni, this revolution. Among the most challenging targets of interest in medicine are cancer diagnosis
F.; Ferroni, P.; Scarpato, N.; Zanzotto, and therapies but, to start this revolution, software tools need to be adapted to cover the new
F.M.; Bueno-Crespo, A.;
requirements. In this sense, learning tools are becoming a commodity but, to be able to assist
Pérez-Sánchez, H. Towards the
doctors on a daily basis, it is essential to fully understand how models can be interpreted. In this
Interpretability of Machine Learning
survey, we analyse current machine learning models and other in-silico tools as applied to medicine—
Predictions for Medical Applications
specifically, to cancer research—and we discuss their interpretability, performance and the input
Targeting Personalised Therapies: A
Cancer Case Survey. Int. J. Mol. Sci.
data they are fed with. Artificial neural networks (ANN), logistic regression (LR) and support vector
2021, 22, 4394. https://doi.org/ machines (SVM) have been observed to be the preferred models. In addition, convolutional neural
10.3390/ijms22094394 networks (CNNs), supported by the rapid development of graphic processing units (GPUs) and high-
performance computing (HPC) infrastructures, are gaining importance when image processing is
Academic Editor: Jung Hun Oh feasible. However, the interpretability of machine learning predictions so that doctors can understand
them, trust them and gain useful insights for the clinical practice is still rarely considered, which is a
Received: 30 March 2021 factor that needs to be improved to enhance doctors’ predictive capacity and achieve individualised
Accepted: 20 April 2021 therapies in the near future.
Published: 22 April 2021
Keywords: drug repurposing; machine learning; personalised therapy; cancer treatment; deep
Publisher’s Note: MDPI stays neutral
learning; high performance computing
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
1. Introduction
Cancer has become one of the most common human diseases and causes of death [1–3].
Among other factors, its occurrence is mainly growing because of aging [4]. Even though
Copyright: © 2021 by the authors.
cancer is a disease that affects men as well as women, there seems to be a clear relationship
Licensee MDPI, Basel, Switzerland.
between gender and incidence. Thus, lung, prostate, colorectal, stomach and liver cancer
This article is an open access article
distributed under the terms and
are predominant among men, while breast, colorectal, lung, cervical and thyroid are the
conditions of the Creative Commons
most common cancers in women (https://www.who.int/health-topics/cancer, accessed
Attribution (CC BY) license (https:// on 29 March 2021). Figure 1 depicts the number of estimated deaths in 2020 by cancer type
creativecommons.org/licenses/by/ collected from the Surveillance, Epidemiology and End Results (SEER) database.
4.0/).
Int. J. Mol. Sci. 2021, 22, 4394. https://doi.org/10.3390/ijms22094394 https://www.mdpi.com/journal/ijms

Int. J. Mol. Sci. 2021, 22, 4394 2 of 31
Figure 1. Estimated deaths in the USA in 2020 by cancer type and gender. Source: SEER database.
A diverse range of therapies, including chemotherapy, radiotherapy, surgery and

irradiation, is used in cancer patients depending on tumour type and stage. Unfortunately,
the success of these treatments is limited because they attack normal and tumoral cells
equally, which may result in toxicity and make the tumoral cells drug-resistant. In this
scenario, early detection is a crucial factor for the successful application of therapies, for
limiting associated side effects and, consequently, increasing the chance of survival [5,6].
For this reason, providing the physicians with appropriate tools for accurate diagnosis and
prognosis remains a major challenge in cancer research.
Colorectal cancer (CRC) is the third most common type of cancer worldwide, repre-
senting 10% of all diagnosed cases and the fourth in the number of deaths it causes [7,8].
Furthermore, these figures are not very promising because the number of CRC cases is
expected to increase by around 60% in the forthcoming decade [9].
As regard the reasons for such disheartening data, bad dietary habits are suspected to
be behind the growing number of CRC cases reported in recent years but there are other
reasons, such as the lack of exercise, obesity and smoking that are suspected of causing
tumours [10]. Moreover, familial and hereditary antecedents have proved to influence
the incidence of this cancer [11]. In an attempt to identify reasons, beyond the biological,
for the evolution of CRC worldwide, Arnold [9] published a study correlating the human
development index with the incidence and high mortality of CRC, which resulted in the
classification of countries into three groups with well-defined characteristics. In short, a
number of factors in our daily lives promote the emergence of colorectal tumours and,
although there is no clear numerical estimation of how much these factors contribute to the
appearance CRC, it seems to be in our hands to change the trend. From a more medical
point of view, the high morbidity and mortality rates could be explained by the fact that
malignant CRC tumours are considered to be especially complex biologically [12].
Much effort has been put into predicting CRC or, at least, into predicting the manner
in which the tumour is likely to progress. Genetic information plays a key role for detecting
tumoral cells and tissues that can help identify cancer disease at an early stage. The role
of genetic mutations in CRC has been extensively analysed and several publications are
available in the literature on this topic [13–16]. Other authors have focused on identifying
biomarkers with the aim of finding the subset with the highest predictive power [17–20].
Early identification could increase the likelihood of survival and dramatically reduce the
Int. J. Mol. Sci. 2021, 22, 4394 3 of 31
mortality rate. Unfortunately, a full understanding of cancer cell behaviour is still beyond
our grasp, making this a major challenge in medicine.
When prevention has failed, the application of individualised therapies is the ideal
scenario for the treatment of cancer patients. Personalising therapies implies finding the
most suitable set of drugs and their exact dose for a given patient, based on the available
input parameters, such as cancer type, tumour size and whether metastasis is present or
not. The idea behind this individualisation of therapies is to maximize the effect of drugs,
limit their side effects, shorten the time necessary to cure the disease and reduce costs. The
idea that individualised therapies are more cost-effective than generic ones seems credible
because the same treatment is obviously not suitable for every patient since not all cases
are similar. Several publications have discussed the direction that medicine is taking in this
respect [21–24] and its popularity has grown in recent years. Although all these authors
agree that personalised treatment will increase the effectiveness of existing drugs, to the
best of our knowledge, there has been no attempt to put it into practice in the case of cancer
treatment, making this goal a priority in cancer research.
In this move towards individual therapies, computing sciences have become a close
ally of health and life sciences and medicinal chemistry. The rapid development of high-
performance computing (HPC) platforms such as parallel and distributed computing
have found a place to develop in the field of chemical and biological problems. It is
well known that HPC infrastructures are extensively used to carry out complex scientific
calculations [25–27] and their computing power can drastically speed up the resolution of
a problem [28–30]. However, this is not enough: firstly, because the amount of medical and
pharmacological data available is overwhelming and huge computing power is needed to
analyse it all; and, secondly, the analysis methods necessary to transform such data into
real understandable knowledge are very challenging. While HPC can help overcome the
first difficulty, the application of artificial intelligence (AI), and more specifically machine
learning (ML), is necessary for the second. Only if HPC and ML work together will they be
capable of screening the vast chemical space and predict the most cost-effective therapy for
individual patients [31,32].
Machine learning experts know that with the right data very efficient predictions can
be made, as has been demonstrated in several fields such as sports results, injuries, stock
market movements, text-based emotions, etc. The field of medicine has not been left behind
in this respect and such technology is already used to diagnose or predict diseases such
as cancer [33], making it clear that ML, complemented by HPC, represents the future of
anti-cancer medicine. Already, ML algorithms are very helpful in many cancer-related
tasks, such as the prediction and diagnosis of the disease, predicting its progression, the
search for new drug synergies, predicting therapy outcomes and estimating survivability.
It is the potential for analysing historical data, learning from the analysis and making
predictions for future cases that makes them suitable for application in cancer research. It
might even be claimed that ML is the aid that doctors need to increase the accuracy of their
predictions and decision making, due to its ability to extract knowledge from previous
cases. Evidently, the output of ML systems has to be transformed to make it understandable
by healthcare staff; otherwise, we would be wasting an important opportunity.
This critical review highlights the role of ML in each of the main steps of anti-cancer
medicine. Section 2 focuses on the needs of doctors, attempting to answer questions like
“What kind of ML do doctors need?” and “Does ML output need to be adapted to medical
doctors?”. Section 3 presents a revision of the typical ML algorithms used in each stage,
each subsection describing the most frequently used approaches, which are condensed
into a table to facilitate their readability. The most relevant findings observed in Section 3
are discussed in Section 4. Finally, the main conclusions reached and the future of ML in
cancer research are summarized.
Int. J. Mol. Sci. 2021, 22, 4394 4 of 31
2. What Kind of ML Is important in Medicine/Cancer Prediction and Treatment

In this section, we focus on the basic features of an ML system that medical doctors
and medical/biological researchers are seeking beyond the output that a trained ML system
already provides.
The advantages of ML systems stem from the fact that they use thousands of features,
which they use to produce decisions in a very short time. It is important to note that the
training stage can be expensive in terms of computing power, while the prediction stage is
in comparison fast and computationally cheaper. The correlations that the algorithm finds
between the samples are similar to those found by experienced doctors, who have seen
hundreds of patients and begun to notice repetitive symptoms or similar values in their
detailed medical tests, which helps them to make decisions.
However, no matter how accurate ML systems are, no matter how many lives they can
save in principle and no matter if they are based on the doctor’s entire medical knowledge
if medical/biological researchers do not understand the underlying models and their
inferences. Only if ML systems cannot be explained, these systems will not be a game
changer in medicine, nor will medical/biological researchers use them to make everyday
decisions, condemning the whole approach to failure. To achieve any success, ML systems
need to gain trust of medical/biological researchers.
Consequently, our aim was to define four factors that should contribute to the success
of ML learning systems in the medical domain: (i) output interpretability, (ii) linking the
predictions to the original cases used to produce outputs and (iii) low data hungriness. In
this survey, we analyse existing approaches with respect to these factors as, only if there is
a substantial attention paid to all of them will a novel ML approach or system be a game
changer in a specific clinical situation. Only if the answer to the question “Do doctors
need to know about and learn ML in the future?” is negative can ML add real value to
clinical practice.
2.1. Factor One: Output Interpretability

Interpretability in Machine Learning or, in AI in general (XAI), is a hot topic, especially
when it is applied to medicine. AI systems tend to return raw results that are hard to
understand, which complicates their interpretation by non-expert users, including doctors.
Thus, to make AI more attractive to healthcare professionals, we should answer the question
“What do doctors need to easily interpret AI predictions?” Interpretability often appears
as a desideratum, but it is poorly defined [34]. Hence, a clear understanding of the term
interpretability is essential in order to classify existing ML approaches. In general, there are
two approaches to interpretability: model interpretability and inference interpretability [35].
Model interpretability relates to understanding how a model behaves in general, whereas
inference interpretability aims to describe how systems decide on each instance. Hence,
these are two facets of the same problem. However, in both cases, interpretability may
be obtained by showing symbols (e.g., natural language or structured languages such as
logical forms) to explain models or inferences.
Since the first AI systems, authors have outlined their stages of inference. For example,
Swartout et al. [36] deal with explanations for expert systems, Johnson [37] presents agents
that learn to explain themselves and Lacave and Diez [38] discuss interpretation methods
for Bayesian networks (BN). In recent years, there has been a strong emphasis on revealing
what happens behind the black box that uses AI algorithms [39]. This is necessary if
doctors are to trust the results provided by these algorithms and so use them in their daily
activities (diagnosis, deciding on the most appropriate treatment, etc.). In comparison with
other domains, medicine deals with the uncertain, probabilistic, unknown, incomplete,
imbalanced, heterogeneous, noisy, dirty, erroneous, inaccurate and missing data sets in
arbitrary high-dimensional spaces [40,41].
Explainable artificial intelligence (XAI) has received much attention in recent years [42].
There are two aspects of unsupervised learning models relevant in the context of inter-
pretability [39]. First, the representations learned in these models may show similarities
Int. J. Mol. Sci. 2021, 22, 4394 5 of 31
between the data in a class. One such case is the word embedding, which can signal
semantic similarity between words [43] and, second, being able to generate instances that
allow us to study the differences between data within a class. This is relevant in medicine,
where the discovery and analysis of disease-related abnormalities are relevant [44].
Trustworthiness in AI is the ability to evaluate the validity and reliability of an ML
system in many different input configuration and application environments. This factor is
very important in the medical environment, particularly in cancer prediction, where it is
necessary to be able to evaluate exactly the limitations of an ML system and, consequently,
accurately interpret and trustfully apply ML prediction system outputs.
Bærøe et al. [45] underline the growing importance of AI and the relative need for
trustworthiness in AI systems, especially in the medical environment. In the same work,
the authors analyse the report: “Ethical guidelines for trustworthy artificial intelligence”
published by the European Commission in 2019 (https://ec.europa.eu/futurium/en/ai-
alliance-consultation, accessed on 29 March 2021) and highlight the need for “globalising”
the guidelines at both European and international level.
2.2. Factor Two: Linking to Original Cases to Produce Outputs

AI systems often focus on the outputs but do not explain how much each input
participates in the result. In a medical context, this correlation between inputs and outputs
may be necessary to identify the reasons leading to a given decision.
Assignment methods try to link a certain output of the deep neural network with input
variables [39]. In another paper [46], the authors analyse the change in output gradients
depending on the change in input variables. In this way, the authors propose a result based
on the data that were used as the input of an algorithm and try to make a link between
these data and the result obtained. However, in the medical field, although we will can
still explain the results obtained and see a link with similar cases that formed the basis of a
decision formulated by the AI algorithms, there will always be the possibility of making a
mistake and exposing the patient to certain risks [47].
2.3. Factor Three: Data Hungriness

With the widespread application of computer technology in the medical field, the
amount of medical data available has increased dramatically and analysis methods are
already in use for the intelligent assessment of medical health. In the coming years,
we expect the volume of medical data to increase even more, ranging from terabytes to
petabytes and even yottabytes [48–50].
However, due to the mixed format of medical data, incomplete records and the noise
present in them, it is still difficult to analyse large amounts of medical data [51]. Because
traditional ML methods cannot efficiently extract a rich body of information from large
medical databases, Deep Learning (DL) methods arise to build more complex models
based on an idea similar to the way that neurons set interconnections in the human brain.
Increasingly, DL models use large medical databases, from which they select and optimize
parameters and automatically learn the process of pathological analysis of doctors [52].
Based on these models, the disease in question is identified in an intelligent way and an
early diagnosis can be made. Thus, the pressure on the activities of doctors is considerably
reduced and the efficiency of their work can be improved.
3. Application of ML Approaches in Cancer Cases

In this section, a number of cases will be discussed to illustrate how ML can help
doctors in the different stages of cancer evolution, from its diagnosis to the prediction
of survival chances. Each section focuses on one of the main steps targeted by ML in
healthcare contexts. Tables 1–6 summarize a detailed collection of works related with the
topic of discussion. The datasets column describes the original source of data to reference
a specific dataset, a full database, a citation, a project or the institution that collected the
samples. The column entitled “Exp?” means whether the interpretability of the results by
Int. J. Mol. Sci. 2021, 22, 4394 6 of 31
non-experts is considered in the paper or not. Other relevant information, such as the AI
approaches and the software tools used, are also reported. To facilitate the readability of
the examples, we present the works in a short table per section.
Table 1. Review of publications whose main topic is Machine Learning and cancer risk prediction.
Training Data Set

Cancer Type AI Approach Datasets Software Data Types Exp? Reference
Size
Lung CNN 1 BRFSS Caffe 235,673 Text Yes [53]
R,
COSMIC, 200,
Any RF 2 HMMER, Text No [54]
dbSNP 800
Dojo
Cosmic,
Any SVM 3 SwissVar, Libsvm 6326 Text No [55]
Swiss-Prot
Java,
TCGA:BRCA, 897,
Weka,
RF TCGA:THCA, 571, Text No
Breast, YARN,
TCGA:KIRP 321
Thyroid, MLlib [56]
Kidney DT 4 TCGA:BRCA unknown 897 Text No
SVM TCGA:BRCA unknown 897 Text No
BN TCGA:BRCA unknown 897 Text No
R,
Visualizations
CRC BN NSHDS 1676 Text Yes [57]
with
Cytoscape
ANN 5 Private Matlab 62,219 Images, Text No [58]
CNN,
Breast unknown R 500 Images, Text No
SVM [59]
CNN,
unknown R 500 Images, Text No
KNN 6
XGBoost,
Sklearn,
GBM 7 , KBCP, 696,
esyN, Text Yes [60]
SVM OBCS 923
Matplotlib,
Python
GBM Private XGBoost 1431 Text No
Gastric [61]
LR 8 Private unknown 1431 Text No
Skin ANN NHIS unknown 462,630 Text No [62]
9, IOTA tumor
KNN, LDA
Ovarian images Matlab 348 Images No [63]
SVM, ELM 10
database
Cervical CNN Private Caffe 20,000 Images No [64]
1CNN, Convolutional Neural Network; 2 RF, Random Forest; 3 SVM, Support Vector Machines; 4 DT, Decision Tree; 5 ANN, Artificial
Neural Network; 6 KNN, K-Nearest Neighbours; 7 GBM Gradient Boosting Machines; 8 LR, Logistic Regression; 9 LDA Linear Discriminant
Analysis; 10 ELM, Extreme Learning Machines.
Int. J. Mol. Sci. 2021, 22, 4394 7 of 31
Table 2. Summary of studies analysed in Section 3.2 about cancer recurrence.
Cancer Type AI Approach Datasets Software Training Data Set Size Data Types Exp? Reference
KNN, SVM, GBM, ANN, DT, RF GEO, ArrayExpress R 50 Text Yes [65]
CRC
LR, DT, GBM BioStudies database Python, R 800 Text Yes [66]
SVM, ANN, Regression unknown SPSS, R 733 Text No [67]
Breast SVM, ANN, DT ICBC Weka 1189 Text No [68]
SVM, RO 1 BioBIM Java 318 Text Yes [69]
Breast
CRC SSL 2 GEO, I2D C++ 194,988 Text Yes [70]
Oral BN, ANN, SVM, DT, RF unknown unknown 86 Text, Images Yes [71]
Chung Shan Medical University
Cervical SVM, DT, ELM unknown 168 Text Yes [72]
Hospital Tumor Registry
1 RO, Random Optimization; 2 SSL, Semi-Supervised Learning.
Table 3. Works applying ML to forecast cancer progression.
Cancer Type AI Approach Datasets Software Training Dataset Size Data Types Exp? Reference
Lung RF Multicenter Clinical Trials Matlab2016, SPSS23 72, 32, 31 Images No [73]
Lung
Breast CCF, binary
Renal TL 1 TRACERx, [74,75] ClonEvol 768 Yes [76]
data
CRC
Lung
RNN TCGA Matlab 506, 253 Numbers No [77]
CRC
Breast ANN [78] unknown 16 Numbers No [79]
Head and Neck LR GSE57441, GSE9844 GraphPad Prism 330 Mass spectra No [80]
caret, scikit,
Weka-FCBF 2 , SVM, PCA 3 ,
Skin TCGA OmicsMarkeR, Rtsne, 371, 354, 371 Numbers No [81]
ExtraTrees, KNN, RF, LR, Ridge
scatterplot3d
1 TL, Transfer Learning; 2 Weka-FCBF, Waikato Environment of Knowledge Analysis—Fast Correlation Based Filter; 3 PCA, Principal Component Analysis.
Int. J. Mol. Sci. 2021, 22, 4394 8 of 31
Table 4. Manuscripts applying ML to estimate drug doses or finding drug combinations for cancer therapies.
ANN UCSD #140520 study unknown 66 Text, Images unknown [82]
Prostate ANN UCSD #140520 study unknown 66 Text, Images No [83]
CNN unknown Keras, Tensorflow 72 Images No [84]
DB-stored
Breast DSS 1 Local database unknown unknown Yes [85]
medical records
AstraZeneca, DREAM
LR, SVM, RF, GBM sklearn, xgboost 2790 Numbers Yes [86]
consortium
R, Matplotlib,
MVA 2 on Undirected Graphs GDSC, CCLE, CTRP 700 CSV, Text Yes [87]
Graphviz
Any
Compounds,
ANN [88] TensorFlow 23,062 Yes [89]
Cell lines
RF Princess Margaret Cancer Centre unknown 383 Images No [90]
CNN PASCAL VOC 2012 TensorFlow 1464 Images No [91]
CNN PASCAL VOC 2012 Caffe, TensorFlow 1464 Images No [92]
ANN NCI database unknown 141 Text Yes [93]
1 DSS, Decision Support Systems; 2 MVA, Multivariate Analysis.
Table 5. List of works presented in Section 3.5 about the prediction of therapy outcome in cancer patients.
Akershus University Hospital,
Aker University Hospital,
CNN TensorFlow 12×106 Images No [94]
Gloucester Colorectal Cancer
Study, VICTOR trial
Teikyo University Hospital, Gifo Medical
RF unknown 54 No [95]
University Hospital Records
CRC caret, class, e1071,
gbm, tree,
RF, SVM, ANN, DT, KNN, GBM GSE19860, GSE28702, GSE72970 50 Raw data No [65]
randomForest,
RSNNS
LR, DT, GBM BioStudies database Scikit-learn, R 800 Excel No [66]
Int. J. Mol. Sci. 2021, 22, 4394 9 of 31
Table 5. Cont.
DB-stored
BN ACTUR database NCSS 5301 Yes [96]
medical records
Genomics of Drug Sensitivity in
RF, ANN Encog, randomForest 38,930 Raw data No [97]
Cancer portal
SVM GSE19860, GSE28702, GSE72970 e1071 144 Raw data No [98]
limma, glmnet,
RF GSE52735, GSE62080, GSE69657 Boruta, 58 Raw data No [99]
randomForest, pROC
SVM, LR unknown Orange 38 unknown No [100]
Val d’Aurelle Regional Cancer
SVM MAS 5.0 5 to 19 Numbers No [101]
Center
Nellie B. Connally Breast Center,
M.D. Anderson Cancer Center,
Diagonal LDA, KNN Instituto Nacional de dCHIP 133 Text, Numbers No [102]
Breast Enfermedades Neoplásicas de
Lima
SVM, Recursive Feature
University of Heidelberg e1071, ROC 52, 48 Numbers No [103]
Elimination
LR unknown unknown 84 Numbers No [104]
University of Southern
Bladder DT SPSS 948 Numbers No [105]
California
Blood LDA FRALLE93 protocol unknown 32 Numbers No [106]
Renal SVM National Wilms Tumor Study-5 e1071 250 Numbers No [107]
Duke University Medical Center,
Ovary Binary LR, Stochastic Regression H. Lee Moffitt Cancer Center Bioconductor 83 Numbers No [108]
and Research Institute
Esophageal SVM unknown unknown 46 Text, Numbers No [109]
Lung
Head and Neck [110–116], Morin (forthcoming), 156, 137, 363, 179, 327, 139,
DT, RF, ANN, SVM, LR, GBM caret Text Yes [121]
Meningioma [117–120] 922, 257, 548, 131, 149, 188
Laryngeal
Int. J. Mol. Sci. 2021, 22, 4394 10 of 31
Table 6. Summary of works about ML and the likelihood of survival.
SVM [122] unknown 295 Numbers No [123]
BN [124] unknown 97 Numbers Yes [125]
Breast DB-stored
SSL SEER database unknown 162,500 No [126]
medical records
DB-stored
SSL Co-training SEER database unknown 162,500 No [67]
medical records
DB-stored
ANN, LR, DT SEER database unknown 200,000 Yes [127]
medical records
Oral SVM unknown unknown 69 unknown No [128]
Any ANN unknown unknown 440 unknown No [129]
Linear Regression, DT, SVM, DB-stored
Lung SEER database R 7830 Yes [130]
GBM, Custom1 medical records
Helsinki University Central
CRC CNN, RNN Keras 420 Images Yes [131]
Hospital
TCGA, South Australian public
Brain CNN Keras, Tensorflow 679 Images Yes [132]
hospital system
Prostate DT, BN, Cox The Methodist Hospital S-PLUS 1050 Text Yes [133]
1 A custom ensemble of methods.
Int. J. Mol. Sci. 2021, 22, 4394 11 of 31
3.1. Predict the Possibility of Cancer

Currently, most of the studies performed for predicting the possibility of cancer are
based on the analysis of genetic data and mutations. Kaminker et al. [54] developed
CanPredict software to identify and predict whether certain mutations are associated
with tumours or not. The software combines the Sorting Intolerant From Tolerant (SIFT),
LogR.E-value score and Genetic Ontology Similarity Score (GOSS) methods by applying
an advanced Random Forest (RF) classification scheme. Capriotti and Altman [55] used
support vector machines (SVM) to analyse different databases, each created with an equal
number of cancer driver Single Amino Acid Polymorphisms (SAPs) and neutral SAPs.
Using this technique, it is possible to predict whether a given missense SAP is neutral or is
involved in cancer appearance. In their study, the authors achieved an effectiveness greater
than or equal to 90% in the overall predictions.
Taninaga [61] describe how a set of characteristics related to gastric cancer can be pro-
cessed using extra gradient boosting decision (XGBoost) algorithms or logistic regression
(LR) methods to predict whether a patient is at risk of developing the disease over the
next 122 months. In this study, 10 models were developed. For the first five, the authors
used XGBoost: the first model only took into account Helicobacter infections, while to the
second they added data on chronic atrophic gastritis, in the third they included endoscopic
findings, in the fourth they added biological background factors and in the fifth they also
included blood tests. The other five models were identical applied linear logistic regression
instead of XGBoost. The performance of each model was measured using the area under
the curve (AUC) value. As a result of the research, the most influential characteristics
in the development of gastric cancer were seen to be the mean corpuscular volume, the
proportion of lymphocytes, age, body mass index (BMI) and postgastrectomy. Finally, AUC
values of 0.899 and 0.874, respectively, were obtained with the 5th and 10th models, the
authors concluding that with these models it is likely to predict whether a patient might
suffer from cancer.
According to the American Cancer Society, 3.3 million people are diagnosed with
skin cancer annually. A prediction of the risk of suffering Non-Melanoma Skin Cancer
(NMSC) was made [62] using 13 personal data of patients that can easily be obtained from
an Electronic Medical Record (EMR): gender, age, BMI, diabetic status, smoking status,
emphysema, asthma, race, Hispanic ethnicity, hypertension, heart diseases, vigorous
exercise habits and history of stroke. These input parameters were first normalized to
values between 0 and 1 and an artificial neural network (ANN) model was developed based
on one input layer with 13 nodes, two hidden layers with 13 nodes and one output node.
The authors used 462,630 cases, taking 70% of the cases for training and the remaining 30%
for validation and obtained an AUC value of 0.81. The study concluded that by including
the two most important factors that should be taken into account in skin cancer, i.e.,
radiation and personal history, risk predictions of the model could very likely be improved.
Martínez-Más et al. [63] combined different ML techniques with features obtained
by Fourier transform (FT) to classify ovarian tumours as benign or malignant, using
ultrasound images. After extracting 187 features from the ultrasound images using FT,
they were used as input features for k-nearest neighbours (KNN), linear discriminant
analysis (LDA), SVM and extreme learning machine (ELM). For this, different kernels were
analyzed to obtain the optimal configuration and it was seen that the combinations of FT
with LDA, SVM or ELM are good classifiers for biomedical images, providing an accuracy
of more than 85%.
Breast cancer (BC) is one of the most common types of cancer in women. For predic-
tion purposes, a regular analysis of mammographic images is required. To estimate the
probability of malignancy of the tumour, there are three categories: prognostic models,
computer-aided detection and computer-aided diagnosis. Ayer [58] proposed a method for
accurately predicting BC using ANNs, with particular emphasis on calibration made by
means of the Hosmer–Lemeshow goodness-of-fit test. This generates a network topology
with three layers: the first one with 36 input nodes (mammographic descriptors, demo-
Int. J. Mol. Sci. 2021, 22, 4394 12 of 31
graphic factors and BI-RADS), a hidden layer with 1000 nodes and an output layer with 1
node. Later, they trained the network using a cross-validation method on 62,219 registers.
Next, they compared the results obtained through their model with the prediction experi-
ence of eight radiologists. The fact that the ANN obtained an AUC value of 0.965 and the
radiologists a value of 0.939, demonstrates the good predictive capabilities of ANN, which
can, therefore, be considered a reliable support tool.
Predictions of the risk of developing BC in the short term can be made by comparing
the distribution of volumetric breast density of both breasts based on mammographic
image analysis [59]. The authors proposed a model based on a Convolutional Neural
Network (CNN), which converts an image into a characteristics vector, then applied a
Locality Preserving Projection (LPP) algorithm to reduce the features obtained by the
network, finally obtaining a vector with 44 characteristics. Classification was then carried
out, comparing two classification methods, SVM and KNN. The model was trained through
a cross-validation using 500 mammographic images, which provided an AUC value of 0.62
for SVM and 0.60 for KNN. In order to further optimize the accuracy of the model, the
AUC values were calculated for each of the 44 characteristics and then sorted according to
these values. Subsequently, the least relevant characteristics were eliminated, by testing
the model based on a range of 2 to 10 characteristics. With 10 features and using KNN, an
AUC value equivalent to 0.64 was obtained, which was better than when using 44 features.
The best configuration was achieved using LPP-KNN, reducing the regenerated features to
four. This gave an AUC value of 0.68 for the short-term prediction of BC (less than 5 years).
The risk of developing BC can be predicted through the identification of Single Nu-
cleotide Polymorphisms (SNPs) in DNA that contribute most to its development [60]. To
identify them, a three-stage protocol is implemented: (i) the SNPs are selected using a
gradient boosting classification technique: XGBoost; (ii) based on the XGBoost output data,
an adaptive iterative search for SNPs is made, sorting the results downwards according
to their scores; the M best-scored results and the M worst-scored ones are selected and
are separately ordered from lowest to highest; this process is repeated, increasing the size
of M until the both lists overlap; (iii) the top SNPs are chosen and classified with SVM
representing an optimal group that can potentially predict the risk of BC. The protocol
is implemented in Python with the libraries sklearn, xgboost among others and can be
downloaded from github.
DNA methylation is known to play a major role in tumorigenesis. BIGGIOCL [56]
is a tool that can be used to analyse hundreds of thousands of individual data in a few
hours. Although it was designed to analyse DNA and CpG Islands, the author specifies
that it could be adapted to other fields. The tool, developed in Java and based on the MLlib
learning library, allows parallelization of work in multiple machines. When developing
the software one of the reasons for implementing RF was its parallelization capability that
allows a forest tree to be executed in each node and the information to be sent to the master
node. As it is based on MLlib it can be used in Yet Another Resource Negotiator (YARN)
environment. In the publication, the authors analysed data from HumanMethylation450
to check its relationship with BC and obtained a direct relationship with the genes RP53,
PIK3CA, BRCA1, BRCA2 and BDNF, results that match those previously published by
other authors.
Another type of cancer that is frequent in both men and women is CRC. Myte [57]
carry out the first study relating a One-carbon metabolism (1CM) pathway to cancer risk
in humans by applying a BN. The observed relationship between compounds of 1CM
and CRC and the lack of empirical studies proving the impact of 1CM and SNPs on CRC
motivated this work. The study collects data from blood samples, one per patient, and
uses a BN to relate population-based data, SNPs and the metabolic pathways involved in
1CM. The authors suggested that the most important factors in colorectal tumorigenesis
are the associations between folate, vitamin B6 and vitamin B2 and concluded that these
compounds should be taken into account in future studies of 1CM and the development
of CRC.
Int. J. Mol. Sci. 2021, 22, 4394 13 of 31
Lifestyle is important for disease prevention. In the case of lung cancer particularly,
there are certain habits or external factors that can increase the risk of contracting the
disease. In the study of Chen and Wu [53] a set of data concerning demographics, disease,
radiation, behaviour, environment and smoking was analysed in a group of adult patients.
The authors used a CNN to identify which of these factors are the most important in the
development of this type of cancer. The study divided the samples into four groups: (i)
men over 64 years, (ii) women over 64 years, (iii) all those over 64 years and (iv) all those
over 17 years. The four sets of data were then converted into Hierarchical Data Format
5 (HDF5), which is designed to store and organize large amounts of data and is used by
Caffe, a Deep learning framework, to import the data into their CNNs. After training the
model with a cross-validation, it achieved an AUC prediction value of 0.913 and, of all
the risk factors for lung cancer examined in those over 64 years of age, smoking was the
most important.
In Martínez-Mas et al. [64], the authors propose a novel method for the early detection
of cervical cancer, which is one of those with high mortality in women. Frequently, the au-
tomatic classification of medical images does not pre-clean the images to remove overlaps,
which does not reflect the reality of the images obtained directly from the medical samples.
To overcome this issue, the authors implemented an artificial cell merger approach to
improve the efficiency and realism of the classification model using CNN and without
ruling out blurred, overlapping cells, etc. This approach showed a classification accuracy
of 88.8%, obtaining a sensitivity and specificity of 0.92 and 0.83, respectively.
3.2. Predict Cancer Recurrence

Once the cancer is diagnosed, one of the main concerns is the possibility of recurrence
or metastasis. In this line, Exarchos et al. [71] used a data set comprising clinical, image and
genomic data to provide a multiparametric system to detect recurrence in squamous cell
carcinoma using BN, ANN, SVM, decision trees (DT) and RF classifier algorithms and ROC
curve assessments. The best results in terms of accuracy were obtained for the BN classifier
(78.6% accuracy for clinical data, 82.8% for images and 91.7% for genomic data). Kim
et al. [134] studied the recurrence of BC over 5 years using SVMs, ANNs and regression
analysis; in this case, the SVM model gave the best results in terms of accuracy (89%). In
the same study, it should be noted that selection of the characteristics of the models was
based on the mutual information provided by the input characteristics. In the same line
of detecting recurrent BC, Park et al. [70] used genetic information to create a graphical
model based on semi-supervised learning (SSL) through gene pairs that indicate strong
biological interactions, in this case for both breast and colon cancer. This graphic model
proved to be quite accurate in predicting the recurrence of breast and colon cancer (80.7%
and 76.7%, respectively). This SSL technique was seen to very interesting when very few
labelled samples are available, which is a fairly common problem for this type of data set.
In Ahmad et al. [68], three ML methods (DT, ANN and SVM) were compared for
predicting for BC recurrence by analysing sensitivity, specificity and accuracy. The C4.5
algorithm was used in DT. Accuracy of 0.936, 0.947 and 0.957, respectively, were obtained.
This work showed that SVM had the lowest error rate and the highest accuracy for pre-
dicting the recurrence of BC. In Tseng et al. [72], SVM, DT and ELM are used to predict
the recurrence of cervical cancer. Of these three methods, DT obtained the best results,
especially when using the C5.0 algorithm (92.44% accuracy). The following were analysed
in the study: Pathologic Stage, Pathologic T, Cell Type and RT Target Summary.
Another way of approaching cancer prediction is through making individual predic-
tions for each patient. Ferroni et al. [69] studied this approach using SVM and Random
Optimization (RO) to predict BC in individual patients. In addition to prediction, the
model allowed patients with low and high risk of cancer progression to be differentiated.
The authors concluded that the use of ML algorithms (specifically SVM) with RO, allows
the creation of an efficient model for customization in the prediction and recurrence of BC.
Int. J. Mol. Sci. 2021, 22, 4394 14 of 31
Two studies by Lu et al. and Xu et al. [65,66] worked on the early identification of
CRC recurrence. In the first paper, several treatments were analysed and good results were
observed in patients who are sensitive to FOLFOX (5-FU, leucovorin and oxaliplatin). The
authors used ML algorithms (more specifically KNN, SVM, gradient boosting machines
(GBM), ANN, DT and RF) to identify the differences in genes between patients who respond
to FOLFOX and those who do not respond in cases of CRC recurrence. They concluded
that SVM and RF are the most effective ML methods for predicting FOLFOX response. In
the second paper, too, ML techniques (LR, DT, Light GBM, GBM) were used to study the
impact of treatments once CRC had been detected. Light GBM and GBM were found to
be the most efficient for detecting the reappearance of CRC and the treatments that most
influence the reappearance of tumours were chemotherapy, age, carcinoembryonic antigen
and anaesthesia time.
3.3. Predicting Cancer Progression

Tumours can change over time, getting bigger, becoming malignant or undergoing
metastasis [135] in an evolutive process that involves cancerous cells [136]. Tumours
evolve in different ways in different patients. The REVOLVER (Repeated EVOLution in
cancER) method [76] applies the so-called Transfer Leaning (TL) approach to forecasting
cancer progression. While the standard procedure infers uncorrelated models for each
individual patient depicted by phylogenetic trees containing noisy data, REVOLVER uses
TL to correlate models obtained from different patients and identify similarities in those
tumours that evolve in a similar manner. The idea behind TL is to store the knowledge
obtained while solving one problem and to apply this knowledge, when possible, in the
resolution of a similar task. Thus, the knowledge extracted from one sample is transferred
to another. As input, REVOLVER uses a set of Cancer Cell Fractions (CCF) or any other
genetic alteration that can be represented in binary format. It then follows a two-step
process: (i) it calculates a set of correlated evolutionary trees, which are numerically scored,
describing the evolution of each patient’s tumour; and (ii) it computes the evolutionary
trajectories for each group of input alterations depicted in a tree that shows the number
of times an alteration occurs among other values. This method was used to analyse a
collection of datasets for lung, breast, renal and colorectal cancer based on 768 samples and
identified interesting genomic trajectories that were judged to merit further study (e.g.,
CDKNA → TP53 → TERT, TP53 → PIK3CA → −8p → +8q).
Alternative to TL for studying mutation timelines are Long Short-Term Memory
(LSTM) networks, which are a type of recurrent neural network (RNN) with the ability to
learn long-term dependencies from a sequence of events. LSTM takes advantage of the
temporal nature of mutation trajectories. With this type of algorithm, mutations can be
sorted by occurrence time to provide an explanation of tumour evolution [77]. The authors
trained an LSTM of 5 hidden layers aiming to predict the number of mutations present
in each tumour, the so-called mutational load. The model was trained on two datasets
containing CRC and lung cancer samples. In less than 100 epochs they reach an AUC of
0.95. It is also possible to predict the genes that are present in such mutations and identify
a set present in both types of cancer (e.g., titin, mucin-16, nesprin-1). Finally, the authors
reported that the last 20 mutations are highly correlated with the mutational load. To
validate their model, they implemented an SVM model that exhibited lower performance
than LSTM, probably because they studied a non-linear relationship between mutations.
The state of a BC usually depends on several factors, such as the tumour size and
cellularity, the presence of tumoral cells in the lymph nodes being the most reliable marker
and the expression of S100A4 and nm23 genes the most effective predictors of their status.
In order to investigate the predictive power of these genes and tumour size and grade a set
of 15 ANNs was trained on 16 BC samples and tested against another 16 [79]. The results
confirmed the expression of S100A4 and nm23 genes as the most effective predictor and
that the inclusion of other markers could improve the accuracy (e.g., ER/PgR expression).
Int. J. Mol. Sci. 2021, 22, 4394 15 of 31
Simpler ML approaches, such as LR, can also help in predicting cancer progression [80].
The method works in the knowledge that Transforming Growth Factor beta (TGF-β) is
involved in the acquisition of heterogeneity by tumours [137]. This fact means that TGF-β
is responsible for promoting tumour evolution, thus, complicating cancer prognosis. The
activation of TGF-β signalling contributes to the acquisition of malignant properties by
head and neck squamous cell carcinoma (HNSCC). However, the effects of TGF-β on lipid
metabolism remain unclear. In this context, the authors aimed to develop an ML-based
algorithm to detect intratumoural TGF-β-stimulated areas in clinical HNSCC tissue without
recourse to a conventional immunohistological examination. For this purpose, Logistic
Regression of the mass spectra of HNSCC-stimulated and non-stimulated human cells was
carried out on the public datasets GSE57441 and GSE9844. The LR algorithm accurately
segregated stimulated and non-stimulated cells reaching a classification accuracy of up to
98%. This finding demonstrates that simple ML approaches, despite their limitations, can
also be helpful in predicting cancer progression.
Metastatic Skin Cutaneous Melanoma (SKCM) has been demonstrated to arise from
factors such as the expression of mRNAs and miRNAs and aberrations in methylation
patterns [138,139]. To understand how skin melanoma progresses a combination of feature
selection methods and ML classifiers has been used [81]. The data, including mRNA,
miRNA and methylation expressions from The Cancer Genome Atlas (TCGA) database,
were split into 80% for training and 20% for testing, giving training datasets of 371, 354 and
371 samples respectively. First, three feature selection methods, namely Weka-FCBF, SVM
with L1 regularization (SVM-L1) and Principal Component Analysis (PCA), were applied
to reduce the number of input features so that subsequent analysis could focus on the most
discriminative characteristics. In this step, SVM-L1 outperformed the other methods by
selecting the 17 features that were used in the next stage. The Jaccard index was calculated
to select the best method. Secondly, six classification models were developed to support
vector classification with weight (SVC-W) performed best, obtaining 0.95 AUC and 89.4%
accuracy in an external validation test. The other classifiers were ExtraTrees, KNN, RF, LR
and Ridge classifier. The models were assessed using different metrics, including AUC,
the Matthews coefficient, sensitivity, specificity and accuracy. As a conclusion, the authors
reported a collection of genes that could be considered relevant markers of cutaneous
melanoma metastasis (e.g., ESM1, NFATC3, C7orf4).
3.4. Calculating Drug Doses or Drug Combinations

It used to be commonly accepted that the administration of drug combinations rather
than providing monotherapy can increase treatment efficacy [140]. This approach is nowa-
days limited by the huge size of the chemical space that makes the identification of novel
drugs very difficult and, consequently, complicates the choice of effective drug combina-
tions. In order to perform a cost-effective screening of this chemical space, DL methods are
gaining in importance. For example, the DeepSynergy tool [89] aims to predict the most
efficacious anti-cancer multi-drug treatments by means of DL. DeepSynergy provides an
ANN, which is implemented with the modern TensorFlow framework and outperformed
other ML methods, such as GBM, RF, SVM and Elastic Nets, in a benchmark on the largest
synergy dataset. However, the performance all these methods decreased when exploring
new datasets of different sizes and data distributions, which is one of the typical problems
of ML approaches which remains a challenge today. In the same line, Celebi [86] pub-
lished a study to identify functional anti-cancer dual therapies, an approach whereby two
single-target drugs work in synergy to cure a disease. The above authors evaluated five
ML methods (LR, Lasso, SVM, RF and GBM) implemented with the sklearn and xgboost
Python libraries. All the models were trained on a novel dataset released by AstraZeneca
and the Dialogue for Reverse Engineering Assessments and Methods consortium [141]. The
assessment showed that GBM outperformed the other methods in synergy identification. It
is interesting to mention that the study included a variant of LR, the so-called Lasso [142],
which is a regularized version of LR that reduces overfitting in the model.
Int. J. Mol. Sci. 2021, 22, 4394 16 of 31
In addition, deciding on the drug combination to be administered, identifying the exact

dose is crucial for creating personalized cancer therapies. However, despite the importance
of these points, research into them lags behind estimating cancer risk or predicting therapy
outcome. EON software [85], a component-based decision support system (DSS) that was
developed to build healthcare protocols at a high level of abstraction, represented a first
attempt to use AI to build reusable software capable of helping doctors. Its modular design
makes it easy to add and replace components and the graphical interface means that it is
accessible to any user, even those lacking advanced computer skills. A major advantage
of EON is that, once designed, the protocols can be reused for any disease with minimal
adaptations; for example, different types of cancer or AIDS might share the same protocol.
With regards to drug dose estimation and the optimal application time, EON includes the
Chronus temporal query system, which implements a specific algebra for writing temporal
queries and can be extended with the Catenation operator. This operator is able to identify
adjacent periods and merge them into a single one, making it possible to know when and
for how long a patient was given a certain drug combination. This information, along with
the therapy outcomes for the same periods, can help analyse the effectiveness of a drug
synergy, providing useful information for future cases.
A recently published work [143] summarizes the main advances of AI for treating
head and neck cancer patients. A key factor when planning treatments for this cancer is
the intensity modulated radiotherapy (IMRT) dose prediction. The manuscript describes
the way ANN [82,83], CNN [84,91,92] and tree-based methods [90] are currently applied to
resolve classification problems from a collection of images. The aim of this sort of protocol
is to identify the most effective dose for each patient. Tree-based methods try to mimic the
thinking of an expert clinician looking at a set of images of a new patient, identify a similar
past patient with the most similar images and map the dose distribution administered
to the former patient in order to assess the optimal treatment to be applied with the new
patient. To do this, a collection of features is extracted from the images to build a dataset
of structured data that can be handled by most ML algorithms. This approach reached
78.68% and 86.83% accuracy in breast and prostate cancer, respectively, when the Gamma
metric was used. The main drawback of tree-based algorithms that work in this way is that
their accuracy is closely coupled to their core steps: extracting descriptive features from
the source images, identifying a similar patient on the basis of such descriptive features
and adapting the past dose to the new patient. The alternatives to the tree-based methods
used in the above work are fully connected ANNs with two layers, which are easy to train
but which do not conserve memory and may suffer overfitting. Whatever the case, the
prediction error reported was lower than 10% [83]. Fortunately, CNNs are very good for
predicting volumetric information, the most suitable types being Tiramisu and Dilated
CNNs (DCNN). Tiramisu models work in two steps: (i) encoding the input image to extract
the most descriptive features; and (ii) decoding the information to restore it to the initial
size. When the dose volumes are consistent with respect to the anatomy (e.g., in prostate
cancer), Tiramisu models are the preferred option [84], otherwise (e.g., head and neck
cancer), DCNNs are preferable.
Frequently, gene mutations are detected in cancer patients and discovering the re-
lationship between these genetic variations and drug responses has led to the ability to
identify which patients might profit most from certain drug synergies. However, the results
of clinical trials in their advanced stages must exhibit a significant improvement over stan-
dard therapy. Thus, clearly defining groups of patients in which a novel drug may be more
effective than the existing ones could help lead to individualised therapies and, as a conse-
quence, this has become a target of ML. An unsupervised learning approach based on mul-
tivariate analysis (MVA) of undirected graphs [87] was performed to classify patients into
well-defined subpopulations. The statistical methods were implemented with R packages
and the input datasets were collected from the GDSC (https://www.cancerrxgene.org/,
accessed on 29 March 2021), CCLE (https://portals.broadinstitute.org/ccle, accessed on 29
March 2021) and CTRP (https://portals.broadinstitute.org/ctrp/, accessed on 29 March
Int. J. Mol. Sci. 2021, 22, 4394 17 of 31
2021) databases. As result of this work, the SEABED (Segmentation and Biomarker En-
richment of Differential Treatment Response) platform was developed and used in several
examples, in one of which the authors aimed to assess the response to a combination of
drugs, namely A and B. To accomplish this, they segmented patients into subpopulations
depending on their response to the therapies, considering AUC and IC50 as metrics. They
also provided a graphical representation of the results in a tree whereby the identified
subpopulations were coloured depending on the exhibited sensitivity to both, A, B or no
drugs, which is important for facilitating interpretation of the results. Then, the authors
chose a BRAF and a MEK inhibitor and discovered that the subpopulation sensitive to
A was enriched for BRAF mutations and the one sensitive to B was enriched for MEK
mutations. This approach is generic enough to be used for the analysis of any type of
cancer sample, independently of its particular characteristics and can also be of great use
for predicting tumour progress.
As can be inferred from Table 4, image processing is a key procedure when estimating
drug doses and finding effective drug combinations. To satisfy the need for powerful image
processing algorithms, CNNs have shown themselves to be alternative to traditional ANNs.
In parallel, new frameworks (e.g., TensorFlow, PyTorch) have been developed to exploit all
the computing power of graphical processing units (GPUs) and accelerate image analysis.
When there are no images available or their inspection is not suitable, other statistical
methods and classifiers (e.g., LR, RF, MVA) can be fed with a diverse collection of data
types. Regarding interpretability of the results, this is not the main concern of scientists
according to Table 4. Very few of the works try to adapt the output of their models to make
it understandable by doctors or use easily interpretable models (e.g., DT, BN). Whatever
the case, the extensive use of image processing with CNN makes some models easier to
understand than raw numerical results.
3.5. Predict Treatment Outcome

In the move towards personalized therapies, the prediction of therapy outcome is
essential. In spite of the fact that several works where AI is used to estimate a tumour’s
evolution after therapy for colorectal [95,101], breast [102–104], blood [106], renal [107],
ovary [108] or oesophageal [109] cancer, this topic remains a major challenge for scientists.
Classification, regression and clustering algorithms have frequently been used to
resolve this sort of issue. As example of the classification method, a DT was implemented
to diagnose and predict therapy outcome for bladder cancer patients using the SPSS
statistical package [105]. The work showed how nearly 950 patients could be classified
into three groups with different recurrence-free and overall survival probabilities. DTs
have the advantage of being very intuitive and easy to interpret by medical doctors,
which is one of the main aims of health-related MLs. A similar statistical analysis for
classification purposes was carried out with BN implemented with Number Cruncher
Statistical Systems (NCSS) on a dataset of CRC patients [96]. In this case, the positive
prediction rate ranged from 78 to 84 per cent when estimating recurrence for the training
dataset extracted from the ACTUR database. The main limitation of this work is data
reliability and consistency due the military nature of some institutions feeding the data
source, which lack approved programs for cancer treatments. RF is another widely used
recurrent classification algorithm that is already used to predict the response to FOLFOX (5-
FU, leucovorin and oxaliplatin) therapy [95]. The model was able to correctly predict 69.2%
of cases in the test set. Relationships between genomic alterations and drug responses is a
factor that could lead to enhanced individual therapies. Although both genomic features
and chemical properties have been computationally analysed, there is still a lack of works
studying both factors together. To shed some light on this topic, ANNs and RF were
used to predict therapy outcomes [97]. The core of this work was the implementation of
a three-layer ANN. The inputs were 608 cell lines and 111 drugs, a number between 1
and 30 hidden nodes were tested to find the best performing architecture and the IC50
predicted value was the only output. Note that the IC50 value is normalized in the range
Int. J. Mol. Sci. 2021, 22, 4394 18 of 31
[0,1] by the sigmoid function added in the output layer. Based on the R2 performance
metric, the model obtained 0.64 on the test dataset extracted from the GDSC portal and 0.61
on an external validation dataset. Then, a RF implemented in R was developed to ascertain
whether the ANN model could be improved but it resulted in a R2 of 0.59 on the blind test
dataset, which is a slightly lower value than that achieved by the ANN model. Although
the results look promising, the model has some limitations that could be overcome by
adding more cell lines, epigenetics data and gene expression data as inputs. Classification
algorithms could also help in identifying potential biomarkers too which is another topic
that has received increasing attention in recent years. An R-implemented RF [99] for this
task achieved 81% accuracy in the validation dataset. A feature selection step is carried
out in this study before the classification. Reducing the dimension of the input makes the
classifier faster and facilitates interpretation of the results by clinicians.
The diversity of classification and regression algorithms makes scientists wonder
about the best choice to build new models and benchmark their own. To fairly assess
some of the most typical classifiers an extensive study was carried out [121] with a set of
algorithms. Six classifiers were evaluated on twelve datasets related to different cancers
(lung, head, neck, meningioma and laryngeal) using the AUC as a measure of which ones
will work well in the future too. Although none of the algorithms stood out over the others,
RF and Elastic Net Logistic Regression (ENLR) exhibited a higher discriminative power in
chemo and radiotherapy outcome. Therefore, it is suggested that they might be the first
choice when building classification models. The authors also claim that RF and ENLR
should be the preferred option against which custom models should be compared.
Many other supervised learning approaches can be found in the literature. Most of
the cases exploit datasets from the National Center for Biotechnology Information (NCBI)
or collected from local institutions. SVMs represent a method that is commonly adopted to
predict tumour progress after therapy and is especially helpful when predicting FOLFOX
therapy results in CRC patients because this type of algorithm usually works with images.
When working alone, SVM reached a positive prediction rate of 85.4% [98], which is similar
to that obtained by RF. However, SVM can also be combined with LR to provide a novel
scoring method to measure the tumour size response to therapy, as it outperforms the
traditional WHO and RECIST measurements [100].
Recent studies assessed a variety of ML methods in CRC prediction scenarios. Lu [65]
compared six models implemented with R packages in a FOLFOX response prediction task.
The models represented the following approaches: RF, SVM, ANN, DT, KNN and GBM.
The experimental tests showed that RF and SVM were the most accurate methods when
predicting FOLFOX outcome. Unfortunately, their performance fell off when predicting
other therapies such as FOLFIRI (5-FU, leucovorin and irinotecan), therefore, their appli-
cation to future patients is limited. The reason for this reduction in performance when
using alternative therapies seems to be related with the aforementioned use of unexplored
datasets with different characteristics, which would indicate a close relationship between
the model and the training data. The third best-ranked classifier was the ANN model,
whose accuracy was close to that of RF and SVM but was more consistent when confronted
with other therapies. This result demonstrates that ANNs constitute a powerful predictive
tool for future CRC studies. In another work [66] the authors assessed four ML methods
(LR, DT, GBM and Light GBM) and found GBM and Light GBM to be more accurate than
the others. This evidence leads us to think that GBM probably gain in importance in the
near future. Finally, the rapid development of ANN and its variants (e.g., recurrent neural
networks, convolutional neural networks, adversarial neural networks) has encouraged
scientists to develop enhanced and more powerful networks capable of profiting from HPC
architectures. As a result of that evolution, several libraries (e.g., TensorFlow) are widely
used nowadays. Tensor-based networks are especially useful for image processing due to
their ability to exploit all the computing power of GPUs to analyse images in a parallel
manner. This novel ML paradigm has been used to build a CNN model that anticipates the
outcome after resection based on a dataset of 12 million images [94].
Int. J. Mol. Sci. 2021, 22, 4394 19 of 31
The poor interpretability of the results is a challenge that needs to be faced. Raw
estimations or complicated charts might be unintelligible to doctors and may render any
ML algorithm worthless for practical reasons. The data types feeding ML systems intended
to predict therapy outcomes are very different, ranging from binary data to well-structured
records (e.g., Excel, CSV, database records). In this step, the application of image processing
through CNN is not so frequent, as explained in the previous section, but still constitutes
the preferred approach when manipulating images, as can be seen in Table 5.
3.6. Predicting Survival Likelihood

Once cancer has been diagnosed, classified and treated, the next questions are how
the tumour will evolve and how likely is the patient’s survival. The former was already
answered in Section 3.3, so this section will focus on the available ML methods for the latter.
Note that the works introduced in Section 3.5 not necessarily predict the survival chances
in months, for example, but is more likely to focus on how the treatment will reduce the
tumour size. The prognostication of a patient’s survivability is not easy and depends on
many factors, such as the type of cancer and the stage. Fortunately, ML can help doctors
evaluate survival chances by analysing several biomarkers in a systematic manner. With
the aim of answering this question, Zhu [144] summarizes an extensive collection of works
concerning the use of DL in cancer prognosis, including some that estimate the survival
likelihood and even the survival time.
According to a recent review [33], SVMs provide the most accurate predictions of
cancer survival. Although all the analysed studies are trained on small datasets, they are
able to reach up to 98% and 97% accuracy in oral [128] and breast [123] cancer, respectively.
Other approaches such as ANNs [72] and BNs [125] are showing good results as well,
attaining more than 83% accuracy and both are expected to gain in importance in coming
years. On the other hand, SSL, which only works with a few labelled samples, has emerged
as a feasible alternative to the classic supervised and unsupervised learning paradigms but,
as its results show (71% and 76% accuracy reported by [126] and [67]), predictive capacity
of this approach still has to be improved. Nevertheless, another study on lung cancer that
used similar ML techniques yielded different results [130]. The authors evaluated linear
regression, DT, SVM, GBM and a custom ensemble, finding that GBM was the most accurate
model in terms of root mean square error (RMSE). All the models were implemented in
R language and trained on SEER database. In recent decades, cancer has been one of the
preferred fields for the assessment of ML models to predict survival likelihood. An analysis
of survivability in prostate cancer patients [133] was carried out using three non-linear
statistical methods: DT, BN and Cox [145]. This work represents a case study that aims to
demonstrate that ML classifiers are useful for estimating a patient’s survival chances, a
process that is receiving increasing attention from ML experts. The authors conclude that
ML statistical models could be helpful in the near future for predicting survival and other
issues such as the probability of recurrence in cancer patients.
The new wave of ML is dominated by ANNs and their subtypes such as convolutional,
recurrent or adversarial neural networks, among others. CRC can also profit from ANNs to
predict survival chances, especially when the input datasets are image collections and the
use of CNNs is advantageous. A recent work [131] described the training of a DL system,
built on convolutional and recurrent neural networks, to classify tumour images. Such
classifications of tumour images are a frequent way of predicting tumour evolution and,
consequently, evaluating survival chances. It is worth mentioning that the classifier used
by these authors ran on a GPU to accelerate the processing and deliver the results in a short
time. GPUs can speed up CNN calculations dramatically, which is a huge advantage due
to the large number of samples that CNNs usually deal with and the high number of layers
they have. Other cancer types also take advantage of CNNs and exploit GPU computing
power. Such is the case with brain cancer, for which condition patient survival can be
estimated by means of the recently published classifier DeepSurvNet [132]. DeepSurvNet
builds CNN models implemented with Keras and TensorFlow libraries, which are trained
Int. J. Mol. Sci. 2021, 22, 4394 20 of 31
with a dataset from the TCGA Program [146]. The models classify the patients into four
groups, each with an estimated overall survival.
The use of ML approaches whose output can be graphically represented, such as BN,
DT and CNN, facilitates the interpretation of survival chances by healthcare professionals.
The easy interpretation of results should always be taken as a requirement when ML is
to be applied in a context outside computer sciences. It is also worth noting that medical
records extracted from public databases are a common input [147,148] when evaluating
survivability, which indicates that long-term well-structured data are the most useful data
source to predict survival chances.
4. Software and Datasets

In this section, we will summarize the most relevant technical details extracted from
the above-mentioned works, such as the software tools created, the availability of the
source code, the use of HPC platforms and the main features of the datasets. Figure 2
summarizes the approaches applied at every stage. It can be observed that ANN, LR and
SVM are the most common methods in cancer research. RF, BN, DT, KNN, GBM and CNN
are also used frequently but are not reported in all the tasks.
Figure 2. Graphical summary of ML methods being applied in cancer research tasks. Super indices in the central figure
represent the number of steps in which that approach is reported. No index means that the approach is reported in all
the tasks.
4.1. Software Tools

In Tables 1–6 we have enumerated a number of libraries and frameworks frequently
used for developing ML models. There is a clear trend to implement models in R and
Python languages. R is a good choice for rapid development due to the diverse collection
of packages it provides (e.g., caret, e1071, Bioconductor) and the many possibilities it
offers to create different models, including SVM, RF and DT, among others. Therefore, its
Int. J. Mol. Sci. 2021, 22, 4394 21 of 31
simplicity and flexibility make it an attractive alternative for several scientists. The other
preferred option is Python and, in particular, frameworks like TensorFlow and PyTorch.
Tensor-based frameworks have gained in importance in recent years supported by the
rapid development of GPUs, which are a very suitable hardware solution for tensored
calculations. While the use of R implementations has been mentioned for several years,
publications reporting works in Python-based frameworks tend to date from 2017. This
confirms the intuition that the development of GPUs and more generally HPC, will be
closely connected with the advances achieved in the performance of ML algorithms in the
near future.
Many statistical tools are less frequently used. This group of statistical methods
is composed of tools such as Matlab, SPSS, Caffe and Weka. Although they are not
so powerful as programming languages, they offer many statistical features that allow
the rapid development of models, including LR, SVM, ANN and BN. Furthermore, the
indicated tools are well established in the academic world and so many scientists are
familiar with them and their reliability has been extensively proved.
Despite being well known and a very stable language, Java is barely used in this
context. Only the Encog and MLlib libraries are reported in the works. There may be many
reasons to explain this, but the main ones are probably that Java is usually considered
slower than other languages and that the users do not have the programming skills required
by this tool.
Few authors share the source code of their models with the community (see Table 7).
Sometimes they prefer to develop and release a novel tool providing the obtained models
through a web interface [81,89]. While this is an understandable decision it hinders under-
standing of the models by external users. However, other researchers freely share their
codes, usually on github and allow others to study and analyse how they are developed.
From an objective point of view, this is the preferred solution because it allows existing
codes to be better understood, improved and optimized, as well as the development of
new models from a solid base.
Table 7. List of code repositories or servers listed in the manuscript.
Task Ref. Code Availability 1

[60] https://github.com/hambeh/breast-cancer-risk-prediction
Predict cancer risk
[56] https://github.com/fcproj/BIGBIOCL
[77] https://github.com/noamaus/LSTM-Mutational-series
Predict progression
[81] https://webs.iiitd.edu.in/raghava/cancerspp/
Predictrecurrence [134] http://ami.ajou.ac.kr/bcr/
[85] https://protege.stanford.edu/
https://github.com/rcelebi/dream-drugcombo https:
[86]
//www.synapse.org/#!Synapse:syn5605365/wiki/394725
Estimate drug synergy
[87] https://github.com/szen95/SEABED
[89] http://www.bioinf.jku.at/software/DeepSynergy/
https://github.com/tensorflow/models/tree/master/
[91]
research/deeplab
[92] http://liangchiehchen.com/projects/DeepLab.html
Predict therapy outcome [121] https://github.com/timodeist/classifier_selection_code
Predict survival [126] http://embio.yonsei.ac.kr/Park/ssl.php
1 Access date: 29 March 2021.
Int. J. Mol. Sci. 2021, 22, 4394 22 of 31
4.2. HPC Infrastructures

While HPC platforms are rarely reported in the analyzed papers, the use of GPUs
has increasingly been mentioned recent years (e.g., Bychkov et al., 2018; Zadeh Shirazi
et al., 2020). The recent development of Tensor-based frameworks and libraries for ML,
e.g., TensorFlow, Keras, PyTorch, has promoted the use of GPUs for programming ML
algorithms [66,84,89,91,94,129,131,132]. The rapid integration of GPU computing in ML
strongly suggests that faster ML algorithms will emerge in coming years, resulting in the
ability to handle even larger training datasets.
Please note that, although references to other HPC paradigms have not been found
in this revision, it is very possible that other authors have leveraged HPC platforms (e.g.,
parallel computing) in their works.
4.3. Datasets
We can broadly classify the input datasets into two major groups: (i) those obtained
from publicly available databases; and (ii) those collected from institutions (e.g., hospitals
or universities). Although both online and custom approaches are valid, public datasets
facilitate the reproducibility of the experiments. SEER and TCGA databases are typically
used in cancer research.
Leaving their source aside, we have focused on two properties of the datasets: the
data types they contain and the size of the training dataset. The data types vary widely
between works, including in terms of the text, images, medical records and binary data.
Numerical values are the preferred option for feeding ML algorithms because they mostly
work on numerical calculations. As can be observed in Tables 1–6, when public or private
institutions are responsible for collecting data, they usually work with numerical data. In
addition, text inputs are widely used, probably because they can be easily translated into
numerical values. Images are typically used to feed CNNs due to the ability of this type of
network to apply sequential filters on images and extract patterns. This is also a frequent
option because many hospitals and universities have easy access to historical images from
scans, tomography or mammography.
Structured information is more suitable for ML than unstructured. Usually, ML
algorithms receive a set of well-defined inputs, which they evaluate and weight to make
predictions. Therefore, when the inputs are clearly defined, the models can be easily
developed (e.g., BN, ANN, LR, DT). This is the case of databases, such as ACTUR, SEER
and TCGA and other datasets where the fields are undoubtedly separated.
The second key feature of the analysed datasets is their size, which ranges from tens
to millions of samples. In general terms, neural networks (ANN, CNN and RNN) handle
the largest datasets (e.g., 200,000, 463,080, 235,673 and 12 × 106 samples). Although a large
number of samples may seem an advantage, their sheer numbers can slow the system
down during training. Thus, finding a good balance between dataset size and learning
capability is required. By contrast, the simplest approaches seem to need fewer data to
learn as can be inferred from the fact that the smallest datasets (less than 100 training
samples) are used by traditional methods such as RF and SVM. Figure 3 shows the reported
dataset size used in ML algorithms.
Int. J. Mol. Sci. 2021, 22, 4394 23 of 31
Figure 3. Reported dataset size by algorithm.
5. Conclusions and Outlook

Decision-making is one of the main challenges in modern medicine is exercised
at every stage of a disease’s lifecycle, from diagnosis to the prediction of recurrence.
Traditionally, doctors have trusted their experience to choose the best option for individual
patients. However, they cannot be expected to recall all the details of all the patients
they have treated in the past, which clouds their ability to recognize patterns in similar
situations. This is where computational help is required.
In recent years, AI and, more specifically, machine and deep learning, have looked
at medical decision-making. In this context, anti-cancer medicine has been found to be a
favourite playground due to the high mortality rate of the disease, the increasing number of
cases expected in the forthcoming years and the vast amount of data available in databases
of hospitals, universities and research centres. The diversity of existing cancer types
encourages experiments with different ML algorithms aimed at the same target. In this
review, we have analysed the generalized use of ML in cancer research but always bearing
in mind CRC.
CRC is the fourth cause of mortality due to cancer worldwide and the number of cases
that are expected to appear in the next decade is not promising, making it a suitable target
for ML. Any ML algorithm can be applied on CRC research ranging from the simplest (e.g.,
LR, SVM, KNN) to the most complex ones (e.g., CNN, DNN) but it has been observed that
ANN, LR and SVM are frequently reported in any task related with decision-making (risk
prediction, recurrence prediction, tumour progression, estimation of drug synergy, therapy
Int. J. Mol. Sci. 2021, 22, 4394 24 of 31
outcomes and survival time estimates). Moreover, RF, GBM, BN, DT, KNN and CNN are
often applied in many cases.
There is no clear relationship between the selected approach and the type of data
feeding the system. However, CNN is clearly the preferred option when manipulating
medical images. It is clear that well-structured records with text or numerical fields are
the simplest and favourite options when available. The dataset size is another key factor
when training ML or DL systems. If the dataset is too small, the ML system will face
difficulties related with learning and generalizing, whereas excessively large datasets may
slow down the training phase. Thus, finding the optimal dataset size remains a challenge.
As regards performance in terms of computing time, his key concern for scientists has
resulted in the emergence of libraries and frameworks specifically focused on profiting
from HPC facilities, such as GPUs. GPU are the preferred architecture for running CNN
calculations and NVIDIA has placed its bet on this technology becoming the world’s
leading manufacturer.
Interpretability has been identified as the third key point to worry about, although it
is no less important than the typical accuracy and performance metrics. The importance
of interpretability stems from the fact that ML is increasingly used in a medical context,
where users are often inexperienced in interpreting AI metrics and results. Consequently,
output must be translated into a language that physicians can understand. It has been
perceived that interpretability is still barely considered in most of the works analysed,
suggesting that it is a factor that can be improved in order to “democratize” AI in many
other areas. To improve the interpretability of systems, feature selection methods are
sometimes applied before classification. This technique helps to reduce the input size
leading to faster classification and providing a more interpretable output. Some ML
algorithms such as BN and DT are especially appropriate for this purpose because they
return labelled directed graphs which are very easy to read and interpret.
In short, we predict a bright future of ML and DL in medical decision-making, but
the results must be more explainable in this or any other context. Identifying the optimal
training dataset size is another factor that deserves further study. Fortunately, the rapid
development of HPC will make ML systems more efficient and enable them to transform
the overwhelming quantity of historical data stored in public and private databases into
real, reliable and valuable knowledge.
Author Contributions: All the authors contributed equally to this work. All authors have read and
agreed to the published version of the manuscript.
Funding: This work has been funded by grants from the European Project Horizon 2020 SC1-BHC-
02-2019 [REVERT, ID:848098]; Fundación Séneca del Centro de Coordinación de la Investigación de
la Región de Murcia [Project 20988/PI/18]; and Spanish Ministry of Economy and Competitiveness
[CTQ2017-87974-R].
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: No new data were created or analyzed in this study. Data sharing is
not applicable to this article.
Acknowledgments: This work has been funded by grants from the European Project Horizon
2020 SC1-BHC-02-2019 [REVERT, ID:848098]; Fundación Séneca del Centro de Coordinación de la
Investigación de la Región de Murcia [Project 20988/PI/18]; and Spanish Ministry of Economy and
Competitiveness [CTQ2017-87974-R]. Powered@NLHPC: This research was partially supported by
the supercomputing infrastructure of the NLHPC (ECM-02).
Conflicts of Interest: The authors declare no conflict of interest.
Int. J. Mol. Sci. 2021, 22, 4394 25 of 31
Abbreviations
1CM One-carbo metabolism

AI Artificial Intelligence
ANN Artificial Neural Network
AUC Area Under the Curve
BC Breast Cancer
BioBIM InterInstitutional Multidisciplinary Biobank
BMI Body Mass Index
BN Bayesian Network
CCF Cancer Cell Fraction
CNN Convolutional Neural Network
CRC Colorectal Cancer
DCNN Dilated Convolutional Neural Network
DL Deep Learning
DSS Decision Support System
DT Decision Tree
ELM Extreme Learning Machine
EMR Electronic Medical Record
ENLR Elastic Net Logistic Regression
FOLFIRI 5-FU leucovorin and irinotecan
FOLFOX 5-FU leucovorin and oxaliplatin
FT Fourier Transform
GBM Gradient Boosting Machine
GEO Gene Expression Omnibus
GOSS Genetic Ontology Similarity Score
GPU Graphics Processing Unit
HDF5 Hierarchical Data Format 5
HNSCC Head and Neck Squamous Cell Carcinoma
HPC High Performance Computing
ICBC Iranian Centre for Breast Cancer
IMRT Intensity Modulated Radiotherapy
KNN K-Nearest Neighbours
LDA Linear Discriminant Analysis
LPP Locality Preserving Projection
LR Logistic Regression
LSTM Long Short-Term Memory
ML Machine Learning
MVA Multivariate analysis
NCBI National Center for Biotechnology Information
NCSS Number Cruncher Statistical Systems
NMSC Non-Melanoma Skin Cancer
PCA Principal Component Analysis
RECIST Response Evaluation Criteria In Solid Tumors
REVOLVER Repeated EVOLution in cancER
RF Random Forest
RMSE Root Mean Square Error
RNN Recurrent Neural Network
RO Random Optimization
ROC Receiver Operating Characteristic
SAP Single Amino Acid Polymorphism
SEABED Segmentation and Biomarker Enrichment of Differential Treatment Response
SEER Surveillance Epidemiology and End Results
SIFT Sorting Intolerant From Tolerant
SKCM Skin Cutaneous Melanoma
SNP Single Nucleotide Polymorphism
Int. J. Mol. Sci. 2021, 22, 4394 26 of 31
SSL Semi-Supervised Learning

SVC-W Support Vector Classification with Weight
SVM Support Vector Machine
SVM-L1 Support Vector Machine with L1 Regularization
TCGA The Cancer Genome Atlas
TGF-β Transforming Growth Factor beta
TL Transfer Learning
WEKA-FCBF Waikato Environment of Knowledge Analysis—Fast Correlation Based Filter
WHO World Health Organization
XAI Explainable Artificial Intelligence
YARN Yet Another Resource Negotiator
References
1. Cronin, K.A.; Lake, A.J.; Scott, S.; Sherman, R.L.; Noone, A.M.; Howlader, N.; Henley, S.J.; Anderson, R.N.; Firth, A.U.; Ma, J.;
et al. Annual report to the nation on the status of cancer, part I: National cancer statistics. Cancer 2018, 124, 2785–2800. [CrossRef]
2. Culp, M.B.B.; Soerjomataram, I.; Efstathiou, J.A.; Bray, F.; Jemal, A. Recent global patterns in prostate cancer incidence and
mortality rates. Eur. Urol. 2020, 77, 38–52. [CrossRef]
3. Ferlay, J.; Soerjomataram, I.; Dikshit, R.; Eser, S.; Mathers, C.; Rebelo, M.; Parkin, D.M.; Forman, D.; Bray, F. Cancer incidence and
mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012. Int. J. Cancer 2015, 136, E359–E386. [CrossRef]
4. Chiavenna, S.; Jaworski, J.P.; Vendrell, A. State of the art in anti-cancer mAbs. J. Biomed. Sci. 2017, 24. [CrossRef]
5. Loud, J.T.; Murphy, J. Cancer screening and early detection in the 21st century. Semin. Oncol. Nurs. 2017, 33, 121–128. [CrossRef]
6. Coleman, C. Early detection and screening for breast cancer. Semin. Oncol. Nurs. 2017, 33, 141–155. [CrossRef]
7. Araghi, M.; Soerjomataram, I.; Jenkins, M.; Brierley, J.; Morris, E.; Bray, F.; Arnold, M. Global trends in colorectal cancer mortality:
Projections to the year 2035. Int. J. Cancer 2019, 144, 2992–3000. [CrossRef] [PubMed]
8. Dekker, E.; Tanis, P.J.; Vleugels, J.L.A.; Kasi, P.M.; Wallace, M.B. Colorectal cancer. Lancet 2019, 394, 1467–1480. [CrossRef]
9. Arnold, M.; Sierra, M.S.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global patterns and trends in colorectal cancer
incidence and mortality. Gut 2017, 66, 683–691. [CrossRef] [PubMed]
10. Kuipers, E.J.; Grady, W.M.; Lieberman, D.; Seufferlein, T.; Sung, J.J.; Boelens, P.G.; Van De Velde, C.J.H.; Watanabe, T. Colorectal
cancer. Nat. Rev. Dis. Prim. 2015, 1. [CrossRef] [PubMed]
11. Weinberg, B.A.; Marshall, J.L.; Salem, M.E. The growing challenge of young adults with colorectal cancer. Oncology 2017, 31,
381–389.
12. García-Figueiras, R.; Baleato-González, S.; Padhani, A.R.; Luna-Alcalá, A.; Marhuenda, A.; Vilanova, J.C.; Osorio-Vázquez, I.;
Martínez-de-Alegría, A.; Gómez-Caamaño, A. Advanced imaging techniques in evaluation of colorectal cancer. Radiographics
2018, 38, 740–765. [CrossRef] [PubMed]
13. Valle, L.; Vilar, E.; Tavtigian, S.V.; Stoffel, E.M. Genetic predisposition to colorectal cancer: Syndromes, genes, classification of
genetic variants and implications for precision medicine. J. Pathol. 2019, 247, 574–588. [CrossRef]
14. Huang, D.; Sun, W.; Zhou, Y.; Li, P.; Chen, F.; Chen, H.; Xia, D.; Xu, E.; Lai, M.; Wu, Y.; et al. Mutations of key driver genes in
colorectal cancer progression and metastasis. Cancer Metastasis Rev. 2018, 37, 173–187. [CrossRef] [PubMed]
15. Oh, M.; McBride, A.; Yun, S.; Bhattacharjee, S.; Slack, M.; Martin, J.R.; Jeter, J.; Abraham, I. BRCA1 and BRCA2 gene mutations
and colorectal cancer risk: Systematic review and meta-analysis. J. Natl. Cancer Inst. 2018, 110, 1178–1189. [CrossRef]
16. Ruiz-López, L.; Blancas, I.; Garrido, J.M.; Mut-Salud, N.; Moya-Jódar, M.; Osuna, A.; Rodríguez-Serrano, F. The role of exosomes
on colorectal cancer: A review. J. Gastroenterol. Hepatol. 2018, 33, 792–799. [CrossRef]
17. Yiu, A.J.; Yiu, C.Y. Biomarkers in colorectal cancer. Anticancer Res. 2016, 36, 1093–1102.
18. Lech, G.; Słotwiński, R.; Słodkowski, M.; Krasnod˛ebski, I.W. Colorectal cancer tumour markers and biomarkers: Recent
therapeutic advances. World J. Gastroenterol. 2016, 22, 1745–1755. [CrossRef] [PubMed]
19. Ding, D.; Han, S.; Zhang, H.; He, Y.; Li, Y. Predictive biomarkers of colorectal cancer. Comput. Biol. Chem. 2019, 83. [CrossRef]
[PubMed]
20. Kather, J.N.; Halama, N.; Jaeger, D. Genomics and emerging biomarkers for immunotherapy of colorectal cancer. Semin. Cancer
Biol. 2018, 52, 189–197. [CrossRef] [PubMed]
21. Jain, K.K. Personalised medicine for cancer: From drug development into clinical practice. Expert Opin. Pharmacother. 2005, 6,
1463–1476. [CrossRef]
22. Jackson, S.E.; Chester, J.D. Personalised cancer medicine. Int. J. Cancer 2015, 137, 262–266. [CrossRef]
23. Usher-Smith, J.A.; Silarova, B.; Lophatananon, A.; Duschinsky, R.; Campbell, J.; Warcaba, J.; Muir, K. Responses to provision of
personalised cancer risk information: A qualitative interview study with members of the public. BMC Public Health 2017, 17.
[CrossRef]
24. Olin, R.L. Delivering intensive therapies to older adults with hematologic malignancies: Strategies to personalize care. Blood 2019,
134, 2013–2021. [CrossRef] [PubMed]
25. Upton, A.; Trelles, O.; Cornejo-García, J.A.; Perkins, J.R. Review: High-performance computing to detect epistasis in genome
scale data sets. Brief. Bioinform. 2016, 17, 368–379. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2021, 22, 4394 27 of 31
26. Schmidt, B.; Hildebrandt, A. Next-generation sequencing: Big data meets high performance computing. Drug Discov. Today 2017,
27. Chen, S.; He, Z.; Han, X.; He, X.; Li, R.; Zhu, H.; Zhao, D.; Dai, C.; Zhang, Y.; Lu, Z.; et al. How big data and high-performance
computing drive brain science. Genom. Proteom. Bioinf. 2019, 17, 381–392. [CrossRef]
28. Wang, H.; Ma, Y.; Pratx, G.; Xing, L. Toward real-time Monte Carlo simulation using a commercial cloud computing infrastructure.
Phys. Med. Biol. 2011, 56, N175–N181. [CrossRef]
29. Garg, V.; Arora, S.; Gupta, C. Cloud computing approaches to accelerate drug discovery value chain. Comb. Chem. High Throughput
Screen. 2011, 14, 861–871. [CrossRef]
30. Nobile, M.S.; Cazzaniga, P.; Tangherloni, A.; Besozzi, D. Graphics processing units in bioinformatics, computational biology and
systems biology. Brief. Bioinf. 2017, 18, 870–885. [CrossRef]
31. Dilsizian, S.E.; Siegel, E.L. Artificial intelligence in medicine and cardiac imaging: Harnessing big data and advanced computing
to provide personalized medical diagnosis and treatment. Curr. Cardiol. Rep. 2014, 16. [CrossRef]
32. Pérez-Sianes, J.; Pérez-Sánchez, H.; Díaz, F. Virtual Screening Meets Deep Learning. Curr. Comput. Aided. Drug Des. 2019, 15, 6–28.
[CrossRef]
33. Kourou, K.; Exarchos, T.P.; Exarchos, K.P.; Karamouzis, M.V.; Fotiadis, D.I. Machine learning applications in cancer prognosis and
prediction. Comput. Struct. Biotechnol. J. 2015, 13, 8–17. [CrossRef]
34. Lipton, Z.C. The Mythos of Model Interpretability. arXiv 2018, arXiv:1301.3781. [CrossRef]
35. Jacovi, A.; Sar Shalom, O.; Goldberg, Y. Understanding convolutional neural networks for text classification. In Proceedings
of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP; Association for Computational
Linguistics: Brussels, Belgium, 2018; pp. 56–65.
36. Swartout, W.; Paris, C.; Moore, J. Explanations in knowledge systems: Design for explainable expert systems. IEEE Exp. 1991, 6,
54–58. [CrossRef]
37. Johnson, W.L. Agents that learn to explain themselves. In Proceedings of the 12th National Conference on Artificial Intelligence
(AAAI’ 94), Seattle, WA, USA, 31 July– 4 August 1994; pp. 1257–1263.
38. Lacave, C.; Díez, F.J. A review of explanation methods for Bayesian networks. Knowl. Eng. Rev. 2002, 17, 107–127. [CrossRef]
39. Holzinger, A.; Langs, G.; Denk, H.; Zatloukal, K.; Müller, H. Causability and explainability of artificial intelligence in medicine.
Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1312. [CrossRef] [PubMed]
40. Holzinger, A.; Dehmer, M.; Jurisica, I. Knowledge discovery and interactive data mining in bioinformatics—State-of-the-art,
future challenges and research directions. BMC Bioinf. 2014, 15, I1. [CrossRef] [PubMed]
41. Lee, S.; Holzinger, A. Knowledge discovery from complex high dimensional data. In Solving Large Scale Learning Tasks. Challenges
and Algorithms, Lecture Notes in Artificial Intelligence; Michaelis, S., Piatkowski, N., Eds.; Springer: Heidelberg, Germany, 2016;
pp. 148–167.
42. Gunning, D. Explainable artificial intelligence (XAI). AI Mag. 2019, 40, 44–58. [CrossRef]
43. Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013,
arXiv:1301.3781.
44. Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. Unsupervised anomaly detection with generative
adversarial networks to guide marker discovery. In Proceedings of the International Conference on Information Processing in
Medical Imaging, Boone, NC, USA, 25–30 June 2017; pp. 146–157.
45. Bærøe, K.; Miyata-Sturm, A.; Henden, E. How to achieve trustworthy artificial intelligence for health. Bull. World Health Organ.
2020, 98, 257–262. [CrossRef]
46. Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference
on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 3319–3328.
47. Oakden-Rayner, L.; Palmer, L.J. Artificial intelligence in medicine: Validation and study design: Opportunities, applications and
risks. In Artificial Intelligence in Medical Imaging; Springer: Cham, Switzerland, 2019; pp. 83–104.
48. Hermon, R.; Williams, P.A. Big data in healthcare: What is it used for? In Proceedings of the Australian Ehealth Informatics and
Security Conference, Perth, Australia, 1–3 December 2014; pp. 40–49.
49. Archenaa, J.; Anita, E.M. A survey of big data analytics in healthcare and government. Procedia Comput. Sci. 2015, 50, 408–413.
[CrossRef]
50. Ristevski, B.; Chen, M. Big Data Analytics in Medicine and Healthcare. J. Integr. Bioinform. 2018, 15, 20170030. [CrossRef]
[PubMed]
51. Sun, H.; Liu, Z.; Wang, G.; Lian, W.; Ma, J. Intelligent analysis of medical big data based on deep learning. IEEE Access 2019, 7,
142022–142037. [CrossRef]
52. Hassan, A.K.; Hassan, Y.F.; Kholief, M.H. A deep classification system for medical big data analysis. J. Med. Imag. Health Inf. 2018,
8, 250–256.
53. Chen, S.; Wu, S. Identifying lung cancer risk factors in the elderly using deep neural networks: Quantitative analysis of web-based
survey data. J. Med. Internet Res. 2020, 22, e17695. [CrossRef]
54. Kaminker, J.S.; Zhang, Y.; Watanabe, C.; Zhang, Z. CanPredict: A computational tool for predicting cancer-associated missense
mutations. Nucleic Acids Res. 2007, 35. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2021, 22, 4394 28 of 31
55. Capriotti, E.; Altman, R.B. A new disease-specific machine learning approach for the prediction of cancer-causing missense
variants. Genomics 2011, 98, 310–317. [CrossRef]
56. Celli, F.; Cumbo, F.; Weitschek, E. Classification of large DNA methylation datasets for identifying cancer drivers. Big Data Res.
2018, 13, 21–28. [CrossRef]
57. Myte, R.; Gylling, B.; Häggström, J.; Schneede, J.; Magne Ueland, P.; Hallmans, G.; Johansson, I.; Palmqvist, R.; Van Guelpen, B.
Untangling the role of one-carbon metabolism in colorectal cancer risk: A comprehensive Bayesian network analysis. Sci. Rep.
2017, 7. [CrossRef]
58. Ayer, T.; Alagoz, O.; Chhatwal, J.; Shavlik, J.W.; Kahn, C.E.; Burnside, E.S. Breast cancer risk estimation with artificial neural
networks revisited: Discrimination and calibration. Cancer 2010, 116, 3310–3321. [CrossRef]
59. Heidari, M.; Khuzani, A.Z.; Hollingsworth, A.B.; Danala, G.; Mirniaharikandehei, S.; Qiu, Y.; Liu, H.; Zheng, B. Prediction of
breast cancer risk using a machine learning approach embedded with a locality preserving projection algorithm. Phys. Med. Biol.
2018, 63. [CrossRef]
60. Behravan, H.; Hartikainen, J.M.; Tengström, M.; Pylkäs, K.; Winqvist, R.; Kosma, V.-M.; Mannermaa, A. Machine learning
identifies interacting genetic variants contributing to breast cancer risk: A case study in Finnish cases and controls. Sci. Rep. 2018,
8. [CrossRef] [PubMed]
61. Taninaga, J.; Nishiyama, Y.; Fujibayashi, K.; Gunji, T.; Sasabe, N.; Iijima, K.; Naito, T. Prediction of future gastric cancer risk using
a machine learning algorithm and comprehensive medical check-up data: A case-control study. Sci. Rep. 2019, 9. [CrossRef]
[PubMed]
62. Roffman, D.; Hart, G.; Girardi, M.; Ko, C.J.; Deng, J. Predicting non-melanoma skin cancer via a multi-parameterized artificial
neural network. Sci. Rep. 2018, 8. [CrossRef] [PubMed]
63. Martínez-Más, J.; Bueno-Crespo, A.; Khazendar, S.; Remezal-Solano, M.; Martínez-Cendán, J.P.; Jassim, S.; Du, H.; Assam, H.A.;
Bourne, T.; Timmerman, D. Evaluation of machine learning methods with Fourier Transform features for classifying ovarian
tumors based on ultrasound images. PLoS ONE 2019, 14, e0219388. [CrossRef]
64. Martínez-Más, J.; Bueno-Crespo, A.; Martínez-España, R.; Remezal-Solano, M.; Ortiz-González, A.; Ortiz-Reina, S.; Martínez-
Cendán, J.P. Classifying Papanicolaou cervical smears through a cell merger approach by deep learning technique. Exp. Syst.
Appl. 2020, 160, 113707. [CrossRef]
65. Lu, W.; Fu, D.; Kong, X.; Huang, Z.; Hwang, M.; Zhu, Y.; Chen, L.; Jiang, K.; Li, X.; Wu, Y.; et al. FOLFOX treatment response
prediction in metastatic or recurrent colorectal cancer patients via machine learning algorithms. Cancer Med. 2020, 9, 1419–1429.
[CrossRef]
66. Xu, Y.; Ju, L.; Tong, J.; Zhou, C.M.; Yang, J.J. Machine Learning Algorithms for Predicting the Recurrence of Stage IV Colorectal
Cancer After Tumor Resection. Sci. Rep. 2020, 10, 1–9. [CrossRef]
67. Kim, J.; Shin, H. Breast cancer survivability prediction using labeled, unlabeled, and pseudo-labeled patient data. J. Am. Med. Inf.
Assoc. 2013, 20, 613–618. [CrossRef]
68. Ahmad, L.; Eshlaghy, A.; Poorebrahimi, A.; Ebrahimi, M.; AR, R. Using three machine learning techniques for predicting breast
cancer recurrence. J. Health Med. Inf. 2013, 4. [CrossRef]
69. Ferroni, P.; Zanzotto, F.M.; Riondino, S.; Scarpato, N.; Guadagni, F.; Roselli, M. Breast cancer prognosis using a machine learning
approach. Cancers 2019, 11, 328. [CrossRef]
70. Park, C.; Ahn, J.; Kim, H.; Park, S. Integrative gene network construction to analyze cancer recurrence using semi-supervised
learning. PLoS ONE 2014, 9. [CrossRef] [PubMed]
71. Exarchos, K.P.; Goletsis, Y.; Fotiadis, D.I. Multiparametric decision support system for the prediction of oral cancer reoccurrence.
IEEE Trans. Inf. Technol. Biomed. 2012, 16, 1127–1134. [CrossRef] [PubMed]
72. Tseng, C.J.; Lu, C.J.; Chang, C.C.; Chen, G. Den Application of machine learning to predict the recurrence-proneness for cervical
cancer. Neural Comput. Appl. 2014, 24, 1311–1316. [CrossRef]
73. Dercle, L.; Lu, L.; Schwartz, L.H.; Qian, M.; Tejpar, S.; Eggleton, P.; Zhao, B.; Piessevaux, H. Radiomics response signature for
identification of metastatic colorectal cancer sensitive to therapies targeting EGFR pathway. J. Natl. Cancer Inst. 2020, 112, 902–912.
[CrossRef] [PubMed]
74. Yates, L.R.; Gerstung, M.; Knappskog, S.; Desmedt, C.; Gundem, G.; Van Loo, P.; Aas, T.; Alexandrov, L.B.; Larsimont, D. Subclonal
diversification of primary breast cancer revealed by multiregion sequencing. Nat. Med. 2015, 21, 751–759. [CrossRef]
75. Gerlinger, M.; Horswell, S.; Larkin, J.; Rowan, A.J.; Salm, M.P.; Varela, I.; Fisher, R.; McGranahan, N.; Matthews, N.; Santos, C.R.;
et al. Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing. Nat. Genet. 2014,
46, 225–233. [CrossRef]
76. Caravagna, G.; Giarratano, Y.; Ramazzotti, D.; Tomlinson, I.; Graham, T.A.; Sanguinetti, G.; Sottoriva, A. Detecting repeated
cancer evolution from multi-region tumor sequencing data. Nat. Methods 2018, 15, 707–714. [CrossRef]
77. Auslander, N.; Wolf, Y.I.; Koonin, E.V. In silico learning of tumor evolution through mutational time series. Proc. Natl. Acad. Sci.
USA 2019, 116, 9501–9510. [CrossRef] [PubMed]
78. Albertazzi, E.; Cajone, F.; Leone, B.E.; Naguib, R.N.; Lakshmi, M.S.; Sherbet, G.V. Expression of metastasis-associated genes
h-mts1 (S100A4) and nm23 in carcinoma of breast is related to disease progression. DNA Cell Biol. 1998, 17, 335–342. [CrossRef]
Int. J. Mol. Sci. 2021, 22, 4394 29 of 31
79. Grey, S.R.; Dlay, S.S.; Leone, B.E.; Cajone, F.; Sherbet, G. V Prediction of nodal spread of breast cancer by using artificial neural
network-based analyses of S100A4, nm23 and steroid receptor expression. Clin. Exp. Metastasis 2003, 20, 507–514. [CrossRef]
[PubMed]
80. Ishii, H.; Saitoh, M.; Sakamoto, K.; Sakamoto, K.; Saigusa, D.; Kasai, H.; Ashizawa, K.; Miyazawa, K.; Takeda, S.; Masuyama, K.;
et al. Lipidome-based rapid diagnosis with machine learning for detection of TGF-β signalling activated area in head and neck
cancer. Br. J. Cancer 2020, 122, 995–1004. [CrossRef] [PubMed]
81. Bhalla, S.; Kaur, H.; Dhall, A.; Raghava, G.P.S. Prediction and analysis of skin cancer progression using genomics profiles of
patients. Sci. Rep. 2019, 9. [CrossRef]
82. Shiraishi, S.; Tan, J.; Olsen, L.A.; Moore, K.L. Knowledge-based prediction of plan quality metrics in intracranial stereotactic
radiosurgery. Med. Phys. 2015, 42, 908. [CrossRef] [PubMed]
83. Shiraishi, S.; Moore, K.L. Knowledge-based prediction of three-dimensional dose distributions for external beam radiotherapy.
Med. Phys. 2016, 43, 378–387. [CrossRef]
84. Nguyen, D.; Long, T.; Jia, X.; Lu, W.; Gu, X.; Iqbal, Z.; Jiang, S. A feasibility study for predicting optimal radiation therapy dose
distributions of prostate cancer patients from patient anatomy using deep learning. Sci. Rep. 2019, 9. [CrossRef]
85. Musen, M.A.; Tu, S.W.; Das, A.K.; Shahar, Y. EON: A component-based approach to automation of protocol-directed therapy.
Emerg. Infect. Dis. 1996, 3, 367–388. [CrossRef]
86. Celebi, R.; Movva, R.; Alpsoy, S.; Dumontier, M. In-silico prediction of synergistic anti-cancer drug combinations using multi-
omics data. Sci. Rep. 2019, 9. [CrossRef]
87. Keshava, N.; Toh, T.S.; Yuan, H.; Yang, B.; Menden, M.P.; Wang, D. Defining subpopulations of differential drug response to
reveal novel target populations. NPJ Syst. Biol. Appl. 2019, 5. [CrossRef] [PubMed]
88. O’Neil, J.; Benita, Y.; Feldman, I.; Chenard, M.; Roberts, B.; Liu, Y.; Li, J.; Kral, A.; Lejnine, S.; Loboda, A.; et al. An unbiased
oncology compound screen to identify novel combination strategies. Mol. Cancer Ther. 2016, 15, 1155–1162. [CrossRef] [PubMed]
89. Preuer, K.; Lewis, R.P.I.; Hochreiter, S.; Bender, A.; Bulusu, K.C.; Klambauer, G. DeepSynergy: Predicting anti-cancer drug synergy
with Deep Learning. Bioinformatics 2018, 34, 1538–1546. [CrossRef] [PubMed]
90. McIntosh, C.; Purdie, T.G. Contextual atlas regression forests: Multiple-atlas-based automated dose prediction in radiation
therapy. IEEE Trans. Med. Imag. 2016, 35, 1000–1012. [CrossRef]
91. Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep con-
volutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848.
[CrossRef]
92. Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017,
arXiv:1706.05587.
93. Weinstein, J.N.; Kohn, K.W.; Grever, M.R.; Viswanadhan, V.N.; Rubinstein, L.V.; Monks, A.P.; Scudiero, D.A. Neural computing in
cancer drug development: Predicting mechanism of action. Science 1992, 258, 447–451. [CrossRef]
94. Skrede, O.J.; De Raedt, S.; Kleppe, A.; Hveem, T.S.; Liestøl, K.; Maddison, J.; Askautrud, H.A.; Pradhan, M.; Nesheim, J.A.;
Albregtsen, F.; et al. Deep learning for prediction of colorectal cancer outcome: A discovery and validation study. Lancet 2020,
395, 350–360. [CrossRef]
95. Tsuji, S.; Midorikawa, Y.; Takahashi, T.; Yagi, K.; Takayama, T.; Yoshida, K.; Sugiyama, Y.; Aburatani, H. Potential responders to
FOLFOX therapy for colorectal cancer by Random Forests analysis. Br. J. Cancer 2012, 106, 126–132. [CrossRef]
96. Steele, S.R.; Bilchik, A.; Johnson, E.K.; Nissan, A.; Peoples, G.E.; Berhardt, J.S.; Kalina, P.; Petersen, B.; Brücher, B.; Protic, M.; et al.
Time-dependent estimates of recurrence and survival in colon cancer: Clinical decision support system tool development for
adjuvant therapy and oncological outcome assessment. Am. Surg. 2014, 80, 441–453. [CrossRef] [PubMed]
97. Menden, M.P.; Iorio, F.; Garnett, M.; McDermott, U.; Benes, C.H.; Ballester, P.J.; Saez-Rodriguez, J. Machine learning prediction of
cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS ONE 2013, 8. [CrossRef]
98. Lin, H.; Qiu, X.; Zhang, B.; Zhang, J. Identification of the predictive genes for the response of colorectal cancer patients to FOLFOX
therapy. Oncol. Targets Ther. 2018, 11, 5943–5955. [CrossRef]
99. Gan, Z.; Zou, Q.; Lin, Y.; Xu, Z.; Huang, Z.; Chen, Z.; Lv, Y. Identification of a 13-gene-based classifier as a potential biomarker to
predict the effects of fluorouracil-based chemotherapy in colorectal cancer. Oncol. Lett. 2019, 17, 5057–5063. [CrossRef] [PubMed]
100. Land, W.H.; Margolis, D.; Gottlieb, R.; Yang, J.Y.; Krupinski, E.A. Improving CT prediction of treatment response in patients with
metastatic colorectal carcinoma using statistical learning. Int. J. Comput. Biol. Drug Des. 2010, 3, 15–18. [CrossRef] [PubMed]
101. Del Rio, M.; Molina, F.; Bascoul-Mollevi, C.; Copois, V.; Bibeau, F.; Chalbos, P.; Bareil, C.; Kramar, A.; Salvetat, N.; Fraslon, C.; et al.
Gene expression signature in advanced colorectal cancer patients select drugs and response for the use of leucovorin, fluorouracil,
and irinotecan. J. Clin. Oncol. 2007, 25, 773–780. [CrossRef] [PubMed]
102. Hess, K.R.; Anderson, K.; Symmans, W.F.; Valero, V.; Ibrahim, N.; Mejia, J.A.; Booser, D.; Theriault, R.L.; Buzdar, A.U.; Dempsey,
P.J.; et al. Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin,
and cyclophosphamide in breast cancer. J. Clin. Oncol. 2006, 24, 4236–4244. [CrossRef] [PubMed]
103. Thuerigen, O.; Schneeweiss, A.; Toedt, G.; Warnat, P.; Hahn, M.; Kramer, H.; Brors, B.; Rudlowski, C.; Benner, A.; Schuetz, F.;
et al. Gene expression signature predicting pathologic complete response with gemcitabine, epirubicin, and docetaxel in primary
breast cancer. J. Clin. Oncol. 2006, 24, 1839–1845. [CrossRef] [PubMed]
Int. J. Mol. Sci. 2021, 22, 4394 30 of 31
104. Harris, L.N.; You, F.; Schnitt, S.J.; Witkiewicz, A.; Lu, X.; Sgroi, D.; Ryan, P.D.; Come, S.E.; Burstein, H.J.; Lesnikoski, B.A.; et al.
Predictors of resistance to preoperative trastuzumab and vinorelbine for HER2-positive early breast cancer. Clin. Cancer Res. 2007,
105. Mitra, A.P.; Skinner, E.C.; Miranda, G.; Daneshmand, S. A precystectomy decision model to predict pathological upstaging and
oncological outcomes in clinical stage T2 bladder cancer. BJU Int. 2013, 111, 240–248. [CrossRef]
106. Talby, L.; Chambost, H.; Roubaud, M.-C.; N’Guyen, C.; Milili, M.; Loriod, B.; Fossat, C.; Picard, C.; Gabert, J.; Chiappetta, P.;
et al. The chemosensitivity to therapy of childhood early B acute lymphoblastic leukemia could be determined by the combined
expression of CD34, SPI-B and BCR genes. Leuk. Res. 2006, 30, 665–676. [CrossRef]
107. Huang, C.C.; Gadd, S.; Breslow, N.; Cutcliffe, C.; Sredni, S.T.; Helenowski, I.B.; Dome, J.S.; Grundy, P.E.; Green, D.M.; Fritsch,
M.K.; et al. Predicting relapse in favorable histology wilms tumor using gene expression analysis: A report from the renal tumor
committee of the children’s oncology group. Clin. Cancer Res. 2009, 15, 1770–1778. [CrossRef]
108. Dressman, H.K.; Berchuck, A.; Chan, G.; Zhai, J.; Bild, A.; Sayer, R.; Cragun, J.; Clarke, J.; Whitaker, R.S.; Li, L.H.; et al. An
integrated genomic-based approach to individualized treatment of patients with advanced-stage ovarian cancer. J. Clin. Oncol.
2007, 25, 517–525. [CrossRef]
109. Duong, C.; Greenawalt, D.M.; Kowalczyk, A.; Ciavarella, M.L.; Raskutti, G.; Murray, W.K.; Phillips, W.A.; Thomas, R.J.S.
Pretreatment gene expression profiles can be used to predict response to neoadjuvant chemoradiotherapy in esophageal cancer.
Ann. Surg. Oncol. 2007, 14, 3602–3609. [CrossRef] [PubMed]
110. Belderbos, J.; Heemsbergen, W.; Hoogeman, M.; Pengel, K.; Rossi, M.; Lebesque, J. Acute esophageal toxicity in non-small cell
lung cancer patients after high dose conformal radiotherapy. Radiother. Oncol. 2005, 75, 157–164. [CrossRef] [PubMed]
111. Bots, W.T.C.; van den Bosch, S.; Zwijnenburg, E.M.; Dijkema, T.; van den Broek, G.B.; Weijs, W.L.J.; Verhoef, L.C.G.; Kaanders,
J.H.A.M. Reirradiation of head and neck cancer: Long-term disease control and toxicity. Head Neck 2017, 39, 1122–1130. [CrossRef]
[PubMed]
112. Carvalho, S.; Troost, E.G.C.; Bons, J.; Menheere, P.; Lambin, P.; Oberije, C. Prognostic value of blood-biomarkers related to
hypoxia, inflammation, immune response and tumour load in non-small cell lung cancer—A survival model with external
validationPrognostic value of blood-biomarkers in NSCLC. Radiother. Oncol. 2016, 119, 487–494. [CrossRef]
113. Janssens, G.O.; Rademakers, S.E.; Terhaard, C.H.; Doornaert, P.A.; Bijl, H.P.; Van Ende, P.D.; Chin, A.; Marres, H.A.; De Bree, R.;
Van Der Kogel, A.J.; et al. Accelerated radiotherapy with carbogen and nicotinamide for laryngeal cancer: Results of a phase III
randomized trial. J. Clin. Oncol. 2012, 30, 1777–1783. [CrossRef] [PubMed]
114. Jochems, A.; Deist, T.M.; El Naqa, I.; Kessler, M.; Mayo, C.; Reeves, J.; Jolly, S.; Matuszak, M.; Ten Haken, R.; van Soest, J.; et al.
Developing and validating a survival prediction model for NSCLC patients through distributed learning across 3 countries. Int. J.
Radiat. Oncol. Biol. Phys. 2017, 99, 344–352. [CrossRef] [PubMed]
115. Kwint, M.; Uijterlinde, W.I.; Nijkamp, J.; Chen, C.; de Bois, J.; Sonke, J.J.; van Herk, M.; van den Heuvel, M.M.; Belderbos, J. Acute
esophagus toxicity in lung cancer patients after Intensity Modulated Radiotherapy and concurrent chemotherapy. Int. J. Radiat.
Oncol. Biol. Phys. 2012, 84, 223–228. [CrossRef] [PubMed]
116. Lustberg, T.; Bailey, M.; Thwaites, D.I.; Miller, A.; Carolan, M.; Holloway, L.; Velazquez, E.R.; Hoebers, F.; Dekker, A. Implementa-
tion of a rapid learning platform: Predicting 2-year survival in laryngeal carcinoma patients in a clinical setting. Oncotarget 2016,
117. Oberije, C.; De Ruysscher, D.; Houben, R.; Van De Heuvel, M.; Uyterlinde, W.; Deasy, J.O.; Belderbos, J.; Dingemans, A.M.C.;
Rimner, A.; Din, S.; et al. A validated prediction model for overall survival from stage III non-small cell lung cancer: Toward
survival prediction for individual patients. Int. J. Radiat. Oncol. Biol. Phys. 2015, 92, 935–944. [CrossRef]
118. Olling, K.; Nyeng, D.W.; Wee, L. Predicting acute odynophagia during lung cancer radiotherapy using observations derived from
patient-centred nursing care. Tech. Innov. Patient Support Radiat. Oncol. 2018, 5, 16–20. [CrossRef]
119. Wijsman, R.; Dankers, F.J.W.M.; Troost, E.G.C.; Hoffmann, A.L.; van der Heijden, E.H.F.M.; de Geus-Oei, L.-F.; Bussink, J.
Multivariable normal-tissue complication modeling of acute esophageal toxicity in advanced stage non-small cell lung cancer
patients treated with intensity-modulated (chemo-)radiotherapy. Radiother. Oncol. 2015, 117, 49–54. [CrossRef] [PubMed]
120. Wijsman, R.; Dankers, F.J.W.M.; Troost, E.G.C.; Hoffmann, A.L.; van der Heijden, E.H.F.M.; de Geus-Oei, L.F.; Bussink, J. Inclusion
of incidental radiation dose to the cardiac atria and ventricles does not improve the prediction of radiation pneumonitis in
advanced-stage non-small cell lung cancer patients treated with intensity modulated radiation therapy. Int. J. Radiat. Oncol. Biol.
Phys. 2017, 99, 434–441. [CrossRef]
121. Deist, T.M.; Dankers, F.J.W.M.; Valdes, G.; Wijsman, R.; Hsu, I.C.; Oberije, C.; Lustberg, T.; van Soest, J.; Hoebers, F.; Jochems, A.;
et al. Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers. Med.
Phys. 2018, 45, 3449–3459. [CrossRef] [PubMed]
122. Van de Vijver, M.J.; Yudong, D.H.; van’t Veer, L.J.; Dai, H.; Hart, A.A.M.; Voskuil, D.W.; Schreiber, G.J.; Peterse, J.L.; Roberts, C. A
gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 2002, 347, 1999–2009. [CrossRef] [PubMed]
123. Xu, X.; Zhang, Y.; Zou, L.; Wang, M.; Li, A. A gene signature for breast cancer prognosis using support vector machine. In
Proceedings of the 5th International Conference on Biomedical Engineering and Informatics—BMEI 2012, Chongqing, China,
16–18 October 2012; pp. 928–931.
124. Van’t Veer, L.J.; Dai, H.; Van de Vijver, M.J.; He, Y.D.; Hart, A.A.M.; Mao, M.; Peterse, H.L.; Van Der Kooy, K.; Marton, M.J.;
Witteveen, A.T.; et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002, 415, 530–536. [CrossRef]
Int. J. Mol. Sci. 2021, 22, 4394 31 of 31
125. Gevaert, O.; De Smet, F.; Timmerman, D.; Moreau, Y.; De Moor, B. Predicting the prognosis of breast cancer by integrating clinical
and microarray data with Bayesian networks. Bioinformatics 2006, 22, e184–e190. [CrossRef]
126. Park, K.; Ali, A.; Kim, D.; An, Y.; Kim, M.; Shin, H. Robust predictive model for evaluating breast cancer survivability. Eng. Appl.
Artif. Intell. 2013, 26, 2194–2205. [CrossRef]
127. Delen, D.; Walker, G.; Kadam, A. Predicting breast cancer survivability: A comparison of three data mining methods. Artif. Intell.
Med. 2005, 34, 113–127. [CrossRef]
128. Rosado, P.; Lequerica-Fernandez, P.; Villallain, L.; Peña, I.; Sánchez-Lasheras, F.; De Vicente, J.C. Survival model in oral squamous
cell carcinoma based on clinicopathological parameters, molecular markers and support vector machines. Expert Syst. Appl. 2013,
40, 4770–4776. [CrossRef]
129. Chen, Y.-C.; Ke, W.-C.; Chiu, H.-W. Risk classification of cancer survival using ANN with gene expression data from multiple
laboratories. Comput. Biol. Med. 2014, 48, 1–7. [CrossRef] [PubMed]
130. Lynch, C.M.; Abdollahi, B.; Fuqua, J.D.; de Carlo, A.R.; Bartholomai, J.A.; Balgemann, R.N.; van Berkel, V.H.; Frieboes, H.B.
Prediction of lung cancer patient survival via supervised machine learning classification techniques. Int. J. Med. Inform. 2017, 108,
1–8. [CrossRef] [PubMed]
131. Bychkov, D.; Linder, N.; Turkki, R.; Nordling, S.; Kovanen, P.E.; Verrill, C.; Walliander, M.; Lundin, M.; Haglund, C.; Lundin, J.
Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 2018, 8, 1–11. [CrossRef] [PubMed]
132. Zadeh Shirazi, A.; Fornaciari, E.; Bagherian, N.S.; Ebert, L.M.; Koszyca, B.; Gomez, G.A. DeepSurvNet: Deep survival convo-
lutional network for brain cancer survival rate classification based on histopathological images. Med. Biol. Eng. Comput. 2020.
[CrossRef] [PubMed]
133. Zupan, B.; Demšar, J.; Kattan, M.W.; Beck, J.R.; Bratko, I. Machine learning for survival analysis: A case study on recurrence of
prostate cancer. Artif. Intell. Med. 2000, 20, 59–75. [CrossRef]
134. Kim, W.; Kim, K.S.; Lee, J.E.; Noh, D.Y.; Kim, S.W.; Jung, Y.S.; Park, M.Y.; Park, R.W. Development of novel breast cancer
recurrence prediction model using support vector machine. J. Breast Cancer 2012, 15, 230–238. [CrossRef]
135. McGranahan, N.; Swanton, C. Clonal heterogeneity and tumor evolution: Past, present, and the future. Cell 2017, 168, 613–628.
[CrossRef]
136. Greaves, M.; Maley, C.C. Clonal evolution in cancer. Nature 2012, 481, 306–313. [CrossRef]
137. Hall, A.; Massagué, J. Cell regulation. Curr. Opin. Cell Biol. 2008, 20, 117–118. [CrossRef]
138. Greenberg, E.S.; Chong, K.K.; Huynh, K.T.; Tanaka, R.; Hoon, D.S.B. Epigenetic biomarkers in skin cancer. Cancer Lett. 2012, 342,
170–177. [CrossRef]
139. Mazar, J.; Khaitan, D.; DeBlasio, D.; Zhong, C.; Govindarajan, S.S.; Kopanathi, S.; Zhang, S.; Ray, A.; Perera, R.J. Epigenetic
regulation of MicroRNA genes and the role of miR-34b in cell invasion and motility in human melanoma. PLoS ONE 2011, 6,
e24922. [CrossRef]
140. Mokhtari, R.B.; Homayouni, T.S.; Baluch, N.; Morgatskaya, E.; Kumar, S.; Das, B.; Yeger, H. Combination therapy in combating
cancer. Oncotarget 2017, 8, 38022–38043. [CrossRef] [PubMed]
141. Menden, M.P.; Wang, D.; Guan, Y.; Mason, M.J.; Szalai, B.; Bulusu, K.C.; Yu, T.; Kang, J.; Jeon, M.; Wolfinger, R.; et al. A cancer
pharmacogenomic screen powering crowd-sourced advancement of drug combination prediction. bioRxiv 2017. [CrossRef]
142. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 1996, 58, 267–288. [CrossRef]
143. Kearney, V.; Chan, J.W.; Valdes, G.; Solberg, T.D.; Yom, S.S. The application of artificial intelligence in the IMRT planning process
for head and neck cancer. Oral Oncol. 2018, 87, 111–116. [CrossRef]
144. Zhu, W.; Xie, L.; Han, J.; Guo, X. The application of deep learning in cancer prognosis prediction. Cancers 2020, 12, 603. [CrossRef]
[PubMed]
145. Cox, D.R. Regression models and life-tables. J. R. Stat. Soc. Ser. B 1972, 34, 187–220. [CrossRef]
146. Weinstein, J.N.; Collison, E.A.; Mills, G.B.; Shaw, K.R.M.; Ozenberger, B.A.; Ellrott, K.; Shmulevich, I.; Sander, C.; Stuart, J.M. The
cancer genome atlas pan-cancer analysis project. Nat. Genet. 2013, 45, 1113–1120. [CrossRef]
147. SEER. SEER Research Data 1975–2017—Surveillance, Epidemiology, and End Results (SEER) Program. 2019. Available online:
www.seer.cancer.gov (accessed on 29 March 2021).
148. Hutter, C.; Zenklusen, J.C. The Cancer Genome Atlas: Creating lasting value beyond its data. Cell 2018, 173, 283–285. [CrossRef]

Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Towards The Interpretability of Machine Learning Predictions For Medical Applications Targeting Personalised Therapies: A Cancer Case Survey

Uploaded by

Copyright:

Available Formats

International Journal of

1 Structural Bioinformatics and High-Performance Computing Research Group (BIO-HPC), Universidad

Int. J. Mol. Sci. 2021, 22, 4394. https://doi.org/10.3390/ijms22094394 https://www.mdpi.com/journal/ijms

A diverse range of therapies, including chemotherapy, radiotherapy, surgery and

2. What Kind of ML Is important in Medicine/Cancer Prediction and Treatment

2.1. Factor One: Output Interpretability

2.2. Factor Two: Linking to Original Cases to Produce Outputs

2.3. Factor Three: Data Hungriness

3. Application of ML Approaches in Cancer Cases

Training Data Set

Table 2. Summary of studies analysed in Section 3.2 about cancer recurrence.

Table 3. Works applying ML to forecast cancer progression.

Table 6. Summary of works about ML and the likelihood of survival.

3.1. Predict the Possibility of Cancer

3.2. Predict Cancer Recurrence

3.3. Predicting Cancer Progression

3.4. Calculating Drug Doses or Drug Combinations

In addition, deciding on the drug combination to be administered, identifying the exact

3.5. Predict Treatment Outcome

3.6. Predicting Survival Likelihood

4. Software and Datasets

4.1. Software Tools

Table 7. List of code repositories or servers listed in the manuscript.

Task Ref. Code Availability 1

4.2. HPC Infrastructures

Figure 3. Reported dataset size by algorithm.

5. Conclusions and Outlook

1CM One-carbo metabolism

SSL Semi-Supervised Learning

You might also like