Saditya Published Paper
Saditya Published Paper
https://doi.org/10.1007/s12351-024-00864-3
ORIGINAL PAPER
Abstract
Credit scoring is a mathematical and statistical tool that aids financial institutions in
deciding suitable candidates for the issuance of loans, based on the analysis of the
borrower’s financial history. Distinct groups of borrowers have unique characteris-
tics that must be identified and trained on to increase the accuracy of classification
models for all credit borrowers that financial institutions serve. Numerous studies
have shown that models based on diverse base-classifier models outperform other
statistical and AI-based techniques for related classification problems. This paper
proposes a novel multi-layer clustering and soft-voting-based ensemble classifica-
tion model, aptly named Self Organizing Map Clustering with Metaheuristic Voting
Ensembles (SCMVE) which uses a self-organizing map for clustering the data into
distinct clusters with their unique characteristics and then trains a sailfish optimizer
powered ensemble of SVM-KNN base classifiers for classification of each distinct
identified cluster. We train and evaluate our model on the standard public credit
scoring datasets—namely the German, Australian and Taiwan datasets and use mul-
tiple evaluation scores such as precision, F1 score, recall to compare the results of
our model with other prominent works in the field. On evaluation, SCMVE shows
outstanding results (95% accuracy on standard datasets) when compared with popu-
lar works in the field of credit scoring.
1 Introduction
Vol.:(0123456789)
making extensive use of machine learning algorithms such as KNN, ANN (Artificial
Neural Network), SVM, artificial immune systems have been introduced.
In most of the studies, single classifier-based models are constructed for credit
scoring. Experiments have shown that singular classifiers are ineffective in cap-
turing the patterns of individual customers, and thus a different type of classifica-
tion model, ensemble classifiers have been employed that have proved effective for
improving the accuracy and stability of single classifiers. Ensemble-based classifi-
ers combine several base classifiers for better predictive performance (Singh 2017).
Thus, research on ensemble methods has grown popular in the last few years. The
main idea behind ensemble classification is to overcome the weakness of individual
classifiers by combining their results. For effective ensemble classification, the base
classifiers must be accurate, diverse, and sensitive. The construction of ensemble
models usually consists of three steps—pool generation, selection of the classifica-
tion and related models, and the combination of the delivered results. In the first
step, base models are generated to create a model pool using various classification
algorithms or data. The second step is necessary to enhance the model performance.
The pool models are selected based on certain rules to maintain an accurate model.
In the last step, the selected models are combined based on some heuristics and
rules.
This paper proposes a multi-layer clustering and classification model consist-
ing of primary classifiers, using SOM and metaheuristic algorithm. In the first step,
instead of a traditional artificial neural network algorithm, we use self-organizing
maps (SOM) for clustering our dataset. SOM models are unsupervised ANN-based
clustering techniques based on competitive learning in which the nodes associated
with weights compete with each other to win an input pattern. It is used to represent
the high-dimensional input space to the final low-dimensional space. Once our data-
set is divided into different clusters where each data point is now only associated
with data points within the same cluster, we apply a classification model to each
cluster individually. We use an ensemble classifier and metaheuristic algorithm for
increasing the overall efficiency of our model. Our ensemble model is formed using
6 sets of KNN and 6 sets of SVM. The classifiers used have different prediction
capabilities. We use SVM for its proven results in classifying non-linearly separable
data and KNN for its closest distance-based classification.
Application of metaheuristics to classification problems has been observed in
the literature as early as 2007. For instance, Ant Colony Optimization was used in
standard classification problems (Martens et al. 2007). In our approach, the latest
metaheuristic model, sailfish optimizer (SFO) is employed for weights generation for
our ensemble model. The relative performance of the classifier is considered where
the classifier that performs well on the cluster dataset is assigned higher weights.
Based on the behavior of sailfish hunting for sardines, the optimizer has shown
excellent results when tested against various optimizers. SFO provides advantages in
terms of its exploration and exploitation phases, a formidable convergence speed in
reaching global optimization, and also better efficiency.
In all previous research, the model is trained using the entire dataset. Our model
departs from existing models by initially clustering similar data points and applying
the classification model to each of them. Also, the use of diverse base classifiers
in our ensemble model yields excellent results. Another advantage of our proposed
model is the use of SOMs to find better sets of clusters of the dataset. This train-
ing technique develops highly efficient state-of-the-art models that can provide high
efficiency.
Thus, SCMVE offers the following advantages and novelties compared to existing
work in the same field:
• Clustering through Self Organising Maps that identifies groups with distinct
characteristics that can be used to identify and train the classification models on
• Use of diverse base classifiers in soft-voting ensemble model, an approach that
outperforms singular complex models such as artificial neural networks
• Utilising an optimized metaheuristic sailfish optimizer for weights optimization
of the ensemble model, leading to enhanced performance on various classifica-
tion metric evaluations
• Application of a unique training strategy, where local fitting of classifier models
are done to the formed clusters (as opposed to training on the whole dataset),
hence utilising the distinct characteristics of each cluster found through SOM
clustering phase leading to outstanding classification results.
2 Related work
Lee et al. (2002) proposed a hybrid neural discriminant model which aimed to
simplify the neural networks by incorporation of the traditional discriminant anal-
ysis. They employed LDA to first model the credit scoring problem and identify
the significant input variables, which are then passed on to the Artificial Neural
Network to perform the prediction. Hence, the LDA acted as a tool to design the
topology of the network.
Onan (2019b) proposed a two-stage neural network framework for topic
extraction, using advanced embeddings (word2vec, POS2vec, word-position2vec,
LDA2vec) and an ensemble clustering approach.
Lappas and Yannacopoulos (2021) are able to combine expert knowledge
and genetic algorithms into a machine learning approach to supplement super-
vised credit scoring with expert input, increasing the overall efficiency. Safi et al.
(2022) combine a neural network with metaheuristic optimizers and five cost sen-
sitivity fitness functions as the base learners in a majority voting ensemble learn-
ing paradigm.
Onan (2023a) proposed a novel GTR-GA approach, which integrates graph-
based neural networks with genetic algorithms for text augmentation, effectively
enhances data diversity and improves model performance.
As an extension to the artificial neural network models used for the classification
step, self organising maps, which are a widely employed neural network model,
have been used instead of the traditional ANN models. Unlike the ANN, the SOM
models are unsupervised and produce a topology preserving mapping between
the input parameter set (which is a higher dimensional map space) to the final
credit score (a lower dimensional map space). Lau (2006) have explored different
SOM Models that can be employed for classification problems.
Hsieh (2005) proposed a hybrid mining approach which used a self organising
map for deciding the parameters of the K Means clustering algorithm followed by
an Artificial Neural Network for classifying the samples. Suleiman et al. (2021)
also used an SOM to improve the ability of pattern recognition in the data and
utilise that to increase the efficiency of K-Means Classifier and neural networks.
AghaeiRad et al. (2017) developed a hybrid model for credit scoring using
SOM clusters and feedforward neural networks. They used the knowledge
obtained from SOM clusters and passed it as information to be trained on by the
neural network classifier. This gave better results compared to a standalone FNN
due to the increased information present in the input.
Onan (2019a) proposed a consensus clustering-based undersampling method
to tackle class imbalance in machine learning. By undersampling the majority
class with various clustering algorithms and evaluating performance using differ-
ent supervised and ensemble learning methods.
Onan (2021b) introduces sentiment analysis on product reviews based on
weighted word embeddings and deep neural networks. The architecture includes
a weighted embedding layer, convolutional layers (1-g, 2-g, 3-g), max-pooling,
LSTM, and a dense layer.
Genetic algorithms have been used widely and in different capacities in the credit
scoring problem.
Onan et al. (2016b) employed a multiobjective differential evolution algorithm to
optimize classifier and class weights within a static classifier selection framework
for sentiment analysis.
Onan (2018a) optimized the Latent Dirichlet Allocation (LDA) parameters by
employing a swarm intelligence approach. They utilized metaheuristic algorithms
like Particle Swarm Optimization (PSO) to estimate the number of topics and
other key parameters in LDA. This complements He et al. (2018) work on adaptive
models.
Pławiak et al. (2019) published a work constituting an application of genetic cas-
cading of ensemble classifiers. They combined the benefits of evolutionary algo-
rithms and ensemble classifiers, alongside using deep learning to develop a complex
credit scoring model. They used genetic algorithms in three different instances: first
for feature selection on the Australian dataset, second for hyperparameter optimiza-
tion, and a third for deriving a training technique for selecting the classifiers for the
final trained model.
Kozodoi and Lessmann (2020) improved on the feature selection through
GA by modelling it into a multiobjective task. They then used a Particle Swarm
Optimization(PSO) algorithm and evaluated it using three different fitness functions
based on: the number of features, the relative acquisition costs of the features and
the AUC-ROC curve fitness score of the trained model.
Tripathi et al. (2020) proposed a novel Binary BAT algorithm on the credit scor-
ing dataset. They combined it with a radial Neural Network (RBFN) for a hybrid
credit scoring model. They used the metaheuristic algorithm for feature selection on
the dataset and used the Neural Network for training the selected features in a super-
vised classification task.
Simumba et al. (2022) incorporated stakeholder requirements into their model
during feature selection and used that to train their supervised classification model.
They compared the results of two modified metaheuristic algorithms: a Grasshop-
per algorithm integrated with non-dominated sorting and genetic algorithm, and
a genetic algorithm integrating different selection, crossover, and mutation strate-
gies. They evaluated their results with empirical data collected from farmers in
Cambodia.
Şen et al. (2020) proposed multilevel metaheuristic algorithms for credit scoring.
They used an SVM classifier, combined with a Genetic Algorithm in a two-level
feeding mechanism for increased model accuracy. They used the GA first to find the
optimized parameters of the SVM model, and then used it for feature selection on
the dataset that increases the classification accuracy.
Onan (2023b) proposed SRL-ACO, a text augmentation framework that uses
Semantic Role Labeling (SRL) and Ant Colony Optimization (ACO) to generate
additional training data for NLP models. SRL-ACO enhances data quality by pre-
serving semantic roles in new sentences. Experimental results demonstrate improved
performance in sentiment analysis, toxic text detection, and sarcasm identification
tasks.
Ensemble models and models based on genetic algorithms combine the power of
machine learning models to generate highly accurate credit risk assessment models
and assist us in classifying previously unrepresentative samples in the dataset.
Van Gestel et al. (2003) proposed a Least Squares Support Vector Machine (LS-
SVM) classifiers within the Bayesian evidence framework to predict the tendency of
an entity to default on their credit given a set of parameters. Once the probabilities
for the defaulting of an entity are generated a sensitivity analysis with respect to the
input set is carried out which provides us with an insight into the parameters that are
affecting the creditworthiness of the entity.
Onan et al. (2017) work on hybrid ensemble pruning enhances classifier diversity
and performance, aligning with the approach of improving sentiment classification
accuracy through advanced ensemble methods.
Onan (2021a) work on hybrid ensemble pruning improves classifier diversity and
performance, enhancing sentiment classification accuracy through advanced ensem-
ble methods.
Onan (2018b) integrated a Random Subspace ensemble of Random Forest clas-
sifiers with four types of features-authorship attribution, character n-grams, part of
speech n-grams, and discriminative word frequency.
Hsieh and Hung (2010) investigated the approach involving the proper preproc-
essing of the dataset into homogenised clusters followed by the classification of
the samples into the preprocessed categories. The ensemble model hence proposed
resulted in an efficient ensemble classifier.
Onan et al. (2016a) combined five statistical keyword extraction methods with
various ensemble techniques for text classification, including Naïve Bayes, SVM,
logistic regression, and Random Forest, akin to how Zhang et al. (2021) utilized
ensemble techniques for improving credit risk models. This integration of feature
extraction with ensemble methods aligns with the advancements in robust predictive
modeling.
He et al. (2018) improved on the construction of the ensemble models by basing
their adaptability to different imbalance ratios by the supervised undersampling of
the dataset (based on the estimation of the data imbalance ratio), followed by clas-
sification by tree-based base classifiers which classify samples in the respective data
subsets. Finally, a particle swarm optimization algorithm was applied to the base
classifiers to obtain the final ensemble model.
Zhang et al. (2021) were able to develop a novel ensemble model that used a local
outlier factoring algorithm added with a bagging strategy to construct a trained model
that works on the outlier adaptability of base classifiers. They combined novel meth-
ods of feature reduction and ensemble learning methods for parameter optimization and
finally used a stacking based ensemble model to train the dataset on.
Nalič et al. (2020) used ensemble techniques for feature selection on datasets and
proposed the if_any voting method that was able to outperform other standard voting
procedures. They combined linear models, SVMs, naive Bayes and decision tree classi-
fiers into a soft voting ensemble model.
Xia et al. (2020) proposed a new tree-based overfitting-cautious heterogeneous
ensemble model for credit scoring. The suggested method could dynamically give
weights to base models based on the overfitting metric during ensemble selection.
To improve the prediction performance of credit scoring, Tripathi et al. (2019) cre-
ated a hybrid model that combines feature selection and a multilayer ensemble clas-
sifier architecture. The first phase is preprocessing, which sets ranks and weights of
classifiers, followed by the ensemble feature selection approach, and finally, the data-
set with the selected features is used in a multilayer ensemble classifier architecture.
In addition, since classifier placement influences the ensemble framework’s predictive
performance, a classifier placement algorithm based on the Choquet integral value was
devised.
Xia et al. (2018) introduced a new heterogeneous ensemble credit model that com-
bined the bagging and stacking algorithms. In three ways, the proposed model varied
from the existing ensemble credit models: pool creation, base learner selection, and
trainable fuser.
To increase the prediction performance, Guo et al. (2019) proposed a novel
multistage self-adaptive classifier ensemble model based on statistical approaches
and machine learning techniques. First, the original data was processed into a
standardized, representative sample using a multistep data preparation method.
Second, based on their performance, base classifiers were self-adaptively picked
from the candidate classifier repository, and their parameters were adjusted using
the Bayesian optimization algorithm post which the ensemble model was inte-
grated using these optimized base classifiers, and used multilayer stacking to
generate new features and particle swarm optimization to achieve the classifier
weights in the ensemble model.
3 Proposed methodology
This section gives a detailed description of SCMVE which is the framework used
to determine if a data point belongs to good credit or risky credit. SCMVE uses
the concept of multi-layer clustering using an artificial neural network-based tech-
nique (Self Organising Map) and then uses a soft voting-based ensemble classi-
fier to predict the final class of the test data point.
The weights for our ensemble model are generated using a metaheuristic opti-
mization technique—Sailfish Optimizer.
The overall architecture of SCMVE is divided into 2 phases:
• Clustering Phase: The entire dataset is clustered with data in them having
similar feature values. The idea of cluster is based on the fact that different set
of controlling attributes which affect the final classification are better identi-
fied and trained accordingly when dealt with as a cluster.
• Classification Phase: Once data is divided into cluster regions, the classifica-
tion models predict the final class of data points. To achieve classification, we
use an ensemble classifier based on Support Vector Machines and K Nearest
Neighbour Classifiers. A metaheuristic algorithm is further used to optimise
the ensemble model.
The first step of SCMVE is to cluster the entire dataset based on the similarity of
attributes between data points. The rationale behind this step is to be able to pre-
dict and identify unique characteristics distinct to each cluster.
For this step, we use Self Organising Maps (SOM). SOM can find better sets of
clusters on the given dataset (Gholamian et al. 2013).
ML algorithms can be of two types: Supervised and Unsupervised learning. In
supervised learning, we predict or categorize the outcome based on the input, and
in unsupervised learning, we describe patterns in input data without having the
Singular model classification Involves conducting supervised clas- Reduced training time. Simple classifica- Low accuracy compared to more complex
sification and training on a singular tion models models
classification model, such as a neural
network or a K-means classifier
SOM Model for optimization Involves the usage of a Self Organizing The clusters found by SOMs are able to Performance decreases with increase in
Map for clustering the data points and generate higher accuracy dataset size
using a supervised classifier on the
optimized data set. SOM may be used
in coordination with other optimisation
techniques
Metaheuristic models for optimization This method involves using metaheuristic Reduced training time. Reduced compu- Performance not consistent: dependent
optimization models (such as sailfish tational complexity on the task, efficiency metric and other
optimizer or the genetic algorithm) for factors
tasks such as feature selection, weights
optimization etc to improve the effi-
ciency of supervised classifiers
Ensemble Classifiers This technique involves using multiple Higher predictive accuracy High cost of training the models
base classifiers to produce the output.
These classifiers may be optimized
using various techniques. The output of
these base classifiers is combined using
a weighted voting approach
I. Singh et al.
Fig. 1 SOM clustering and metaheuristic voting ensemble classifiers for credit scoring
Definition 1 (Neuron) Self Organizing maps consist of a grid of nodes. Each node is
referred to as a neuron. Each neuron is assigned a weight vector of the dimension of
its input vector X0.
dX→n = X0 − n2 ∀n ∈ X1 , (1)
where X1, represents all the layers excluding the input layer. The neuron with the
minimum distance is called the best matching neuron 𝜉 and its weight vector is
updated along with its neighbors to move it closer to the input vector.
𝜉 = min(dX→n ) (2)
Definition 3 (Neighborhood Radius Function, 𝜂)—Neighborhood Radius Function
is a function used to compute the radius for defining the neighbors of a given neu-
ron. It is a decreasing time function such that the number of neighbors reduces dur-
ing the training phase. The neighbourhood radius function is represented as:
( 2
)
du,v
𝜂(u, v, S, i) = exp − , i = 0, 1, 2, ... (3)
2𝜎(i)2
where u, v represent two vectors of same dimension, S represents the input space, i
represents the ith iteration.
Definition 4 (Neighbor Neuron, NN)—The neurons close to the best matching neu-
ron that lie within a certain minimum neighborhood radius NRmin are termed as
Neighbour Neuron NN which is mathematically represented as
{
NN = n if 𝜂(𝜉, n, S, i) < NRmin , ∀n ∈ S (4)
1. Each neuron is initiated with a random weight vector, with the same dimensional-
ity as the input vector. Neurons are adjusted during the training period based on
competitive learning.
2. In the SOM grid, every input data is related to every output neuron on the grid.
In every iteration i, the neuron with the shortest distance to the input vector is
denoted as 𝜉 (or Best Matching Neuron) and the representative vector of 𝜉 is
updated so that it moves closer to the input vector.
3. Then, the weights of 𝜉 ’s neighborhood neurons are updated so that they are sym-
metrically adapted towards the training input vector. The neuron weights are
updated according to the following equation-
( )
Wi+1 = Wi + 𝜂BMU ⋅ 𝜆 Xi − Wi (5)
where Xi is the input vector, Wi is the neuron vector, 𝜆 denotes the learning rate,
and 𝜂BMU represents neighborhood function value for the best matching unit.
4. This training cycle is repeated for each input vector. After finishing the training,
the final grid represents the topology-preserving mapping of input data where
related neurons are closer to each other.
Note that the update rate for the best matching neuron, the neighbor neuron,
and the farther neighbor neuron will differ. The neighborhood radius is a pre-
defined parameter that gradually decreases during the entire training phase and
is finally set to 1. A summary of the different parameters involved in the self-
organising map is provided in Table 2.
Finally, clusters of neurons are formed, each representing a different class of
data that can be appropriately labeled.
In this phase of SCMVE, the clusters formed in the first stage are individually used
to train and test our ensemble model. Data points in each cluster are split into train-
ing and testing data.
The conventional idea is to use a single classifier or a simple combination to make
predictions that are often less accurate. One of the techniques to improve the perfor-
mance of the individual classifier is to combine them using different techniques.
The technique we use is called ensemble learning. Various techniques like vot-
ing can be used to combine these classifiers used in the ensemble. Ensemble models
gather the decisions of several base classifiers trained on the same dataset to get a
more effective and accurate decision. The aim is to compensate for the error of a
single classifier by other classifiers of the ensemble, increasing the accuracy of the
ensemble model as compared to a single classifier model.
For credit scoring, diverse base classifiers could improve the model’s accuracy
and also provide insightful information about the applicant even if they belong to
the same category. The ensemble model used in our paper consists of 12 base classi-
fiers—6 SVM’s and 6 KNN’s. To ensure maximum utilization and greater diversity
of base classifiers, we have used different hyperparameters for each classifier. The
classifiers used in the model are explained below -
• SVM: an effective classifier model that maps the data into high dimensional
spaces and finds an optimal hyperplane to maximize the distance between dif-
ferent classes. A hyperplane is a surface that partitions data points into classes,
where points from different classes lie on different sides of the hyperplane. Ker-
nel functions such as linear, polynomial, RBF, and sigmoid are used to map the
data into high dimensional spaces.
• KNN: one of the most simple yet effective non-parametric classification mod-
els. It is an instance-based learning technique and does not require a learning
phase. For a data point n to be classified its neighborhood is retrieved i.e k near-
est neighbors of n. Selecting an appropriate value of k is necessary as the success
of the classifier depends on k.
Votes from each classifier can be combined using several techniques, majority vote
ensemble being the most common. In majority voting, each classifier gives a binary
vote of 0 and 1. The majority voting ensemble uses two techniques: hard voting and
soft voting.
Hard voting makes the final prediction using majority votes from the classifiers,
whereas in soft voting each classifier corresponds to a probabilistic outcome. The
class with the highest probability is predicted as the final output. In our model,
we use the soft voting approach. The following equations describe the soft voting
procedure
Among the n classifiers, each classifier Ci gives the probability Pi such that the
predict label yi = 0 for the corresponding input X.
( )
Pi = P Ci [X] = 0 (6)
For each classifier, we assign a weight wi which determines its relative importance
among the others in the ensemble and are assigned based on the performance of
an individual classifier (further optimised using the sailfish algorithm as explained
later). The final output label y is determined as follows
� ∑n w ⋅P
1 ∑i=1n wi i ≤ 0.5
y= i=1 i (7)
0 otherwise
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
– Being one of the fastest marine beings in the ocean, it is almost impossible
for sardines to avoid the attack.
– The injured will eventually be separated from the rest of the prey school
and will be captured by the hunters (sailfish).
– Sailfish attacks do not kill the sardine but the frequent attack increases the
number of injured sailfish.
– Sardines are another important inspiration for the SFO algorithm. It imi-
tates their ability to change position to escape from an attack.
When tested with other metaheuristic algorithms: Particle Swarm Optimization
(PSO), Grey Wolf Optimization (GWO), Genetic Algorithm (GA), Ant Lion
Optimizer (ALO), Salp Swarm Algorithm (SSA), and Satin Bowerbird Optimizer
(SBO) using a set of unimodal, multi-modal and fixed dimension multimodal
benchmark functions, SFO showed competitive results in terms of exploration
and exploitation phases (Nassef et al. 2021).
Moreover, SFO shows high-speed convergence on multimodal functions in
reaching the global optimum while avoiding the local optimum.
Definition 7 (Search Space, S) It is the set of all possible solutions that the input can
take. It contains a set of points (each representing a possible solution) that give the
optimal solution. Optimization aims to search that point(solution).
Definition 8 (Fitness Function, 𝜓(x)) It is an objective function that defines the opti-
mality of a solution to a given target problem. It defines how close a given solution
is to achieve the desired output. Any mathematical operation that is able to assign
computable scores to the states of the matrices can be used for the approach.
The functions that have been experimented with in this paper are mentioned in
Table 3. For this paper, the mean squared error function is used to compute the fit-
ness as per the Algorithm 3.
1
2
3
4
5
6
7
8
9
10
11
12
Definition 9 (Prey Density, PD) The value indicating the number of prey(sardines)
present in each iteration and it decreases in every iteration as the prey decreases dur-
ing hunting. The value PD can be calculated as follows
( )
NSF
PD = 1 − (8)
NSF + NS
where NSF and NS are the numbers of sailfish and sardines in each cycle, respectively.
Definition 10 (Attack Power, AP) Represents the sailfish’s attack power at each iter-
ation. It helps us to calculate the number of sardines that update their position. The
attack power is calculated as
AP = A × (1 − (2 × i × 𝜀)) (9)
where A represents the coefficient of the sailfish’s decrementing attack power, 𝜖 is
the learning rate and i denotes the ith iteration.
The following matrices give the fitness value for all positions of sailfish and sardines
where n is the number of sailfish and d shows the number of variable and Pisf [x][, ] is
the position of sailfish x
where m is the number of sardines and d shows the number of variables and
Pisar [y][, ] is the position of sailfish y.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
1
2
3
4
5
6
7
10
11
12
13
14
15
16
17
18
The sailfish with the best fitness values are considered elite and are best suited
for attacking the prey. These are the sailfish that do not change their position in
upcoming iterations until the optimal solution is not lost. Moreover, sardines with
the best fitness function are considered the best target by sailfish and are the most
exposed to injury.
5. As the hunting continues through each iteration, the power of the sailfish’s attack
also reduces as given in Eq. 9. Further, sardines will have less energy and will
not be able to properly detect the sailfish position, hence the propensity of get-
ting injured increases. Hence, the sardine positions are also updated as follows
Psar = ∼ N(0, 1)⋅
(14)
(SFbest − Psar + AP)
9. Finally, the attack is initiated, and the injured sardines are captured. It is assumed
to be the case that the sardines are attacked if their fitness values are better than
the sailfish.
10. The space of the captured sardine is substituted by sailfish to increase the
chances of hunting. The conditional substitution is represented by the follow-
ing expression
where Pisar is the position of sardine at its ith iteration and Pisf gives the posi-
tion of sailfish at ith iteration.
11. This cycle of updation of positions of sardines and sailfish continues until the
end criterion is fulfilled.
The last step is to predict the final accuracy of SCMVE, which is calculated
using the weighted average of the accuracy of individual classifiers. We use a
weighted voting technique and a sailfish optimizer for weight generation for the
following reasons:
• Relative weighing the base classifiers on the basis of their performance helps us
to value the classifiers that perform better on a certain cluster and undervalue
those that perform worse. This helps us to increase our overall performance. In
comparison, if all base classifiers were weighed equally, it would lead to a low
performing base classifier reducing the overall accuracy of the model.
• Generating the weights using a metaheuristic optimizer converts this into an opti-
mization problem. This not only reduces the training time since we can traverse
the search space faster given the nature of our optimizer and its exploration capa-
bilities, it also helps us reach a global maxima on the model efficiency, which
otherwise can lead to models getting stuck on local maxima due to inefficient
search space exploration strategies.
where Ci denotes the ith cluster, X denotes the input dataset and P denotes the per-
formance metrics, for instance, accuracy.
4 Experimental setup
We detail the various performance evaluation measures we used to evaluate our pro-
posed model and compare its performance with that of other well-known classifiers in
this section.
Five evaluation metrics are used to assess the overall classification performance of
our proposed model: accuracy, precision, recall, F1 score and AUC. The most com-
monly used evaluation metric is accuracy, which is defined as the ratio of correctly
identified instances to the total number of instances in the dataset. However, in the
event of a dataset with a class imbalance, accuracy does not reflect the model’s true
performance
TP + FN
Accuracy = (20)
TP + FN + FP + TN
Another widely used metric in credit scoring studies is the AUC, which is based on
the receiver operating characteristic (ROC) curve. The ROC curve is a visualisation
of the true positive rate (TPR) and false positive rate (FPR) in binary classification
tasks (FPR). AUC, which is defined as the area under the ROC curve, is used to
compare different classifiers. TPR and FPR are defined as follows:
TP
TPR = (21)
TP + FN
FP
FPR = (22)
FP + FN
The following are the definitions for the last three metrics: precision, recall, and F1
score:
TP
Precision = (23)
TP + FP
TP
Recall = (24)
TP + FN
precision ∗ recall
F1 score = 2 ∗ (25)
precision + recall
5 Experimental results
We exhibit the obtained data to strengthen the claim that the ensemble model formu-
lated in this study outperforms the other benchmark models in the important param-
eters mentioned in this paper. Experiments mentioned in this study were conducted
with Python 3.7 on Google Colab. Various classifier Table 9 results are compared
with our proposed technique, represented in Figs. 4, 5, 6, 7 and 8 in terms of accu-
racy, precision, recall, F1 score and AUC. In Sect. 5.1, we present a comparative
analysis of the results of numerous individual classifiers. Section 5.2 includes the
comparison of our method with other classical ensemble methods. In Sect. 5.3, the
result of SCMVE is highlighted, and we compare it with various individual and
ensemble classifiers.
For Taiwan dataset, the Gradient Boosting model outperforms others in the
precision metric score, while the Naive Bayes model outperforms on the recall
metric of both Autralian and German dataset.
Individual classifiers yield good results for all the Australian and German along with
Taiwan dataset, but most researchers have used an ensemble of individual classi-
fiers. Using a mixture of multiple base learners can assist reducing variance and
bias, and hence increase the accuracy of predictions. In this research, we examine
the performance of various proposed ensemble approaches, such as Tree-Based
Dynamic Ensemble (Xia et al. 2020), B-Stacking approach (Xia et al. 2018), Semi-
Supervised Selective Ensemble (Xiao et al. 2020) and Enhanced Outlier adaptation
(Zhang et al. 2021) to show that ensemble classifiers outperform solo classifiers on
credit scoring datasets. Tables 10 and 12 demonstrate the results of some popular
ensemble approaches on all Australian and German as well as Taiwan datasets. The
data shows that all other ensemble classifiers outperform most individual classifiers
on most parameters across both datasets. The proposed model approach was the best
performer overall.
The results in Tables 10 and 12 show that the classification method described in this
research performs significantly better than other singular or hybrid ensemble clas-
sifier on all the Australian and German as well as the Taiwan dataset. A soft voting
metaheuristic approach is proposed by our model. It entails examining the perfor-
mance of different models by training them on clusters and producing significantly
enhanced results.
While the proposed model outshines all other classifiers across the evaluation
metrics by a large margin for the Australian dataset (Fig. 9), the signifanct achieve-
ment of this paper and the proposed model is the subsequent improvement of per-
formance in the German dataset (Fig. 10) and Taiwan dataset, where our model
demonstrates an overall upgradation in results achieved through our cluster-based
metaheuristic approach.
Further, on experimentation with the selection of the various fitness functions
as referenced in Table 3, we observe a variation in the performance in accordance
with the usage of each fitness function as shown in Figs. 11 and 12. This approach
of finding the most suitable fitness function has been highlighted in Huang et al.
Fig. 11 Variation of precision and recall with the fitness function used on the Australian dataset
Fig. 12 Variation of accuracy, F1 Score, and AUC Score with the fitness function used on the Australian
dataset
Xia et al. (2018) The model proposed differs from the existing 0.8828 0.7866
ensemble credit models in three aspects-pool
generation, selection of base learners, and
trainable fuser
Page 36 of 42
Tripathi et al. (2019) This model conducts preprocessing and assigns 0.9155 0.8268
weights to classifiers. Then feature selection
using an ensemble model is applied to a clas-
sifier framework
Guo et al. (2019) Introduces a novel multi-stage self-adaptive 0.8740 0.7830
classifier ensemble model based on the
statistical techniques and the machine learn-
ing techniques to improve the prediction
performance
Xia et al. (2020) This work develops a novel tree-based 0.8689 0.7772
overfitting-cautious heterogeneous ensemble
model (i.e., OCHE) for credit scoring.
Xiao et al. (2020) Proposes a cost-sensitive semi-supervised 0.8689 0.7376
selective ensemble model based on group
method of data handling
Kuppili et al. (2020) A novel spike-generating function is proposed 0.9558 0.7589
in Leaky Nonlinear Integrate and Fire Model
(LNIF). Its interspike period is computed
and utilized in the extreme learning machine
(ELM) for classification
Tripathi et al. (2020) A novel parametrized algebraic activation 0.8992 0.8118
function is proposed for extreme learning
machine (ELM)
I. Singh et al.
Liu et al. (2022) This work develops a Multi-grained and multi- 0.8826 0.7653
layered gradient boosting decision tree for
credit scoring.
Zhang et al. (2021) A local outlier factor algorithm is enhanced 0.9236 0.7950
with bagging strategy to identify outliers
and boost them back into the training set to
construct an outlier-adapted training set
Zhang et al. (2021) A voting-based outlier detection method is pro- 0.9058 0.7950
posed to enhance the outlier detection algo-
rithms with the weighted voting mechanism
and boost the outlier scores into the training
set to form an outlier-adapted training set
Runchi et al. (2023) The Logistic-BWE model combines logistic 0.865 0.757
regression with heterogeneous balancing and
dynamic weighting. Sub-models are trained
A hybrid metaheuristic optimised ensemble classifier with…
Table 14 Comparison of the performance achieved by various ensemble approaches of credit scoring over Taiwan dataset
Authors Approach Accuracy: Taiwan dataset
of each cluster to generate an output, thus not ignoring any feature that might be
distinctly representative in a particular cluster. Credit scoring has outlier issues and
most works have to engineer outlier detection algorithms to overcome the situation.
Examples of this are present in the works of Zhang et al. (2021, 2021) Our usage of
an ensemble classifier, combined with the clusters formed through the SOM helps
us overcome outliers in the sample space since the result of multiple base classifiers
perform better on outlier detection as opposed to singular classification models.
Funding The authors declare that no funds, grants, or other support were received during the preparation
of this manuscript. This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.
Data Availability The datasets generated and/or analysed during the current study are available in
https://archive.ics.uci.e du/ d atas et/ 1 43/ statl o g+ a ustr a lian+ c redit+ appro val,https:// archi ve. i cs. u ci.
edu/dataset/144/statlog+german+credit+data and https://archive.ics.uci.edu/dataset/350/default
+of+credit+card+clients.
Declarations
Conflict of interest The authors declare that they have no known competing financial interests or personal
relationships that could have appeared to influence the work reported in this paper.
Ethical approval This article does not contain any studies with human participants performed by any of
the authors.
Informed consent This article does not contain any studies with human participants or animals performed
by any of the authors.
References
AghaeiRad A, Chen N, Ribeiro B (2017) Improve credit scoring using transfer of learned knowledge
from self-organizing map. Neural Comput Appl 28(6):1329–1342
Bumacov V, Ashta A, Singh P (2014) The use of credit scoring in microfinance institutions and their out-
reach. Strateg Chang 23(7–8):401–413
Dastile X, Celik T, Potsane M (2020) Statistical and machine learning models in credit scoring: a system-
atic literature survey. Appl Soft Comput 91:106263
Dua D, Graff C (2017) UCI machine learning repository
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Gholamian M, Jahanpour S, Sadatrasoul S (2013) A new method for clustering in credit scoring prob-
lems. J Math Comput Sci 6:97–106
Guo S, He H, Huang X (2019) A multi-stage self-adaptive classifier ensemble model with application in
credit scoring. IEEE Access 7:78549–78559
Hand DJ, Kelly MG (2002) Superscorecards. IMA J Manag Math 13(4):273–281
He H, Zhang W, Zhang S (2018) A novel ensemble method for credit scoring: adaption of different
imbalance ratios. Expert Syst Appl 98:105–117
Henley WEM, Hand DJ (1996) Ak-nearest-neighbour classifier for assessing consumer credit risk. J Roy
Stat Soc Ser D (Stat) 45(1):77–95
Hsieh N-C (2005) Hybrid mining approach in the design of credit scoring models. Expert Syst Appl
28(4):655–665
Hsieh N-C, Hung L-P (2010) A data-driven ensemble classifier for credit scoring analysis. Expert Syst
Appl 37(1):534–545
Huang C, Li Y, Yao X (2019) A survey of automatic parameter tuning methods for metaheuristics. IEEE
Trans Evol Comput 24(2):201–216
Kohonen T (1998) The self-organizing map. Neurocomputing 21(1–3):1–6
Kozodoi N, Lessmann S (2020) Multi-objective particle swarm optimization for feature selection in credit
scoring. In: Workshop on mining data for financial applications. Springer, pp 68–76
Kuppili V, Tripathi D, Edla DR (2020) Credit score classification using spiking extreme learning
machine. Comput Intell 36(2):402–426
Lappas PZ, Yannacopoulos AN (2021) A machine learning approach combining expert knowledge
with genetic algorithms in feature selection for credit risk assessment. Appl Soft Comput
107:107391
Lau KW, Hujun Y, Simon H (2006) Kernel self-organising maps for classification. Neurocomputing
69(16–18):2033–2040
Lee T-S, Chiu C-C, Lu C-J, Chen I-F (2002) Credit scoring using the hybrid neural discriminant tech-
nique. Expert Syst Appl 23(3):245–254
Li X, Ying W, Tuo J, Li B, Liu W (2004) Applications of classification trees to consumer credit scor-
ing methods in commercial banks. In: 2004 IEEE international conference on systems, man and
cybernetics (IEEE Cat. No. 04CH37583), vol 5. IEEE, pp 4112–4117
Li S-T, Shiue W, Huang M-H (2006) The evaluation of consumer loans using support vector machines.
Expert Syst Appl 30(4):772–782
Liu W, Fan H, Xia M (2022) Multi-grained and multi-layered gradient boosting decision tree for
credit scoring
Martens D, De Backer M, Haesen R, Vanthienen J, Snoeck M, Baesens B (2007) Classification with
ant colony optimization. IEEE Trans Evol Comput 11(5):651–665
Nalič J, Martinovič G, Žagar D (2020) New hybrid data mining model for credit scoring based on fea-
ture selection algorithm and ensemble classifiers. Adv Eng Inform 45:101130
Nassef MGA, Hussein TM, Mokhiamar O (2021) An adaptive variational mode decomposition based
on sailfish optimization algorithm and gini index for fault identification in rolling bearings.
Measurement 173:108514
Onan A (2018a) Biomedical text categorization based on ensemble pruning and optimized topic mod-
elling. Comput Math Methods Med 2018(1):2497471
Onan A (2018) An ensemble scheme based on language function analysis and feature engineering for
text genre classification. J Inf Sci 44(1):28–47
Onan A (2019a) Consensus clustering-based undersampling approach to imbalanced learning. Sci
Program 2019(1):5901087
Onan A (2019b) Two-stage topic extraction model for bibliometric data analysis based on word
embeddings and clustering. IEEE Access 7:145614–145633
Onan A (2021a) Sentiment analysis on massive open online course evaluations: a text mining and
deep learning approach. Comput Appl Eng Educ 29(3):572–589
Onan A (2021b) Sentiment analysis on product reviews based on weighted word embeddings and deep
neural networks. Concurr Comput Pract Exp 33(23):e5909
Onan A (2022) Bidirectional convolutional recurrent neural network architecture with group-wise
enhancement mechanism for text sentiment classification. J King Saud Univ Comput Inf Sci
34(5):2098–2117
Onan A (2023a) Gtr-ga: harnessing the power of graph-based neural networks and genetic algorithms
for text augmentation. Expert Syst Appl 232:120908
Onan A (2023b) Srl-aco: a text augmentation framework based on semantic role labeling and ant
colony optimization. J King Saud Univ Comput Inf Sci 35(7):101611
Onan A, Korukoǧlu S, Bulut H (2016a) Ensemble of keyword extraction methods and classifiers in
text classification. Expert Syst Appl 57:232–247
Onan A, Korukoǧlu S, Bulut H (2016b) A multiobjective weighted voting ensemble classifier based
on differential evolution algorithm for text sentiment classification. Expert Syst Appl 62:1–16
Onan A, Korukoǧlu S, Bulut H (2017) A hybrid ensemble pruning approach based on consensus clus-
tering and multi-objective evolutionary algorithm for sentiment classification. Inf Process Manag
53(4):814–833
Pławiak P, Abdar M, Acharya RU (2019) Application of new deep genetic cascade ensemble of svm
classifiers to predict the Australian credit scoring. Appl Soft Comput 84:105740
Reichert AK, Cho C-C, Wagner GM (1983) An examination of the conceptual issues involved in
developing credit-scoring models. J Bus Econ Stat 1(2):101–114
Runchi Z, Liguo X, Qin W (2023) An ensemble credit scoring model based on logistic regression
with heterogeneous balancing and weighting effects. Expert Syst Appl 212:118732
Safi SA-D, Castillo PA, Faris H (2022) Cost-sensitive metaheuristic optimization-based neural net-
work with ensemble learning for financial distress prediction. Appl Sci 12(14):6918
Şen D, Dönmez CÇ, Yıldırım UM (2020) A hybrid bi-level metaheuristic for credit scoring. Inf Syst
Front 22(5):1009–1019
Shadravan S, Naji HR, Bardsiri VK (2019) The sailfish optimizer: a novel nature-inspired metaheuris-
tic algorithm for solving constrained engineering optimization problems. Eng Appl Artif Intell
80:20–34
Simumba N, Okami S, Kodaka A, Kohtake N (2022) Multiple objective metaheuristics for feature selec-
tion based on stakeholder requirements in credit scoring. Decis Support Syst 155:113714
Singh P (2017) Comparative study of individual and ensemble methods of classification for credit scor-
ing. In: 2017 International conference on inventive computing and informatics (ICICI). IEEE, pp
968–972
Suleiman S, Ibrahim A, Usman D, Isah BY, Usman HM (2021) Improving credit scoring classification
performance using self organizing map-based machine learning techniques. Eur J Adv Eng Technol
8(10):28–35
Transpire Online sailfish optimizer (sfo) (2019) A novel method motivated from the behavior of sailfish
for optimal solution
Tripathi D, Edla DR, Cheruku R, Kuppili V (2019) A novel hybrid credit scoring model based on ensem-
ble feature selection and multilayer ensemble classification. Comput Intell 35(2):371–394
Tripathi D, Edla DR, Kuppili V, Bablani A (2020a) Evolutionary extreme learning machine with novel
activation function for credit scoring. Eng Appl Artif Intell 96:103980
Tripathi D, Edla DR, Kuppili V, Dharavath R (2020b) Binary bat algorithm and rbfn based hybrid credit
scoring model. Multimedia Tools Appl 79(43):31889–31912
Van Gestel IT, Baesens B, Garcia IJ, Van Dijcke P (2003) A support vector machine approach to credit
scoring. In: Forum Financier-Revue Bancaire et Financiaire Bank en Financiewezen, pp 73–82
West D (2000) Neural network credit scoring models. Comput Oper Res 27(11–12):1131–1152
Xia Y, Liu C, Da B, Xie F (2018) A novel heterogeneous ensemble credit scoring model based on bstack-
ing approach. Expert Syst Appl 93:182–199
Xia Y, Zhao J, He L, Li Y, Niu M (2020) A novel tree-based dynamic heterogeneous ensemble method
for credit scoring. Expert Syst Appl 159:113615
Xiao XJ, Zhong ZY, Xie L, Xin G, Liu D (2020) Cost-sensitive semi-supervised selective ensemble
model for customer credit scoring. Knowl-Based Syst 189:105118
Zhang W, Yang D, Zhang S (2021a) A new hybrid ensemble model with voting-based outlier detection
and balanced sampling for credit scoring. Expert Syst Appl 174:114744
Zhang W, Yang D, Zhang S, Ablanedo-Rosas JH, Xin W, Lou Yu (2021b) A novel multi-stage ensemble
model with enhanced outlier adaptation for credit scoring. Expert Syst Appl 165:113872
Zhou Y, Shen L, Ballester L (2023) A two-stage credit scoring model based on random forest: evidence
from Chinese small firms. Int Rev Financ Anal 89:102755
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and
applicable law.
* Indu Singh
indusingh@dtu.ac.in
D. P. Kothari
dpkvits@gmail.com
S. Aditya
s.aditya.me@gmail.com
Mihir Rajora
mihee20@gmail.com
Charu Agarwal
acharu848@gmail.com
Vibhor Gautam
vibhorgautam907@gmail.com
1
Department of Computer Science and Engineering, Delhi Technological University,
Delhi 110042, India
2
Department of Electrical Engineering, Visvesvaraya National Institute of Technology, Nagpur,
Maharashtra 440010, India
3
Department of Electronics and Communication Engineering, National Institute of Technology,
Hamirpur, Himachal Pradesh 177005, India
4
Department of Electronics and communication Engineering, Delhi Technological University,
Delhi 110042, India
1. use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
2. use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com