[go: up one dir, main page]

0% found this document useful (0 votes)
24 views7 pages

A Survey On Machine Learning Techniques For Insura

Uploaded by

Ngọc Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views7 pages

A Survey On Machine Learning Techniques For Insura

Uploaded by

Ngọc Quỳnh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/329448645

A Survey on Machine Learning Techniques for Insurance Fraud Prediction

Article · October 2018


DOI: 10.29042/2018-4358-4363

CITATIONS READS
8 2,185

1 author:

Komal Patil
Vishwakarma Institute of Technology
24 PUBLICATIONS 415 CITATIONS

SEE PROFILE

All content following this page was uploaded by Komal Patil on 06 May 2020.

The user has requested enhancement of the downloaded file.


DOI 10.29042/2018-4358-4363

Helix Vol. 8(6): 4358- 4363

A Survey on Machine Learning Techniques for Insurance Fraud


Prediction
*1
Komal S. Patil, 2Prof.Anand Godbole
*1
Department of Computer Engineering, Sardar Patel Institute of Technology, Mumbai, India
Email: pkomals94@gmail.com, anand_godbole@spit.ac.in

Received: 20th September 2018, Accepted: 11th October 2018, Published: 31st October 2018

Abstract
The fraudulent activities are increasing day by day with increase in technology in insurance sector. These fraud
cases make shoddy impact on socio-economical system. This paper presents a detail survey of machine learning
techniques used in insurance fraud prediction. This paper has disclosed traditional machine learning techniques
like supervised and unsupervised learning and also some contemporary methods such as hybrid and ensemble
leaning. The approach of the problem changes with the change in dataset hence this paper aims to provide an
organized overview of the fraud prediction techniques based on the type of training data provided to the machine
learning model.

Keywords
Insurance Fraud; Supervised Learning; Unsupervised Learning; Hybrid Classifiers; Ensemble Classifiers; Bagging; Boosting;
Stacking.

Introduction
‘Insurer’ and ‘Insured’ are two pillars of insurance industry and the whole business runs due to the utmost faith within
both of them. Insurance Fraud occurs when any action performed by either insurer or insured with an aspiration to gain
some advantage to which they are not legally permitted or fraud may occur when any one of the party purposely refuses
to provide benefits to other party which was legally permitted to them. The main intention behind initiating an insurance
fraud is “to appear as conventional and to be proceed and get recompense in routine manner”
The survey of TOI reveals that one in every ten insurance claims is found to be fraud, which means around 10% of total
insurance claims are fraud. According to the study of KPMG India Financial Services, insurance is the most vulnerable to
fraud than other financial services, the survey says that loss caused due to insurance frauds are over Rs.
30,000crores(approximately $45billion) which is actually 9% of the total amount of insurance industry. The financial
survey of Ernst & Young says that premium, claims and third party frauds are the three main fraud risks in insurance
sector, from which only fraudulent claims contributes around 50% of the total fraud.
Nowadays almost every organization and agencies have their data stored in their databases; this data could be used for
detecting and analyzing fraudulent activities. This hidden knowledge and patterns can be discovered using various
machine learning techniques. Machine learning provides wide range of methods and algorithms to handle different types
of problems depending on the need of an organization and type of the data hence it is one of the most popularly used
technique for classification and prediction of fraud. Machine learning models are trained on the historic data and make
predictions based on the prior knowledge extracted from the data, the model keep on updating with the new patterns and
knowledge with incoming data. Machine learning is the most trending framework because of its high reliability and
compatibility. A single classifier or multi classifier or hybrid model can be used according to need of the problem. Many
researchers have used different techniques to deal with fraud cases. This paper gives the generalize overview of all the
techniques used for an insurance fraud prediction and detection. The techniques are categorized based on the type of the
data and also on the type of problem which is being solved by the model. Some hybrid and ensemble techniques are also
disclosed in this paper as these techniques are gaining popularity because of their compatibility with other algorithms
which are proved to be more efficient that single classifier models.

Materials and Methods


1. Traditional Machine Learning Methods
Traditionally the machine learning techniques are classified on basis of type of data which will be provided to the model.
Therefore based on type of data there are three types of machine learning methods supervised learning, unsupervised
learning and semi-supervised learning. Supervised and unsupervised methods are widely used by the researchers for fraud
detection while few of them have also used semi-supervised method for fraud prediction.

4358 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495


Helix Vol. 8(6): 4358- 4363

1.1. Supervised Learning


Supervised learning techniques requires a labeled dataset these labels are nothing but the target variables target variables
which discloses whether the particular claim is fraud or not. It means that to use supervised techniques, data should
contain previously correctly identified claims based on this data algorithm generalizes the fraud instances and make
predictions on new data instances.
For any supervised learning method let X be the set of n instances. These instances are also represented by a feature vector
x= (𝑥! , …𝑥! ) where D is a dimension of vector x. A training dataset is a collection of all such instances {𝑥! }!!!! =
{𝑥!, … 𝑥! } which will be given as input to the learning model. Let Y be the set of labels these labels are nothing but
distinct values of different classes, y ∈ {1,…C} where C is number of classes. Now P (𝑋! ,𝑌! ) is given as probability of any
instance i for particular class label, then supervised learning trains function F such that f(x) predicts whether 𝑌! is correctly
labeled for given instance 𝑋! .
However supervised learning method has few drawbacks. It is difficult to get labeled data always. Organization need to
maintain the data labels from the beginning itself, in case if the data is unlabeled it is quite difficult and expensive to give
labels to such huge data. Hence for such cases unsupervised learning methods are used. K- Nearest Neighbor (KNN),
Naive Bayes, Decision Trees, Support Vector Machine (SVM), Neural Network, Regression are some popular supervised
algorithms.

1.2. Unsupervised Learning


Unsupervised learning methods deals with the data where the target variable or the data label is not available.
Unsupervised learning finds specific patterns within the data; this method of discovering the particular structures within
the regularities of the data is called as density estimation. Clustering is commonly used method for density estimation. In
clustering claims which poses similar characteristics are group together assuming that majority of instances are non
fraudulent.
For the given training set {𝑥! }!!!! the aim of supervised learning is to separate n instances into k clusters in such a way that
instances within same clusters have same characteristics and instances of different clusters have different characteristics.
The number of clusters can be predefined or algorithm itself partition the data into possible clusters based on characteristic
of the data. The clusters formed are not necessarily discriminate, the clusters may be overlapped or may not be
differentiate properly in such cases there is very thin or there may not be any boundary between the clusters at all. This is
the limitation of unsupervised learning due to this the new claim may not be classified properly. Clustering, Association
rules, Principal Component Analysis (PCA) are commonly used in unsupervised learning.

2. Hybrid Methods
Every individual learning method has its own benefits and drawbacks hence few researchers started using hybrid
approaches for fraud detection. Hybrid learners are nothing but using two different learners together so that flaws of one
learner could be overcome by another one. Hybrid methods are designed to perform specific tasks with combination of
two or more algorithms these algorithms can be supervised, unsupervised or could be both, most of hybrid methods use
combination of supervised and unsupervised methods. Vipula Rawte et al.[4] have used evolving clustering method to first
detect the cluster of the disease and then applied SVM to detect whether the particular claim is legitimate or fraud. In most
of the hybrid methods clustering is use to identify the position of an instance and then different classifiers are used to
classify that particular instance into specific class.

3. Ensemble Learners
Ensemble is a framework of integrating various homogeneous or heterogeneous learners together so as to outperform the
model than that of single classifier. The main idea behind constructing an ensemble is to improve the prediction
performance. A typical ensemble learner contains the following elements:
1. Training Set: The training dataset for ensemble models need to be labeled always. A training data could be
considered as attribute-value vector. The training set can be given as
X = {𝑋! , 𝑋! … 𝑋! } where n be the number of attributes in a training set and Y be the set of target variable.
2. Base Classifier: The base classifiers are nothing but the classification algorithms which are trained on the training
set and make their predictions. Each base classifier performs independent of each other. Each base classifiers
solves same problem and make individual predictions, predictions made by every classifier may or may not be
same this property of an ensemble gives more generalize predictions.
3. Combiner: As the name suggests combiner combines the output from each base classifier and produces new
prediction. This combiner could be a function or could be another classifier (meta-classifier).
There are some predefined ensemble models which are given below:

4359 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495


Helix Vol. 8(6): 4358- 4363

3.1 Bagging
Bagging is most commonly used independent ensemble model. Bagging is a technique which implements similar
classifiers on small set of instances and then applies a mean or average of all the predictions. Generally bagging uses
different learners on different population. In bagging the training dataset is divided into its subsets and then each training
set is given to the individual classifiers. These base classifiers make their predictions and the combiner function makes
final prediction by taking mean or average or voting of every base classifier. The base classifiers used in bagging may or
may not be same. The use of diverse classifiers produces diverse output and hence the final prediction will be more
generalize.

Figure 1: Bagging Model

Wagging is variant of bagging the only difference between bagging and wagging is in wagging all the base classifiers are
trained on the entire training dataset and each base classifier carry particular weight for its prediction. With this approach
weights can be assigned to the classifiers which give more promising and accurate predictions than other base classifiers.
In final prediction, predictions made by the classifiers with high weight have more influence on final prediction. Random
forest is another variant of bagging. As the name suggests random forest is the forest of decision trees, these decision trees
are built as base classifiers and a final decision tree is formed over all these trees taking inputs from all the base trees.

3.2 Boosting
Boosting is an iterative ensemble model; it is also called as dependent ensemble model. It is an iterative technique which
adapts the weight of an observation based on the recently applied classifier. It tries to increase the weight of the
observation which was classified incorrectly by the previous classifier. In the first iteration boosting performs normal
classification on the given set of data and make predictions. The instances which are misclassified in the first iteration are
sent to next iteration with in order to improve the accuracy of weak classifiers. The workings of the iterations are depends
on the type of boosting, different types of boosting are gives below:

Ada-Boosting: It is an adaptive boosting technique which uses weighted approach for misclassified instances. In the very
first iteration all the instances are assigned equal weights and predictions are made, after first iteration the instances which
were misclassified are assigned higher weights than other instances in order to improve the predictability of the model.
This process continues till the most accurate prediction model is built.

Gradient Boosting: Gradient boosting runs on the same principal of traditional boosting. In the first iteration a simple
classification model is built to predict outcomes over the given set of data, and then from the misclassified instances a loss
function is plotted. Now the original plot and the error plot are combined together to built a new plot which yields more
prediction accuracy than the base predictor, this process of plotting error plot continues till the model finds minimum error
points. In this process the error is sequentially reduced after every iteration and hence it produces strong prediction model.

4360 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495


Helix Vol. 8(6): 4358- 4363

Figure 2: Boosting Model

3.3 Stacking
Stacking is similar to bagging. The basic framework of both models is similar the only difference between stacking and
bagging is in stacking a meta classifier or meta learner is used for final prediction. Stacking model is also called as super
learner. The generalization ability of stacking is far more than other models. The first layer of stacking is same as bagging
where the base classifiers make predictions based on the dataset provided to them, output from all this base classifier is
given as an input to the meta classifier. The meta classifier produces output considering new attributes which are
predictions made by base classifiers, target variable will remain same as original dataset. While testing the testing
instances are initially classified by the base classifiers, predictions made by these classifiers are added as a new attribute to
the dataset and it is fed to the meta classifier. The meta classifier combines the different outputs of base classifiers into the
final prediction.

Figure 3: Stacking Model


Results and Discussion
The articles studied and analyzed are given in Table 1. These articles are analyzed based on the methods and algorithms
used for the research.

Sr
Title of Article Methods used Type Algorithms used
no
Modeling Insurance Fraud Using Amira Kamil et Ensemble SVM, ANN, DT
1
Ensemble Using Ensemble Combining Classifier [1] al.
A novel hybrid under sampling method for G.Ganesh Supervised DT, SVM, LR, PNN,
2 mining unbalanced datasets in banking Sundar kumar et MLP,GMDH, k-Reverse
and insurance[2] al. Nearest Neighbor
Research and Application of Random Yaqi Li et al. Supervised Random Forest
3 Forest Model in Mining Automobile

4361 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495


Helix Vol. 8(6): 4358- 4363

Insurance Fraud[3]

Fraud Detection in Health Insurance using Vipula Rawt Hybrid SVM, K-means Clustering
4 Data Mining Techniques[4] et.al
Credit Card Fraud Detection: A Hybrid Tanmay Kumar Hybrid Fuzzy Clustering, Neural
5 Approach Using Fuzzy Clustering & Behera et al. Network
Neural Network[24]
A Hybrid Outlier Detection Algorithm Hamada Rizk et Hybrid K-Medoids
6 Based On Partitioning Clustering And al.
Density Measures[25]
A Principle Component Analysis-based Yaqi Li et al. Ensemble Random Forest, Principle
Random Forest with the Potential Nearest Component Analysis ,
7
Neighbor Method for Automobile Potential Nearest Neighbor
Insurance Fraud Identification[6]
A Case Study of Applying Boosting Naive Stijn Viaene et Ensemble Naive Bayes
8 Bayes to Claim Fraud Diagnosis[9] al.
Pattern Discovery on Australian Medical Ah Chung Tsoi Unsupervised Clustering , Hidden Markov
9 Claims Data—A Systematic Approach[10] et al. Model
Combining Re-sampling with Twin Lu Cao et al. Supervised Twin SVM
10 Support Vector Machine for Imbalanced
Data Classification[11]
Fraud Detection and Frequent Pattern Aayushi Verma Unsupervised K- Means Clustering
11 Matching in Insurance claims using Data et. al
Mining Techniques[12]
Random Rough Subspace based Neural Wei Xu et al. Ensemble Neural Network, OSS,
12 Network Ensemble for Insurance Fraud Resilient back propagation
Detection[14]
Framework for the Identification of Saba Kareem et Unsupervised, Clustering , Apriori
13 Fraudulent Health Insurance Claims using al. Supervised Algorithm, SVM
Association Rule Mining[18]

Table 1: Review Articles


Conclusion
This survey has explored the machine learning techniques used in insurance fraud prediction. Machine learning approach
provides the vast range of methods and algorithms for fraud prediction. Supervised and unsupervised learning methods are
widely used in combination with other methods to improve the prediction accuracy of the model. Hybrid learning methods
provides flexibility to user by blending different algorithms together these techniques have outperformed than that of the
traditional learning methods. Ensemble learning is gaining more importance recently due to its reliability and flexibility
with different approaches. In few recent studies it is revealed that ensembles not only improve prediction accuracy but
they also deal with some chronic machine learning problems such as over-fitting, class imbalance and concept drift.
Ensemble models and their applications are tempting because of their generalization ability. Ensembles are expensive to
build in terms of both time and resources but this could be seen as one time investment because once the ensemble is
assembled it produces highly efficient results.

References
[1] Amira Kamil, Ibrahim Hassan and Ajith Abraham.”Modeling Insurance Fraud Detection Using Ensemble Combining
Classification”. International Journal of Computer Information Systems and Industrial Management Applications. ISSN
2150-7988 Volume 8 (2016)
[2] G. Ganesh Sundarkumar, Vadlamani Ravi.” A novel hybrid undersampling method for mining unbalanced datasets in
banking and insurance”. Engineering Applications of Artificial Intelligence 37 (2015)
[3] Yaqi Li, Chun Yan,Wei Liu, Maozhen Li ”Research and Application of Random Forest Model in Mining Automobile
Insurance Fraud”. International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-
FSKD)(2016)
[4] Vipula Rawte, G Anuradha.”Fraud Detection in Health Insurance using Data Mining Techniques”. 2015 International
Conference on Communication, Information and Computing Technology (ICCICT), Jan.16-17

4362 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495


Helix Vol. 8(6): 4358- 4363

[5] Siddhartha Bhattacharyya , Sanjeev Jha , Kurian Tharakunnel , J. Christopher Westland.”Data mining for credit card
fraud: A Comparative Study”. Decision Support Systems 50 (2011)
[6] Yaqi Li, Chun Yan, Wei Liu, Maozhen Li.“A Principle Component Analysis-based Random Forest with the Potential
Nearest Neighbor Method for Automobile Insurance Fraud Identification”. Applied Soft Computing Journal
[7] Stijn Viaene, Mercedes Ayuso , Montserrat Guillen ,Dirk Van Gheel, Guido Dedene.“Strategies for detecting
fraudulent claims in the automobile insurance industry”. European Journal of Operational Research 176 (2007)
[8]Michal Wozniak, Manuel Grana, Emilio Corchado.: “A Survey of Multiple Classifier Systems as Hybrid Systems
Information Fusion” (2013)
[9] Stijn Viaene, Richard A. Derrig, and Guido Dedene.”A Case Study of Applying Boosting Naive Bayes to Claim Fraud
Diagnosis.” IEEE Transactions On Knowledge And Data Engineering, Vol. 16, No. 5, May 2004.
[10] Ah Chung Tsoi, Shu Zhang, and Markus Hagenbuchner. ”Pattern Discovery on Australian Medical Claims Data: A
Systematic Approach.” IEEE Transactions On Knowledge And Data Engineering, Vol. 17, No. 10, October 2005.
[11] Lu Cao,Hong Shen.”Combining Re-sampling with Twin Support Vector Machine for Imbalanced Data
Classification”. International Conference on Parallel and Distributed Computing, Applications and Technologies 2016 17.
[12] Aayushi Verma, Anu Taneja, Anuja Arora.”Fraud Detection and Frequent Pattern Matching inInsurance claims
using Data Mining Techniques”. Proceedings of 2017 Tenth International Conference on Contemporary Computing (
IC3), 10-12 August 2017, Noida, India
[13] Lior Rokach. “Ensemble-based classifiers”. Springer Science+Business Media B.V. 2009
[14] Wei Xu, Shengnan Wang, Dailing Zhang, Bo Yang. “Random Rough Subspace based Neural Network Ensemble for
Insurance Fraud Detection”. Fourth International Joint Conference on Computational Sciences and Optimization 2011.
[15] Yi Peng, Gang Kou, Alan Sabatka, Zhengxin Chen, Deepak Khazanchil, Yong Shi3 “Application of Clustering
Methods to Health Insurance Fraud Detection”. 1-4244-0451-7/06/$20.00 C2006 IEEE.
[16] Dr.M.S. Anbarasi, S. Dhivya. “Fraud Detection Using Outlier Predictor In Health Insurance Data”. International
Conference On Information, Communication & Embedded Systems (Icices 2017).
[17] Riya Roy , Thomas George K.”Detecting Insurance Claims Fraud Using Machine Learning Techniques”.
International Conference on circuits Power and Computing Technologies [ICCPCT] 2017.
[18] Saba kareem Dr. Rohiza Binti Ahmad Dr. Aliza Binit Sarlan.”Framework for the Identification of Fraudulent Health
Insurance Claims using Association Rule Mining”. IEEE Conference on Big Data and Analytics (ICBDA)2017.
[19] Chun Yan, Yaqi Li “The Identification Algorithm and Model Construction of Automobile Insurance Fraud Based on
Data Mining”. Fifth International Conference on Instrumentation and Measurement, Computer, Communication and
Control 2015.
[20] Stijn Viaene, Richard A. Derrig, Bart Baesens, Guido Dedene “A Comparison Of State-Of-The-Art Classification
Techniques For Expert Automobile Insurance Claim Fraud Detection”. The Journal of Risk and Insurance, 2002, Vol. 69,
No. 3, 373-421.
[21] E.W.T. Ngai , Yong Hu , Y.H. Wong , Yijun Chen , Xin Sun ”The application of data mining techniques in financial
fraud detection: A classification framework and an academic review of literature.” Decision Support Systems 50 (2011)
559–569.
[22] S. Viaene, G. Dedene, R.A. Derrig ”Auto claim fraud detection using Bayesian learning neural networks”. Expert
Systems with Applications 29 (2005) 653–666.
[23] Richard A. Bauder, Taghi M. Khoshgoftaar. “Medicare Fraud Detection using Machine Learning Methods”. 16th
IEEE International Conference on Machine Learning and Applications 2017.
[24] Tanmay Kumar Behera, Suvasini Panigrahi.”Credit Card Fraud Detection: A Hybrid Approach Using Fuzzy
Clustering & Neural Network”. Second International Conference on Advances in Computing and Communication
Engineering 2015.
[25] Hamada Rizk, Sherin Elgokhy, Amany Sarhan.”A Hybrid Outlier Detection Algorithm Based On Partitioning
Clustering And Density Measures.” 978-1-4673-9971-5/15/$31.00 ©2015 IEEE.

4363 Copyright © 2018 Helix E-ISSN: 2319-5592; P-ISSN: 2277-3495

View publication stats

You might also like