KBC NORTH MAHARASHTRA UNIVERSITY, JALGOAN
Z.B. PATIL COLLEGE, DHULE
Department of STATISTICS
“AUTO INSURANCE FRAUD DETECTION”
BY Guided By
DYNANESHWAR KOTE
MS. PRATIKSHA
AJAY PATIL
RUIKAR MAM
SACHIN WANKHEDE
INTRODUCTION
⚫ An improper activity committed by individuals in
order
to gain benefit.
⚫ There are various types of frauds viz. Health
care, Agricultural frauds but we are focusing on
Auto Insurance Fraud Detection.
W hat is Auto Insurance
Fraud Detection
⚫The insurance industry is concerned with
the detection of fraudulent behavior
with insurance company due to
vehicles . The number of automobile
claims involving some kind of suspicious
circumstance is high and has become a
subject of major interest for companies .
By building a classification model auto
insurance fraud can be detected.
N eed of Auto insurance
fraud detection
⚫ India is one of the biggest market for
insurance industries all over the world, yet it is
not free from risks.
⚫ IndianInsurance Industry looses around $6 billion
every year to this insurance frauds.
⚫ Hence there is an urgent need to develop a capability
which can help companies identify whether the given
insurance claim is fraud or genuine with high
degree of accuracy and with less amount of time.
⚫ This will also help in maintaining the customers
satisfaction and also the trust towards the
insurance company.
OBJECTIVE
⚫ To minimize number of fraud claim cases.
⚫ Tobuild a classification methodology to
determine whether a customer is placing
a fraudulent insurance claim or not .
⚫ To provide quickness & high accuracy for
claiming process.
⚫ Toreduce the amount of financial loss
of company due to such illegals
frauds.
Methodolog
y
⚫W e use machine learning &
their algorithm using
python.
⚫The data used for this study is
secondary data. It is extracted and
compiled from Kaggle website
⚫ The data is then preprocessed
and after training ,the data is
modeled using Xgboost classifier
and we predict given claim is fraud
Importing
Libraries
⚫In this step we import all
necessary libraries required in
our project
Load the
Dataset
⚫After importing all our required
libraries then we load the D ataset.
⚫The Dataset we used in this project is
a publicly available dataset taken
from Kaggle
Basic Operation on
Dataset.
⚫Here we perform some basic
operation on our dataset to check
whether our dataset working
properly or not
C hecking the N ull
values
Data
cleaning
⚫C leaning missing values using
categorical imputer
Extracting categorical
data
Encodin
g⚫In this step we perform label
encoding on categorical variables in
the dataset
Catagorical Data after
Encoding
C ombining categorical &
numerical data
Separating Feature column
and Target C olumn
Checking Multicollinearity
using Heatmap
⚫Here we plot heatmap showing
relation between the variables
Removing highly correlated
columns
⚫Here we remove age column
and Total claim amount
column
Normalization
⚫ Here our data is normally
distributed
Standardisati
on
⚫Standardisation makes all
variable to a common scale
Training & testing of
model
⚫Here we split our dataset in train and
test set. 75% of our dataset is for
training purpose and 25 % is for
testing .
Using Xgboost
algorithm
⚫Here we use X gboost algorithm and
train the model by 75% of the
dataset
Outpu
t⚫Here we test the model by 25 % of
the dataset .Where 0 represent
fraud not happen and 1 represents
fraud happen
Conclusion
⚫Here we check all suitable
algorithms for better accuracy and
found that X gboost has highest
accuracy among all of them having
75% accuracy so we used xgboost
algorithm for further predictions
Future
scope
⚫In this Project, we learned how
machine learning can be applied to
decide which claims are genuine and
which claims are fraudulent . In future
it saves time and money for dealing
with fraudulent claims
Research
paper
⚫ 1 ] Survey of Insurance Fraud Detection Using Data Mining Techniques H.Lookman
Sithic, T.Balasubramanian
⚫ 2 ] Use of optimized Fuzzy C-Means clustering and supervised classifiers
for
automobile insurance fraud detection [ Sharmila Subudhi, Suvasini Panigrahi ]
⚫ 3 ] Application of Clustering Methods to Health Insurance Fraud Detection
Yi Peng1
, Gang Kou1, *, Alan Sabatka2 , Zhengxin Chen1 , Deepak
Khazanchi1 ,Yong Shi1
⚫ 4 ] CLAIMS AUDITING IN AUTOMOBILE INSURANCE: FRAUD DETECTION
A N D DETERRENCE OBJECTIVES Sharon Tennyson Pau Salsas-Forn
⚫ 5 ] Big Data and Specific Analysis Methods for Insurance Fraud Detection
Ana- Ramona BOLOGA, Razvan BOLOGA, Alexandra FLOREA
⚫ 6] Analytics for Insurance Fraud Detection: An Empirical Study Carol Anne
Hargreaves* ,Vidyut Singhania* (Business Analytics)Institute of Systems Science,
N ational University of Singapore, Singapore, Singapor
THANK
YOU!