An Automated System for Detecting Property Insurance Fraud Using Machine Learning

PDF (1854KB), PP.8-25

Views: 0 Downloads: 0

Author(s)

Kazi Md. Tawsif Rahman 1 Chowdhury Mahfuzul Hoq 1,*

1. Department of CSE of the Chittagong University of Engineering and technology, Chittagong-4349, Bangladesh

* Corresponding author.

DOI: https://doi.org/10.5815/ijmsc.2024.03.02

Received: 20 Jun. 2024 / Revised: 24 Jul. 2024 / Accepted: 16 Aug. 2024 / Published: 8 Sep. 2024

Index Terms

Insurance fraud, machine learning, web application, Ensemble technique, Stacking, SMOTE

Abstract

Detecting property insurance fraud is critical for reducing financial losses and ensuring fair claim processing. Traditional methods of detecting insurance fraud had several drawbacks, including no feature selection process, no hyper parameter tuning, lower accuracy, and class imbalance problems. To address the aforementioned shortcomings, this paper examines advanced ML (machine learning) techniques for accurately detecting property insurance fraud. To determine the best model for predicting fraudulent activities, this paper tested several machine learning models, including Gradient Boosting, classical ML classifiers, and Stacking Ensemble methods. To address class imbalance and improve model performance, the selected model incorporates proper feature selection, hyper parameter tuning, and SMOTE techniques (synthetic minority over-sampling). The Stacking Ensemble method outperformed the other ML models, achieving an accuracy of 96% and a recall of 94%. The experimental results show that the proposed stacking ensemble-based prediction scheme improves accuracy by 3.4% and recall by 2.7% over previous works. This article also includes a web application for assisting with property insurance fraud, which includes ML-based fraud prediction, question submission, answer checking, and blog post access. According to the findings, more than 54% of users expressed satisfaction with the web application's usefulness for detecting property fraud.

Cite This Paper

Kazi Md. Tawsif Rahman, Chowdhury Mahfuzul Hoq, "An Automated System for Detecting Property Insurance Fraud Using Machine Learning", International Journal of Mathematical Sciences and Computing(IJMSC), Vol.10, No.3, pp. 8-25, 2024. DOI: 10.5815/ijmsc.2024.03.02

Reference

[1]Peter Barrett et al., “RGA 2017 Global Claims Fraud Survey,” https://www.rgare.com/knowledge-center/article/rga-2017-global-claims-fraud-survey, last accessed on June 2024. 
[2]I. Mitic, “The Fraudster Next Door: Insurance Fraud Statistics, ” https://fortunly.com/statistics/insurance-fraud-statistics/, last accessed on May 2024.
[3]M. Skiba, “Insurance fraud costs $309 billion a year – nearly $1,000 for every American,” https://theconversation.com/insurance-fraud-costs-309-billion-a-year-nearly-1-000-for-every-american-193087, last accessed on July 2024.  
[4]Wikipedia, “Insurance fraud,” https://en.wikipedia.org/wiki/Insurance_fraud, last accessed on July 2024.  
[5]Lugentic platform, “Fraud detection challenges insurers face,” https://legentic.com/resources/5-fraud-detection-challenges-insurers-face, last accessed on June 2023.
[6]Statista, “Property Insurance-Bangladesh,” https://www.statista.com/outlook/fmo/insurances/non-life-insurances/property-insurance/bangladesh, last accessed on may 2023.
[7]R. J. Bolton et al., “Statistical fraud detection: A review,” Statistical science, vol. 17, no. 3, pp. 235–255, 2002.
[8]S. Viaene et al., “Strategies for detecting fraudulent claims in the automobile insurance industry,” European Journal of Operational Research, vol. 176, no. 1, pp. 565–583, 2007. 
[9]M. Artis et al., “Detection of automobile insurance fraud with discrete choice models and misclassified claims,” Journal of Risk and Insurance, vol. 69, no. 3, pp. 325–340, 2002. 
[10]S. Viaene et al., “Insurance fraud: Issues and challenges,” The Geneva Papers on Risk and Insurance-Issues and Practice, vol. 29, no. 2, pp. 313–333, 2004. 
[11]N. A. Akbar et al., ‘Improvement of decision tree classifier accuracy for healthcare insurance fraud prediction by using extreme gradient boosting algorithm, ” in 2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), 2020, pp. 110–114.
[12]M. K. Severino et al., “Machine learning algorithms for fraud prediction in property insurance: Empirical evidence using real-world microdata, ”  Machine Learning with Applications, vol. 5, no. 1, pp. 1-14, 2021. 
[13]T. Aziz et al., “Insurance fraud detection model: Using machine learning techniques, ” International Journal of Computational Intelligence in Control, vol. 14,  no. 1, pp. 410–422,  2022. 
[14]L. Rukhsar  et al., “Prediction of insurance fraud detection using machine learning algorithms”, Mehran University Research Journal of Engineering and Technology, vol. 41, pp. 33–40, Jan. 2022. 
[15]B. Itri et al., “Performance comparative study of machine learning algorithms for automobile insurance fraud detection, ”  Third International Conference on Intelligent Computing in Data Sciences (ICDS), Marrakech, Morocco, 2019, pp. 1-4.
[16]S. Harjai et al., “Detecting fraudulent insurance claims using random forests and synthetic minority oversampling technique,” in 4th International Conference on Information Systems and Computer Networks (ISCON), 2019, pp. 123–128.
[17]N. V. Chawla et al., ‘Smote: Synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002.
[18]S. Subudhi et al., “Use of optimized fuzzy c-means clustering and supervised classifiers for automobile insurance  fraud detection,” Journal of King Saud University - Computer and Information Sciences, vol. 32, no. 5, 2020, pp. 568-575. 
[19]M. Hanafy et al., “Using machine learning models to compare various resampling methods in predicting insurance fraud,” Journal of Theoretical and Applied Information Technology, vol. 99, pp. 2819–2833, Jul. 2021. 
[20]Shanthini M. et al., “Stacking Classifier-based Automated Insurance Fraud Detection System,” IEEE Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI), Gwalior, India, 2022, pp. 1-6. 
[21]Snowflake Inc., “Streamlit documentation,” https://docs.streamlit.io/, last accessed on June 2024. 
[22]Wikipedia, “MongoDB,” https://en.wikipedia.org/wiki/MongoDB, last accessed on July 2024. 
[23]Pymongo, “Pymongo installation,” https://pypi.org/project/pymongo/, last accessed on June 2024. 
[24]Veena K et al., “Predicting health insurance claim frauds using supervised machine learning technique,” Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), Chennai, India, 2023, pp. 1-7.
[25]S. N. Pushpak et al., “An Implementation of Quantum Machine Learning Technique to Determine Insurance Claim Fraud,” 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2022, pp. 1-5.