Automatic Bank Fraud Detection Using Support Vector Machines: Difference between revisions

From statwiki
Jump to navigation Jump to search
(Adding the headers in for the paper critique)
 
No edit summary
Line 3: Line 3:


== Introduction ==  
== Introduction ==  
Automatic Bank Fraud Detection Using Support Vector Machines is a paper written by Djeffal Abdelhamid, Soltani Khaoula, and Ouassaf Atika in 2014. This paper proposes data mining methods to obtain relevant information for various fraudulent activities and the hybridization of supervised and unsupervised algorithms to improve fraud detection. Fraud detection is very important for financial companies and has become increasingly common due to e-commerce. The previous institutional methods such as PINs, passwords and identification systems are no longer sufficient; thus, requiring a more data-driven approach. In this paper, credit card fraud, money laundering and mortgage fraud are discussed with regards to data mining and for each a support vector machine (SVM) variant is proposed. The data is used to first distinguish between normal and fraudulent activities using binary SVM (supervised learning). In the cases where data to identify fraudulent transactions is not available, a single-class SVM model is used to obtain the decision boundary distinguishing the two. The normal transactions are investigated further to observe strange trends as shown in Figure 1 (unsupervised learning). These methods were then tested on various bank databases to determine how effective they were in detecting fraud.
Figure 1: Method using Supervised and Unsupervised Learning


== Previous Work ==  
== Previous Work ==  
Support vector machines have been a robust and reliable approach to statistical learning for several years. As financial institutions have had access to more data, they have been able to utilize this to predict patterns of behaviour that have a higher probability of being fraudulent. Different methods have been used to tackle this problem impacting financial institutions such as: Bayesian Networks[1] , K Nearest Neighbours [2] and Artificial Neural Networks [3]. Throughout the literature, there are two approaches for fraud detection:
1. Supervised Learning
2. Unsupervised Learning
With supervised learning, models are constructed based on legitimate transactions and fraudulent transactions. Examples of a supervised learning approach would be Bayesian Networks and Support Vector Machines. However, fraudsters are able to bypass security and prevention methods and hence,  unsupervised methods are used to identify abnormal approaches and unusual transactions from the normal transactions. Examples of Unsupervised methods would be k-nearest neighbours and self-organizing maps [4].


== Motivation ==  
== Motivation ==  
The identification of behavioural patterns for fraud detection is not an efficient and effective approach as fraud techniques are updating rapidly; therefore, making current models obsolete.
Therefore, single-class SVM [5], an unsupervised algorithm, is used to learn a decision function for novelty detection. This classifies new data as similar or an anomaly based on the training set.  Binary SVM, a supervised algorithm, is used to find a separating hyperplane between the fraudulent and non-fraudulent classes. The proposed method is a combination of the supervised and unsupervised learning to flag fraudulent behaviour. The supervised component will learn from previous transactions and the unsupervised component will detect strange behaviour.


== Model Architecture ==  
== Model Architecture ==  

Revision as of 14:01, 11 November 2021

Presented by

Kanika Chopra, Yush Rajcoomar

Introduction

Automatic Bank Fraud Detection Using Support Vector Machines is a paper written by Djeffal Abdelhamid, Soltani Khaoula, and Ouassaf Atika in 2014. This paper proposes data mining methods to obtain relevant information for various fraudulent activities and the hybridization of supervised and unsupervised algorithms to improve fraud detection. Fraud detection is very important for financial companies and has become increasingly common due to e-commerce. The previous institutional methods such as PINs, passwords and identification systems are no longer sufficient; thus, requiring a more data-driven approach. In this paper, credit card fraud, money laundering and mortgage fraud are discussed with regards to data mining and for each a support vector machine (SVM) variant is proposed. The data is used to first distinguish between normal and fraudulent activities using binary SVM (supervised learning). In the cases where data to identify fraudulent transactions is not available, a single-class SVM model is used to obtain the decision boundary distinguishing the two. The normal transactions are investigated further to observe strange trends as shown in Figure 1 (unsupervised learning). These methods were then tested on various bank databases to determine how effective they were in detecting fraud.

Figure 1: Method using Supervised and Unsupervised Learning

Previous Work

Support vector machines have been a robust and reliable approach to statistical learning for several years. As financial institutions have had access to more data, they have been able to utilize this to predict patterns of behaviour that have a higher probability of being fraudulent. Different methods have been used to tackle this problem impacting financial institutions such as: Bayesian Networks[1] , K Nearest Neighbours [2] and Artificial Neural Networks [3]. Throughout the literature, there are two approaches for fraud detection: 1. Supervised Learning 2. Unsupervised Learning With supervised learning, models are constructed based on legitimate transactions and fraudulent transactions. Examples of a supervised learning approach would be Bayesian Networks and Support Vector Machines. However, fraudsters are able to bypass security and prevention methods and hence, unsupervised methods are used to identify abnormal approaches and unusual transactions from the normal transactions. Examples of Unsupervised methods would be k-nearest neighbours and self-organizing maps [4].


Motivation

The identification of behavioural patterns for fraud detection is not an efficient and effective approach as fraud techniques are updating rapidly; therefore, making current models obsolete.

Therefore, single-class SVM [5], an unsupervised algorithm, is used to learn a decision function for novelty detection. This classifies new data as similar or an anomaly based on the training set. Binary SVM, a supervised algorithm, is used to find a separating hyperplane between the fraudulent and non-fraudulent classes. The proposed method is a combination of the supervised and unsupervised learning to flag fraudulent behaviour. The supervised component will learn from previous transactions and the unsupervised component will detect strange behaviour.

Model Architecture

Data

Results

Conclusion

Critiques

References