Automatic Bank Fraud Detection Using Support Vector Machines

Presented by

Kanika Chopra, Yush Rajcoomar

Introduction

Automatic Bank Fraud Detection Using Support Vector Machines is a paper written by Djeffal Abdelhamid, Soltani Khaoula, and Ouassaf Atika in 2014. This paper proposes the hybridization of supervised and unsupervised algorithms to improve fraud detection. Fraud detection is very important for financial companies and has become increasingly common due to e-commerce. The previous institutional methods such as PINs, passwords and identification systems are no longer sufficient; thus, requiring a more data-driven approach. In this paper, credit card fraud, money laundering and mortgage fraud are discussed with regards to data mining and how the respective fraud detections would be improved with the hybridization method. The data is used to first distinguish between normal and fraudulent activities using binary SVM (supervised learning). In the cases where data to identify fraudulent transactions is not available, a single-class SVM model is used to obtain the decision boundary distinguishing the two. The normal transactions are investigated further to observe strange trends as shown in Figure 1 (unsupervised learning). These methods were then tested on various bank databases to determine how effective they were in detecting fraud.

Figure 1: Hybridization of supervised and unsupervised learning for fraud detection [6]

Previous Work

Support vector machines have been a robust and reliable approach to statistical learning for several years. As financial institutions have had access to more data, they have been able to utilize this to predict patterns of behaviour that have a higher probability of being fraudulent. Different methods have been used to tackle this problem impacting financial institutions such as: Bayesian Networks[1] , K Nearest Neighbours [2] and Artificial Neural Networks [3]. Throughout the literature, there are two approaches for fraud detection:

1. Supervised Learning

2. Unsupervised Learning

With supervised learning, models are constructed based on legitimate transactions and fraudulent transactions. Examples of a supervised learning approach would be Bayesian Networks and Support Vector Machines. However, fraudsters are able to bypass security and prevention methods and hence, unsupervised methods are used to identify abnormal approaches and unusual transactions from the normal transactions. Examples of Unsupervised methods would be k-nearest neighbours and self-organizing maps [4].

Motivation

The identification of behavioural patterns for fraud detection is not an efficient and effective approach as fraud techniques are updating rapidly; therefore, making current models obsolete.

Therefore, single-class SVM [5], an unsupervised algorithm, is used to learn a decision function for novelty detection. This classifies new data as similar or an anomaly based on the training set. Binary SVM, a supervised algorithm, is used to find a separating hyperplane between the fraudulent and non-fraudulent classes. The proposed method is a combination of the supervised and unsupervised learning to flag fraudulent behaviour. The supervised component will learn from previous transactions and the unsupervised component will detect strange behaviour.

Model Architecture

As mentioned above, the author selects a hybrid approach, combining supervised and unsupervised learning. The first pass uses a binary SVM to draw a separating hyperplane between normal and fraudulent transactions based on the training data provided. However, the challenge behind detecting fraud is adapting to the novel approaches to fraud. Fraudsters study the triggers of the security system, and then promptly change their approach to bypass it. To address this problem, the authors suggest using a single-class SVM on the normal data predicted by the binary SVM.

Figure 1 is a solution for the bias introduced by the binary SVM. The authors need a system that will support both supervised and unsupervised learning simultaneously.

Figure 3: Proposed Architecture [6]

The authors plan to use a database of normal and fraudulent transactions that will be fed to the two models in parallel. The supervised learning model will be trained on the whole database such that it learns the decision boundary between the two classes. Meanwhile, only the normal transactions will be passed into the unsupervised models to detect abnormal behaviour.

When a new transaction enters the system, it is passed through the supervised learning model and if it is classified as a normal transaction, then it passes through the unsupervised learning model. If it is classified again as a normal transaction, it is executed. In the event that any of the two models flag a transaction as fraudulent, the authors suggested using a layer of manual verification to assess the transaction. If the transaction is deemed fraudulent, it is rejected and investigated by the bank. Otherwise, it is a normal transaction and is executed. In both cases, the transaction is added back to the dataset with the correct label to provide additional training data.

Data

This paper focuses on three different types of fraud; each one has a unique set of data to identify anomalies. Credit card fraud is detected using historical transaction data which can then be used for feature extraction. Some features of interest are frequency of use, the remaining unpaid balance of each cycle, the frequency of the uncovered, the maximum number of late days, etc. [7]. As for money laundering, the likelihood of money laundering is rarer. Features of interest for this type of fraud include the amount of the transaction, billing, the source and date of transfer, etc. [7]. Lastly, for mortgage fraud, the data used includes both personal and professional information of the customers to verify reliability and as well the mortgage that has been presented.

The results are based on three databases with normal transactions: General Ledger, Payables Data and Revenue Data corresponding to credit card fraud, money laundering and mortgage fraud. These datasets do not provide indicators of whether transactions or users are fraudulent. In addition, the German and Australian databases of credit cards were used in order to obtain data for the binary SVM; these datasets included fraudulent transactions.

Results

The data for the models were split 70-30% for training-validation respectively. When detecting bank fraud, the single-class SVM model obtained a higher precision than those belonging to Bayesian networks coupled with neural networks (70%) [1] and hidden Markov models (80%) [7]. These results were obtained by training the model on the General Ledger, Payables Data and Revenue Data datasets. The single-class SVM model detected abnormal behaviour from the datasets and appeared to be more accurate in classifying these as fraudulent for financial institutions than the other models. For the hybridization approach, this is primarily used to recognize unusual trends in the data which can then be investigated further. The results using the single-class model from the paper are summarized below:

Table	General Ledger Data	Payables Data	Revenue Data
Precision	94%	100%	85%

Table 1: Results obtained by single class SVM model [6]

Hence, from the single-class SVM, we are able to see that this model itself is able to extract strange behaviours from a dataset fairly accurately and is superior to other techniques mentioned above. The proposed method is to use a combination of the single-class and binary SVM to extract these trends if no label is present, and then investigate abnormal data further as well as classify labelled data as fraudulent or non-fraudulent. These results are based on training the model on the Australian and German databases. In this case, the data indicates fraudulent and non-fraudulent cases so the hybridization method can be applied. It is shown that this method has slight improvement over the binary method and is superior to the single-class SVM method on both datasets. It is important to note that this is based on the history of credit scoring of the customers rather than the customer behaviour. The results from these observations in the paper are summarized in Table 2.

Table	Binary SVM	Single class SVM	Hybrid SVM
Australian	83.56%	54.67%	83.85%
German	72.4%	67.2%	72.4%

Table 2: Results obtained by the proposed method [6]

Conclusion

From the results above, it is observed that the hybrid method is preferred over single-class SVM; however, only shows slight improvement over the binary SVM cases for the Australian data and has the same precision for the German data. In comparison to other methods, such as Bayesian networks with neural networks and hidden Markov models, the single-class SVM has a higher precision; hence, we hope that the hybridization method is preferable over these methods as well.

However, obtaining data that indicates fraudulent users is not easily available and thus, it is more difficult to measure precisely how accurate this method would be. The results would be further improved by applying this method to larger datasets and as well by fine-tuning the parameters for the SVM methods. Despite these difficulties, this hybridization method is worth investigating further to determine its efficacy as it has promising results.

Critiques

This method results in a higher precision than previous works and using a hybridization method allows to narrow down the datasets that the unsupervised model predicts on. The hybridization technique is also able to be adapted for various types of fraud, such as credit card fraud detection and mortgage fraud detection in which case the abnormal behaviour presents itself differently. However, it is important to note that there is less data that has fraudulent indicators available. In addition to this, there was no clear indication of the distribution of the fraudulent vs. non-fraudulent data. It is likely that these datasets have imbalanced data where the majority of the data is non-fraudulent. Hence, precision would be higher since the model is more likely to predict non-fraudulent data as this is the more common case. Furthermore, the credit score is used to determine precision and measure the efficacy for the hybridization method whereas customer behaviour would be more accurate. The paper mentions the importance of proper data mining to obtain relevant data to make more accurate predictions; hence, access to more real databases would be ideal and beneficial in determining how successful this proposed method is.

References

[1] Maes, Sam & Tuyls, Karl & Vanschoenwinkel, Bram & Manderick, Bernard. (2002). Credit Card Fraud Detection Using Bayesian and Neural Networks.

[2] Nami, Sanaz, and Mehdi Shajari. “Cost-Sensitive Payment Card Fraud Detection Based on Dynamic Random Forest and k -Nearest Neighbors.” Expert Systems with Applications, vol. 110, 2018, pp. 381–92. Crossref, doi:10.1016/j.eswa.2018.06.011.

[3] Patidar, R. D. and Lokesh Sharma. “Credit Card Fraud Detection Using Neural Network.” (2011).

[4] Olszewski, Dominik. “Fraud Detection Using Self-Organizing Map Visualizing the User Profiles.” Knowledge-Based Systems, vol. 70, 2014, pp. 324–34. Crossref, doi:10.1016/j.knosys.2014.07.008.

[5] Schölkopf, Bernhard & Williamson, Robert & Smola, Alex & Shawe-Taylor, John & Platt, John. (1999). Support Vector Method for Novelty Detection. NIPS. 12. 582-588.

[6] Abdelhamid, Djeffal, Soltani Khaoula, and Ouassaf Atika. "Automatic bank fraud detection using support vector machines." The International Conference on Computing Technology and Information Management (ICCTIM). Society of Digital Information and Wireless Communication, 2014.

[7] Abhinav Srivastava, Amlan Kundu, Shamik Sural, and Arun K Majum- dar. Credit card fraud detection using hidden markov model. Dependable and Secure Computing, IEEE Transactions on, 5(1):37–48, 2008.

Automatic Bank Fraud Detection Using Support Vector Machines

Contents

Presented by

Introduction

Previous Work

Motivation

Model Architecture

Data

Results

Conclusion

Critiques

References

Navigation menu

Automatic Bank Fraud Detection Using Support Vector Machines

Presented by

Introduction

Previous Work

Motivation

Model Architecture

Data

Results

Conclusion

Critiques

References

Navigation menu

Search