MainAssignment.docx - THE ROLE OF RANDOM FOREST IN CREDIT...

This preview shows page 1 - 2 out of 17 pages.

THE ROLE OF RANDOM FOREST IN CREDIT CARD FRAUD ANALYSIS Sudeep Dogga Data Science and Artificial Intelligence BOURNEMOUTH UNIVERSITY Bournemouth, England [email protected] Abstract The focus of the paper is the analysis of credit card fraud. The tremendous increase in credit card dealings and proceedings in recent years has led to a significant increase in fraud. The main drawback in the credit card usage is that it does not require the card holder to authorize the transaction, so it is hard to find whether the transaction is genuine or not. Many machine learning algorithms can be used to analyze the credit card fraud, the paper is focused mainly on the Random Forest algorithm because of its advantages like higher dimensionality and accuracy. It is capable to solve both classification and regression issues. Keywords—Random forest, Decision Tree, Credit card fraud analysis. I. I NTRODUCTION Random Forest is one of the many machine learning algorithms used for credit card fraud analysis. Many other methods using AI, Data Mining has been used for years. Credit card frauds can happen in different ways such as lost card misused by unknown person, card details overseen by next person in public places, by making fake calls convincing the individuals disclose their confidential card details and with adaptive technology hacking bank accounts. Credit card fraud is the commonly practiced fraud that effects the financial sector with billions of losses globally. One in every thousand credit card transactions are declared as fraud. Two main challenges involved in the credit card fraud analysis is to handle huge amount of imbalanced data that is continuously generated from everyday transactions and limitations in data availability because of banks privacy policies for their customers. Whenever an online transaction is happening there is a chance that there should be a hacker who is looking to steal the details of card and take advantage of that data, so in-order to understand the attributes that are basically detecting the fraud transaction we need to collect the data in-order to analyze the data. Credit card frauds are hard to detect as the fraudsters came up with new way of fraud each other time. From [[ CITATION Lyl20 \l en-US ]] In USA It is stated that The Federal Trade Commission (FTC) identified more than 3.2 million cases of fraud in 2019, it is the most common type of fraud with identity theft occurring in 20.33% of cases. From [ CITATION DCl20 \l en-US ] In UK statistics stated that the distribution of total annual fraud losses on UK-issued debit and credit cards are 76 percent of Card not present, 15 percent of card fraud. Machine learning is considered as the productive approach to analyze the few fraud transaction among millions of genuine transactions by analyzing the data of previous transaction. This previous dataset has to be divided into two parts, one part is to train the models and another part is to test the trained models.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture