Anonymized credit card transactions labeled as fraudulent or genuine
Credit card companies should be able to identify fraudulent transactions so that customers are not charged for items that they did not purchase.With the cost of fraud rising and cardholder trust declining, financial institutions need to take steps to ensure their business and their cardholders are protected. Credit Card companies need anonymized credit card transactions labeled as fraudulent or genuine for enhanced security and to avoid losses.
The provided dataset was highly unbalanced . Positive class (frauds) account only for 0.172% of all transactions. Due to non-disclosure & confidentiality issues, the company cannot provide original features and additional background information. Hence PCA (Principal component analysis)transformation was chosen as the model
iVentura Machine Learning Platform was used for building the solution. iVentura provides the complete ecosystem for data scientists to build models without worrying about the underlying Infra & Security.
The feature ‘Amount’ is the transaction Amount. This feature can be used for example-dependant cost-sensitive learning. Feature ‘Class’ is the response variable and takes value 1 in case of fraud and 0 otherwise.
Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.
1) The class datasets is of type unbalanced classification.
2) Thus, SMOTE is used for imbalanced labelled data.
3) Exploratory data analysis is carried out on input data.
4) Class data contains two classes: fraudulent and not fraudulent.
5) Logistic regression with balanced class is used to predict fraudulent transactions.