Figure 5 : Decision Tree Classifications Performance
The ROC Curve
The ROC curve is a commonly used metric for evaluating the performance of classification models, mainly when dealing with imbalanced datasets49. As seen in Figure 6, the AUROC curve plots the true positive rate (TPR) against the false positive rate (FPR) at various threshold values and provides a visual representation of the model’s accuracy. The decision tree model was the best performing classifier (.95), followed by the logistic regression model (.93). The random forest and gradient descent classifiers had the lowest AUC score of .92. These findings suggests that, while all four models are effective at predicting laundered transactions, the decision tree model is the most accurate. Furthermore, the logistic regression model is a close second, making it a good choice for businesses that require a more interpretable model. Finally, the random forest and gradient descent models are still effective predictors of fraud, but they are not as accurate as the other two models.