Figure 5 : Decision Tree Classifications Performance
The ROC Curve
The ROC curve is a commonly used metric for evaluating the performance
of classification models, mainly when dealing with imbalanced datasets49. As seen in Figure 6, the AUROC curve plots the
true positive rate (TPR) against the false positive rate (FPR) at
various threshold values and provides a visual representation of the
model’s accuracy. The decision tree model was the best performing
classifier (.95), followed by the logistic regression model (.93). The
random forest and gradient descent classifiers had the lowest AUC score
of .92. These findings suggests that, while all four models are
effective at predicting laundered transactions, the decision tree model
is the most accurate. Furthermore, the logistic regression model is a
close second, making it a good choice for businesses that require a more
interpretable model. Finally, the random forest and gradient descent
models are still effective predictors of fraud, but they are not as
accurate as the other two models.