A Comparative Study of Resampling, Cost-Sensitive, and Ensemble Techniques for Handling Class Imbalance in Indonesian Financial Data
- 1 Department of Computer Science, Bina Nusantara Graduate Program, Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia
Abstract
Handling class imbalance is a critical challenge in machine learning applications, particularly in financial domains where minority instances often represent significant anomalies such as fraud or audit risks. Various oversampling and undersampling methods were tested, alongside cost-sensitive adjustments and ensemble models including Random Forest, AdaBoost, Gradient Boosting, and XGBoost. The evaluation, based on 10-fold stratified cross-validation and performance metrics such as F1-score, ROC-AUC, and confusion matrix, highlights the superiority of a hybrid approach combining Borderline SMOTE and XGBoost. This configuration achieved near-perfect performance with F1-scores of 0.99 for both classes, demonstrating excellent discrimination and minimal error rates. The findings underscore the importance of method integration in imbalanced data scenarios and offer practical insights for model selection in real-world financial risk modeling.
DOI: https://doi.org/10.3844/jcssp.2026.605.617
Copyright: © 2026 Gunawan Kurnia and Ditdit Nugeraha Utama. This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 64 Views
- 9 Downloads
- 0 Citations
Download
Keywords
- Data Imbalanced
- Predictive Modeling
- Resampling
- Cost Sensitive Learning
- Ensemble Learning