Machine Learning Approach for Credit Score Predictions
DOI:
https://doi.org/10.51519/journalisi.v5i2.487Keywords:
Credit Score, Machine learning, Class Imbalance, SMOTE, Ensemble, XGBoostAbstract
This paper addresses the problem of managing the significant rise in requests for credit products that banking and financial institutions face. The aim is to propose an adaptive, dynamic heterogeneous ensemble credit model that integrates the XGBoost and Support Vector Machine models to improve the accuracy and reliability of risk assessment credit scoring models. The method employs machine learning techniques to recognise patterns and trends from past data to anticipate future occurrences. The proposed approach is compared with existing credit score models to validate its efficacy using five popular evaluation metrics, Accuracy, ROC AUC, Precision, Recall and F1_Score. The paper highlights credit scoring models’ challenges, such as class imbalance, verification latency and concept drift. The results show that the proposed approach outperforms the existing models regarding the evaluation metrics, achieving a balance between predictive accuracy and computational cost. The conclusion emphasises the significance of the proposed approach for the banking and financial sector in developing robust and reliable credit scoring models to evaluate the creditworthiness of their clients.
Downloads
References
W. Frame, A. Srinivasan and L. Woosley, “The effect of credit scoring on small-business lending,” Journal of Money, Credit and Banking, vol. 33, no. 3, pp. 813-825, 2001.
T. Tang, “Information asymmetry and firms' credit market access: Evidence from Moody's credit rating format refinement,” Journal of Financial Economics, vol. 93, no. 2, pp. 325-351, 2009.
J. Crook, D. Edelman and L. Thomas, “Recent developments in consumer credit risk assessment,” European Journal of Operational Research, vol. 183, no. 3, pp. 1447-1465, 2007.
A. Blöchlinger and M. Leippold, “Economic benefit of powerful credit scoring,” Journal of Banking and Finance, vol. 30, no. 3, pp. 851-873, 2006.
N. Chen, B. Ribeiro and A. Chen, “Financial credit risk assessment: a recent review,” Artificial Intelligence Review, vol. 45, no. 1, pp. 1-23, 2016.
A. El-Qadi, M. Trocan, T. Frossard and N. Díaz-Rodríguez, “Credit Risk Scoring Forecasting Using a Time Series Approach,” in MaxEnt 2022, Basel Switzerland.
A. El Qadi, M. Trocan, N. Díaz-Rodríguez and T. Frossard, “Feature contribution alignment with expert knowledge for artificial intelligence credit scoring,” Signal, Image and Video Processing, vol. 17, no. 2, pp. 427-434, 2023.
A. Aida, S. M. Shamsuddin and A. L. Ralescu, “Classification with class imbalance problem: a review,” International Journal of Advances in Soft Computing and its Applications, vol. 5, no. 3, 2015.
R. Adhao and V. Pachghare, “Feature selection using principal component analysis and genetic algorithm,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no. 2, pp. 595-602, 2020.
A. Asuncion and D. Newman, “UCI Machine Learning Repository,” 2007.
Z. Runchi, X. Liguo and W. Qin, “An ensemble credit scoring model based on logistic regression with heterogeneous balancing and weighting effects,” Expert Systems with Applications, vol. 212, 2023.
J. Mushava and M. Murray, “A novel XGBoost extension for credit scoring class-imbalanced data combining a generalized extreme value link and a modified focal loss function,” Expert Systems with Applications, vol. 202, 2022.
J. Mushava and M. Murray, “An experimental comparison of classification techniques in debt recoveries scoring: Evidence from South Africa's unsecured lending market,” Expert Systems with Applications, vol. 111, pp. 35-50, 2018.
Y. Wu, W. Huang, Y. Tian, Q. Zhu and L. Yu, “An uncertainty-oriented cost-sensitive credit scoring framework with multi-objective feature selection,” Electronic Commerce Research and Applications, vol. 53, 2022.
H. He, W. Zhang and S. Zhang, “A novel ensemble method for credit scoring: Adaption of different imbalance ratios,” Expert Systems with Applications, vol. 98, pp. 105-117, 2018.
W. Liu, H. Fan and M. Xia, “Credit scoring based on tree-enhanced gradient boosting decision trees,” Expert Systems with Applications, vol. 189, 2022.
Y. Xia, C. Liu, B. Da and F. Xie, “A novel heterogeneous ensemble credit scoring model based on bstacking approach,” Expert Systems with Applications, vol. 93, pp. 182-199, 2018.
R. M. Cruz, R. Sabourin and G. D. Cavalcanti, “META-DES.Oracle: Meta-learning and feature selection for dynamic ensemble selection,” Information Fusion, vol. 38, pp. 84-103, 2017.
L. Yang, “Classifiers selection for ensemble learning based on accuracy and diversity,” in Procedia Engineering, 2011.
G. U. Yule, “On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c,” Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, pp. 257-319, 1900.
L. L. Minku and X. Yao, “DDD: A New Ensemble Approach for Dealing with Concept Drift,” IEEE Transactions on Knowledge and Data Engineering, vol. 24(4), pp. 619-633, 2012.
T. Chen and C. Guestrin, “XGBoost,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2016.
J. Kennedy and R. Eberhart, “Particle swarm optimization,” Proceedings of ICNN'95 - International Conference on Neural Networks, pp. 1942-1948, 1995.
F. van den Bergh and A. Engelbrecht, “A new locally convergent particle swarm optimiser,” IEEE International Conference on Systems, Man and Cybernetics, vol. 6, 2002.
N. V. Chawla, K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
M. Mercier, M. S. Santos, P. H. Abreu, C. Soares, J. P. Soares and J. Santos, “Analysing the Footprint of Classifiers in Overlapped and Imbalanced Contexts,” pp. 200-212, 2018.
R. Wang and G. Liu, “Ensemble Method for Credit Card Fraud Detection,” International Conference on Intelligent Autonomous Systems (ICoIAS), pp. 246-252, 2021.
Y. Xia, C. Liu, Y. Li and N. Liu, “A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring,” Expert Systems with Applications, vol. 78, pp. 225-241, 2017.
C. Sano, “Japanese Credit Screening Data Set”.
“Statlog (German Credit Data) Dataset,” UCI: Machine Learning Repository, 2023.
Y. Xia, L. He, Y. Li, N. Liu and Y. Ding, “Predicting loan default in peer-to-peer lending using narrative data,” Journal of Forecasting, vol. 39(2), pp. 250-280, 2020.
J. Xiao, X. Zhou, Y. Zhong, L. Xie, X. Gu and D. Liu, “Cost-sensitive semi-supervised selective ensemble model for customer credit scoring,” Knowledge-Based Systems, vol. 189, 2020.
X. Chen, S. Li, X. Xu, F. Meng and W. Cao, “A Novel GSCI-Based Ensemble Approach for Credit Scoring,” IEEE Access, vol. 8, 2020.
C. Qin, Y. Zhang, F. Bao, C. Zhang, P. Liu and P. Liu, “XGBoost Optimized by Adaptive Particle Swarm Optimization for Credit Scoring,” Mathematical Problems in Engineering, vol. 2021, pp. 1-18, 2021.
S. Lessmann, B. Baesens, H.-V. Seow and L. C. Thomas, “Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research,” European Journal of Operational Research, pp. 124-136, 2015.
J. Demšar, “Statistical Comparisons of Classifiers over Multiple Data Sets,” Journal of Machine Learning Research, vol. 7, pp. 1-30, 1 December 2006.
Downloads
Published
Issue
Section
License
Authors Declaration
- The Authors certify that they have read, understood, and agreed to the Journal of Information Systems and Informatics (JournalISI) submission guidelines, policies, and submission declaration. The submission has been prepared using the provided template.
- The Authors certify that all authors have approved the publication of this manuscript and that there is no conflict of interest.
- The Authors confirm that the manuscript is their original work, has not received prior publication, is not under consideration for publication elsewhere, and has not been previously published.
- The Authors confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- The Authors confirm that the manuscript is not copied from or plagiarized from any other published work.
- The Authors declare that the manuscript will not be submitted for publication in any other journal or magazine until a decision is made by the journal editors.
- If the manuscript is finally accepted for publication, the Authors confirm that they will either proceed with publication immediately or withdraw the manuscript in accordance with the journal’s withdrawal policies.
- The Authors agree that, upon publication of the manuscript in this journal, they transfer copyright or assign exclusive rights to the publisher, including commercial rights














