Repository logo
 
Publication

Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods

dc.contributor.authorKOUMETIO TEKOUABOU, Stéphane Cédric
dc.contributor.authorCristian Gherghina, Ştefan
dc.contributor.authorTOULNI, Hamza
dc.contributor.authorMata, Pedro
dc.contributor.authorMoleiro Martins, José
dc.date.accessioned2022-07-12T10:07:39Z
dc.date.available2022-07-12T10:07:39Z
dc.date.issued2022-07-06
dc.descriptionArtigo publicado em revista científica internacionalpt_PT
dc.description.abstractThe diversity of data collected on both social networks and digital interfaces is extremely increased, raising the problem of heterogeneous variables that are not often favourable to classification algorithms. Despite the significant improvement in machine learning (ML) and predictive analysis efficiency for classification in customer relationship management systems (CRM), their performance remains very limited by heterogeneous data processing, class imbalance, and feature scales. This impact turned out to be more important for simple ML methods which in addition often suffer from over-fitting. This paper proposes a succinct and detailed ML model building process including cross-validation of the combination of SMOTE to balance data and ensemble methods for modelling. From the conducted experiments, the random forest (RF) model yielded the best performance of 0.86 in terms of accuracy and f1-scoreusing balanced data. It confirms the literature summary about this topic which shows that RF was among the most effective algorithms for customer predictive classification issues. The constructed and optimized models were interpreted by Shapley values and feature importance analysis which shows that the “age” feature was the most significant while “HasCrCard” was the less one. This process has proven effective in bridging previously reported research gaps and the resulting model should be used for supporting bank customer loyalty decision-making.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.citationTékouabou, S. C. K., Gherghina, Ștefan C., Toulni, H., Mata, P., & Martins, J. M. (2022). Towards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methods. Mathematics, 10(14), 2379. https://doi.org/10.3390/math10142379pt_PT
dc.identifier.doihttps://doi.org/10.3390/math10142379pt_PT
dc.identifier.urihttp://hdl.handle.net/10400.21/14825
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherMDPIpt_PT
dc.relation.ispartofseries;14
dc.relation.publisherversionhttps://www.mdpi.com/2227-7390/10/14/2379/htmpt_PT
dc.subjectSMOTEpt_PT
dc.subjectHeterogeneous datapt_PT
dc.subjectImbalance datapt_PT
dc.subjectMachine learningpt_PT
dc.subjectShapley valuespt_PT
dc.subjectEnsemble methodspt_PT
dc.subjectBank churn modellingpt_PT
dc.subjectFeature importancept_PT
dc.titleTowards Explainable Machine Learning for Bank Churn Prediction Using Data Balancing and Ensemble-Based Methodspt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage16pt_PT
oaire.citation.startPage1pt_PT
oaire.citation.titleMathematicspt_PT
oaire.citation.volume10pt_PT
person.familyNameKOUMETIO TEKOUABOU
person.familyNameGherghina
person.familyNameTOULNI
person.familyNameMata
person.familyNameMoleiro Martins
person.givenNameStéphane Cédric
person.givenNameŞtefan Cristian
person.givenNameHamza
person.givenNamePedro
person.givenNameJosé
person.identifier987275
person.identifier1485846
person.identifier.ciencia-id5F1F-3C6C-A7BD
person.identifier.orcid0000-0003-3627-5746
person.identifier.orcid0000-0003-2911-6480
person.identifier.orcid0000-0002-6598-6267
person.identifier.orcid0000-0001-8465-9539
person.identifier.orcid0000-0001-6853-2917
person.identifier.ridJ-3339-2012
person.identifier.scopus-author-id57215085805
person.identifier.scopus-author-id56046530600
person.identifier.scopus-author-id55842206000
person.identifier.scopus-author-id36008956400
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublication280624e0-01d6-439e-bc53-c776ce388ab2
relation.isAuthorOfPublicationc56545f9-e7b7-44ac-a275-4a05010903c1
relation.isAuthorOfPublicationb951e79d-9c2a-40da-8775-5a36c105fa3f
relation.isAuthorOfPublicationd297cc6d-ae10-4764-ac8b-5913bda0a3c4
relation.isAuthorOfPublication2ee2e92e-ca22-467a-8f6d-824b362aebd5
relation.isAuthorOfPublication.latestForDiscoveryd297cc6d-ae10-4764-ac8b-5913bda0a3c4

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
mathematics-10-02379-v2.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections