A Comparative Analysis of Imbalanced Learning Techniques for Optimizing Credit Card Fraud Detection

Silva, Rita Ávila da

Utilize este identificador para referenciar este registo: http://hdl.handle.net/10362/184631

Título:	A Comparative Analysis of Imbalanced Learning Techniques for Optimizing Credit Card Fraud Detection
Autor:	Silva, Rita Ávila da
Orientador:	Henriques, Roberto André Pereira
Palavras-chave:	Classification Imbalanced Learning Credit Card Fraud Detection Sampling Techniques Evaluation Metrics SDG 8 - Decent work and economic growth SDG 12 - Responsible production and consumption SDG 16 - Peace, justice and strong institutions
Data de Defesa:	23-Jun-2025
Resumo:	Credit card fraud is a growing concern for financial institutions and consumers, leading to significant financial losses and increased security risks. One of the main challenges in fraud detection is the extreme class imbalance, where fraudulent transactions make up only a tiny fraction of all transactions. This imbalance makes it difficult for machine learning models to correctly identify fraud, as they tend to be biased toward the majority class. This paper explores and compares the implementation of various imbalanced learning techniques, including SMOTE, ROS, Borderline-SMOTE, ADASYN, K-means SMOTE, SMOTE-ENN, SMOTETomek, CT-GAN, and CT-GAN Synthesizer. The goal is to assist in the selection of highperformance imbalanced learning techniques for fraud detection, ensuring its applicability and robustness across imbalanced fraud datasets. Empirical results of extensive experiments with 5 datasets show that traditional oversampling methods like ROS and SMOTE variants, consistently improved model performance when combined with strong classifiers like Random Forest and XGBoost. These methods not only increased recall, ensuring a higher detection rate of fraudulent transactions, but also maintained a favorable balance with precision, reducing the risk of flagging legitimate transactions as fraudulent. In contrast, more advanced techniques, including GAN’s and K-means SMOTE, did not demonstrate the expected improvements. Instead, these methods occasionally introduced variability that did not translate into overall performance gains when compared to the traditional oversampling strategies.
Descrição:	Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Business Intelligence
URI:	http://hdl.handle.net/10362/184631
Designação:	Mestrado em Gestão de Informação, especialização em Inteligência de Negócio
Aparece nas colecções:	NIMS - Dissertações de Mestrado em Gestão da Informação (Information Management)

Ficheiros deste registo:

Ficheiro	Descrição	Tamanho	Formato
TGI4486.pdf		2,02 MB	Adobe PDF	Ver/Abrir

Mostrar registo em formato completo Dê a sua opinião sobre este registo.