Henriques, Roberto André PereiraCorreia, Maria Ana Mendes2024-11-062024-11-062024-10-28http://hdl.handle.net/10362/174689Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Information Analysis and ManagementThe present dissertation evaluates the importance of using data mining techniques to prevent and detect cases of financial fraud. The most common examples of financial fraud are money laundering, credit card fraud, financial statement fraud, insurance fraud and securities and commodities fraud. A business must prevent and detect fraud behaviour in real time to avoid money losses, fines from the regulator, and exposure to financial and operational risk. Being a Bank or an Insurance, it is important to use data mining techniques to detect and prevent fraud behaviour. This study's main objective is to build a predictive model using a data mining approach and machine learning to predict money laundering in the banking sector using transaction data. The supervised learning algorithms applied to predict money laundering transactions are Logistic Regression, Neural Networks, Decision Trees, Random Forests, Light Gradient Boost and Ensemble. The dataset used in this study was highly imbalanced, and it is necessary to apply an oversampling technique that combines K-means clustering with SMOTE. The empirical results show that the Light Gradient Boost is the model with the best performance, showing a strong discriminatory power (AUC=99,9% and Gini=0,998), a strong precision (98,4%) and recall (96,4%). It achieved the highest value of f1-score (97,4%), showing that the model correctly identifies a high number of fraudulent transactions while minimizing the false positives and false negatives. This study proves that by monitoring and analyzing transaction data, fraudulent transactions can be predicted with high levels of success achieved. It also presents a strong evidence that data mining techniques can continuously be used to detect cases of fraud behaviour, especially cases of financial fraud in the Banking sector.engFraud BehaviourFinancial FraudMoney LaunderingSupervised LearningImbalanced dataOversamplingData MiningSDG 8 - Decent work and economic growthSDG 16 - Peace, justice and strong institutionsSDG 17 - Partnerships for the goalsPredicting Fraud Behaviour: A Data Mining Approach for Anti-Money Launderingmaster thesis203775929