| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.46 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
In this study, we investigate whether a compact set of behavioural covariates, embedded in a continuoustime multi-state Markov model (MSM), can reliably forecast credit-card fraud risk—and whether
combining MSM-derived transition probabilities with a simple machine-learning classifier enhances
early-fraud detection. We apply stepwise, Akaike Information Criterion (𝐴𝐼𝐶)-guided selection on
MSMs fitted to the original, undersampled, Synthetic Minority Over-sampling Technique (SMOTE)-
augmented, and Generative Adversarial Networks (GAN)-augmented datasets to isolate core predictors,
such as: time since last transaction, daily transaction count and amount, transaction amount and age. In
all datasets, these lean MSMs achieved near-optimal 𝐴𝐼𝐶 and log-likelihood values with consistently
stable risk ratios, yet their standalone predictive accuracy remained modest (~0.50) and prone to high
false-positive rates. By feeding MSM transition probabilities into a basic Random Forest (RF), without
hyperparameter tuning, we increase accuracy to 0.81 on the original data (versus 0.72 with SMOTE and
0.57 with GAN), underscoring the importance of hybridisation in highly imbalanced settings. Using the
MSM‐RF hybrid approach also yielded strong class‐specific performance: for fraud-network, precision
reached 0.95 and recall 0.92; for normal behaviour, both precision and recall hovered around 0.80.
Crucially, the early detection recall jumped from 0.42 with pure MSM to 0.66 under SMOTE (and 0.16
with GAN), illustrating the trade-off between sensitivity and precision in highly imbalanced settings.
We further show that synthetic augmentation preserves core temporal signals but can introduce
parameter instability. Overall, our findings demonstrate that parsimony and temporal interpretability
outweigh the complexity of the model in sequential fraud-risk modelling and provide a practical
blueprint for real-time deployment.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Palavras-chave
Multi-State Markov Models Fraud Detection Class imbalance Behavioural modelling
