Henriques, Roberto André PereiraMubashar, Sabeen2025-04-112025-04-112025-04-08http://hdl.handle.net/10362/182170Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceP2P lending platforms are responsible for direct lending between borrowers and lenders, avoiding the traditional intermediaries within the financial system. This exposes them to additional risk within the business space and the challenges in assessing loan risk that can result in financial loss. This study introduces a machine learning framework designed to enhance the accuracy and interpretability of default predictions in P2P lending. Based on the publicly available Bondora dataset (2009–2023), this research identifies outliers and key factors influencing loan defaults and explores strategies for handling data imbalance. Traditional, ensemble and neural network models are rigorously compared. AutoGloun, and XGBoost have emerged as top-performing models. The findings highlight that the proposed approach is reliable for the loan default prediction at the pre-approval stage, offering moderate accuracy while protecting investments and promoting platform stability. This paper, therefore, presents a comprehensive framework for tackling loan default risk in peer-to-peer lending by integrating predictive modelling and Explainable AI.engPeer-to-Peer lendingLoan Default PredictionMachine LearningExplainable AI (XAI)Imbalanced DataSDG 8 - Decent work and economic growthSDG 9 - Industry, innovation and infrastructurePeer-to-Peer Loan Default Detection using Machine Learningmaster thesis203940377