| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.9 MB | Adobe PDF |
Orientador(es)
Resumo(s)
Anomalies are everywhere, and neither can we discard such truth in the business context. From intrusion detection for computer network systems to fraud detection and credit risk analysis, abnormalities are an unavoidable component of practically every known system. Insurance companies have registered significant growth over the last few years with the support of machine learning techniques and technological advancements. Several studies have discussed the best-unsupervised anomaly detection algorithm for each business problem and domain. Algorithms’ enhancements and novel models’ proposals are the most typical subject addressed. Nonetheless, fewer studies have been made regarding the identification of abnormal behaviour in client profiles and vehicle characteristics that may influence the two main measures in a non-life insurance field: the frequency and the severity. This project aims to respond to this need by experimenting with different clustering techniques, such as DBSCAN and OPTICS, and distinct unsupervised anomaly detection models, such as Isolation Forest, Extended Isolation Forest and Local Outlier Factor, on a real-world dataset provided by an insurance company that operates in Portugal. In doing so, its impact on the pricing and underwriting rules allows the attribution of an equitable tariff for the insurance entity and its customers. The implementation of the Isolation Forest algorithm for the whole dataset outperforms the remaining models by achieving an AUC score of approximately 0.86. The development of this project, besides supporting the decision-making process on identifying unsought clients in the insurance context, also contributes to broadening the knowledge of existing state-of-the-art anomaly detection algorithms and their performances.
Descrição
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business Analytics
Palavras-chave
Anomaly Detection Unsupervised Learning Clustering Non-life Insurance Isolation Forest Area Under the Curve
