| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 1.25 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Cardiovascular disease (CVD) is the leading cause of death globally, significantly impacting
mortality and morbidity individual across different demographics. The aim of this study is to
leverage attention-based Natural Language Process (NLP) models to predict severe forms of
CVD from unstructured clinical notes using discharge summaries of patients in MIMIC-IV
dataset. Through a comparative analysis of various models that included LSTM, BERT,
clinicalBERT and Clinical LongFormer, as well as modified versions of BERT and clinicalBERT,
this research finds that attention-based models outperform traditional deep learning models
in handling long and complex unstructured clinical notes, and therefore make better
predictions. The best performing model identified in this study is BERT (sliding window), as
this model was most accurate (Accuracy: 0.73), well-balanced in predictions (F1-Micro: 0.80)
and excelled at correctly predicting specific CVD (AUC: 0.83). Although there are some
limitations, this study demonstrates the predictive power of advanced attention-based
models in healthcare, which would enable better disease predictions and timely interventions
to reduce mortality and morbidity due to CVD.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Palavras-chave
Electronic Health Records (EMRs) Clinical Notes Natural Language Processing Transformerbased Methods Cardiovascular Diseases SDG 3 - Good health and well-being
