| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 1.23 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
The increasing use of artificial intelligence in healthcare offers significant opportunities to
improve patient outcomes and optimize clinical workflows. From the diversity of applications,
machine learning continues to demonstrate its potential of use as an auxiliary diagnostic tool.
It is used in the context of neurodegenerative diseases research, for tasks such as speech and
handwriting analysis, gait impairment studies using sensor data, and to discover new potential
pharmacotherapies. Parkinson’s disease and atypical parkinsonism are two classes of
neurodegenerative pathologies that present significant challenges in differential diagnosis
and require distinct management strategies. Distinguishing them early allows for a better
prognosis by targeting the specific characteristics of each disease. Despite the advancements
in machine learning research applied to Parkinson’s, there is a gap in studies that specifically
address the binary classification between the two classes, which is the focus of this thesis. This
research can serve as a starting point to the development of a complementary tool to assist
practitioners with early differential diagnosis, using data from standardised clinical
assessments. A comprehensive experimental design was implemented, using six machine
learning classifiers. One of the tested implementations accounts for strategies to address class
imbalance and small sample size. The objectives were to analyse and identify the bestperforming model, and to determine which features impact classification the most. The
results, when accounted for F1-score macro, indicate that extreme gradient boosting and
random forest achieved the highest scores. Statistical testing revealed no significant
difference in performance between these two models. When balanced accuracy was
considered, logistic regression, support vector classifier and extreme gradient boosting
achieved the highest scores and also presented no statistically significant differences in
performance. Although the performance metrics were not as good as expected, the best
models achieved scores ranging from 0.55 to 0.62 for both F1-score macro and balanced
accuracy. This research is a foundation for the classification task of recognizing each disease.
Additionally, model interpretability was explored using Shapley additive explanations. The
feature “mds_updrs_pIII” was found to be the most influential in distinguishing between the
two conditions, followed by the “Age”. These insights can inform future scientific research.
Despite limitations in data quality and sample size, this work provides a stable methodology
for future studies with richer datasets, more features, and alternative modelling approaches.
Ultimately, this research emphasisesthe value of interdisciplinary collaboration and how both
positive and negative results contribute to scientific progress.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
Palavras-chave
Machine Learning Artificial Intelligence Parkinson’s Disease Atypical Parkinsonian Disorders Atypical Parkinsonism Healthcare Rehabilitation SDG 3 - Good health and well-being
