Utilize este identificador para referenciar este registo:
http://hdl.handle.net/10362/159733
Título: | The Role of Synthetic Data in Improving Supervised Learning Methods: The Case of Land Use/Land Cover Classification |
Autor: | Fonseca, João Pedro Martins Ribeiro da |
Orientador: | Bação, Fernando José Ferreira Lucas |
Palavras-chave: | LULC classification Active Learning Imbalanced Learning Synthetic Data Oversampling SDG 13 - Climate action SDG 15 - Life on land |
Data de Defesa: | 12-Out-2023 |
Resumo: | In remote sensing, Land Use/Land Cover (LULC) maps constitute important assets for various applications, promoting environmental sustainability and good resource management. Although, their production continues to be a challenging task. There are various factors that contribute towards the difficulty of generating accurate, timely updated LULC maps, both via automatic or photo-interpreted LULC mapping. Data preprocessing, being a crucial step for any Machine Learning task, is particularly important in the remote sensing domain due to the overwhelming amount of raw, unlabeled data continuously gathered from multiple remote sensing missions. However a significant part of the state-of-the-art focuses on scenarios with full access to labeled training data with relatively balanced class distributions. This thesis focuses on the challenges found in automatic LULC classification tasks, specifically in data preprocessing tasks. We focus on the development of novel Active Learning (AL) and imbalanced learning techniques, to improve ML performance in situations with limited training data and/or the existence of rare classes. We also show that much of the contributions presented are not only successful in remote sensing problems, but also in various other multidisciplinary classification problems. The work presented in this thesis used open access datasets to test the contributions made in imbalanced learning and AL. All the data pulling, preprocessing and experiments are made available at https://github.com/joaopfonseca/publications. The algorithmic implementations are made available in the Python package ml-research at https://github.com/joaopfonseca/ml-research. |
Descrição: | A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management |
URI: | http://hdl.handle.net/10362/159733 |
Designação: | Doutoramento em Gestão da Informação |
Aparece nas colecções: | NIMS - Teses de Doutoramento (Doctoral Theses) |
Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.