| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 4.59 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Offline Reinforcement Learning (RL) learns policies solely from fixed pre-collected datasets,
making it applicable to use-cases where data collection is expensive or risky. Consequently,
the performance of these offline learners is highly dependent on the dataset used. Still the
questions of how this data is collected and what dataset characteristics are needed are not
thoroughly investigated. Simultaneously, evolutionary methods have reemerged as a
promising alternative to classic RL, leading to the field of evolutionary RL (EvoRL), combining
the two learning paradigms to exploit their supplementary attributes. This study aims to join
these research directions and examine the effects of Genetic Programming (GP) on dataset
characteristics in RL and its potential to enhance the performance of offline RL algorithms. A
comparative approach was employed, comparing Deep Q-Networks (DQN) and GP for data
collection across multiple environments and collection modes. The exploration and
exploitation capabilities of these methods were quantified and the effects of Semantic Genetic
Operators (GOs) and bloat control on these metrics were assessed. Lastly, a comparative
analysis was conducted to determine whether data collected through GP led to superior
performance in multiple offline learners. The findings indicate that GP demonstrates strong
and stable performance in generating high-quality experiences with competitive exploration.
GP exhibited lower uncertainty in experience generation compared to DQN and produced high
trajectory quality datasets across all environments. More offline algorithms showed
statistically significant performance gains with GP-collected data than trained on DQNcollected trajectories. Furthermore, their performance was less dependent on the
environment, as the GP consistently generated high-quality datasets. This study showcases
the effective combination of GP's properties with offline learners, suggesting a promising
avenue for future research in optimizing data collection for RL.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
The dataset for this thesis can be accessed directly in GitHub through the following link: https://github.com/dropthedave/offlineRL_thesis A record of the dataset is also available on NOVA Research Portal: https://novaresearch.unl.pt/en/datasets/software-using-genetic-programming-to-improve-data-collection-for
The dataset for this thesis can be accessed directly in GitHub through the following link: https://github.com/dropthedave/offlineRL_thesis A record of the dataset is also available on NOVA Research Portal: https://novaresearch.unl.pt/en/datasets/software-using-genetic-programming-to-improve-data-collection-for
Palavras-chave
Offline Reinforcement Learning Genetic Programming Evolutionary Reinforcement Learning Evolutionary Algorithms Data Efficiency SDG 9 - Industry, innovation and infrastructure
