| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 7.87 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
This project falls under the category of database optimization problems and has the aim to enhance
the performance of a data replication process between two databases systems (OLTP and OLAP). In
DBMS, there are hundreds of knobs that are typically tuned manually by engineers. The configuration
of such parameters influences the performance of the data replication process as well as the whole
system. The goal of this project is to minimize latency, defined by the time that it takes for the data
to be replicated from the source database to the target database. It is important to keep latency as
low as possible in order to avoid long delays in the replication process which eventually leads to
outdated analytics for the customers. As a means to approach this problem, a simulation
environment that captures the state of the replication process between the two databases was
designed to collect data. Then, it was necessary to represent numerically the incoming workload for
this case study. Lastly, two machine learning approaches were implemented to automate the
configuration of the parameters. The first solution is based on a reinforcement learning agent
formulated as a Markov decision process and the second is having a predictive model in combination
with Bayesian optimization search. The initial experimental results obtained have shown
improvements in the performance measure when comparing to the traditional approach.
Descrição
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
Palavras-chave
Databases Machine Learning Reinforcement Learning Python Auto-tuning
