Evolving Synthetic Data Generating Programs for Text Recognition Tasks - Evolving a Diverse Population of SyntheticData Generators using Genetic Programming with Novelty Search for Robust Text Recognition Models

Cooper, Daniel

http://hdl.handle.net/10362/114827

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
TAA0083.pdf		1.88 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Cooper, Daniel

Orientador(es)

Vanneschi, Leonardo

Resumo(s)

The quality of models produced via supervised machine learning depends on both the learning algorithm used and the training data available to learn from. The work presented in this paper focuses on optimizing training data directly and compares different methods for generating synthetic training data while holding the learning algorithm constant. In this paper, the author proposes a new algorithm that leverages genetic programming to create a diverse population of data generating programs which are sampled from to create training data for the given task. This is applied within the context of building a robust text recognition model that can be integrated into a broader document processing software solution that supports multiple domains.

Descrição

Dissertation presented as partial requirement for obtaining the Master’s degree in Data Science and Advanced Analytics

Palavras-chave

Deep Learning Novelty Search Genetic Programming Synthetic Data Algorithms Computer Vision Document Processing Document Understanding

URI

http://hdl.handle.net/10362/114827

Coleções

NIMS - Dissertações de Mestrado em Ciência de Dados e Métodos Analíticos Avançados (Data Science and Advanced Analytics)

Licença CC

cclicense-by

Ver registo completo