Context-driven Semantic Parsing to expand cross-domain Text-to-SQL

Nascimento, Inês Daniela Cardoso

http://hdl.handle.net/10362/178601

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
TCDMAA3631.pdf		4.17 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Nascimento, Inês Daniela Cardoso

Orientador(es)

Castelli, Mauro

Peres, Fernando Augusto Junqueira

Resumo(s)

This research study focuses on applying sequence-to-sequence models to approach conversational text-to-SQL by comparing different methodologies. This study proposes a pre-training in the T5-base model with WikiSQL data later fine-tuned with SParC data, which involves taxonomy, also known as schema linking and tree dependency parsing integrated. This model was later compared through an ablation study with training SParC data with a pre-train in T5-base model with fine-tuning, where all procedure was kept except the differentiations on the model development itself. The impact of taxonomy and dependency parsing were checked through model results. These methodologies were tested through four samples defined in advance using different database domains in a way that all benchmark was trained and tested. The metrics used were the execution with values and the exact set match without values that evaluates the capacity of the queries to access the database and bring a value or build the query structure. Thus, computational runtime and proper machines were described in order to evaluate the impact of the final result. The computational power challenges found suggests that future work requires to be developed using this alternative approach.

Descrição

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science

Palavras-chave

Natural Language Processing Conversational text-to-SQL Schema linking Dependency parsing

URI

http://hdl.handle.net/10362/178601

Coleções

NIMS - Dissertações de Mestrado em Ciência de Dados e Métodos Analíticos Avançados (Data Science and Advanced Analytics)

Licença CC

cclicense-by

Ver registo completo