| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 1.35 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
This dissertation presents the development and evaluation of a chatbot designed to support access to information related to PT2030 incentives and IAPMEI procedures. The proposed system is based on a Retrieval-Augmented Generation (RAG) architecture that combines a Large Language Model (LLM) with a hybrid retrieval strategy integrating dense semantic search and sparse lexical matching. The knowledge base was constructed by web scraping official institutional sources, followed by data cleaning, preprocessing, chunking, and vectorization. To optimize system performance, multiple configurations were systematically tested, including variations in embedding models, chunk size and overlap, dense-sparse retrieval weighting, number of retrieved context chunks, and the inclusion of reranking mechanisms. Performance evaluation was conducted using the RAGAS framework, assessing Answer Relevancy, Faithfulness, Context Precision, and Context Recall, and was complemented by a questionnaire-based stakeholder evaluation to capture perceptions of usability, response quality, and overall satisfaction. The results indicate that hybrid retrieval strategies with balanced or slightly sparse-weighted configurations achieved the most robust performance, while appropriately tunedchunking strategies contributed to stronger answer relevancy and faithfulness across tested scenarios. Additionally, reranking mechanisms introduced additional latency without providing consistent performance improvements. Stakeholder feedback further suggested that the chatbot was perceived as intuitive, useful, and valuable as an institutional support tool. The findings demonstrate the practical viability of RAG-based conversational systems for domain-specific public sector information retrieval, while highlighting key considerations for optimizing retrieval architectures in institutional chatbot deployments.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business Analytics
Palavras-chave
Intelligent Chatbot Retrieval-Augmented Generation Large Language Models Public Funding Incentives Small and Medium-Sized Enterprises Digital Government
