Neto, Miguel de Castro Simões FerreiraJardim, João Bruno Morais de SousaCruz, Miguel Almeida Coutinho Teixeira da2024-11-042024-11-042024-10-30http://hdl.handle.net/10362/174529Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsAs Lisbon continues to attract a growing number of visitors, the development of a tailored chatbot catering to the tourists’ unique needs becomes increasingly valuable. In this thesis, we develop an engaging general-purpose chatbot that can fulfill the unique needs of Lisbon’s tourists. Utilizing a web-scraped knowledge base with over 2000 website pages, the chatbot offers recommendations for tourist routes, events and places to visit and answers queries about Lisbon. Two evaluations datasets, one for question-answering and the other for recommendations, were created based on synthetic data. Various experiments, including data preprocessing, exploration of different ChatGPT models, and improvements to the retrievalaugmented generation pipeline, were conducted to improve the chatbot. This thesis contributes to literature on chatbot development, emphasizing the benefits of more advanced machine learning models in the tourism industry. It also demonstrates the potential of iterative optimization of large language models and evaluation based on synthetic data for downstream tasks.engchatbottransformerchatgpttourismnatural language processingSDG 11 - Sustainable cities and communitiesLisa: A touristic chatbot for Lisbonmaster thesis203777190