Logo do repositório
 

FCT: DI - Dissertações de Mestrado

URI permanente para esta coleção:

Navegar

Entradas recentes

A mostrar 1 - 10 de 1037
  • Large Language Model for Querying Databases in Portuguese
    Publication . Figueiredo, Lourenço Maria Morais da Silva Pinto de; Marques, Nuno; Cavique, Luís
    This dissertation presents a system designed to generate SQL queries from the Portuguese natural language, using Cedis’ sports facility management application ESPORT.IA as a case study. The system’s performance was validated with real-world client queries, focusing on its ability to interpret synonyms and handle complex query requirements. Experiments highlighted challenges such as model hallucination and time interval pro- cessing, emphasizing the need for precise context in improving query accuracy. A custom benchmark, ESPORTNL2SQL, based on the Spider SQL Hardness Criteria, was developed to evaluate the system across varying query difficulties. For the benchmark’s first version, the GPT-4 Turbo and GPT-4o scored 88.79% and 80.17%. For the second version, they scored 75.00% and 80.17% respectively. Results gave evidence that adding contextual information improves accuracy. While the system shows potential in simplifying database interactions through natural language, limitations exist, particularly in managing larger contexts without degrading performance. This research offers insights into the practical application of NL2SQL systems in enterprise environments and their potential for broader deployment.
  • Automated Data Extraction from Documents Using AI: Response Automation and Integration in Document Management Systems
    Publication . Silva, Afonso Pimenta de Oliveira e; Silva, Joaquim; Campos, Adriano; Roque, Miguel
    Organizations are increasingly challenged by the need to process large volumes of un- structured documents such as invoices, receipts, and contracts. Traditional rule-based approaches often fail to capture the variability and complexity of such data, leading to inefficiencies and high error rates. This thesis investigates two AI-driven strategies for automated information extraction from unstructured documents. The first combines Optical Character Recognition (OCR) through Tesseract with a Large Language Model (LLaMA) to identify and extract key fields from recognized text. The second employs Donut, a transformer-based vision-language model designed for end-to-end document understanding, which directly maps document images to structured outputs without an intermediate OCR step. A comparative evaluation is conducted across multiple mea- sures like character and word error rates, extraction accuracy, latency, and scalability to assess the trade-offs between modular and end-to-end pipelines. The results highlight the strengths and limitations of each method, providing insights into the practical deployment of intelligent document processing systems. This work contributes to advancing the field by offering a systematic comparison of OCR-LLM pipelines versus transformer-based end-to-end models for real-world unstructured document processing.
  • Ferramenta para conversão de linguagem natural em ações
    Publication . Lopes, Rita Gomes; Carvalho, Alberto; Medeiros, Pedro
    A rápida digitalização de serviços aumentou a necessidade de ferramentas capazes de compreender instruções em linguagem natural e de realizar ações em conformidade. Con- tudo, soluções tradicionais, baseadas em regras rígidas, revelam-se insuficientes perante interações complexas, comprometendo a eficácia do apoio ao cliente e a experiência do utilizador. Neste contexto, esta dissertação apresenta um sistema conversacional avançado, focado na gestão de marcações em serviços consulares digitais e integrável em aplicações web, nomeadamente nas soluções da Opensoft. A solução é composta por três componentes principais: um agente conversacional, um servidor e uma interface de utilizador. O agente foi desenvolvido com recurso a um Large Language Model (LLM) e à framework LangChain. O servidor, utilizando FastAPI, assegura a comunicação com a interface de utilizador, sendo também responsável pela invocação do LLM. A interface, desenvolvida em React, disponibiliza funcionalidades como internacionalização, gestão dinâmica de tema e mecanismos de feedback, proporcionando uma experiência de utilização moderna. A solução foi ainda alvo de avaliações da sua robustez e fiabilidade, através de testes manuais, unitários, de integração e end-to-end, realizados com recurso a ferramentas como Pytest e RAGAs. Estes testes asseguraram a utilização correta das ferramentas pelo agente, a consistência das respostas produzidas e a conformidade com os requisitos de segurança definidos. Os resultados demonstram que a integração de LLMs em agentes conversacionais com acesso a ferramentas ultrapassa as limitações de chatbots tradicionais, permitindo melhores interpretações de linguagem natural, execuções de ações fiáveis e respostas ajustadas ao contexto. Além disso, a arquitetura modular e escalável desenvolvida constitui uma base sólida para adaptação a diferentes domínios, contribuindo para modernizar o atendimento digital e melhorar a experiência do utilizador.
  • Autoformalização: Inteligência artificial generativa na formalização de teoremas
    Publication . Atalaia, Bernardo João Ferreira; Marques, Nuno; Janota, Mikoláš; Araujo, João
    A autoformalização é a tarefa de traduzir automaticamente conteúdos matemáticos escritos em linguagem natural para expressões em linguagem formal. Este processo tem ganho força com os avanços recentes nos Grandes Modelos de Linguagem Large Language Models (LLMs), que demonstram crescente proficiência tanto em linguagens naturais como formais. No entanto, os LLMs, por si só, ainda não são capazes de fornecer autoformalização de forma consistente e fiável, sobretudo em domínios matemáticos especializados. Este trabalho propõe uma framework integrada que combina LLMs e a ferramenta ProverX para automatizar o processo de geração automática de datasets para autoforma- lização em Prover9/Mace4. Partindo de dados formais, esta ferramenta abrange as etapas de informalização, formalização e validação, afim de gerar as respetivas representações em linguagem natural dos dados iniciais. Em testes realizados sobre um dataset de 438 classes matemáticas, observou-se que ajustes nos hiperparâmetros e no input da LLM melhoraram significativamente os resultados, alcançando 346 informalizações válidas e até 108 classes válidas em uma única execução. A principal contribuição consiste em demonstrar a viabilidade prática da geração automática de datasets assistida por LLMs, disponibilizando uma framework funcional e extensível para a criação de pares de teoremas e classes matemáticas em representações formais e informais. Esta abordagem estabelece uma base para pesquisas futuras em autoformalização em larga escala, tirando partido da integração com a ferramenta ProverX.
  • A Textbook of Verified OCaml Programs. A deductive study on algorithms and data structures
    Publication . Gasparinho, Pedro Ramos; Pereira, Mário
    We live in a society that has an ever-increasing dependency on critical software. Even minor programming oversights can have disastrous consequences, ranging from endangering human and animal lives to economic loss. Thus, it is crucial to verify the correctness of such programs, which can be achieved with deductive verification techniques. However, the teaching of such techniques is hampered by the small number of bibliographical works in this field. Our goal with this work is ambitious: to write a textbook on verified algorithms and data structures, fully automated or as close to it as possible. We believe that OCaml is the perfect programming language choice for this endeavour, due to its multi-paradigm nature, and its status as a general-purpose language with use cases in both academia and the industry. And, most importantly, it has good support for automated deductive verification with Cameleer and GOSPEL. The current version of the textbook includes more than 1000 lines of verified OCaml code spread across 6 chapters and more than 40 programs. Moreover, we offer a blend of functional and imperative case studies, and, overall, a substantial subset of constructs found in OCaml, including its module system. In this document, we discuss our thought process and design decisions behind the creation of a textbook on verified OCaml algorithms. Namely, identifying which classes of algorithms to tackle, which algorithms were chosen within those classes, and how to orga- nize the contents of the textbook, to name a few topics. Additionally, this is accompanied by an overview of theoretical and practical concepts concerning programming languages, algorithm analysis, deductive verification, and the selected tools.
  • A Model-Driven Approach to the Generation of Front-Ends
    Publication . Nascimento, Everton; Cunha, Jácome; Amaral, Vasco
    Today most companies have many ways to connect and gather data from their clients. Some companies provide spreadsheets to be filled and submitted, others have websites with forms to collect information or even an Android or iOS app, etc. Multiple types of front-ends are now the norm in the modern world. This brings advantages of connectivity, interactivity, and the data gathering possibilities are great. But in a fast-evolving world, time is important. Therefore, ensuring a faster and more efficient way to generate, update and guarantee the coherence between front-ends would be very relevant. In this work, we will try to tackle the problem of generating in a coherent manner multiple types of front-ends from the common component which is the database schema. Our aim is to create a solution that automatically generates front-ends from a database schema definition in SQL code. Our solution is grounded in foundations and tooling found in model-driven development and DSLs. We want to automatically generate from a database schema specification in SQL: spreadsheets and web pages (and in the future possibly any desired template) already containing the restrictions and verifications needed. By doing this, we can decrease the likelihood of errors and guarantee the efficiency and correctness of all elements of the problem. As a case study, we have the goal of solving a specific real-world case of the Portuguese General Inspection of Finance (IGF). In IGF, we have identified the need for synchroniza- tion and consistency between database, spreadsheets, and web pages. By addressing this need we are trying to create a robust solution that solves various problems without the need to learn new techniques or technology with a significant learning curve. Our solution was also tested with multiple schemas and we are able to generate web pages from the schema definitions of multiple SQL dialects.
  • Stakeholder-Driven Web Platform for Collaborative Development of Sustainable Tourism Products
    Publication . Cordeiro, André Miguel Mangericão; Madeira, Rui; Romão, Teresa
    This dissertation presents the design and development process of a web platform pro- totype designed within the scope of the Sustainability-oriented, Highly-interactive, and Innovation-based Framework for Tourism Marketing (SHIFT) project. The platform is designed to promote the active involvement of all stakeholders in the co-creation pro- cess of tourism products, thereby strengthening this vulnerable sector by enabling the development of more sustainable and well-adapted solutions. The platform integrates a wide range of features to support collaboration, including idea submission and management, feedback mechanisms, discussion spaces, and gamified elements to encourage sustained engagement. Its development followed an iterative and incremental process, with requirements gathering, conceptual modeling, and interface design refined through usability testing and team feedback. A user study to assess the solution was conducted with nine participants, combining practical testing with questionnaires. Results showed that most participants found the concept original and valuable, highlighting the importance of giving end users and residents a stronger voice in tourism innovation. From a usability standpoint, the platform was considered intuitive and straightforward to use, with positive ratings in the System Usability Scale. Gamified features such as levels and daily tasks were particularly well received, while digital rewards were viewed more critically by the participants. Overall, the findings suggest that the SHIFT platform represents a promising and innovative contribution to collaborative tourism design. This work has already led to the publication of a poster at MUM 2024 (International Conference on Mobile and Ubiquitous Multimedia), and a full paper is in preparation for submission to a top-tier conference after completing a broader evaluation study with a more diverse group of participants. Future work will focus on refining gamification strategies, improving interface responsiveness, and scaling the evaluation to validate the platform’s impact across different types of tourism stakeholders.
  • From Augmented to Virtual Reality: Exploring the Virtuality Continuum in Parkinson’s Disease Therapy
    Publication . Viana, João Miguel Duarte; Santos, Pedro; Madeira, Rui; Macedo, Patrícia; Nóbrega, Rui
    Parkinson’s Disease (PD) is a progressive neurological disorder that affects movement through symptoms such as tremors, rigidity, and slowness, with a strong impact on patients’ quality of life. Since current treatments are mainly symptomatic, new complementary strategies are needed. One of the main challenges is the lack of motivation in patients, which affects their adherence to exercise plans. In this context, interactive tools that can engage patients play an important role. This dissertation explores the use of immersive technologies for PD therapy, focusing on how different levels of immersion influence the user experience. A serious game, Fruit Picker XR, was developed for the Meta Quest platform, supporting a progression from Augmented Reality (AR) with fewer virtual elements to fully immersive Virtual Reality (VR). This design makes it possible to test multiple steps along the virtuality continuum, instead of just comparing AR and VR as two extremes. Two user studies were carried out. The first compared AR and VR versions with healthy participants and people with PD. The second extended this work by testing three different levels of immersion in a refined version of the game integrated with a therapeutic platform. Results from both studies showed that PD patients found AR easier and less demanding, while VR offered stronger immersion but higher workload. The intermediate level in the second study suggested a possible balance, combining accessibility with greater sense of presence. Usability scores were positive overall, and eye-tracking data provided additional insights into how attention shifted across the different immersion levels. The study confirms the potential of immersive technologies for rehabilitation, while also pointing out their limitations. Future work should include tests with larger groups of PD patients, improvements to the questionnaires to make them more suitable for clinical populations, and longitudinal studies to assess therapeutic effectiveness. This work has already been disseminated in two co-authored scientific publications, and provides a basis for further research on the use of immersive technologies in therapy.
  • Paprika: A Popularity, Affinity and Capacity-Aware Replica Placement Algorithm
    Publication . Pacheco, Xavier Pinheiro; Paulino, Hervé
    The placement of data replicas on nodes at the network edge is a key strategy to maximize user experience, as it minimizes data access latency. However, care must be taken in data placement, as edge servers do not have the same storage capacity as a cloud server. In this dissertation, we present Paprika, an algorithm for replica selection and placement adapted to the requirements of execution and storage environments at the network periphery, which makes decisions based on factors such as data popularity, affinity between data from different regions, and storage capacity of the servers supporting these regions. Given that the replica placement problem is NP-Complete, Paprika relies on a heuristic that operates in a decentralized structure, allowing for quick convergence while calculating relevant solutions for each region. In this work, we implement Paprika in the Gardenbed framework and create a simulation platform capable of simulating all necessary aspects of Gardenbed. Additionally we tested and compared various heuristic algorithms for the data placement section of Paprika. The performance evaluation reported in this dissertation shows that not only does Paprika consistently gives better solutions, containing a fairer representation of the user’s interests, but also allows for niche data in the origin region to be disseminated to other regions, where interest in that data is higher.
  • WFA Sim — a Wi-Fi Aware Simulator
    Publication . Tomás, João Lopes; Teófilo, António; Paulino, Hervé; Lourenço, João
    Wi-Fi Aware is a new wireless technology that eases device-to-device communication without requiring traditional infrastructure, unlike conventional Wi-Fi. This technology was designed with key features such as power efficiency and dynamic connectivity. Wi-Fi Aware operates through a system where devices announce a service (Publishers) and others utilize that service (Subscribers). However, Wi-Fi Aware faces a significant challenge: the absence of straightforward testing tools. The current way of testing Wi-Fi Aware applications requires the use of physical Android devices that support the technology. This introduces several issues, such as the fact that these devices are expensive, have limited battery life, and must be physically moved to test different scenarios, making it very hard to test complex networks that could happen in the real world. This work proposes a Wi-Fi Aware simulator that provides an interface as close as possible to the one available through Android while respecting the Wi-Fi Aware protocol defined by Wi-Fi Alliance. This simulator incorporates real device limitations, such as battery life and other Wi-Fi Aware related characteristics. It also enables testing of various scenarios, such as the number of nodes, their movement speeds and their dynamic joining and leaving of the network. Finally, this simulator also provides a visualization tool that allows for the easy understanding of the current state of the network, where characteristics such as the different services, different device roles and the current communication links can be seen. By simplifying the Wi-Fi Aware testing process, this dissertation aims to increase the adoption of the technology, making it more accessible to developers and encouraging its integration into the broader ecosystem.