FCT: DI - Dissertações de Mestrado
URI permanente para esta coleção:
Navegar
Entradas recentes
- Improving user experience in Apps generated on a low-code/no-code platform with table actionsPublication . Fresco, Jorge Neves; Aveiro, David; Goulão, MiguelLow-code and no-code platforms have revolutionized software development by enabling individuals with minimal programming expertise, also known as citizen developers, to create functional applications through intuitive graphical interfaces and pre-built compo- nents. Despite their benefits, many of the applications generated by these platforms suffer from usability and inclusiveness challenges due to standardized, rigid interface compo- nents that do not adequately accommodate diverse cognitive and interactional needs. This research focuses on investigating ways to improve the usability and inclusiveness of ap- plications generated by the Dynamic Information System Modeler and Executor (DISME), an open-source low-code platform (LCP) developed by the Enterprise Engineering Lab, a research group part of the Madeira Branch of NOVA Laboratory for Computer Science and Informatics (NOVA LINCS). The study was conducted by evaluating the current state of the art in terms of usability and inclusiveness of applications produced by low-code platforms, as well as through a comparative analysis between DISME and other leading market alternatives, such as OutSystems, PowerApps, and Mendix, to identify potential areas for improvement. Several functionalities and differences were identified, but the focus ultimately converged on a more specific objective: the implementation of Table Ac- tions in applications generated by DISME. This feature, which allows interactive actions to be embedded directly into table rows, was identified as a crucial addition to improving usability of the applications generated by DISME. The implementation of table actions re- quired substantial architectural changes, including the development of a new Action Items Management component within DISME. Due to time constraints, it was not possible to conduct usability testing with real DISME users. Instead, a comparison was carried out by sharing a Google Forms questionnaire, presenting participants with two versions of a page to manage hearings part of an application generated by DISME, one with the traditional execution method and one with table actions, alongside a step-by-step demonstration of how to reschedule a hearing requested by a citizen. The results indicated that participants found executing tasks through table actions to be more intuitive, more efficient, less prone to errors, and overall more satisfying compared to the original method. This comparison highlights the potential of table actions to reduce interaction complexity, improve task execution efficiency, and move DISME toward a more user-centered design. In addition to this technical contribution, we also identified a set of promising improvements for both DISME and the current implementation of the Municipality Hearing Process application generated by DISME.
- RedShell: A Generative AI-Based Approach to Ethical HackingPublication . Bessa, Ricardo Jorge Matos; Claro, Rui; Lourenço, João; Trindade, JoãoThe application of Machine Learning techniques in code generation is now a common practice for most developers. Tools such as ChatGPT from OpenAI leverage the natural language processing capabilities of Large Language Models to generate machine code from natural language descriptions. In the cybersecurity field, red teams can also take advantage of generative models to build malicious code generators, providing more automation to pentest audits. However, the application of Large Language Models in malicious code generation remains challenging due to the lack of data to train and evaluate offensive code generators. In this work, we propose RedShell, a tool that allows ethical hackers to generate malicious PowerShell code. We also introduce a ground truth dataset, combining publicly available code samples to fine-tune models in malicious PowerShell generation. Our experiments demonstrate the strong capabilities of RedShell in generating syntactically valid PowerShell, with over 90% of the generated samples successfully parsed without errors. Furthermore, our specialized model was able to produce samples that were semantically consistent with reference snippets, achieving a competitive performance on standard output similarity metrics such as edit distance and METEOR, with their similarity scores exceeding 50% and 40%, respectively. We also conducted a functional evaluation of the snippets generated by our tool, emphasizing their strong effectiveness in a wide range of offensive cybersecurity operations. This work sheds light on the state-of-the-art research in the field of Generative AI applied to pentesting and also serves as a steppingstone for future advancements, highlighting the potential benefits these models hold within such controlled environments.
- Data-driven system identification of reconfigurable physical systemsPublication . Łaszkiewicz, Piotr Marcin; Soares, Cláudia; Lourenço, PedroThis thesis presents a benchmark of Linear Time Varying (LTV) system identification meth- ods with focus on reconfigurable spacecraft. For this purpose we developed simulations of a spring mass damper system in five challenging scenarios including nonlinear damping, strong perturbations, and reconfiguration. To tackle the problem of system identification, we propose two approaches: 1. An integration of nonlinear autoregressive model with exogenous input (NLARX) with different deep learning architectures including multi- layer perceptron (MLP), residual network (ResNet), recurrent neural network (RNN), long-short term memory network (LSTM), and regression algorithms such as XGBoost or regression forest. 2. An application of Physics-informed Neural Networks (PINNs) to model the state-space representation of the system. In the benchmark, we enhance this class of machine-learning-based algorithms with fast closed-form identification from large-scale data for LTV systems (COSMIC), and compare them with methods originating from systems theory field, i.e., time-varying eigensystem realization algorithm (TVERA), time-varying observer/Kalman filter identification (TVOKID), and identification of LTV dynamical models with smooth or discontinuous evolution by means of convex optimiza- tion (LTVModels). We have found that the machine-learning-based methods dominate the benchmark in terms of state propagation accuracy, however it is not possible to select one best approach for the discussed problem.
- On Prognosis ModelingPublication . Gonçalves, Simão Guimarães Ferreira Novais; Fonseca, Miguel; Martins, FlávioWith the rising adoption of Eletronic Health Record (EHR) systems in healthcare facilities, we are presented with an unprecedented amount of rich information regarding patient care and the evolution of health conditions. This has given rise to research opportunities to model hospital data for several patient trajectory tasks, specifically predicting future diseases. Prognosis modeling is the task of predicting future diagnoses a patient will be as- signed to in the future, given their medical history. If a model can learn the temporal development of diagnosed health conditions, then hospitals are better prepared to pro- vide early recognition and treatment of patients. In this dissertation we explore different deep learning and machine learning models for the task of prognosis modeling. We make a thorough analysis and discussion on how to build a prognosis dataset from an EHR dataset, and work with two datasets: a publicly available EHR dataset Medical Information Mart for Intensive Care (MIMIC-III) of around 50 thousand patients, and a private dataset from Hospital da Luz (HDL) with almost half a million patients. On both datasets we conclude that both the sequence model (RNN) and the basic Multi-Layered Perceptron (MLP) model can’t achieve higher performance than the Logistic Regression model, and that all three models mostly learn the identity function for chronic conditions as well as some weak co-occurrence functions, while having very poor performance on predicting onset of new conditions for the patient. We also concluded our models were well-calibrated, and leveraged this finding to both provide a baseline abstention pipeline of uncertain predictions, as well as make the case for the importance of well-calibrated models in healthcare. Lastly, we developed Monte Carlo Dropout models to make use of the posterior predictive distribution as a proxy to uncertainty of predictions, and did not achieve significant improvements.
- Metrics and real-time monitoring in distributed systemsPublication . Paulico, Diogo Ramos; Leitão, JoãoWith the increasing popularity and diversity of Internet services and the affordability of devices that allow access to the Internet, services have increasingly evolved to adopt a distributed model, to be able to provide service to a large number of geographically distributed users. This has led to an increasing need for tools which make the tasks of prototyping, implementing and evaluating distributed applications and protocols more manageable. While there are many tools which facilitate the implementation and evaluation of distributed applications and protocols, there is a lack of solutions for monitoring and assessing the performance of distributed systems. To achieve this, the developers of those applications must implement their own solutions for monitoring, typically in an ad-hoc fashion, which is both time-consuming and error-prone. In this work we address this shortcoming by designing a solution that allows developers of distributed applications and protocols to obtain information about the hardware resource utilization, as well as user-defined metrics for their developed solutions, without making significant modifications to their code. To this end, our solution supplies developers with a library which they can use to capture and expose relevant metrics for their implemented distributed applications and protocols. Moreover, this solution provides a monitoring stack that can collect those metrics and display them to the developer, using relevant visualizations. The evaluation of our proposed solution demonstrates that it is easy to use, requiring minimal code changes to existing protocols and applications, with negligible complexity increase. Furthermore, the performance overhead introduced by our solution is minimal, making it suitable for use in a wide range of distributed systems.
- Facilitating the usage of ASPPublication . Martins, Rafael Lopes; Knorr, Jörg Matthias; Gonçalves, RicardoIn today’s world, many search and optimization problems require analyzing millions of possibilities, making it infeasible for humans to evaluate all options efficiently. Real- world challenges, such as scheduling (e.g., university course timetables), logistics (e.g., warehouse management), and governmental planning (e.g., determining optimal locations for infrastructures), must be solved in optimized ways to benefit society. To address these challenges, Answer Set Programming (ASP) was developed as a powerful tool to model and solve complex real-world problems through the use of solvers that take advantage of modern computational power. However, this form of declarative programming presents itself as a challenge for newcomers, who may struggle with modeling problems using ASP techniques. The Easy Answer Set Programming (Easy ASP) methodology was developed to simplify the learning process associated with ASP by introducing structured guidelines for program use and modeling. The Visual Studio Code extension EZASP implements that new methodology in a pratical manner. This tool supports both newcomers and experts by enforcing guidelines from the Easy ASP approach. While EZASP achieves its primary objective, significant limitations remain, such as undifferentiated syntax error messages and the absence of tools for code improvement. This thesis addresses these gaps by enhancing EZASP with dynamic functionalities, including advanced error detection with detailed messages and tools for reorganizing code automatically. The primary goal is to provide newcomers with a more intuitive, supportive introduction to ASP problem modeling, helping them overcome initial challenges. Additionally, these improvements aim to benefit experienced ASP users through quality-of-life (QoL) features that enhance existing ASP tools.
- Water simulation and optimization for biomimetic propulsion in underwater vehicles using HLSLPublication . Lobo, Lucas Correia; Nóbrega, Rui; Lobo, VictorDespite the large importance of the oceans and other bodies of water to our life, they remain largely unexplored. Current solutions to explore these environments include Biomimetic Underwater Vehicles (BUVs), unmanned vehicles inspired by biological life, usually in their locomotion, which can be used to navigate otherwise complicated terrain, and can be stealthier than screw-like propellers in underwater vehicles, among other benefits. Despite these benefits their development is still hampered by one common factor, the control mechanism. The European SABUVIS project, in which this thesis is included, aims to tackle this problem by training a control mechanism through a reinforcement learning algorithm. This thesis presents the development and evaluation of a real-time fluid simulation tool designed to support the SABUVIS project. Built using the Unity3D game engine and accelerated through High Level Shader Language (HLSL), the simulation models fluid dynamics using an Eulerian (grid-based) approach. Key features include customizable resolution, multiple rendering modes, and post-processing effects to enhance visualization and user interaction. The simulation was tested for correctness and performance under varying conditions. Results confirmed that the system accurately replicates expected fluid behavior, including interactions with static obstacles, and performs efficiently enough to support future integration with reinforcement learning algorithms. Performance benchmarks evaluated the impact of resolution, overrelaxation, and visualization features on execution time.
- Towards Inclusive Communication. Applying AR Glasses and VLMs for Real-World and Context-Aware Sign Language TranslationPublication . Arruda, Pedro Guilherme Moreira; Birra, Fernando; Sousa, JoanaMore than 70 million Deaf or Hard-of-Hearing (DHH) individuals use sign language as a means of communication. Still, existing Sign Language Translation (SLT) systems struggle with real-world applicability due to data scarcity, lack of context-awareness, and limited integration with portable, hands-free technologies. This work proposes a novel, gloss-free and context-aware SLT system tailored for real-world scenarios, combining a Vision-Language Model (VLM) and Augmented Reality (AR) glasses to provide accurate translations in everyday situations while also focusing on being efficient and sustainable. In contrast to most recent systems that rely on large-scale resources, this work was de- veloped under strict hardware limits, which motivated the design of a resource-optimized fine-tuning strategy that reduced training costs by about 40%. Similarly, this also led to a focus on targeted and lightweight architectural changes, resulting in the Motion- CLSAdapter, a module that greatly improved temporal motion modeling. Conversational context was incorporated through the creation of a small synthetic dataset and prompting techniques, while cloud-based model deployment and the AR glasses enabled hands-free and interactive use of the application. The results show that these optimizations led to stronger translation metrics, improved clustering of signs, and more coherent dialogue-level translations, while latency remained within the threshold defined for natural interaction. Despite challenges in robustness under real-world capture conditions and error propagation in extended contexts, the prototype demonstrates the feasibility of delivering context-aware SLT through wearable technology. Importantly, it also represents the first work to integrate SLT with AR glasses. Overall, this dissertation provides a solid step towards more inclusive communication, demonstrates that meaningful progress can be achieved even under strict compute limits, and lays a foundation for future systems that support natural communication between signers and non-signers in everyday environments.
- Open Data Repository for a Fuel Break Monitoring SystemPublication . Barros, Francisco Martins de; Damásio, Carlos; Pires, JoãoThe prevention of forest fires is one of the greatest challenges faced by Portugal, as failures in this prevention lead to severe economic, environmental, and human consequences. One of the methods implemented to reduce these consequences is the creation and maintenance of fuel breaks, which are strategically cleared areas that act as a barrier to slow or contain the spread of wildfires. To support the monitoring of fuel breaks, an information system is in place to aid in fire prevention and mitigation. However, the inspection and maintenance of the breaks are complex processes due to their vast extent and the need for regular inspections to ensure that vegetation levels remain within the established limits. Management involves various entities, ranging from national institutes to local municipalities. The absence of a centralizedandpublicly accessible data repository for entitiescomplicates the management of these zones. To address the aforementioned challenge, an open data repository was developed to manage the spatiotemporal data present in the Floresta Limpa project. This repository is made available through multiple APIs and modules to facilitate the collection and dissemination of data for the entities responsible for managing the fuel breaks. The platform supports the dissemination of spatiotemporal data in various open data formats and specifications, enabling seamless integration with other platforms and responsible entities. Additionally, the platform includes a data provenance system to track the origin and transformations of the collected data. This ensures that all modifications are documented and verifiable, reinforcing trust in the data used by the responsible entities. With this capability, it is possible to validate the integrity of the information, facilitating analysis and informed decision-making related to the maintenance of the zones, especially with the use of machine learning models.
- MISSING DATA IN MACHINE LEARNING: IMPACT OF REPLACEMENT TECHNIQUES FOR MISSING DATA IN SOME SUPERVISED MACHINE LEARNING MODELSPublication . Domingos, Euclides de Almeida; Amaral, PaulaSupervised models are widely used for classification and regression tasks, as they aim to extract information from raw data and make inferences and predictions for new unseen datasets, discovering patterns or explaining relations between the data. This goal can be compromised for various reasons, one of which is the existence of missing data in the training set. The likelihood of missing data occurring in datasets is high, in particular in some areas like healthcare and medical records, surveys and social science data, finance and banking, among others. This dissertation aims to study and compare different missing data imputation techniques and evaluate their impact on the performance of supervised classification models. Simple techniques such as mean, median and mode imputation were applied to datasets with different levels of missingness (3%, 10%, 20%, and 30%), and different sizes, evaluating the impact on models such as KNN, Logistic Regression, SVM, Decision Tree, Random Forest, and Neural Networks. Model performance was evaluated using accuracy, precision, recall and F1-score metrics. The results show that the choice of the imputation technique significantly influences the quality of the prediction, confirming the importance of properly handling missing data.
