A carregar...
Projeto de investigação
LASIGE - Extreme Computing
Financiador
Autores
Publicações
DGH-GO
Publication . Asif, Muhammad; Martiniano, Hugo F. M. C.; Lamúrias, André; Kausar, Samina; Couto, Francisco M.; NOVALincs; BioMed Central (BMC)
Background: Complex diseases such as neurodevelopmental disorders (NDDs) exhibit multiple etiologies. The multi-etiological nature of complex-diseases emerges from distinct but functionally similar group of genes. Different diseases sharing genes of such groups show related clinical outcomes that further restrict our understanding of disease mechanisms, thus, limiting the applications of personalized medicine approaches to complex genetic disorders. Results: Here, we present an interactive and user-friendly application, called DGH-GO. DGH-GO allows biologists to dissect the genetic heterogeneity of complex diseases by stratifying the putative disease-causing genes into clusters that may contribute to distinct disease outcome development. It can also be used to study the shared etiology of complex-diseases. DGH-GO creates a semantic similarity matrix for the input genes by using Gene Ontology (GO). The resultant matrix can be visualized in 2D plots using different dimension reduction methods (T-SNE, Principal component analysis, umap and Principal coordinate analysis). In the next step, clusters of functionally similar genes are identified from genes functional similarities assessed through GO. This is achieved by employing four different clustering methods (K-means, Hierarchical, Fuzzy and PAM). The user may change the clustering parameters and explore their effect on stratification immediately. DGH-GO was applied to genes disrupted by rare genetic variants in Autism Spectrum Disorder (ASD) patients. The analysis confirmed the multi-etiological nature of ASD by identifying four clusters of genes that were enriched for distinct biological mechanisms and clinical outcome. In the second case study, the analysis of genes shared by different NDDs showed that genes causing multiple disorders tend to aggregate in similar clusters, indicating a possible shared etiology. Conclusion: DGH-GO is a user-friendly application that allows biologists to study the multi-etiological nature of complex diseases by dissecting their genetic heterogeneity. In summary, functional similarities, dimension reduction and clustering methods, coupled with interactive visualization and control over analysis allows biologists to explore and analyze their datasets without requiring expert knowledge on these methods. The source code of proposed application is available at https://github.com/Muh-Asif/DGH-GO
Recommending Words Using a Bayesian Network
Publication . Santos, Pedro; Pato, Matilde; Datia, Nuno; Sobral, José; Leitão, Noel; Ramos Ferreira, Manuel; Gomes, Nuno; NOVALincs; MDPI - Multidisciplinary Digital Publishing Institute
Asset management involves the coordinated activities of an organisation to derive value from assets, which may include physical assets. It encompasses activities related to design, construction, installation, operation, maintenance, renewal, and asset disposal. Asset management ensures the coordination of all activities, resources, and data related to physical assets. Recording and monitoring all maintenance activities is a key part of asset management, often done using work orders (WOs). Technicians typically create WOs using “free text”, which can result in missing or ungrammatical words, making it difficult to identify trends and analyse information. To standardise the terminology used for the same asset maintenance operation, this paper proposes a method that suggests words to technicians as they complete WOs. The word suggestion algorithm is based on past maintenance records, and a Bayesian network-based recommender system adapts to present needs verified by technicians using implicit user feedback. Implementing this system aims to normalise the terms used by technicians when filling in a WO. The corpus for this work comes from asset management records collected in a health facility in Portugal operated by a private company.
Slim
Publication . Rosenfeld, Liah; Farinati, Davide; Rasteiro, Diogo; Pietropolli, Gloria; Rebuli, Karina Brotto; Silva, Sara; Vanneschi, Leonardo; NOVA Information Management School (NOVA IMS); Information Management Research Center (MagIC) - NOVA Information Management School
This poster presents Slim: an open-source Python library that provides the first ever framework for the Semantic Learning algorithm based on Inflate and deflate Mutation (SLIM-GSGP). Proposed by Vanneschi in 2024, SLIM-GSGP is a promising non-bloating variant of Geometric Semantic Genetic Programming (GSGP). The Slim library includes all existing SLIM-GSGP variants, as well as traditional GSGP and standard Genetic Programming (GP), facilitating comparative analysis and benchmarking. Additionally, Slim’s semi-modular architecture and parallel computation renders it not only fast but also user-friendly and easily extensible, thereby fostering progress in this emerging area of research.
An empirical study on the application of KANs for classification
Publication . Costa, Samuel Sampaio; Pato, Matilde; Datia, Nuno; NOVALincs
Kolmogorov-Arnold Networks (KANs) represent a breakthrough in deep learning, diverging from Multi-Layer Perceptrons (MLPs) by generalizing the Kolmogorov-Arnold representation theorem (KAT) to networks of arbitrary depth and width. This theorem facilitates the decomposition of multivariate functions into constituent one-dimensional elements, with learnable activation functions on weights and the sum operator on nodes. KANs have been shown to exhibit robust performance in function approximation, validated across mathematical, physical, and practical domains such as traffic prediction and medical diagnostics. Our study investigates KANs’ efficacy through comprehensive evaluations on OpenML, Kaggle and UCI datasets, with a focus on enhancing Human Activity Recognition systems. They demonstrate high classification performance compared to conventional machine learning approaches and MLPs. These findings underscore KANs’ potential as scalable, interpretable tools in modern machine learning applications given their favorable neural scaling laws.
Geometric Semantic Genetic Programming with Normalized and Standardized Random Programs
Publication . Bakurov, Illya; Muñoz Contreras, José Manuel; Castelli, Mauro; Rodrigues, Nuno Miguel Duarte; Silva, Sara; Trujillo, Leonardo; Vanneschi, Leonardo; Information Management Research Center (MagIC) - NOVA Information Management School; NOVA Information Management School (NOVA IMS); Springer Science Business Media
Geometric semantic genetic programming (GSGP) represents one of the most promising developments in the area of evolutionary computation (EC) in the last decade. The results achieved by incorporating semantic awareness in the evolutionary process demonstrate the impact that geometric semantic operators have brought to the field of EC. An improvement to the geometric semantic mutation (GSM) operator is proposed, inspired by the results achieved by batch normalization in deep learning. While, in one of its most used versions, GSM relies on the use of the sigmoid function to constrain the semantics of two random programs responsible for perturbing the parent’s semantics, here a different approach is followed, which allows reducing the size of the resulting programs and overcoming the issues associated with the use of the sigmoid function, as commonly done in deep learning. The idea is to consider a single random program and use it to perturb the parent’s semantics only after standardization or normalization. The experimental results demonstrate the suitability of the proposed approach: despite its simplicity, the presented GSM variants outperform standard GSGP on the studied benchmarks, with a difference in terms of performance that is statistically significant. Furthermore, the individuals generated by the new GSM variants are easier to simplify, allowing us to create accurate but significantly smaller solutions.
Unidades organizacionais
Descrição
Palavras-chave
Contribuidores
Financiadores
Entidade financiadora
Fundação para a Ciência e a Tecnologia
Programa de financiamento
6817 - DCRRNI ID
Número da atribuição
UIDB/00408/2020
