| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 674.71 KB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Imbalanced datasets pose a significant and longstanding challenge to machine learning algorithms, particularly in binary classification tasks. Over the past few years, various solutions have emerged, with a substantial focus on the automated generation of synthetic observations for the minority class, a technique known as oversampling. Among the various oversampling approaches, the Synthetic Minority Oversampling Technique (SMOTE) has recently garnered considerable attention as a highly promising method. SMOTE achieves this by generating new observations through the creation of points along the line segment connecting two existing minority class observations. Nevertheless, the performance of SMOTE frequently hinges upon the specific selection of these observation pairs for resampling. This research introduces the Genetic Methods for OverSampling (GM4OS), a novel oversampling technique that addresses this challenge. In GM4OS, individuals are represented as pairs of objects. The first object assumes the form of a GP-like function, operating on vectors, while the second object adopts a GA-like genome structure containing pairs of minority class observations. By co-evolving these two elements, GM4OS conducts a simultaneous search for the most suitable resampling pair and the most effective oversampling function. Experimental results, obtained on ten imbalanced binary classification problems, demonstrate that GM4OS consistently outperforms or yields results that are at least comparable to those achieved through linear regression and linear regression when combined with SMOTE.
Descrição
Farinati, D., & Vanneschi, L. (2024). GM4OS: An Evolutionary Oversampling Approach for Imbalanced Binary Classification Tasks. In S. Smith, J. Correia, & C. Cintrano (Eds.), Applications of Evolutionary Computation: 27th European Conference, EvoApplications 2024, Held as Part of EvoStar 2024, Aberystwyth, UK, April 3–5, 2024, Proceedings, Part I (Vol. 1, pp. 68-82). (Lecture Notes in Computer Science; Vol. 14634). Springer Nature Switzerland AG. https://doi.org/10.1007/978-3-031-56852-7_5 --- This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia), under the project - UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS.
Palavras-chave
Oversampling Imbalanced Data Binary Classification Genetic Programming Genetic Algorithms Theoretical Computer Science General Computer Science
Contexto Educativo
Citação
Editora
Springer Nature Switzerland AG
