Logo do repositório
 
Publicação

Anchor-Based Density Undersampling with Swarm Stabilization

datacite.subject.fosCiências Naturais::Ciências da Computação e da Informação
datacite.subject.sdg09:Indústria, Inovação e Infraestruturas
dc.contributor.advisorDamásio, Bruno Miguel Pinto
dc.contributor.authorPáris, Pedro Maria Queiroz Pereira Rocha
dc.date.accessioned2026-04-20T15:08:11Z
dc.date.available2026-04-20T15:08:11Z
dc.date.issued2026-04-13
dc.descriptionDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Science
dc.description.abstractClass imbalance is a pervasive problem in machine learning, where one class, often the class of interest, is underrepresented relative to others. This imbalance can severely compromise the performance of standard classifiers, which tend to favor the majority class. A well-known example is a classifier that predicts a 99.9% majority class and a 0.1% minority class, achieving high accuracy while being practically useless. This thesis introduces Anchor-Based Density Undersampling with Swarm Stabilization (ABDUSS), a novel resampling method that employs a swarm-inspired undersampling heuristic to intelligently reduce majority-class instances based on data density. The method (i) estimates minority-class density via feature-wise smoothing, (ii) selects a high-density minority anchor, (iii) applies a lightweight continuous swarm update toward this anchor to stabilize the search space, and (iv) removes majority samples within a density-scaled radius using a KD-tree range query. Unlike optimization-based PSO mask methods, ABDUSS avoids inner-loop validation optimization, operating instead as a fast density-guided heuristic with a single scaling parameter. Experimental evaluation on three imbalanced datasets (credit card fraud, telco churn, and customer satisfaction) using XGBoost shows that ABDUSS reduces overlap near minority clusters and achieves competitive F1-score and AUC, with improved minority recall in several scenarios. These results indicate that ABDUSS provides a simple, computationally efficient, and reproducible baseline for density-aware undersampling in imbalanced classification tasks.eng
dc.identifier.tid204297354
dc.identifier.urihttp://hdl.handle.net/10362/202384
dc.language.isoeng
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectClass imbalance
dc.subjectDensity-based undersampling
dc.subjectInstance selection
dc.subjectData-level preprocessing
dc.subjectImbalanced classification
dc.subjectSwarm-inspired methods
dc.titleAnchor-Based Density Undersampling with Swarm Stabilizationeng
dc.typemaster thesis
dspace.entity.typePublication
thesis.degree.nameMestrado em Ciência de Dados e Métodos Analíticos Avançados, especialização em Data Science

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
TCDMAA4993.pdf
Tamanho:
1.97 MB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
348 B
Formato:
Item-specific license agreed upon to submission
Descrição: