Logo do repositório
 
A carregar...
Miniatura
Publicação

A performance comparison of oversampling methods for data generation in imbalanced learning tasks

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
TEGI0396.pdf1.3 MBAdobe PDF Ver/Abrir

Resumo(s)

Class Imbalance problem is one of the most fundamental challenges faced by the machine learning community. The imbalance refers to number of instances in the class of interest being relatively low, as compared to the rest of the data. Sampling is a common technique for dealing with this problem. A number of over - sampling approaches have been applied in an attempt to balance the classes. This study provides an overview of the issue of class imbalance and attempts to examine some common oversampling approaches for dealing with this problem. In order to illustrate the differences, an experiment is conducted using multiple simulated data sets for comparing the performance of these oversampling methods on different classifiers based on various evaluation criteria. In addition, the effect of different parameters, such as number of features and imbalance ratio, on the classifier performance is also evaluated.

Descrição

Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRM

Palavras-chave

Imbalanced learning Oversampling methods Evaluation metrics Classifier performance

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo