Logo do repositório
 
A carregar...
Logótipo do projeto
Projeto de investigação

MapIntel - Interactive Visual Analytics Platform for Competitive Intelligence

Autores

Publicações

Topic Modeling
Publication . Amaro, Ana; Bação, Fernando; NOVA Information Management School (NOVA IMS); Information Management Research Center (MagIC) - NOVA Information Management School; Societa Italiana di Istochimica / PAGEPress Publications
In recent years, the field of Topic Modeling (TM) has grown in importance due to the increasing availability of digital text data. TM is an unsupervised learning technique that helps uncover latent semantic structures in large sets of documents, making it a valuable tool for finding relevant patterns. However, evaluating the performance of TM algorithms can be challenging as different metrics and datasets are often used, leading to inconsistent results. In addition, many current surveys of TM algorithms focus on a limited number of models and exclude state-of-the-art approaches. This paper has the objective of addressing these issues by presenting a comprehensive comparative study of five TM algorithms across three different benchmark datasets using five different metrics. We offer an updated survey of the latest TM approaches and evaluation metrics, providing a consistent framework for comparing different algorithms while introducing state-of-the art approaches that have been disregarded in the literature. The experiments, which primarily use Context Vectors (CV) Topic Coherence as an evaluation metric, show that Top2Vec is the best-performing model across all datasets, disrupting the tendency for Latent Dirichlet Allocation to be the best performer.

Unidades organizacionais

Descrição

Palavras-chave

Contribuidores

Financiadores

Entidade financiadora

Fundação para a Ciência e a Tecnologia

Programa de financiamento

Concurso de Projetos de Investigação Científica e Desenvolvimento Tecnológico em Ciência dos dados e inteligência artificial na Administração Pública - 2019

Número da atribuição

DSAIPA/DS/0116/2019

ID