Logo do repositório
 
A carregar...
Miniatura
Publicação

Big Data Visualization: Tableplot for Python

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
TDDM5000.pdf1.67 MBAdobe PDF Ver/Abrir

Resumo(s)

The rapid increase of data generation across different industries such as government, social networks, and mobile applications, driven by the Internet of Things (IoT) and cloud computing technologies, has created an urgent requirement for processing and visualizing massive amounts of data. In this context, data-intensive environments now depend heavily on Big Data analytics and thus efficient and scalable visualization methods. This work aimed to identify the types of data visualizations commonly used in Big Data. It was identified that the tableplot chart has proven effective for summarizing large datasets. Tableplot was originally an R package that was discontinued and had not been properly translated to Python, retaining only a subset of its functionalities. Based on the identification of a possible gap and an opportunity to create a tool that would be useful for both the academic community and data analysis professionals. The present work employed agile methodologies and utilized programming language translation dictionaries to successfully port the package to Python, subsequently publishing it on the PyPI website and GitHub. The Python package duplicates every essential functionality from its R counterpart and provides additional features to improve readability, identified by the authors of the original R package as opportunities for improvement. The research provides academic value and innovation through the tableplot package development and publication, since it had not yet been translated to Python while maintaining all original functionalities, and can be used as a reference in future package translations between different programming languages. Besides this, the current work can also be helpful by serving as a source for a better understanding of Big Data, as well as tableplot data visualization specifically, which, compared to other identified visualization methods, is less frequently cited in academic works.

Descrição

Dissertation presented as the partial requirement for obtaining a Master's degree in Data Driven Marketing, specialization in Data Science for Marketing

Palavras-chave

Big Data Data Visualization Tableplot Python Exploratory Analysis SDG 4 - Quality education SDG 7 - Affordable and clean energy

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo