Logo do repositório
 
Publicação

Data Profiling in Cloud Migration: Data Quality Measures while Migrating Data from a Data Warehouse to the Google Cloud Platform

dc.contributor.advisorPinheiro, Flávio Luís Portas
dc.contributor.advisorFigueira, Pedro Santos
dc.contributor.authorCabral, Andreia Filipa Gonçalves
dc.date.accessioned2021-05-13T16:44:16Z
dc.date.available2021-05-13T16:44:16Z
dc.date.issued2021-05-06
dc.descriptionInternship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analyticspt_PT
dc.description.abstractIn today times, corporations have gained a vast interest in data. More and more, companies realized that the key to improving their efficiency and effectiveness and understanding their customers’ needs and preferences better was reachable by mining data. However, as the amount of data grow, so must the companies necessities for storage capacity and ensuring data quality for more accurate insights. As such, new data storage methods must be considered, evolving from old ones, still keeping data integrity. Migrating a company’s data from an old method like a Data Warehouse to a new one, Google Cloud Platform is an elaborate task. Even more so when data quality needs to be assured and sensible data, like Personal Identifiable Information, needs to be anonymized in a Cloud computing environment. To ensure these points, profiling data, before or after it migrated, has a significant value by design a profile for the data available in each data source (e.g., Databases, files, and others) based on statistics, metadata information, and pattern rules. Thus, ensuring data quality is within reasonable standards through statistics metrics, and all Personal Identifiable Information is identified and anonymized accordingly. This work will reflect the required process of how profiling Data Warehouse data can improve data quality to better migrate to the Cloud.pt_PT
dc.identifier.tid202726967pt_PT
dc.identifier.urihttp://hdl.handle.net/10362/117609
dc.language.isoengpt_PT
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt_PT
dc.subjectData Qualitypt_PT
dc.subjectData Profilept_PT
dc.subjectDatabasept_PT
dc.subjectData Warehousept_PT
dc.subjectCloudpt_PT
dc.subjectData Migrationpt_PT
dc.subjectPandas Profilingpt_PT
dc.subjectPersonal Identifiable Informationpt_PT
dc.titleData Profiling in Cloud Migration: Data Quality Measures while Migrating Data from a Data Warehouse to the Google Cloud Platformpt_PT
dc.typemaster thesis
dspace.entity.typePublication
rcaap.rightsopenAccesspt_PT
rcaap.typemasterThesispt_PT
thesis.degree.nameMestrado em Métodos Analíticos Avançadospt_PT

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
TAA0085.pdf
Tamanho:
2.1 MB
Formato:
Adobe Portable Document Format