Logo do repositório
 
A carregar...
Miniatura
Publicação

Designing and Implementing a Metadata-driven Modern Data Warehouse: Automating ELT Processes through Metadata-Driven Pipelines

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
TGI5011.pdf2.59 MBAdobe PDF Ver/Abrir

Resumo(s)

The increasing volume, variety, and velocity of data have exposed the limitations of traditional data warehouses and sparked a transformative shift towards modern, cloud-based solutions. These modern data warehouses (MDW) are not just about scalability, flexibility, and advanced capabilities but about redefining how we meet evolving business needs. This project work delves into the implementation of a metadata-driven approach to automate and optimise ELT (Extract, Load, Transform) processes within this revolutionary modern data warehouse architecture. The project leverages Microsoft Azure Synapse Analytics to minimise manual intervention and enhance scalability. A metadata repository serves as the foundation for developing generic, reusable data pipeline templates capable of handling diverse scenarios based on metadata parameters. This standardisation enables efficient and automated data ingestion, transformation, and modelling while reducing development and maintenance time. The solution supports both full and incremental data ingestion into a data lake and implements modelling techniques, such as slowly changing dimensions (SCD) Types 1 and 2 and fact tables. To enhance user accessibility, a Power App interface was developed, simplifying parameter management and enabling non-technical users to interact with the system seamlessly. Additionally, to ensure operational reliability, a monitoring framework was meticulously designed and implemented, providing robust oversight of the solution. The whole solution was tested and deployed in a medium-sized health insurance company, demonstrating its effectiveness in improving efficiency, scalability, and data quality. This project work demonstrates the practical benefits of applying a metadata-driven approach to a modern data warehouse by automating and standardising data pipelines using metadata. This sets a foundation for further research in metadata-driven applications in cloud environments.

Descrição

Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence

Palavras-chave

Metadata Driven Approach ELT Modern Data Warehouse User Interface Data Lake Cloud Architecture SDG 8 - Decent work and economic growth SDG 9 - Industry, innovation and infrastructure

Contexto Educativo

Citação

Projetos de investigação

Unidades organizacionais

Fascículo