| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.59 MB | Adobe PDF |
Orientador(es)
Resumo(s)
The increasing volume, variety, and velocity of data have exposed the limitations of traditional
data warehouses and sparked a transformative shift towards modern, cloud-based solutions.
These modern data warehouses (MDW) are not just about scalability, flexibility, and advanced
capabilities but about redefining how we meet evolving business needs. This project work
delves into the implementation of a metadata-driven approach to automate and optimise ELT
(Extract, Load, Transform) processes within this revolutionary modern data warehouse
architecture.
The project leverages Microsoft Azure Synapse Analytics to minimise manual intervention and
enhance scalability. A metadata repository serves as the foundation for developing generic,
reusable data pipeline templates capable of handling diverse scenarios based on metadata
parameters. This standardisation enables efficient and automated data ingestion,
transformation, and modelling while reducing development and maintenance time. The
solution supports both full and incremental data ingestion into a data lake and implements
modelling techniques, such as slowly changing dimensions (SCD) Types 1 and 2 and fact tables.
To enhance user accessibility, a Power App interface was developed, simplifying parameter
management and enabling non-technical users to interact with the system seamlessly.
Additionally, to ensure operational reliability, a monitoring framework was meticulously
designed and implemented, providing robust oversight of the solution. The whole solution
was tested and deployed in a medium-sized health insurance company, demonstrating its
effectiveness in improving efficiency, scalability, and data quality.
This project work demonstrates the practical benefits of applying a metadata-driven approach
to a modern data warehouse by automating and standardising data pipelines using metadata.
This sets a foundation for further research in metadata-driven applications in cloud
environments.
Descrição
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business Intelligence
Palavras-chave
Metadata Driven Approach ELT Modern Data Warehouse User Interface Data Lake Cloud Architecture SDG 8 - Decent work and economic growth SDG 9 - Industry, innovation and infrastructure
