NIMS - Dissertações de Mestrado em Ciência de Dados e Métodos Analíticos Avançados (Data Science and Advanced Analytics)

URI permanente para esta coleção:

http://hdl.handle.net/10362/19563

Anteriormente: Dissertações de Mestrado em Métodos Analíticos Avançados (Advanced Analytics)

Navegar

A mostrar 1 - 10 de 781

Hybrid Search in the context of second-hand online marketplaces
Publication . Soveral, Inês Slotboom de Portocarrero e; Pinheiro, Flávio Luís Portas
Search and discovery systems play a central role in user experience on e-commerce platforms, yet traditional keyword-based retrieval often struggles to capture semantic relationships between queries and product listings. These limitations are particularly pronounced in second-hand online marketplaces, where listing content is usergenerated, heterogeneous, and frequently inconsistent. While semantic and hybrid retrieval approaches have been widely studied and applied in general web search and large-scale e-commerce systems, their application to second-hand marketplaces remains relatively underexplored. To address this gap, this work investigates the impact of semantic and Hybrid Search approaches on retrieval performance in second-hand e-commerce platforms. Using search interaction data from OLX, a large European classifieds platform, two retrieval strategies were evaluated. Semantic search based on vector embeddings was introduced as a fallback mechanism to recover results when keyword-based retrieval fails, while a Hybrid Search architecture combining keyword-based and vector-based retrieval was implemented within the main search pipeline and evaluated using real user interactions. Results show that, compared to traditional keyword-based retrieval, semantic retrieval significantly improves recall when exact keyword matching fails, while Hybrid Search further enhances performance by preserving lexical precision and improving coverage. These findings demonstrate that Hybrid Search provides a more robust and effective solution, offering empirical evidence of its suitability for second-hand e-commerce and helping to bridge this important gap in the literature.
2026-07-03Dissertação de mestrado Acesso aberto Ver mais
Resident Attitudes Toward Tourism: A Longitudinal Analysis of User-Generated Content
Publication . Rodrigues, Joana Ramos; António, Nuno Miguel da Conceição; Guerreiro, Sérgio Miguel Pratas
Tourism's accelerating growth is increasingly affecting host communities’ quality of life, demanding closer monitoring by destination managers and policymakers. A transferable monitoring framework for resident attitudes is developed and empirically validated using Lisbon, one of Europe's fastest-growing tourism destinations, as a case study. A Natural Language Processing pipeline is applied to Tripadvisor and social media post-pandemic data (2023-2025), integrating resident filtering, Aspect-Based Sentiment Analysis and topic modelling. Results reveal a clear deterioration in resident sentiment, with housing emerging as the dominant theme and public space degradation as the strongest driver of negativity. By addressing the temporal limitations of survey-based approaches, the proposed framework extends the concept of Social Carrying Capacity into a longitudinal monitoring instrument with change-point detection and offers destination managers a near-real-time early-warning system, complemented by a no-intervention forecast as a counterfactual baseline for policy evaluation.
2026-06-26Dissertação de mestrado Acesso embargado Ver mais
Transfer Learning for National Energy Consumption and CO2 emissions Forecasting: A cross-country Approach Applied to Portugal
Publication . Henriques, Rodrigo Sebastião Farinha Rodrigues Capinha; Scott, Ian James
The world pushes through cleaner energy systems, the ability to accurately forecast national energy consumption and CO2 emissions is rising. Countries that simply do not have enough historical data to train the kind of deep learning models that work well elsewhere. This thesis explores whether transfer learning can offer a way out of that bind, by pre-training Long Short-Term Memory (LSTM) networks on data from six European countries and then fine-tuning them in Portugal as the target. Annual data from 1990 to 2022 were drawn from the International Energy Agency and World Bank, and a cross-country feature selection procedure was applied before pre-training to identify which input variables, spanning economic, demographic, and energy indicators, carried consistent predictive value across different national contexts. The results told two different stories depending on the forecasting task. For energy consumption, transfer learning worked well: the TL-LSTM achieved the lowest average test error of 907.78 GWh at the one-step horizon, outperforming both the standard LSTM and the best classical model by meaningful margins. For CO2 emissions, things were less clear-cut. Transfer learning also improved CO2 forecasting, with TL-LSTM achieving the best one-step result of 2.61 Mt CO2, while at the multi-step horizon the best result came from combining transfer learning with feature selection (TL-LSTM-FS, 3.09 Mt CO2). Feature selection, meanwhile, did not consistently help: the crosscountry voting rule kept features that mattered little for Portugal while discarding ones the fine-tuned model genuinely relied on. Taken together, these findings suggest that transfer learning holds real promise for national-scale energy forecasting, but its success depends heavily on how similar the source countries really are to the target.
2026-06-26Dissertação de mestrado Acesso aberto Ver mais
Public Procurement in Portugal: a detailed analysis
Publication . Roque, Lucas Costa; Damásio, Bruno Miguel Pinto; Pinheiro, Flávio Luís Portas
Public Procurement, both in Portugal as in a global sense, is a crucial economic global engine, but one that can be flawed in solving or mitigating territorial inequalities and asymmetries. For this study, the aim is to analyse, in a period ranging from 2009 to 2023, Public Procurement in Portugal, focusing especially on concentration patterns and spatial dependencies. For this purpose, data collection was conducted via web scraping from Portal BASE, treated and pre-processed, then had applied on it various econometric and geostatistical modelling tools, namely Gini Index, LISA, Cluster Analysis and spatial econometric models, such as SAR and SEM. Upon completion of this modelling phase, extreme concentration of procurement, combined with a significant bias towards scale over dynamism were observed. This is explained by the division of the country between a continuous procurement engine in the littoral, with Metropolitan Areas as main drivers, and interior regions, which are a procurement desert, only attenuated by some regional hubs. This is further confirmed by an SEM model, which confirmed these patterns are driven and accentuated by massive use of Direct Award contracts and external, regional, factors. However, when analysed by a per-capita perspective, this trend is reversed, since interior and autonomous regions become the leaders in investment, which indicates an effort by the state to mitigate the broader trend of over-investment in the littoral areas. This led to the conclusion of a Procurement system that acts as a mirror and driver of ever increasing asymmetries between littoral and interior instead of a cohesion tool. To solve these problems, deeper CIM decentralization and development of a real-time monitoring tool, using dashboards and AI as components, are recommended solutions.
2026-06-26Dissertação de mestrado Acesso aberto Ver mais
Predicting Career Transitions: Semantic Retrieval and Ontology-Based Signals for Career Transition Prediction
Publication . Deschanel, Chloé; Pinheiro, Flávio Luís Portas
Career transitions are a central feature of labor market dynamics, yet predicting plausible occupational moves remains challenging. Advances in natural language processing have made it possible to compare career histories with occupation descriptions at scale. However, purely semantic approaches often struggle to capture structured labor market knowledge and provide limited transparency regarding why certain occupations are recommended. This thesis investigates whether ontologybased skill signals can enhance semantic retrieval for career transition prediction within large occupational taxonomies. The task is formulated as an occupation-ranking problem and implemented using a two-stage retrieval–reranking framework. Experiments are conducted on a large-scale dataset of career trajectories aligned with the ESCO taxonomy, comprising thousands of occupations and skill annotations. A transformer-based bi-encoder first retrieves candidate occupations by embedding CV histories and occupation descriptions into a shared semantic space. The resulting candidate set is then reranked using features derived from the ESCO ontology that capture different aspects of skill compatibility. Both hybrid scoring strategies and lightweight learned rerankers are evaluated. The results indicate that ESCO-based skill signals consistently improve ranking performance over semantic retrieval alone, particularly when feature interactions are modeled. The best-performing model, based on XGBoost, achieves approximately a 58% relative improvement in Recall@10 on the test set compared to the semantic baseline. While semantic similarity effectively identifies conceptually related occupations, skill-based features provide stronger discrimination within the candidate set. Overall, combining semantic retrieval with structured skill knowledge provides a practical and interpretable approach for improving career transition prediction in large occupational taxonomies.
2026-06-26Dissertação de mestrado Acesso aberto Ver mais
Antecedents and outcomes of participation in local energy communities: A mixed-methods longitudinal study
Publication . Caridade, Daniel Filipe Bento; Oliveira, Tiago André Gonçalves Félix de; Neves, Catarina Paisana Pires Costa das
Local energy communities (LECs) have gained substantial momentum across Europe and worldwide due to their integration of sustainable technologies that enable decentralised renewable energy production and foster collaboration among community actors. Despite this rapid growth, a holistic examination of both the antecedents and outcomes of participation in LECs remains limited. To address this gap, this thesis adopts a mixed-methods approach comprising three complementary studies. Study 1 employs a qualitative design, based on interviews with community members to identify key outcomes of participation in LECs. Study 2 uses a longitudinal quantitative approach to develop and test a Belief-Action-Outcome (BAO) model through partial least squares structural equation modelling (PLS-SEM), drawing on data collected across two time periods (2023 for antecedents; 2025 for outcomes) from individuals in five European countries. Study 3 provides a corroborative qualitative analysis that triangulates findings from the first two studies, enabling an assessment of convergence and divergence from the two previous studies. The results reveal that empowerment and frequency of sustainable technology use significantly promote participation in LECs, and that this participation, in turn, positively influences community environmental performance, perceived well-being, and energy poverty alleviation. A cross-country analysis shows that while the main findings hold across different cultural contexts, Türkiye exhibits a distinct positive cultural effect on energy poverty alleviation. Taken together, the corroborative qualitative findings show strong overall convergence with the quantitative results, while also explaining context-dependent differences, particularly regarding diversity of use. This research offers theoretical contributions to the understanding of digital and community-based sustainability transitions and provides actionable implications for policymakers and LEC practitioners.
2026-06-25Dissertação de mestrado Acesso embargado Ver mais
Predicting the Portuguese public procurement markets with graph neural networks
Publication . Semedo, Luís Proença; Pinheiro, Flávio Luís Portas; Damásio, Bruno Miguel Pinto
Public procurement represents a substantial share of public expenditure and plays a central role in markets, public services, and economic policy. Yet, repeated interactions among firms and public entities raise concerns about concentration, competition, and market organization. This thesis studies Portuguese public procurement as an evolving network of firm–firm relationships across two regimes: open tenders and direct awards. Using Portal BASE data (2009-2024), two statistically validated networks are constructed. The open tenders network captures direct competition through co-bidding, while the direct awards network reflects weaker competitive proximity through shared activity in product and public-entity segments. The analysis first compares their structural properties, such as connectivity, clustering, community structure, and overlap, and then frames market evolution as a link prediction problem using heuristics, embeddings, and graph neural networks. Results show clear structural differences: open tenders are more clustered and modular, while direct awards are larger and more diffuse. Link prediction results demonstrate that future relationships can be predicted with significant precision, with BUDDY achieving the best overall performance. An exploratory perturbation analysis based on BUDDY scores further shows that procurement networks are structurally flexible under targeted link changes. Overall, this thesis shows that Portuguese public procurement can be effectively analyzed as an evolving system. By combining network analysis, link prediction, and network perturbations, it provides a framework to describe, predict, and stress-test procurement market structure, contributing to the development of forward-looking tools for monitoring, competition analysis, and risk screening.
2026-06-25Dissertação de mestrado Acesso aberto Ver mais
Water Management Practices in Portugal and Mozambique: A Comparative Analysis of Behavioral Drivers Under Scarcity Conditions
Publication . Custódio, Gonçalo Quirino; Neves, Catarina Paisana Pires Costa das
The evolution of societies has intensified pressure on water resources, with scenarios indicating a deterioration in security unless consumption patterns change. Watersaving practices have the potential to reduce the detrimental impact of this scarcity. This research aims to analyze the key determinants that drive individuals in Portugal and Mozambique to adopt these practices. To achieve this, a theoretical model was developed and evaluated using structural equation modeling. The study explores how adoption intention and actual behavior are influenced by coping appraisals, threat perceptions, and intrinsic ethical motivations across different contexts. These findings are valuable in guiding fruitful communication strategies to encourage societies to adopt sustainable water management practices. Based on 171 responses, findings reveal that the intention to conserve water is primarily driven by personal moral norms and coping mechanisms, specifically self-efficacy and response efficacy, while threat perceptions showed no significant impact on intent. Furthermore, multi-group analysis highlights distinct cross-cultural pathways: Portuguese residents rely heavily on intention-driven planning, whereas Mozambican residents demonstrate a more pragmatic, direct transition from intrinsic morality to tangible action. Ultimately, these insights guide policymakers in tailoring regional strategies, moving beyond fear-based appeals to focus on behavioral empowerment and direct enablement in diverse socioeconomic contexts.
2026-06-25Dissertação de mestrado Acesso aberto Ver mais
How well do professional color trend forecasts align with consumer adoption? An empirical evaluation using EU sportswear sales data
Publication . Carvalho, Diogo Nunes; Pinheiro, Flávio Luís Portas
Professional color trend forecasts are a routine input into assortment planning in the fashion industry, yet whether the color directions they identify correspond to the colors consumers actually demand has never been empirically tested against transactionlevel data. This study addresses that gap by evaluating the predictive alignment between seasonal color forecast directions from two professional forecasting agencies and realized consumer demand in the EU sportswear market during FY2024. The analysis draws on approximately 40 million transaction records aggregated into a monthly segment-color panel and applies a two-step prediction-residual framework: a Ridge regression baseline model is estimated on pre-window data to generate expected demand, and weighted least squares with HC3-robust standard errors is then used to test whether forecasted colors are associated with systematically higher residual demand. Subgroup heterogeneity is assessed across product categories, subcategories, and geographic markets using the Benjamini-Hochberg false discovery rate correction, and a dedicated interaction model tests for delayed alignment by comparing residual lift estimates between the forecast year and the same seasonal window one year later. Across all twelve season-color experiments - spanning two forecasters, three seasonal windows, and both primary and secondary color designations - the estimated residual lift is negative and statistically significant in every case: forecasted colors are consistently associated with below-expectation demand after conditioning on structural market characteristics and demand persistence. This negative pattern is broadly robust across product and geographic subgroups, with the only credible positive alignment signal confined to a single color-season combination in a subset of Northern European markets. Five of twelve experiments show a statistically significant reduction in the magnitude of underperformance in the subsequent year, suggesting partial delayed alignment rather than an absence of any directional signal in the forecasts, though all 2025 lift estimates remain negative. The study provides the first transaction-level, demand-based assessment of color forecast alignment in sportswear and demonstrates that a regression-based residual lift framework offers a replicable methodology for empirically evaluating qualitative forecast signals at scale.
2026-06-24Dissertação de mestrado Acesso embargado Ver mais
Simulator of a Multi-objective Optimization Model for Document Translation Services
Publication . Gomes, Sofia Santos; Vanneschi, Leonardo
This thesis explores the assignment problem in document translation services from a multi-objective perspective. In this context, assignment decisions consider more than just cost. They also impact expected quality, request coverage, and how work is divided among translators. To examine this issue, a simulation framework was created to compare three assignment methods: a baseline strategy, a greedy heuristic, and the NSGA-II algorithm. The results indicate that each method has different trade-offs. The baseline method guarantees full coverage but performs worse in cost and quality. The greedy heuristic achieves the highest average quality, but it has lower coverage and a more concentrated workload. NSGA-II offers the best overall performance by maintaining full coverage while improving cost management, quality, and workload distribution. These findings suggest that assignments in document translation services should be viewed as a multi-objective decision problem, not just a simple matching task. Although this study relies on a simplified simulator and uses proxy measures for quality and time, it highlights the importance of considering multiple objectives when designing assignment strategies for translation services.
2026-06-26Dissertação de mestrado Acesso embargado Ver mais

Navegar

Entradas recentes