Caldeira, João Carlos Palmela PinheiroAlves, David Forjaz Jorge Alexandre2025-11-102025-10-28http://hdl.handle.net/10362/190389Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsTikTok has rapidly emerged as one of the most influential social media platforms worldwide, driven by its dynamic, personalized recommendation algorithm (the “For You Page”) and its unique ability to transform ordinary videos into viral successes almost instantaneously. Despite its widespread popularity, predicting a video’s virality prior to publication remains a complex challenge, particularly given the platform’s fast-paced nature and the cultural and linguistic specificity of its user communities. This research addresses this gap by developing a predictive framework designed to estimate a video’s virality potential prior to upload, with a particular focus on the Portuguese-speaking TikTok community. To achieve this, a dataset of TikTok videos was collected directly from the platform and carefully preprocessed to ensure data quality and representativeness. By leveraging pre-upload textual features—such as hashtags, descriptions, and voice-to-text content—informative variables are engineered and used to train multiple predictive models. During this process, the study also investigates the dynamics of virality and user engagement behaviors through the analysis of the impact of the Voice-to-Text feature and the defining characteristics of viral content. Among the models developed, LightGBM achieves the strongest performance, sucessfully identifying most viral and non-viral videos in the test set. The findings offer valuable insights for content creators and marketers seeking to optimize visibility and engagement, while also contributing to academic understanding of virality in culturally specific digital contexts.engTikTok Virality PredictionMachine LearningText MiningVoice-to-Text (VTT)Social Media AnalyticsSDG 4 - Quality educationSDG 8 - Decent work and economic growthSDG 9 - Industry, innovation and infrastructureSDG 17 - Partnerships for the goalsForecasting TikTok Virality: A Predictive Modeling Approach: Leveraging Text-Based Features to Anticipate Pre-Upload Performancemaster thesis204072069