Logo do repositório
 
Publicação

Navigating the Risks and Safeguards of Large Language Models (LLMs): Addressing Data Privacy, Security, and Ethical Concerns

datacite.subject.fosCiências Naturais::Ciências da Computação e da Informaçãopt_PT
dc.contributor.advisorRio, José Américo Alves Sustelo
dc.contributor.authorJarząbkowski, Mikołaj
dc.date.accessioned2025-11-12T16:03:47Z
dc.date.embargo2026-10-29
dc.date.issued2025-10-29
dc.descriptionDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data Sciencept_PT
dc.description.abstractThis thesis explores the privacy, security, and ethical risks of large language models (LLMs) and proposes practical defenses to reduce these risks when using LLMs in real-world settings. After reviewing existing research, three representative vulnerabilities - cognitive prompt injection via persona roleplay, data extraction through repeated token sequences, and fine‑tuning backdoors activated by rare triggers - were examined through experiment on a state‑of‑the‑art LLM. For each scenario, specific defenses were designed and tested: an extra output-scanning step to catch policy violations, an inference‑time filter to block excessive token repetitions, and a combined rare‑token filter with a prompt‑classification step. The results showed that these defenses significantly lowered the success rate of attacks, although none offered complete protection, although no single approach achieves perfect scores. Expert consultations with security and quality‑assurance professionals supported these findings and pointed out on-going challenges in stress‑testing, red‑teaming, and maintaining usability while ensuring strong safeguards. Based on both experimental results and expert feedback, the thesis provides practical recommendations for safely integrating LLMs. These include prompt engineering guidelines, layered output checks, input sanitization, access controls, privacy-focused training methods, ongoing red-teaming, and governance measures that comply with GDPR, CCPA, and other upcoming AI regulations.pt_PT
dc.identifier.tid204075360
dc.identifier.urihttp://hdl.handle.net/10362/190605
dc.language.isoengpt_PT
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt_PT
dc.subjectLarge Language Modelspt_PT
dc.subjectLLM Securitypt_PT
dc.subjectPrompt Injectionpt_PT
dc.subjectData Extractionpt_PT
dc.subjectBackdoor Attackspt_PT
dc.subjectDefense Mechanismspt_PT
dc.subjectOutput Scanningpt_PT
dc.subjectInput Validationpt_PT
dc.subjectDifferential Privacypt_PT
dc.subjectEthical AIpt_PT
dc.subjectAI Governancept_PT
dc.subjectDesign Science Researchpt_PT
dc.subjectEmpirical Evaluationpt_PT
dc.subjectRed-Teamingpt_PT
dc.subjectRegulatory Compliancept_PT
dc.subjectSDG 9 - Industry, innovation and infrastructurept_PT
dc.subjectSDG 16 - Peace, justice and strong institutionspt_PT
dc.subjectSDG 17 - Partnerships for the goalspt_PT
dc.titleNavigating the Risks and Safeguards of Large Language Models (LLMs): Addressing Data Privacy, Security, and Ethical Concernspt_PT
dc.typemaster thesis
dspace.entity.typePublication
rcaap.embargofctWithout loss of any copyright regarding my dissertation and the right to use it in future works (such as articles or books), I declare that: I grant NOVA University Lisbon and its agents, through its institutional repository, a non-exclusive license to archive and make my dissertation accessible under the conditions stated below. I authorize NOVA University Lisbon to archive, without further content changes, and make any file conversions necessary for long-term preservation and access. My dissertation can be made available on NOVA's Institutional Repository in the following way:pt_PT
rcaap.rightsembargoedAccesspt_PT
rcaap.typemasterThesispt_PT
thesis.degree.nameMestrado em Ciência de Dados e Métodos Analíticos Avançados, especialização em Data Sciencept_PT

Ficheiros

Principais
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
TCDMAA4500.pdf
Tamanho:
1.36 MB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
348 B
Formato:
Item-specific license agreed upon to submission
Descrição: