Utilize este identificador para referenciar este registo: http://hdl.handle.net/10362/187321
Registo completo
Campo DCValorIdioma
dc.contributor.authorBonin, Lorenzo-
dc.contributor.authorCusin, Lorenzo-
dc.contributor.authorDe Lorenzo, Andrea-
dc.contributor.authorCastelli, Mauro-
dc.contributor.authorManzoni, Luca-
dc.date.accessioned2025-09-01T21:11:54Z-
dc.date.available2025-09-01T21:11:54Z-
dc.date.issued2025-08-11-
dc.identifier.isbn979-8-4007-1464-1-
dc.identifier.otherPURE: 128435852-
dc.identifier.otherPURE UUID: 556556f1-ebdf-4a7b-8482-ff59210dd80d-
dc.identifier.otherScopus: 105014587226-
dc.identifier.otherORCID: /0000-0002-8793-1451/work/190962198-
dc.identifier.urihttp://hdl.handle.net/10362/187321-
dc.descriptionBonin, L., Cusin, L., De Lorenzo, A., Castelli, M., & Manzoni, L. (2025). A Genetic Algorithm Framework for Jailbreaking Large Language Models [poster]. In G. Ochoa (Ed.), GECCO '25 Companion: Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 779-782). ACM - Association for Computing Machinery. https://doi.org/10.1145/3712255.3726687 --- This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia), under the project - UIDB/04152/2020 (DOI: 10.54499/UIDB/04152/2020) - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMS), and the project 2024.07277.IACDC (Lexa).-
dc.description.abstractDespite their capabilities to generate human-like text and aid in various tasks, Large Language Models (LLMs) are susceptible to misuse. To mitigate this risk, many LLMs undergo safety alignment or refusal training to allow them to refuse unsafe or unethical requests. Despite these measures, LLMs remain exposed to jailbreak attacks—i.e., adversarial techniques that manipulate the models to generate unsafe outputs. Jailbreaking typically involves crafting specific prompts or adversarial inputs that bypass the models' safety mechanisms. This paper examines the robustness of safety-aligned LLMs against adaptive jailbreak attacks, focusing on a genetic algorithm-based approach.en
dc.format.extent4-
dc.language.isoeng-
dc.publisherACM - Association for Computing Machinery-
dc.relationhttps://doi.org/10.54499/UIDB/04152/2020-
dc.relationhttps://doi.org/10.54499/2024.07277.IACDC-
dc.rightsopenAccess-
dc.subjectGenetic Algorithm-
dc.subjectLarge Language Model-
dc.subjectJailbreak-
dc.subjectAdversarial Attack-
dc.subjectAdaptive Attack-
dc.subjectArtificial Intelligence-
dc.subjectSoftware-
dc.subjectControl and Optimization-
dc.subjectDiscrete Mathematics and Combinatorics-
dc.subjectLogic-
dc.subjectSDG 9 - Industry, Innovation, and Infrastructure-
dc.titleA Genetic Algorithm Framework for Jailbreaking Large Language Models [poster]-
dc.typeconferenceObject-
degois.publication.firstPage779-
degois.publication.lastPage782-
degois.publication.titleGECCO '25 Companion-
degois.publication.titleGenetic and Evolutionary Computation Conference-
dc.peerreviewedyes-
dc.identifier.doihttps://doi.org/10.1145/3712255.3726687-
dc.description.versionpublishersversion-
dc.description.versionpublished-
dc.contributor.institutionInformation Management Research Center (MagIC) - NOVA Information Management School-
dc.contributor.institutionNOVA Information Management School (NOVA IMS)-
Aparece nas colecções:NIMS: MagIC - Documentos de conferências internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Genetic_Algorithm_Framework_for_Jailbreaking_LLM.pdf1,77 MBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.