Graph-based reasoning for retrieval-augmented generation: a study in the Portuguese legal domain

Hermenegildo, Maria Leonor Trindade

http://hdl.handle.net/10362/181479

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
2023_24_Fall_47285.pdf		2.84 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Hermenegildo, Maria Leonor Trindade

Orientador(es)

Han, Qiwei

Resumo(s)

This study explores RAG systems tailored to the Portuguese legal domain, highlighting challenges in underrepresented languages. Fixed-size chunking strategies, particularly TokenTextSplitter, were found to be most effective, while more advanced techniques like Recursive and Semantic splitting showed little benefits. Larger chunk sizes improved retrieval accuracy and answer quality, though the impact of chunk overlap remains inconclusive. Self reflection techniques show promising results, particularly for weaker LLMs and when different techniques are paired. However, there is an increment in computational cost to consider.

Palavras-chave

Retrieval-Augmented Generation RAG Large Language Models LLM Artificial Intelligence AI Hallucination Question answering RAG evaluation Vector store Chunking Legal AI Graph-based reasoning Self-assessment Self-reflection Multi-Agent Systems MAS

URI

http://hdl.handle.net/10362/181479

Coleções

NSBE - MA Dissertations

Ver registo completo