| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.11 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
This study explores RAG systems tailored to the Portuguese legal domain, highlighting
challenges in underrepresented languages. Fixed-size chunking strategies, particularly
TokenTextSplitter, were found to be most effective, while more advanced techniques like
Recursive and Semantic splitting showed little benefits. Larger chunk sizes improved retrieval
accuracy and answer quality, though the impact of chunk overlap remains inconclusive.
Although reranking techniques have been shown to improve retrieval in previous research this
may only be true for large and diverse datasets.
Descrição
Palavras-chave
Retrieval-Augmented Generation RAG Large Language Models LLM Artificial Intelligence AI Hallucination Question answering RAG evaluation Vector store Chunking Legal AI Document reranking Relevance ranking Legal information retrieval Portuguese legal retrieval
