| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 2.05 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
This study explores RAG systems tailored to the Portuguese legal domain, highlighting
challenges in underrepresented languages. Fixed-size chunking strategies, particularly
TokenTextSplitter, were found to be most effective, while more advanced techniques like
Recursive and Semantic splitting showed little benefits. Larger chunk sizes improved retrieval
accuracy and answer quality, though the impact of chunk overlap remains inconclusive. One
issue of the vector databases is their lack of explainability and understanding of complex
relationships. This work will analyse a solution named GraphRAG, and advanced RAG
technique that leverages the strength of knowledge graphs. It shows promise with faster results
than traditional RAG approaches and performs better in questions that need relations
understanding.
Descrição
Palavras-chave
Retrieval-Augmented Generation RAG Large Language Models LLM Artificial Intelligence, AI Retrieval-augmented generation Hallucination Question answering RAG evaluation Vector store Chunking Legal AI Knowledge graph GraphRAG RDF Legal information retrieval Portuguese legal retrieval Natural language processing Chain-of-Thought CoT
