DSpace UNL

RUN >
Faculdade de Ciências e Tecnologia (FCT) >
FCT Departamentos >
FCT: Departamento de Informática >
FCT: DI - Dissertações de Mestrado >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10362/2051

Título: Parallel texts alignment
Autor: Gomes, Luís Manuel dos Santos
Orientador: Lopes, José Gabriel
Palavras-chave: Parallel texts alignment
Parallel corpora
Extraction of translation equivalents
Issue Date: 2009
Editora: FCT - UNL
Resumo: Alignment of parallel texts (texts that are translation of each other) is a required step for many applications that use parallel texts, including statistical machine translation, automatic extraction of translation equivalents, automatic creation of concordances, etc. This dissertation presents a new methodology for parallel texts alignment that departs from previous work in several ways. One important departure is a shift of goals concerning the use of lexicons for obtaining correspondences between the texts. Previous methods try to infer a bilingual lexicon as part of the alignment process and use it to obtain correspondences between the texts. Some of those methods can use external lexicons to complement the inferred one, but they tend to consider them as secondary. This dissertation presents several arguments supporting the thesis that lexicon inference should not be embedded in the alignment process. The method described complies with this statement and relies exclusively on externally managed lexicons to obtain correspondences. Moreover, the algorithms presented can handle very large lexicons containing terms of arbitrary length. Besides the exclusive use of external lexicons, this dissertation presents a new method for obtaining correspondences between translation equivalents found in the texts. It uses a decision criteria based on features that have been overlooked by prior work. The proposed method is iterative and refines the alignment at each iteration. It uses the alignment obtained in one iteration as a guide to obtaining new correspondences in the next iteration, which in turn are used to compute a finer alignment. This iterative scheme allows the method to correct correspondence errors from previous iterations in face of new information.
Descrição: Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
URI: http://hdl.handle.net/10362/2051
Appears in Collections:FCT: DI - Dissertações de Mestrado

Files in This Item:

File Description SizeFormat
Gomes_2009.pdf1,18 MBAdobe PDFView/Open
Statistics
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Universidade Nova de Lisboa  - Feedback
Estamos no RCAAP Governo Português separator Ministério da Educação e Ciência   Fundação para a Ciência e a Tecnologia

Financiado por:

POS_C UE