Faculdade de Ciências e Tecnologia (FCT) >
FCT Departamentos >
FCT: Departamento de Informática >
FCT: DI - MA Dissertations >
Please use this identifier to cite or link to this item:
|Title: ||Parallel texts alignment|
|Authors: ||Gomes, Luís Manuel dos Santos|
|Advisor: ||Lopes, José Gabriel|
|Keywords: ||Parallel texts alignment|
Extraction of translation equivalents
|Issue Date: ||2009|
|Publisher: ||FCT - UNL|
|Abstract: ||Alignment of parallel texts (texts that are translation of each other) is a required step for many applications that use parallel texts, including statistical machine translation, automatic extraction of translation equivalents, automatic creation of concordances, etc.
This dissertation presents a new methodology for parallel texts alignment that departs from previous work in several ways. One important departure is a shift of goals concerning the use of lexicons for obtaining correspondences between the texts. Previous methods try to infer a bilingual lexicon as part of the alignment process and use it to obtain correspondences between the texts. Some of those methods can use external lexicons to complement the inferred one,
but they tend to consider them as secondary. This dissertation presents several arguments supporting the thesis that lexicon inference should not be embedded in the alignment process. The method described complies with this statement and relies exclusively on externally managed lexicons to obtain correspondences. Moreover, the algorithms presented can handle very large lexicons containing terms of arbitrary length.
Besides the exclusive use of external lexicons, this dissertation presents a new method for obtaining correspondences between translation equivalents found in the texts. It uses a decision criteria based on features that have been overlooked by prior work.
The proposed method is iterative and refines the alignment at each iteration. It uses the
alignment obtained in one iteration as a guide to obtaining new correspondences in the next iteration, which in turn are used to compute a finer alignment. This iterative scheme allows the method to correct correspondence errors from previous iterations in face of new information.|
|Description: ||Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática|
|Appears in Collections:||FCT: DI - MA Dissertations|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.