| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 6.22 MB | Adobe PDF |
Orientador(es)
Resumo(s)
A presente dissertação tem como objetivo principal a elaboração de um
vocabulário fundamental do português atualizado, digital, anotado e codificado segundo
a especificação TEI Lex-0 (Tasovac et al., 2018). Diferenciando-se das abordagens
tradicionais, geralmente orientadas para o ensino de português como língua não materna,
propõe-se aqui um vocabulário destinado a falantes nativos, com organização por níveis
de complexidade e com tratamento semântico explícito das unidades lexicais, em vez de
uma lista de lemas não desambiguados ou listagens de vocabulário/unidades lexicais
orientadas para o ensino de língua não materna.
Este trabalho faz uso dos léxicos organizados por nível de complexidade,
desenvolvidos no âmbito do projeto iRead4Skills – Intelligent Reading Improvement
System for Fundamental and Transversal Skills Development. Esses léxicos foram
extraídos de um corpus diversificado e atual, composto por textos provenientes de
diferentes domínios e contextos comunicativos. A integração num recurso lexicográfico
de referência, como o Dicionário da Língua Portuguesa (DLP) da Academia das Ciências
de Lisboa, assegura uma maior acessibilidade e usabilidade da ferramenta proposta.
Nesse sentido, foi desenvolvida uma metodologia específica que permite
identificar as unidades lexicais relevantes e definir unidades lexicais representativas do
vocabulário, procedendo ao seu alinhamento com os sentidos correspondentes no DLP e
à criação de metadados estruturados visando à codificação e interoperabilidade.
O fluxo de trabalho seguiu quatro etapas principais: (i) a criação e organização
dos subcorpora por nível de complexidade; (ii) identificação das unidades lexicais
relevantes e sua definição; (iii) alinhamento de unidades lexicais definidas e selecionadas
com os sentidos correspondentes no DLP; e (iv) criação dos metadados necessários para
codificação das unidades lexicais.
Este Vocabulário Fundamental em formato digital promove uma descrição mais
precisa e atualizada da língua considerando diferentes níveis de complexidade, ajustada
às necessidades de falantes nativos e tendo aplicações diretas em contextos educativos,
tecnológicos e sociopolíticos, o que abre novas possibilidades de investigação em
linguística e ensino de línguas.
This dissertation’s main objective is to develop an updated fundamental vocabulary of Portuguese, annotated and encoded according to the TEI Lex-0 specification (Tasovac et al., 2018). Unlike traditional approaches, which are generally oriented toward teaching Portuguese as a non-native language, this study proposes a vocabulary intended for native speakers, organized by levels of complexity and with explicit semantic treatment of lexical units, rather than lists of non-disambiguated lemmas or vocabulary/lexical unit listings designed for foreign language teaching. This study leverages on the lexicons organized by complexity level, developed within the framework of the iRead4Skills project – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development. These lexicons were extracted from a diverse and up-to-date corpus composed of texts from various domains and communicative contexts. Their integration into a reference lexicographic resource, such as the Dicionário da Língua Portuguesa (DLP) of the Academia das Ciências de Lisboa, ensures greater accessibility and usability of the proposed tool. To this end, a specific methodology was developed to identify the relevant units and define representative lexical units of the vocabulary, aligning them with their corresponding senses in the DLP and creating structured metadata for encoding and interoperability purposes. The workflow followed four main steps: (i) creation and organization of subcorpora by complexity level; (ii) identification of the relevant lexical units and their definition; (iii) alignment of selected and defined lexical units with their corresponding senses in the DLP; and (iv) creation of the necessary metadata for encoding the lexical units. This digital Fundamental Vocabulary provides a more precise and updated description of the Portuguese language considering different complexity levels, and tailored to the needs of native speakers. It has direct applications in educational, technological, and sociopolitical contexts, opening new possibilities for research in linguistics and language teaching.
This dissertation’s main objective is to develop an updated fundamental vocabulary of Portuguese, annotated and encoded according to the TEI Lex-0 specification (Tasovac et al., 2018). Unlike traditional approaches, which are generally oriented toward teaching Portuguese as a non-native language, this study proposes a vocabulary intended for native speakers, organized by levels of complexity and with explicit semantic treatment of lexical units, rather than lists of non-disambiguated lemmas or vocabulary/lexical unit listings designed for foreign language teaching. This study leverages on the lexicons organized by complexity level, developed within the framework of the iRead4Skills project – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development. These lexicons were extracted from a diverse and up-to-date corpus composed of texts from various domains and communicative contexts. Their integration into a reference lexicographic resource, such as the Dicionário da Língua Portuguesa (DLP) of the Academia das Ciências de Lisboa, ensures greater accessibility and usability of the proposed tool. To this end, a specific methodology was developed to identify the relevant units and define representative lexical units of the vocabulary, aligning them with their corresponding senses in the DLP and creating structured metadata for encoding and interoperability purposes. The workflow followed four main steps: (i) creation and organization of subcorpora by complexity level; (ii) identification of the relevant lexical units and their definition; (iii) alignment of selected and defined lexical units with their corresponding senses in the DLP; and (iv) creation of the necessary metadata for encoding the lexical units. This digital Fundamental Vocabulary provides a more precise and updated description of the Portuguese language considering different complexity levels, and tailored to the needs of native speakers. It has direct applications in educational, technological, and sociopolitical contexts, opening new possibilities for research in linguistics and language teaching.
Descrição
Palavras-chave
Lexicologia Lexicology Vocabulário fundamental Fundamental vocabulary Corpus Word sense alignment Alinhamento de sentidos
