Utilize este identificador para referenciar este registo:
http://hdl.handle.net/10362/165994
Título: | Map Text Extraction and Parsing using Optical Character Recognition (OCR) for Facilitating Map Reproducibility Assessment |
Autor: | Mulaw, Yohannes Abrha |
Orientador: | Kray, Christian Romero, José Francisco Ramos Koukouraki, Eftychia |
Palavras-chave: | reproducibility map reproducibility map reproducibility assessment optical character recognition (OCR) text analysis fuzzy matching |
Data de Defesa: | 2-Fev-2024 |
Resumo: | Reproducibility stands as a fundamental element in promoting transparency and openness in scientific publications and in geoscientific research as well. Figures, particularly maps, integrated into geoscientific research play a significant role in visualizing and representing crucial scientific results; thus, they should be reproducible. However, the assessment of map reproducibility for determining the success of map reproduction is limited due to the absence of standard metrics, criteria, and tools. In this study, a novel web-based application is developed to facilitate the map reproducibility assessment process based on textual elements of the map. The tool integrates an open source optical character recognition (OCR) technology for text extraction from maps and proposes a comprehensive comparative analysis workflow consisting of assessment criteria such as the text similarity between the extracted texts using fuzzy string matching techniques, the overlap ratio between the bounding boxes associated with the texts using the Jaccard index (intersection over union), and the Euclidean distance between the bounding boxes for effective map reproducibility assessment. The tool is validated and evaluated using real-world datasets and reveals its effectiveness compared to the existing map comparison methods in terms of accessibility, interoperability, and flexibility to accommodate diverse file sizes, image resolutions, and file types. As a result, the tool was found to be usable with a SUS score of 69.33 and useful for researchers and GIS professionals to extract and assess textual elements from maps. In addition, the study demonstrates promising results in the effective utilization of OCR technology for accurate text extraction from maps, even with the lowest map image resolution (60 dpi) and smallest font sizes (7 pt). |
Descrição: | Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies |
URI: | http://hdl.handle.net/10362/165994 |
Designação: | Mestrado em Tecnologias Geoespaciais |
Aparece nas colecções: | NIMS - MSc Dissertations Geospatial Technologies (Erasmus-Mundus) |
Ficheiros deste registo:
Ficheiro | Descrição | Tamanho | Formato | |
---|---|---|---|---|
TGEO297_O.pdf | 1,31 MB | Adobe PDF | Ver/Abrir |
Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.