Utilize este identificador para referenciar este registo: http://hdl.handle.net/10362/165994
Título: Map Text Extraction and Parsing using Optical Character Recognition (OCR) for Facilitating Map Reproducibility Assessment
Autor: Mulaw, Yohannes Abrha
Orientador: Kray, Christian
Romero, José Francisco Ramos
Koukouraki, Eftychia
Palavras-chave: reproducibility
map reproducibility
map reproducibility assessment
optical character recognition (OCR)
text analysis
fuzzy matching
Data de Defesa: 2-Fev-2024
Resumo: Reproducibility stands as a fundamental element in promoting transparency and openness in scientific publications and in geoscientific research as well. Figures, particularly maps, integrated into geoscientific research play a significant role in visualizing and representing crucial scientific results; thus, they should be reproducible. However, the assessment of map reproducibility for determining the success of map reproduction is limited due to the absence of standard metrics, criteria, and tools. In this study, a novel web-based application is developed to facilitate the map reproducibility assessment process based on textual elements of the map. The tool integrates an open source optical character recognition (OCR) technology for text extraction from maps and proposes a comprehensive comparative analysis workflow consisting of assessment criteria such as the text similarity between the extracted texts using fuzzy string matching techniques, the overlap ratio between the bounding boxes associated with the texts using the Jaccard index (intersection over union), and the Euclidean distance between the bounding boxes for effective map reproducibility assessment. The tool is validated and evaluated using real-world datasets and reveals its effectiveness compared to the existing map comparison methods in terms of accessibility, interoperability, and flexibility to accommodate diverse file sizes, image resolutions, and file types. As a result, the tool was found to be usable with a SUS score of 69.33 and useful for researchers and GIS professionals to extract and assess textual elements from maps. In addition, the study demonstrates promising results in the effective utilization of OCR technology for accurate text extraction from maps, even with the lowest map image resolution (60 dpi) and smallest font sizes (7 pt).
Descrição: Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
URI: http://hdl.handle.net/10362/165994
Designação: Mestrado em Tecnologias Geoespaciais
Aparece nas colecções:NIMS - MSc Dissertations Geospatial Technologies (Erasmus-Mundus)

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
TGEO297_O.pdf1,31 MBAdobe PDFVer/Abrir


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote 

Todos os registos no repositório estão protegidos por leis de copyright, com todos os direitos reservados.