| Nome: | Descrição: | Tamanho: | Formato: | |
|---|---|---|---|---|
| 1.19 MB | Adobe PDF |
Autores
Orientador(es)
Resumo(s)
Datasets produced or collected by governments are being made publicly available
for re-use. Open government data portals help realize such reuse by providing list
of datasets and links to access those datasets. This ensures that users can search,
inspect and use the data easily.
With the rapidly increasing size of datasets in open government data portals,
just like it is the case with the web, nding relevant datasets with a query of few
keywords is a challenge. Furthermore, those data portals not only consist of textual
information but also georeferenced data that needs to be searched properly. Currently,
most popular open government data portals like the data.gov.uk and data.gov.ie lack
the support for simultaneous thematic and spatial search. Moreover, the use of query
expansion hasn't also been studied in open government datasets.
In this study we have assessed di erent spatial search strategies and query expansions'
performance and impact on user relevance judgment. To evaluate those
strategies we harvested machine readable spatial datasets and their metadata from
three English based open government data portals, performed metadata enhancement,
developed a prototype and performed theoretical and user evaluation.
According to the results from the evaluations keyword based search strategy returned
limited number of results but the highest relevance rating. In the other hand
aggregated spatial and thematic search improved the number of results of the baseline
keyword based strategy with a 1 second increase in response time and but decreased
relevance rating. Moreover, strategies based on WordNet Synonyms query expansion
exhibited the highest relevance rated rst seven results than all other strategies except
the keyword based baseline strategy in three out of the four query terms.
Regarding the use of Hausdor distance and area of overlap, since documents
were returned as results only if they overlap with the query, the number of results
returned were the same in both spatial similarities. But strategies using Hausdor
distance were of higher relevance rating and average mean than area of overlap based
strategies in three of the four queries.
In conclusion, while the spatial search strategies assessed in this study can be
used to improve the existing keyword based OGDs search approaches, we recommend
OGD developers to also consider using WordNet Synonyms based query expansion
and hausdor distance as a way of improving relevant spatial data discovery in open
government datasets using few keywords and tolerable response time.
Descrição
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Palavras-chave
Open Government Data Spatial Search Relevance Judgment Geographic Information Retrieval Query Expansion
