Using provenance in the search for relevant spatial data

Spatial Knowledge Infrastructure (SKI) expands the current spatial data infrastructure model to a model that creates an autonomous network of data, analytics, expertise and policies that assists end users to integrate spatial knowledge in real-time. In such an SKI, data and knowledge must be exposed for the semantic web by which the data and processes become discoverable by search engines and can be queried by dedicated machines.

An example use of the next generation SKI is a spatial search, which inasmuch as it seems to be trivial is currently not fully functional – current search engines allow only limited search against bespoke datasets by their location expressed by geographic coordinates or an address, but does not allow search by any other (spatial) characteristics defined for the object. Current trends in spatial search are towards extending the ‘spatial understanding’ of search engines by enabling natural language queries and having dedicated semantic reasoning capabilities. As a result, such search engines can assisting with the decision on the search results’ fitness for use.

RDA Group: 
Provenance Patterns WG
Contributor: 
Ivana Ivanova
Actors: 
Spatial data search engine
Goal: 
• To be able to use provenance for evaluation of dataset’s fitness for use
Summary: 
This use-case demonstrates how to use the provenance information in the search for relevant dataset.
Preconditions: 
Data is identifiable and discoverable.
Provenance is documented (e.g. using ISO 19115) in a machine readable format.
Search engine is equipped with semantic reasoner.
Postconditions: 
Results are provided as a ranked list of relevant datasets.
Steps: 
A registered user provides a natural language search string.
Search engine (SE) translates the search string to a structured set of requirements on provenance.
SE compares search criteria and compares them with the provenance values of resources.
SE identifies candidate spatial resources.
System returns the result to the user as a list of spatial datasets ranked by relevance.