Organisations wishing to associate items in an ISO19115-1-conformant catalogue with graph-based provenance delivered by a provenance query service (likely a SPARQL service) may wish to make the link to that service as visible as possible to people used to the ISO19115-1 norms. Since ISO19115-1 typically uses the Lineage field for structured or un-structured lineage information (provenance), this pattern suggests making that link to the provenance query service in that field.
PROV DM in OWL ontology form (PROV-O) and SPARQL services require the identification of provenance objects using HTTP URIs. For this reason, any object catalogued according to ISO19115-1 wanting to link to a provenance query service as described here must be able to be referred to by URI. Some ISO19115-1-compliant catalogues use URIs but some only use UUIDs and do not provide persistent URIs base on those UUIDs. An effort must be made to establish persistent URIs using those UUIDs before integration with a provenance query service.
This pattern is a specialisation of Pattern 12.1. It suggests placing a has_query_service link, as PROV-AQ calls it, in the LI_Lineage field in a manner conformant with ISO19115-1.
Using this method, we can work out that the query service itself is a PROV Entity (when described) and an Agent (when performing actions like causing Entities to be generated) but we can't say anything, in PROV terms, about the item linking to the query service. It could be any class of PROV object since any object may have provenance about it made available by a query service. Having said that, it is likely to be an Entity as ISO19115-1 catalogues tend to store information about sorts of Entities such as Datasets or Images.
The suggested structure is for the resourceLineage field of a metadata document to contain a LI_Lineage element containing a LI_Source which, in turn, cites an CI_OnlineResource which is described by a name, function, linkage, protocol & protocolRequest. Note that ISO191151 documents may contain multiple LI_Lineage objects within a resourceLineage field so that a reference to a provenance query service may be made alongside written statements of provenance, which are the current usage norm, or other structured lineage.
The specific elements that should be used to communicate that the metadata document is linking to a query service are described here:
DESCRIBE <resource>;
and if the resource was identifiable via a URI, perhaps http://example.com/dataset/x, then it would be DESCRIBE <http://example.com/dataset/x>;
. The full information for this filed then would be that query as a provenance service request action and would just be an HTTP GET link. See the example below.
<mdb:resourceLineage>
<mrl:LI_Lineage>
<mrl:LI_Source>
<mrl:sourceCitation>
<cit:CI_Citation>
<cit:title>
<gco:CharacterString>{SERVICE_NAME}</gco:CharacterString>
</cit:title>
<cit:onlineResource>
<mcc:CI_OnlineResource>
<cit:name>
<gco:CharacterString>{SERVICE_NAME}</CharacterString>
</cit:name>
<cit:function>
<cit:CI_OnLineFunctionCode codeList="codeListLocation#CI_OnLineFunctionCode" codeListValue="provenanceQueryService"/>
</cit:function>
<cit:linkage>
<gco:CharacterString>{ENDPOINT_URI}</gco:CharacterString>
</cit:linkage>
<cit:protocol>
<gco:CharacterString>{PROTOCOL}</gco:CharacterString>
</cit:protocol>
<cit:protocolRequest>
<gco:CharacterString>{REQUEST_EXAMPLE}</gco:CharacterString>
</cit:protocolRequest>
</mcc:CI_OnlineResource>
</cit:onlineResource>
</cit:CI_Citation>
<mrl:sourceCitation>
</mrl:LI_Source>
<mrl:LI_Lineage>
</mdb:resourceLineage>
For a Dataset X with a SPARQL query service at http://location.com/sparql, the following XML might be used:
<mdb:resourceLineage>
<mrl:LI_Lineage>
<mrl:LI_Source>
<mrl:sourceCitation>
<cit:CI_Citation>
<cit:title>
<gco:CharacterString>Provenance Query Service</gco:CharacterString>
</cit:title>
<cit:onlineResource>
<mcc:CI_OnlineResource>
<cit:name>
<gco:CharacterString>Corporation A's Provenance Finder</CharacterString>
</cit:name>
<cit:function>
<cit:CI_OnLineFunctionCode codeList="codeListLocation#CI_OnLineFunctionCode" codeListValue="provenanceQueryService"/>
</cit:function>
<cit:linkage>
<gco:CharacterString>http://location.com/sparql</gco:CharacterString>
</cit:linkage>
<cit:protocol>
<gco:CharacterString>HTTP-SPARQL</gco:CharacterString>
</cit:protocol>
<cit:protocolRequest>
<gco:CharacterString>
http://location.com/sparql?query=DESCRIBE%20%3Chttp%3A%2F%2Fexample.com%2Fdataset%2Fx%3E
</gco:CharacterString>
</cit:protocolRequest>
</mcc:CI_OnlineResource>
</cit:onlineResource>
</cit:CI_Citation>
<mrl:sourceCitation>
</mrl:LI_Source>
<mrl:LI_Lineage>
</mdb:resourceLineage>
The CI_OnlineResource.protocolRequest value used in this example is the basic SPARQL DESCRIBE
query for the resource applied to the SPARQL service in a GET request which is then a single link consisting of;
http://location.com/sparql
?query=
DESCRIBE%20%3Chttp%3A%2F%2Fexample.com%2Fdataset%2Fx%3E
for the query DESCRIBE <http://example.com/dataset/x>;