Geoscience Australia's metadata catalogue

Geoscience Australia (GA) is a government organisation in charge of geophysical and spatial data. It maintains a large catalogue of public, scientific, datasets online at http://ecat.ga.gov.au/geonetwork/. The catalogue is implemented using the GeoNetwork tool which is an online, document database system. Records are all stored according to the ISO's 19115-1:2014 metadata standard. In addition to utilising that standard, GA has extended it with it's own profile which, among other things, implements code list extensions. The code list relevant to this implementation is the DS_AssociationTypeCode codelist which defines how a catalogued resource is related to another catalogued resource or an esternal resource indicated by a hyperlink. The GA extensions include types taken directly from PROV, such as wasDerivedFrom, hadDerivation, wasGeneratedBy, generated, wasInformedBy and informed. This allows resources associated with other resources with the type indicated by these codes to be interpreted as PROV, for example:

Dataset 82033 was derived from Dataset 70908.

In the catalogue's ISO19115-1:2014, Geoscience Australia profile, XML record for Dataset 82033, using the code list extension for wasDerivedFrom, this is:

<mri:associatedResource>
  <mri:MD_AssociatedResource>
    <mri:name>
      <cit:CI_Citation>
        <cit:title>
          <gco:CharacterString>International Best Track Archive for Climate Stewardship</gco:CharacterString>
        </cit:title>
        <cit:identifier>
          <mcc:MD_Identifier>
            <mcc:code>
              <gco:CharacterString>70908</gco:CharacterString>
            </mcc:code>
            <mcc:description>
              <gco:CharacterString>eCat Identifier</gco:CharacterString>
            </mcc:description>
          </mcc:MD_Identifier>
        </cit:identifier>
        <cit:onlineResource>
          <cit:CI_OnlineResource>
            <cit:linkage>
              <gco:CharacterString> http://pid.geoscience.gov.au/dataset/70908</gco:CharacterString>
            </cit:linkage>
            <cit:protocol>
              <gco:CharacterString>WWW:LINK-1.0-http--link</gco:CharacterString>
            </cit:protocol>
            <cit:description>
              <gco:CharacterString>Link to eCat metadata record landing page</gco:CharacterString>
            </cit:description>
            <cit:function>
              <cit:CI_OnLineFunctionCode codeList="codeListLocation#CI_OnLineFunctionCode" codeListValue="completeMetadata" />
            </cit:function>
          </cit:CI_OnlineResource>
        </cit:onlineResource>
      </cit:CI_Citation>
    </mri:name>
    <mri:associationType>
      <mri:DS_AssociationTypeCode codeList="codeListLocation#DS_AssociationTypeCode" codeListValue="wasDerivedFrom" />
    </mri:associationType>
  </mri:MD_AssociatedResource>
</mri:associatedResource>

Using the persistent URIs for the datasets 82033 & 70908 which are & respectively, this can be interpreted as the RDF triple, using the PROV ontology as:

<http://pid.geoscience.gov.au/dataset/ga/82033> prov:wasDerivedFrom <http://pid.geoscience.gov.au/dataset/ga/70908> .

Links to the online implementation of this for Dataset 82033 and another dataset, 113621, are given below.