Generating citations from provenance - INCOMPLETE

Contributor: 
Nicholas Car

Introduction

Citing documents, datasets or other things is an inherently provenance-related task given that information in a reference, such as an item's creator or date of publication, are often provenance facts about it. Additionally, provenance graphs may contain information valuable for citation that are not associated directly with the object cited but with objects related to it or perhaps its use.

This pattern relates a methodology for determining how to cite objects and then a methodology for generating reference text from provenance information.

Class-based determination of citation requirements

The first step to generating a sensible citation for an object is to determine the object's class. This is because the class of the object will determine what elements of a citation are necessary in order to be useful. For example, a static object such as a book or a dataset can be cited as is commonly done so, but objects that are subject to change, such as an internet articles, will need an access date related in order to provide information about the state of the object at the time cited. Consider a more specialised object, a software code repository. This will need both a citation time indicator (or a software "commit" reference which provides information about a particular repository state) and also a reference to the "branch" of the repository used. Branches are an element unique to software version control systems used in repositories.

If information about the use of an object is recorded in a provenance graph, then, regardless of type, the URI of the Named Individual (the object) in question can be used to refer to it. Details from the provenance graph can then supply the particular citation elements.

Citation/reference format background

Academic literature-style citations usually have a link within text to a formatted reference, e.g.:

“… according to Car (2016).”

Car, N.J., Leveraging provenance in citation. In J.Something, 18 (2) 152-167, DOI:10.123/abcdef987

There are many reference data models (and resulting templates and formatting) for references but, for the purposes of this patterning, we will just use a simplified AAAS citation template, used by the example above, which, for a journal article citation, we give as:

{AUTHOR}[, {AUTHOR}], {TITLE}. {JOURNAL} {JOURNAL_NUMBER}[, ({JOURNAL_VOLUME})][, {PAGE_RANGE}] ({YEAR}).

For a book, we give the following simplified AAAS template:

{AUTHOR}[, {AUTHOR}], {TITLE} ({PUBLISHER_NAME}, {PUBLISHER_LOCATION}[, ed. {EDITION_NUMBER},] {YEAR})[, pp. {PAGE_RANGE}].

This renderes examples such as:

M. Lister, Fundamentals of Operating Systems (Springer-Verlag, New York, ed. 3, 1984), pp. 7-11.

All of the elements in a formatted reference, regardless of the cited object type, are elements that can be described in the provenace of the cited object and, specifically, in PROV-O. Therefore, we can generate formatted reference text from provenance.

In order to effectively use provenance in citations, coginsance of both the (PROV-O) class of object cited and the purpose of the citation must be taken.

Provenance citation: see Samples API generation template Reverese case: citation provenance: what can we infer?

ISOz39.29 – bibliographic reference standard