VocPub Profile - Specification
- URI
https://w3id.org/profile/vocpub/spec
- Title
- VocPub Profile - Specification Document
- Definition
- This document specifies the VocPub Profile. It is to be used to inform people about the requirements that need to be met by data claiming to conform to the profile.
- Created
- 2020-06-14
- Modified
- 2024-06-13
- Version IRI
- https://w3id.org/profile/vocpub/spec/4.10
- Version Information
-
4.10 move all warning from PropertyShapes to NodeShapes
4.9 Fixed RDF syntax errors
4.8 Fixed Warning/Violation error for PropertyShapes
4.7 Fixed modified date alternate path listing bug
4.6 Fixed sdo:historyNote->skos:historyNote bug
4.5 Added suggested predicates of license & copyrightHolder
4.4 Fixed versions across multiple Resources
4.3 Improved validator error messages by using more named Property Shapes
4.2: Included CONSTRUCT-based pre-validation inference in validator. First Git tagged version
4.1: Added Requirements 2.1.10, 2.1.11 & 2.1.12 and example RDF
4.0: Added a SPARQL function to allow for the inferencing of
skos:inScheme
predicates,skos:broader
/skos:narrower
andskos:topConceptOf
/skos:hasTopConcept
pairs of inverse predicates3.3: Converted validator metadata to schema.org, enabled bibliographic references for Concepts, enabled DCTERMS or schema.org for many ConceptScheme predicates; simplified 2.1.6 from two Requirements to one; included
skos:topConceptOf
in 2.1.8 for Concepts at the top of the hierarchy; collapsed title & definition requirement pairs to single requirements3.2: Allowed
dcterms:provenance
andskos:historyNote
; removed max restriction ondcterms:source
&prov:wasDerivedFrom
3.1: Changed
dcterms:provenance
toskos:historyNote
3.0: Removed Requirement-2.3.5 (identifiers) as these are auto-generated in systems like VocPrez; Added Requirement-2.1.10 & 2.1.11 and sub parts to test for qualifiedDerivation and status of a
ConceptScheme
- Creator
- Nicholas J. Car
- Publisher
- Australian Government Linked Data Working Group
- Further metadata
- This specification is part of the VocPub Profile. See that profile's main document for License & Rights information and other metadata not given here.
- Profile URI
https://w3id.org/profile/vocpub
- License
- CC-BY 4.0
Abstract
This is the specification document of the VocPub profile of SKOS. It defines the requirements that data must satisfy to be considered conformant with this profile.
This specification document cannot be used for testing conformance of RDF resources to this profile: that role belongs to the validation resource within this profile:
For the list of all resources within this profile, see the profile definition:
Namespaces
This document refers to elements of various ontologies by short codes using namespace prefixes. The prefixes and their corresponding namespaces' URIs are:
- dcterms
http://purl.org/dc/terms/
- isorole
http://def.isotc211.org/iso19115/-1/2018/CitationAndResponsiblePartyInformation/code/CI_RoleCode/
- prof
http://www.w3.org/ns/dx/prof/
- prov
http://www.w3.org/ns/prov#
- reg
http://purl.org/linked-data/registry#>
- sdo
https://schema.org/
- skos
http://www.w3.org/2004/02/skos/core#
- rdfs
http://www.w3.org/2000/01/rdf-schema#
1. Introduction
Many organisations use the Simple Knowledge Organization System Reference (SKOS)ref to represent vocabularies in a form that can be read by humans and consumed by machines; that is, in Semantic Web formref.
This profile defines a vocabulary as a controlled collection of defined terms - Concepts - that may or may not contain relationships between Cocnepts and relationships to Concepts in other vocabularies.
This document specifies a profile of SKOS and for profile, the definition of from the Profiles Vocabularyref is used. A profile is:
A specification that constrains, extends, combines, or provides guidance or explanation about the usage of other specifications.
Here, the other specification being profiled is SKOS.
In the next section, this document describes how SKOS's elements must be presented - in certain arrangements wth respect to one another and with certain predicates to indicate properties - to make a vocabulary that conforms to this profile.
This specification's rules/requirements - are numbered and indicated in red text.
1.1 Data Expansion
Some SKOS elements - classes and predicates - can be inferred based on rules present in the SKOS model. For example, skos:broader
and skos:narrower
are inverse predicates thus if I have <A> skos:broader <B>
, I can infer <B> skos:narrower <A>
.
This profile allows data to be supplied in a minimalist form that is not conformant to this specification until a series of calculations, based on certain SKOS rules, are carried out on it through a process known as data expansion.
The particular rules that may be applied to data before validation with this profile's validator are given in the table below. These rules are enacted by the application of a series of SPARQLref queries to the data, all of which are packaged up inside a SHACLref file in this profile's repository.
Rule | Description | SPARQL Query |
---|---|---|
hasTopConcept | Calculate skos:hasTopConcept as the inverse to skos:topConceptOf |
CONSTRUCT { $this skos:hasTopConcept ?c . } WHERE { ?c skos:topConceptOf $this . } |
topConceptOf | Calculate skos:topConceptOf as the inverse to skos:hasTopConcept |
CONSTRUCT { $this skos:topConceptOf ?cs . } WHERE { ?cs skos:hasTopConcept $this . } |
broader | Calculate skos:broader as the inverse to skos:narrower |
CONSTRUCT { $this skos:broader ?n . } WHERE { ?n skos:narrower $this . } |
narrower | Calculate skos:narrower as the inverse to skos:broader |
CONSTRUCT { $this skos:narrower ?b . } WHERE { ?b skos:broader $this . } |
inScheme | Calculate skos:inScheme for all Concepts, based on their linking to a Concept Scheme via skos:broader/skos:topConceptOf property path |
CONSTRUCT { $this skos:inScheme ?cs . } WHERE { $this skos:broader*/skos:topConceptOf ?cs . } |
Concept provenance | Infer provenance predicates for a Concept when they don't have their own but their containing Concept Scheme does |
CONSTRUCT { $this ?p ?o3 } WHERE { $this skos:inScheme ?cs . VALUES ?p { prov:wasDerivedFrom skos:historyNote sdo:citation dcterms:source dcterms:provenance } ?cs ?p ?o . OPTIONAL { $this ?p ?o2 . } BIND (COALESCE(?o2, ?o) AS ?o3) } |
Application of these rules will allow a Concept supplied without any provenance predicates to be calculated from its containing Concept Scheme, thus satisfying Requirement 2.3.4.
The SHACL file containing all the expansion queries is available at:
A Python script able to execute the expansion rule on RDF data before validation is available at:
1.2 Dependencies
To characterise vocabularies according to the mandatory and suggested Requirements of this Profile, several other vocabularies will need to be used. This profile is therefore dependent on those vocabularies. For this reason, copies of these vocabularies are maintained within the repository of this profile and are accessible as per the details table below.
Vocabulary | Description | Where used | Local Copy |
---|---|---|---|
IDN Role Codes | he Indigenous Data Network's vocabulary of the types of roles Agents - People and Organisations - play in relation to data | For the predicate prov:hadRole , applied to an Attribution indicated by a prov:qualifiedAttribution predicate for a Concept Scheme |
https://data.idnau.org/pid/vocab/idn-role-codes |
Registry Statuses | The registration statuses of items in a Register, as per ISO19135 | For the suggested vocabulary predicate reg:status , as per Requirement 2.1.11 |
https://linked.data.gov.au/def/reg-statuses |
Vocab Derivation Modes | The modes by which one vocabulary may derive from another | For the suggested vocabulary predicate prov:qualifiedDerivation , as per Requirement 2.1.12 |
https://linked.data.gov.au/def/vocdermods |
2. Elements & Requirements
2.1 Vocabulary
This profile identifies Semantic Web objects with URI-based persistent identifiers. For this reason:
§ 2.1.1 Each vocabulary MUST be identified by a IRI
As per the SKOS Primerref, a document guiding the use of SKOS:
concepts usually come in carefully compiled vocabularies, such as thesauri or classification schemes. SKOS offers the means of representing such KOSs using the skos:ConceptScheme
class.
For this reason, this profile requires that:
§ 2.1.2 Each vocabulary MUST be presented as a single Concept Scheme object
For ease of data management:
§ 2.1.3 Each vocabulary MUST be presented in a single RDF file which does not contain information other than that which is directly part of the vocabulary
To ensure vocabularies can be catalogued effectively and governed:
§
2.1.4 Each vocabulary MUST have exactly one title and at least one definition indicated using the skos:prefLabel
and the skos:definition
predicates respectively that must give textual literal values. Only one definition per language is allowed
NOTE: Unlike the general directions for the use of SKOS (the SKOS "Primer") labels in multiple languages should be indicated with skos:altLabel
predicates, not all with skos:prefLabel
, i.e. there should only ever be one skos:prefLabel
value. If multiple definitions are given, the one in the language of the label is considered primary.
§
2.1.5 Each vocabulary MUST have exactly one created date and exactly one modified date indicated using the sdo:dateCreated
and sdo:dateModified
or dcterms:created
and dcterms:modified
predicates respectively that must be either an xsd:date
, xsd:dateTime
or xsd:dateTimeStamp
literal value
§
2.1.6 Each vocabulary MUST have at least one creator, indicated using sdo:creator
or dcterms:creator
predicate and exactly one publisher, indicated using sdo:publisher
or dcterms:publisher
, all of which MUST be IRIs indicating instances of sdo:Person
, or sdo:Organization
. A prov:qualifiedAttribution
predicate indicating an Agent with the prov:hadRole
predicate indicating the value isorole:originator
or isorole:publisher
may be used instead of sdo:creator
& sdo:publisher
, respectively
To be able to link SKOS vocabularies to their non-vocabulary source information:
§
2.1.7 The origins of a Concept Scheme MUST be indicated by at least one of the following predicates: skos:historyNote
, sdo:citation
, prov:wasDerivedFrom
. dcterms:source
MAY be used instead of sdo:citation
and dcterms:provenance
MAY be used instead of skos:historyNote
but the schema.org and SKOS predicates are preferred.
If a vocabulary is based on another Semantic Web resource, such as an ontology or another vocabulary, prov:wasDerivedFrom
should be used to indicate that resource's IRI. If the vocabulary is based on a resource that is identified by a IRI but which is not a Semantic Web resource, sdo:citation
should be used to indicate the resource's IRI with the xsd:anyURI
datatype. If the vocabulary is based on something which cannot be identified by IRI, a statement about the vocabulary's origins should be given in a literal value indicated with the skos:historyNote
predicate. If the vocabulary is not based on any other resource or source of information, i.e. this vocabulary is its only expression, this should be communicated by use of the skos:historyNote
indicating the phrase "This vocabulary is expressed for the first time here".
The use of dcterms:source
& dcterms:provenance
is to maintain compatability with previous versions of VocPub only and may eventually be disallowed.
To ensure that all the terms within a vocabulary are linked to the main vocabulary object, the Concept Scheme:
§
2.1.8 All Concept instances within a Concept Scheme MUST be contained in a single term hierarchy using skos:hasTopConcept
/ skos:topConceptOf
predicates indicating the broadest Concepts in the vocabulary and then skos:broader
and/or skos:narrower
predicates for all non-broadest Concepts in a hierarchy that contains no cycles.
To unambiguously link the term hierarchy within a vocabulary to the vocabulary itself:
§
2.1.9 Each vocabulary's Concept Scheme MUST link to at least one Concept within the vocabulary with the skos:hasTopConcept
predicate
To communicate the Registry Status of the vocabulary:
§
2.1.10 The status of the vocabulary as a whole, according to the Registry Status standardref, SHOULD be given with the predicate reg:status
indicating a Concept from the Registry Statuses vocabulary (https://linked.data.gov.au/def/reg-statuses).
To indicate whether and if so how this vocabulary has been derived from another vocabulary:
§
2.1.11 The derivation status of the vocabulary SHOULD be given should be given with the predicate prov:qualifiedDerivation
indicating a Blank Node that contains the predicated prov:entity
, to indicate the vocabulary derived from and prov:hadRole
to indicate the mode of derivation which SHOULD be taken from the Vocabulary Derivation Modes vocabulary (https://linked.data.gov.au/def/vocdermods).
Example data for a vocabulary indicating that it is an extension to another vocabulary using the mechanism defined in Requirement 2.1.2 is:
# Vocab X is derived from Vocab Y and is an extension of it <http://example.com/vocab/x> a skos:ConceptScheme ; skos:prefLabel "Vocab X"@en ; ... prov:qualifiedDerivation [ prov:entity <http://example.com/vocab/y> ; # Vocab Y prov:hadRole <https://linked.data.gov.au/def/vocdermods/extension> ; ] ; .
To high-level theming of a vocabulary:
§
2.1.12 High-level theming of a vocabulary SHOULD be given using the sdo:keywords
predicate indicating Concepts from another vocabulary. Alternatively, dcat:theme
MAY be used. Text literal values for either predicate SHOULD NOT be used.
To indicate license, copyright:
§
2.1.13 Any licence pertaining to the reuse of a vocabulary's content SHOULD be given using the sdo:license
predicate preferentially indicating the IRI of a license if in RDF form or a literal URL (datatype xsd:anyURI
) if online but not in RDF form. If the licence is expressed in test, a literal text field may be indicated.
§
2.1.14 The copyright holder for the vocabulary SHOULD be given using the sdo:copyrightHolder
predicate preferentially indicating the IRI of an Agent or a Blank Node instance of an Agent containing details as per Agent requirements. A prov:qualifiedAttribution
predicate indicating an Agent with the prov:hadRole
predicate indicating the value isorole:rightsHolder
may be used instead of sdo:copyrightHolder
.
2.2 Collection
From the SKOS Primerref:
SKOS makes it possible to define meaningful groupings or "collections" of concepts. Collections may contain Concepts defined in any vocabulary, not just the one the Collection itself is defined in.
To ensure that Collection instances are identifiable and their meaning isn't obscure or lost:
§
2.2.1 Each Collection MUST have exactly one title and at least one definition indicated using the skos:prefLabel
and the skos:definition
predicates respectively that must give textual literal values. Only one definition per language is allowed
NOTE: Unlike the general directions for the use of SKOS (the SKOS "Primer") labels in multiple languages should be indicated with skos:altLabel
predicates, not all with skos:prefLabel
, i.e. there should only ever be one skos:prefLabel
value. If multiple definitions are given, the one in the language of the label is considered primary.
If a Collection's grouping of Concepts is derived from an existing resource that is different from the ConceptScheme it is defined within:
§
2.2.2 The origins of a Collection, if different from its containing Concept Scheme, SHOULD be indicated by at least one of the following predicates: skos:historyNote
, sdo:citation
, prov:wasDerivedFrom
. dcterms:source
MAY be used instead of sdo:citation
and dcterms:provenance
MAY be used instead of skos:historyNote
but the schema.org and SKOS predicates are preferred.
For compatability with previous versions of this Specification, dcterms:provenance
MAY be used instead of skos:historyNote
but the latter is the preferred predicate.
To help list Collections within vocabularies:
§
2.2.3 A Collection exists within a vocabulary SHOULD indicate that it is within the vocabulary by use of the skos:inScheme
predicate. If it is defined for the first time in the vocabulary, it should also indicate this with the rdfs:isDefinedBy
predicate
To ensure that a Collection isn't empty:
§
2.2.4 A Collection MUST indicate at least one Concept instance that is within the collection with use of the skos:member
predicate. The Concept need not be defined by the Concept Scheme that defines the Collection
2.3 Concept
From the SKOS Primerref:
The fundamental element of the SKOS vocabulary is the concept. Concepts are the units of thought — ideas, meanings, or (categories of) objects and events—which underlie many knowledge organization systems
Vocabularies conforming to this profile must present at least one Concept within the vocabulary file and, as per requirements in Section 2.1, at least once Concept must be indicated as the top concept of the vocabulary.
To ensure that Concept instances are identifiable and their meaning isn't obscure or lost:
§
2.3.1 Each Concept MUST have exactly one title and at least one definition indicated using the skos:prefLabel
and the skos:definition
predicates respectively that must give textual literal values. Only one definition per language is allowed
NOTE: Unlike the general directions for the use of SKOS (the SKOS "Primer") labels in multiple languages should be indicated with skos:altLabel
predicates, not all with skos:prefLabel
, i.e. there should only ever be one skos:prefLabel
value. If multiple definitions are given, the one in the language of the label is considered primary.
To ensure that every Concept is linked to the vocabulary that defines it:
§
2.3.2 Each Concept in a vocabulary MAY indicate the vocabulary that defines it by use of the rdfs:isDefinedBy
predicate indicating a Concept Scheme instance. If no such predicate is given, the Concept Scheme in the file that a Concept is provided in is understood to be the defining Concept Scheme
Note that the vocabulary that defines a Concept does not have to be the vocabulary in the file being validated. This is to allow for Concept instance reuse across multiple vocabularies.
Since a Concept may be used in more than one vocabulary:
§
2.3.3 Each Concept in a vocabulary MUST indicate that it appears within that vocabulary's hierarchy of Concepts either directly by use of the skos:topConceptOf
predicate indicating the vocabulary or indirectly by use of one or more skos:broader
/ skos:narrower
predicates placing the Concept within a chain of other Concepts, the top concept of which uses the skos:topConceptOf
predicate to indicate the vocabulary.
If a Concept is derived from an existing resource and that derivation is not already covered by source information for the vocabulary that it is within:
§
2.3.4 The origins of a Concept, if different from its containing Concept Scheme, SHOULD be indicated by at least one of the following predicates: skos:historyNote
, sdo:citation
, dcterms:source
or prov:wasDerivedFrom
or dcterms:provenance
.
If a Concept is based on another Semantic Web resource, such as another Concept or other defined object, prov:wasDerivedFrom
should be used to indicate that resource's IRI. If the Concept is based on a resource that is identified by a IRI but which is not a Semantic Web resource, dcterms:source
should be used to indicate the resource's IRI. If the vocabulary is based on something which cannot be identified by IRI, a statement about the vocabulary's origins should be given in a literal value indicated with the skos:historyNote
predicate. If the vocabulary is not based on any other resource or source of information, i.e. this vocabulary is its only expression, this should be communicated by use of the skos:historyNote
indicating the phrase "This vocabulary is expressed for the first time here".
2.4 Agent
To be consistent with other Semantic Web representations of Agents, vocabularies' associated Agents, creator & publisher must be certain typed RDF values:
§
2.4.1 Each Agent associated with a vocabulary MUST be typed as an sdo:Person
or sdo:Organization
To ensure human readability and association of Agents with their non-Semantic Web (real world) form:
§
2.4.2 Each Agent MUST give exactly one name with the sdo:name
predicate indicating a literal text value
To ensure that Agents are linked to non-Semantic Web forms of identification:
§
2.4.3 Each Agent MUST indicate either a sdo:url
(for organizations) or a sdo:email
(for people) predicate with a URL or email value
To link to Agent registers using non-Semantic Web identifiers for Agents:
§
2.4.4 Each Agent SHOULD indicate any non-Semantic Web identifiers for Agents with the sdo:identifier
predicate with literal identifier values, preferentially with custom data types that define the form of the identifier.
NOTE: This method of providing identifiers with specialised datatypes is the same as that specified for skos:notation
values in the SKOS Primer.
3. References
- PROF
- Rob Atkinson; Nicholas J. Car (eds.). The Profiles Vocabulary. 18 December 2019. W3C Working Group Note. URL: https://www.w3.org/TR/dx-prof/
- ISO 19135-1:2015
- International Organization for Standardization ISO 19135-1:2015 - Geographic information - Procedures for item registration - Part 1: Fundamentals. 2015. ISO Standard. URL: https://www.iso.org/standard/54721.html
- OWL
- W3C OWL Working Group (eds.). OWL 2 Web Ontology Language Document Overview (Second Edition). 11 December 2012. W3C Recommendation. URL: https://www.w3.org/TR/owl2-overview/
- SHACL
- World Wide Web Consortium. Shapes Constraint Language (SHACL) 20 July 2017. W3C Recommendation. URL: https://www.w3.org/TR/shacl/
- SKOS
- Alistair Miles; Sean Bechhofer (eds.). SKOS Simple Knowledge Organization System Reference. 18 August 2009. W3C Recommendation. URL: https://www.w3.org/TR/skos-reference/
- SKOS Primer
- Antoine Isaac; Ed Summers (eds.). SKOS Simple Knowledge Organization System Primer. 18 August 2009. W3C Note. URL: https://www.w3.org/TR/skos-primer/
- Semantic Web
- World Wide Web Consortium. Semantic Web 2015. Web Page. URL: https://www.w3.org/standards/semanticweb/, accessed 2020-06-14
- SPARQL
- World Wide Web Consortium. SPARQL 1.2 Query Language 29 September 2023. W3C Working Draft. URL: https://www.w3.org/TR/sparql12-query/