Australian Government Linked Data Working Group

Showcase

The AGLDWG aims to communicate the benefits and technical aspects of Linked Data use in government. Here are Linked Data systems and datasets already implemented by Australian government agencies as well as a presentations by group members.

Linked Data Catalogues

Linked Data resources for which this group has allocated persistent identifiers are:

This catalogue only lists operational Linked Data resources that have identifiers allocated as placeholders only. The catalogue itself is a Linked Data system and can be navigated via Linked Data link hopping.

Example Resources

Below are listed some:

Vocabularies

Much of linked Data relies on definitions, indeed the Semantic Web, which Linked Data is helping build, is predicated on strong definitions for web resources. Vocabularies, standardised in their structure and delivery according to Linked Data and Semantic Web principles provide online, look-up-able definitions for things which can be used much more easily and powerfully than older vocabulary tools such as (paper) dictionaries, tables on web pages or XML code lists.

Here are Linked Data vocabularies and also an Australian national catalogue of research vocabularies:

Ontologies

Ontologies are data models that express knowledge within a domain and are often more complex than vocabularies although vocabularies themselves are a form of ontology.

A great number of foundational, or fundamental, ontologies have been produced to cater for such broadly required concepts as time (TIME ontology) and simple authoring information Dublin Core and tracing changes to things over time (PROV-O, the provenance ontology) this Linked Data WG has produced an ontology to define properties for datasets within the data.gov.au catalogue.

Datasets

Here are some examples of Linked Data datasets. Yes, the list is small now but we will be adding to it very soon!

Systems & Tools

There're many different systems that can claim to be "Linked Data Systems": really anything that helps supply Linked Data. Some of them are dedicated to Linked Data, such as RDF triplestores and Linked Data APIs, others facilitate Linked Data along with other functions, such as general website content management.

Below are a few examples of Linked Data systems currently in operation within Australian government.

Australian Governments' Interactive Functions Thesaurus (AGIFT)

AGIFT website screenshot

AGIFT is a vocabulary delivered by the National Archives of Australia that lists functions performed by government. The web page delivering AGIFT is a system that allows for both human and machine-readable versions of the vocabulary formalised using the SKOS ontology.

The system used is the commercial PoolParty product.

Australian National Data Service' Research Vocabularies Australia

The ANDS provides a vocabulary hosting service for Australian government and academic users

ANDS' Research Vocabularies Australia portal

A search for the word 'rock' yields both individual terms ("Concepts") within vocabularies about rocks and whole vocabularies about them.

AGLDWG PID URI Service

One of the core tasks of Linked Data is to uniquely and usefully identify resources - information items - on the web. This is usually done with URIs which are just a slight extension to web page URLs allowing for non-web page things to be linked to, e.g. vocabulary terms in machine-to-machine data formats.

The Linked Data WG recently used an advanced web proxy, the PID Service, but has recently migrated its efforts in redirecting URLs to URL rewrite rules in Apache. This migration allows for lower overhead in complexity and maintenance.

The PID URI Service is used within the data.gov.au domain to manage PIDs made with a series of subdomains, such as environment.data.gov.au, reference.data.gov.au and others that accord with the AGLDWG's URI Guidelines, which indicates how to supply PIDs for use across government (tip: use PIDs associated with government functions, not organisations, as functions don't change, organisations do).

Apache Rewrite Rules Example

Agencies with datasets in the environment domain of government functions, such as the Bureau of Meteorology, can put Linked Data datasets online and use the proxying functions of this PID URI Service to create persistent identifiers for the datasets and their subcomponents which resolve to them, regardless of where and how they are implemented under the hood.

The PID URI http://environment.data.gov.au/def/op to CSIRO "Observable Properties" ontology which is a Linked Data resource about environmental properties. It's hosted on a CSIRO system but the PID makes it accessible via a nice, ordered, URI that won't change, even if CSIRO changes things (it can be redirected).

Geoscience Australia's Samples Register

Geoscience Australia's Sample Register delivers metadata for physical samples stored in it's repositories - internal databases. Multiple 'views' and 'formats' of samples' metadata is available, including the Dublin Core Metadata Initiative represented in RDF general purpose metadata, and more specialised metadata according to more sample-specialised schema, such as the W3C's Spatial Data on the Web's SOSA ontology.

The full catalogue (register) of all samples is available at http://pid.geoscience.gov.au/sample/ and W3C Data on the Web Best Practices are followed to allow for navigating the 2M samples.

GA's Samples Register

The GeoSPARQL Extensions Ontology

The GeoSPARQL Ontology which s widely used for spatial data on the Web and which powers GeoSPARQL queries has been extended by members of this WG to include some properties and other elements found to be needed for Australian spatial Linked Data projects, particularly the Location Index.

Simple properties for basic values of area as well as complete property chain axioms have been added.

http://linked.data.gov.au/def/geox

GeoSPARQL Extensions ont overview image

The Australian Linked Data Cache


A cache of "Australian" Linked Data (i.e. LD attempted to be sourced from Australia only, but this is hard) is being worked on by University of Canberra students and Geoscience Australia.

This dataset will be presented here in August, 2017.

ACORN-SAT

"Experimental Environmental Linked-data published by the Bureau of Meteorology"

ACORN-SAT homepage

The Bureau of Meteorology (BOM) in collaboration with the National Plan for Environmental Information Initiative, the Australian Government Information Management Office (AGIMO) and the Information Engineering Laboratory of the CSIRO is providing experimental resources for Linked Data under lab.environment.data.gov.au. The data published under this domain makes data available in a Linked Data fashion, and illustrates some of the capabilities that can be developed with Linked Data.

Currently, the following environmental data sets are available as Linked Data:

See http://lab.environment.data.gov.au for more info.

The Plot Ontology documentation

TERN's Plot ontology

The Terrestrial Ecosystems Research Network (TERN) . commissioned an OWL ontology to "provide a set of classes to support capture of plot- and site-based ecological survey data" with the result being the Plot Ontolgy.

The ontology is an extension to the W3C SSN/SOSA vocabulary and thus all data characterised using the Plot Ontology is compatible, at least in structure, with other SSN/SOSA data

GA's Public Data Model Ontology

The GA PDM

Geoscience Australia is moving to present all of its public online resources in accordance with a single ontology: the GA Public Data Ontology.

The ontology describes how GA's datasets, services, vocabularies, vocab terms, licenses, samples and all other resources online represented in Linked Data are linked. Using the ontology you can see that a Service operatesOn Dataset(s) and that the cardinality is 1+, i.e. every GA Service will indicate at least one Dataset.

The ontology provides semantic beyond that able to be provided by a single legacy catalogue tool.

Machine Readable Australian Curriculum

Machine Readable Australian Curriculum

"On behalf of ACARA, Education Services Australia publishes a machine readable version of the Australian Curriculum. The Australian Curriculum is published in machine readable form, using the Resource Description Framework (RDF). This uses Semantic Web technologies for an extensible encoding of metadata about the curriculum, expressed through relations between URIs."

The Australian Government Records Interoperability Framework (AGRIF) ontology

The Australian Government Records Interoperability Framework (AGRIF) is a system of related semantic ontologies that describe the structure, functions, and activities of the Australian Government, providing sufficient context for the effective use – including but not limited to management – of Australian Government information assets. It complies with the World Wide Web Consortium’s Web Ontology Language (OWL2) Recommendation and makes reference to other Recommendations and existing domain ontologies for archival and preservation processes.

This ontology is expected to form one of the pillars of Semantic Web interoperability between Australian government organisations.

The AGRIF Ontology's Record class

Python's Live OWL Documentation Environment

pyLODE is a development of the well-known LODE ontology documentation tool that's not available online any more.

pyLODE is written in Python and uses templating to deliver wither HTML or Markdown documentation for ontologies by interpreting the ontology and its metadata according to a series of display rules.

Many of the ontologies published by the AGLDWG are documented using pyLODE.

See the code repository at https://github.com/rdflib/pyLODE/

Note it's also available for use either as a Python package or as a Command Line application.

https://github.com/rdflib/pyLODE/

Registry Status Ontology

This vocabulary is a re-published version of the Registry Ontology's Status vocabulary (online in RDF). This re-publication was performed to allow for the URIs of each vocab term (skos:Concept) to resolve to both human-readable and machine-readable forms (HTML and RDF, respectively) using HTTP content negotiation.

Note that just like the original form of this vocabulary, while it is a SKOS vocabulary implemented as a single Concept Scheme, it is also an OWL Ontology and that each Status is both a skos:Concept and a reg:Status.

This vocabulary was the first vocab published using the AGLDWG's PID URI domain of linked.data.gov.au.

http://linked.data.gov.au/def/reg-status

pyLDAPI

A very small Python code module to add Linked Data API functionality to a Python Flask installation.

This module contains only a single Python file with a few static methods and classes that are indented to be added to a Flask API in order to add a series of extra functions to endpoints that the API delivers. It will also require the addition of one API endpoint - a ‘Register of Registers’ - if it is not already present.

An API using this module will get:

  • an alternates view for each Register and Object that the API delivers
    • if the API declares the appropriate model view s for each item
  • a Register of Registers
    • a start-up function that auto-generated a Register of Registers is run when the API is launched
  • a basic, over-writeable, template for Registers’ HTML & RDF
pyLDAPI on PyPI

Linked Data GNAF

A Linked Data (OWL/RDF + HTML) version of the Geocoded National Address File (G-NAF) which is a database of all Australian street addresses and their "geo-codes" (coordinate location points).

The dataset is delivered according to Linked Data principles by use of the pyLDAPI tool and that ensures it is also conformant with the Content Negotiaon By Profile emerging Linked Data standard which allows for standard ways to request different profiles for data. This dataset supports a couple of differnt profiles.

Metadata for the dataset as a whole is also available according to the recently updated DCAT vocabulary for describing datasets.

GNAF API screenshot