Alistair Miles

Category: thesauri

Semantic Web Deployment Final Face-to-Face

The W3C Semantic Web Deployment Working Group is kicking off it’s final face-to-face meeting at the Library of Congress in Washington, D.C. The main purpose of the meeting is to resolve outstanding issues for the Simple Knowledge Organization System (SKOS), which are summarised on the meeting agenda.

As an aside, I heard recently about the deployment of the Library of Congress Subject Headings (LCSH) as linked data in the Web, using SKOS. This nice work provides a great backdrop to our meeting.

Request for Comments — SKOS Reference — W3C Working Draft 25 January 2008

The W3C Semantic Web Deployment Working Group has announced the publication of the SKOS Reference as a W3C First Public Working Draft:

This is a substantial update to and replacement for the previous SKOS Core Vocabulary Specification W3C Working Draft dated 2 November 2005. The publication has been announced in the W3C news, and a request for comments has been sent to various mailing lists.

The abstract from this new specification:

This document defines the Simple Knowledge Organization System (SKOS), a common data model for sharing and linking knowledge organization systems via the Semantic Web.

Many knowledge organization systems, such as thesauri, taxonomies, classification schemes and subject heading systems, share a similar structure, and are used in similar applications. SKOS captures much of this similarity and makes it explicit, to enable data and technology sharing across diverse applications.

The SKOS data model provides a standard, low-cost migration path for porting existing knowledge organization systems to the Semantic Web. SKOS also provides a light weight, intuitive language for developing and sharing new knowledge organization systems. It may be used on its own, or in combination with formal knowledge representation languages such as the Web Ontology language (OWL).

This document is the normative specification of the Simple Knowledge Organization System. It is intended for readers who are involved in the design and implementation of information systems, and who already have a good understanding of Semantic Web technology, especially RDF and OWL.

For an informative guide to using SKOS, see the upcoming SKOS Primer.


Using SKOS, conceptual resources can be identified using URIs, labeled with lexical strings in one or more natural languages, documented with various types of note, linked to each other and organized into informal hierarchies and association networks, aggregated into concept schemes, and mapped to conceptual resources in other schemes. In addition, labels can be related to each other, and conceptual resources can be grouped into labeled and/or ordered collections.

SKOS and RDFa in e-Learning

The W3C’s Semantic Web Deployment Working Group is developing two new technologies which may be relevant to e-learning technology. These are the Simple Knowledge Organisation System (SKOS), and RDFa.

SKOS is a lightweight language for representing intuitive, semi-formal conceptual structures. So, for example, the figure below (taken from the SKOS Core Guide) depicts concepts with intuitive hierarchical and associative relationships to other concepts, and with preferred and alternative labels in one (or more) languages — these are the kinds of structures that can be expressed using SKOS. Once expressed in this form, conceptual structures can easily be published on the Web, shared between applications, linked/mapped to other conceptual structures and so on. Typically, these conceptual structures are used as tools for navigating around complex or unfamiliar subject areas, for retrieving information across languages, and for bringing together related information from different sources.

RDFa is a language for embedding richly structured data and metadata within Web pages. This allows a Web page to expose much of its underlying meaning to applications, enabling a range of new functionalities within Web clients, exchanging data between Web sites, services, and the users’ desktop applications. For example, a Web page about a new music album can use RDFa to embed structured data expressing facts about that album, such as the track listing, artist, links to sample media files etc. A Web browser with a suitable plugin or extension can use this data to offer new functions to the user, such as download the tracklisting with available samples to my music library, or compare prices from online vendors.

Both of these technologies are on the W3C Recommendation track, and are scheduled for completion in April 2008.

Read the rest of this entry »

The Value Grid for Semantic Technologies

I’ve submitted a paper entitled “The Value Grid for Semantic Technologies” to the workshop on Issues in Ontology Development and Use to be held as part of the UK e-Science All Hands Meeting later this year. The paper is available for download from the following URL:


This paper situates formal ontologies as one of many products in a multi-tier value grid of semantic technologies. Incremental strategies for the exploitation of intermediate products in the value grid are discussed, as a possible step towards cost-effective, low-risk and scalable business models for the exploitation of semantic technologies. A case study is presented, illustrating a hypothetical value grid for the management of scientific data from a large-scale experimental facility. Suggestions are made for the design of predictable, repeatable collaborative processes for adding value in semantic technology value grids.

A Thesaurus Data Model for British Standard 8723 (Part 2)

Continuing on from my initial exploration of using UML to capture the monolingual thesaurus data model described in BS 8723 part 2 (written up here), below is an alternative UML model attempting to represent the underlying conceptual structure of a monolingual thesaurus. This model is more complicated, so I’ve broken it into separate class diagrams for easier viewing …

Read the rest of this entry »

A Thesaurus Data Model for British Standard 8723

The working group producing the new BS 8723 standard for thesauri (structured vocabularies) is currently focusing on the issue of standard formats for interchange of thesaurus data. At a recent meeting it was concluded that a (semi-)formal data model for thesaurus data, using some sort of establishing modeling language, would be a good starting point.

Here is my first attempt to use UML to capture the data model expressed informally as prose in BS 8723 part 2 (monolingual thesauri). The UML was generated using StarUML which is free, and I read this tutorial on UML. I’ve tried to be as faithful to BS 8723 part 2 as possible and capture no more than what is expressed therein nor add any interpretation …

Read the rest of this entry »