SKOS and RDFa in e-Learning
by Alistair Miles
SKOS is a lightweight language for representing intuitive, semi-formal conceptual structures. So, for example, the figure below (taken from the SKOS Core Guide) depicts concepts with intuitive hierarchical and associative relationships to other concepts, and with preferred and alternative labels in one (or more) languages — these are the kinds of structures that can be expressed using SKOS. Once expressed in this form, conceptual structures can easily be published on the Web, shared between applications, linked/mapped to other conceptual structures and so on. Typically, these conceptual structures are used as tools for navigating around complex or unfamiliar subject areas, for retrieving information across languages, and for bringing together related information from different sources.
RDFa is a language for embedding richly structured data and metadata within Web pages. This allows a Web page to expose much of its underlying meaning to applications, enabling a range of new functionalities within Web clients, exchanging data between Web sites, services, and the users’ desktop applications. For example, a Web page about a new music album can use RDFa to embed structured data expressing facts about that album, such as the track listing, artist, links to sample media files etc. A Web browser with a suitable plugin or extension can use this data to offer new functions to the user, such as download the tracklisting with available samples to my music library, or compare prices from online vendors.
Both of these technologies are on the W3C Recommendation track, and are scheduled for completion in April 2008.
SKOS Foundations and Motivations
SKOS inherits much from the development of knowledge organisation systems (KOS) within the library and information sciences. Thesauri, classification schemes, subject heading systems and taxonomies are all examples of KOS widely used in information systems today.
The original motivation for SKOS was to provide a standard, low-cost way of migrating or “porting” existing KOS, especially thesauri, to the Semantic Web, so that they could be used as-is for the development of lightweight Semantic Web applications such as search/browse Web portals. This remains one of the central requirements for current development of SKOS. However, it’s worth noting that SKOS is also increasingly seen as a “bridging” technology, providing the missing link between the rigorous logical formalism of ontology languages such as OWL and the chaotic, informal and weakly-structured world of social approaches to information management, as exemplified by social tagging applications.
As such, SKOS is a very interesting technology to work on, because in the same thought you have to consider the model-theoretic semantics of RDF and OWL (and their consequences for formal reasoning), the ways in which people naturally express and organise their own conceptualisations (especially when working as a collaboration), and the potential for computational processes to “spot” and analyse emergent patterns in networks of unstructured information. The future for knowledge organisation is undoubtedly in highly collaborative, intuitive, and computer-aided environments, where people interact in a natural but structured way, being guided (perhaps unwittingly) towards the creation of emergent structures, supported by and feeding back into a range of analytic systems working behind the scenes to mine, discover and exploit patterns in information. SKOS is a small but important part of this bigger picture, in as much for the work it will lead to in the future as for the applications it can enable today.
The basic building block in SKOS is the notion of a conceptual resource, or often simply “concept”. Concepts can be labelled in one or more languages, can be annotated with various types of documentation, can be arranged into intuitive hierarchies and association networks, and can be aggregated into concept schemes and linked/mapped to concepts in other schemes.
All of these SKOS primitive features can be extended or refined to support more detailed, fine-grained conceptual models. SKOS can also be used in part or as a whole in a “mix-and-match” with other RDF vocabularies and OWL ontologies. A concrete example of this is the Semantically Interlinked Online Communities (SIOC) ontology, where SKOS can be “plugged in” to describe the topics or tags defined on a community Web site.
SKOS is built on the Resource Description Framework (RDF) and the Web Ontology Language (OWL). However, SKOS deliberately hides much of the compexity of these two languages. It provides an interface between the formal underpinnings of the Semantic Web, and the more informal, intuitive ways in which people naturally express and organise knowledge. Thus, informal and semi-formal conceptual structures or knowledge organisation systems can be expressed directly in SKOS and used immediately in Semantic Web applications, without requiring any formal re-engineering.
While these topics are still under debate, consensus is emerging that the formal semantics of SKOS are, by design, very limited. Therefore, a small number of logical consequences follow from using only SKOS, compared with the larger number of logical consequences that follow from using RDFS or especially OWL directly. Whereas in some situations, a powerful set of logical entailments is very valuable, in others this can be inappropriate and/or unecessary. This is the typically the case where a formal language such as OWL is abused to express what is at best a semi-formal conceptualisation (e.g. a “concept map” or thesaurus), and a number of surprising and inappropriate inferences then follow. SKOS provides the option to model at a simpler, less formal level, which is then a starting point for more formalisation as required.
SKOS is itself an RDF vocabulary (i.e. a set of URIs), whose semantics is defined using the RDF Vocabulary Description Language (RDF Schema) and OWL. SKOS can therefore be used as a lightweight conceptual modeling language in its own right, or can be used as an adjunct to the primitives provided by RDF, RDFS and OWL in a “mixed mode” modeling environment.
An example of where SKOS, RDFS and OWL are used in “mixed mode” is the Semantic Web Environmental Directory (SWED), a prototype Web portal with “faceted browsing” functionality for finding UK organisations and projects in in the environment sector. Here, RDFS and OWL are used to model projects, organisations and their properties, such as topic of interest, geographical coverage and so on. SKOS is then used to model the semi-formal taxonomies which provide the descriptive vocabulary for these properties, e.g. animal welfare, welfare of captive animals, biodiversity etc. Of course, this is a Semantic Web application, so information can be drawn together and integrated from many different sources.
SKOS is formally developed and maintained by the W3C Semantic Web Deployment Working Group (SWDWG). It is a work item on the W3C Recommendation track, which means it is subject to the full W3C Web standardisation process. However, whilst formal responsibility for SKOS rests with the SWDWG, the working group carries out all development in an open, consensus-led environment, and is ably supported by an extended community of interest. Informal participation in the SKOS development process is warmly welcomed — to join in, subscribe to the email@example.com mailing list (you can also browse the mailing lists’ online archives). To participate formally in the development of SKOS, contact your W3C Advisory Committee representative about joining the SWDWG.
From the abstract to the latest version of the RDFa Primer:
Current Web pages, written in XHTML, contain inherent structured data: calendar events, contact information, photo captions, song titles, copyright licensing information, etc. When authors and publishers can express this data precisely, and when tools can read it robustly, a new world of user functionality becomes available, letting users transfer structured data between applications and Web sites. An event on a Web page can be directly imported into a desktop calendar. A license on a document can be detected to inform the user of his rights automatically. A photo’s creator, camera setting information, resolution, and topic can be published as easily as the original photo itself.
RDFa is simply a collection of XML attributes that can be used within an XHTML document to embed structured data within that document. This data can then be extracted by an RDFa parser as a set of RDF triples.
So, for example, the following snippet of XHTML has embedded data describing a person’s contact information. The data has been embedded using the special RDFa attributes
<p class="contactinfo" about="http://example.org/staff/jo"> <span property="contact:fn">Jo Smith</span>. <span property="contact:title">Web hacker</span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:firstname.lastname@example.org"> via email </a>. </p>
This snippet, when parsed, yields the following RDF triples:
contact:fn "Jo Smith"; contact:title "Web Hacker"; contact:org ; contact:email .
Given this data, a Web client could, for example, offer functions to import this contact information into a desktop contact management system.
This is a very simple example, but hopefully it illustrates the general principle that, once data is available in Web pages, new functionality becomes possible. In the e-learning sector, we might imagine Web pages which not only describe historical events, but encode data about the time, place and people involved in those events. A history student might then “cut and paste” these data from many different Web pages into their own virtual learning space, allowing them to discover and explore the many-dimensional relationships between people, places and events and build up their own structured “mini-history” which is specific to a particular learning objective or research question.
RDFa and Microformats
There’s a lively debate ongoing on the Web today about the relationship between RDFa and an analogous technology called “microformats”. Microformats have the same objective as RDFa, of embedding data within Web pages. While I’m not in a position to comment in any depth on the comparisons and relative merits of these two approaches, RDFa developers argue that the RDFa approach provides a more scalable, general purpose approach, which requires only a single implementation, and which allows different types of data to “play well” with each other. On the other hand, a bespoke transformation is required for each different microformat, and microformats could easily “clash” with each other under certain circumstances. Having said that, microformats are used on the Web today, whereas RDFa has only a number of prototype implementations. A blog post by Evan Prodromou discusses the issue in more detail, although some of the information there may be out of date (see e.g. the comments at the end).
SKOS provides a lightweight technology for overlaying distributed learning content with intuitive conceptual structures, which could for example be used to aid discovery and navigation of learning resources. Conceptual structures are also themselves learning resources in their own right, and although SKOS is oriented towards information retrieval applications, the use of SKOS to express, evolve, exchange and publish “knowledge” as part of a learning process remains an intriguing avenue for exploration.
RDFa provides a general purpose technology for embedding data in Web pages. This in turn can enable a richer experience when interacting with the Web, moving beyond Web pages as unbreakable chunks of information towards Web pages as highly flexible containers of data which can be extracted, adapted, re-used and re-purposed. This technology has the potential to completely change out interaction with the Web as a learning experience.
Both technologies would greatly benefit from active participation from the e-learning community, to ensure that the requirements of near-future learning technologies are met by SKOS and RDFa within the timeline set for standardisation.