Re: structured bibliographic info for BioMed Central articles now available as RDF from Danny Ayers on 2005-09-15 (stdin)

From: Danny Ayers <danny.ayers_at_gmail.com>
Date: Thu, 15 Sep 2005 11:11:00 +0200

On 9/14/05, Leigh Dodds <ldodds_at_ingenta.com> wrote:
> Matthew Cockerill wrote:
>
> > And embedding the RDF as a comment is already in very common use for CC metadata - so
> > shouldn't Piggy Bank really support the identification of islands of
> RDF in comments like this?
>
> It's a hack though, as you basically have to regex through the comments
> to attempt to find some content.

True, comments aren't designed for machine-processing.

Ideally if the pages were XHTML you'd
> just embed the triples, but then browsers would fall over...

Not necessarily - don't forget GRDDL [1]. Triples can be embedded as
pieces of XHTML fashion then clients can use XSLT to extract RDF/XML.
For example, there's already a bit of CSS markup identifying the
title:
<font class="xpapertitle"...

The GRDDL approach is better that comment-embedding and scraping in
general as the doc can be associated with a profile (a URI-valued
attribute of the <head> element) which identifies the specific
processing needed to get the RDF out of the XHTML - it's
deterministic.

There are a few stylistic alternatives for expressing explicit data in
XHTML - for relatively flat stuff the docs <meta> elements can be
used, there's the microformats [2] approach of including the data
inline, and there's the Structured Blogging approach [3], which uses
arbitrary XML embedded in a <script> element.

A quick way of going from what you've got to GRDDL would be to
identify a profile that has XSLT something like:

<xsl:template match="comment()">
<xsl:value-of select="."/>
</xsl:template>

But note there would be no guarantee that this would work for every
processor, as XML parsers don't have to pass comment data to the app.

> Linking to the content would be much more flexible and easier to process.

Generally, yup.

Cheers,
Danny.

[1] http://www.w3.org/2004/01/rdxh/spec
[2] http://www.microformats.org/wiki/microformats
[3] http://structuredblogging.org/

-- 
http://dannyayers.com

Received on Thu Sep 15 2005 - 09:06:33 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT