RE: structured bibliographic info for BioMed Central articles now available as RDF from Matthew Cockerill on 2005-09-15 (stdin)

From: Matthew Cockerill <matt_at_biomedcentral.com>
Date: Thu, 15 Sep 2005 10:17:14 +0100

> -----Original Message-----
> From: Danny Ayers [mailto:danny.ayers_at_gmail.com]
> Sent: 15 September 2005 10:11
> To: general_at_simile.mit.edu
> Subject: Re: structured bibliographic info for BioMed Central articles
> now available as RDF
>
>
> On 9/14/05, Leigh Dodds <ldodds_at_ingenta.com> wrote:
> > Matthew Cockerill wrote:
> >
> > > And embedding the RDF as a comment is already in very
> common use for CC metadata - so
> > > shouldn't Piggy Bank really support the identification
> of islands of
> > RDF in comments like this?
> >
> > It's a hack though, as you basically have to regex through
> the comments
> > to attempt to find some content.
>
> True, comments aren't designed for machine-processing.
>
> Ideally if the pages were XHTML you'd
> > just embed the triples, but then browsers would fall over...
>
> Not necessarily - don't forget GRDDL [1]. Triples can be embedded as
> pieces of XHTML fashion then clients can use XSLT to extract RDF/XML.
> For example, there's already a bit of CSS markup identifying the
> title:
> <font class="xpapertitle"...
>
> The GRDDL approach is better that comment-embedding and scraping in
> general as the doc can be associated with a profile (a URI-valued
> attribute of the <head> element) which identifies the specific
> processing needed to get the RDF out of the XHTML - it's
> deterministic.
>
> There are a few stylistic alternatives for expressing explicit data in
> XHTML - for relatively flat stuff the docs <meta> elements can be
> used, there's the microformats [2] approach of including the data
> inline, and there's the Structured Blogging approach [3], which uses
> arbitrary XML embedded in a <script> element.
>
> A quick way of going from what you've got to GRDDL would be to
> identify a profile that has XSLT something like:
>
> <xsl:template match="comment()">
> <xsl:value-of select="."/>
> </xsl:template>
>
> But note there would be no guarantee that this would work for every
> processor, as XML parsers don't have to pass comment data to the app.

That's was the problem - when I looked at non comment ways of embedding RDF info into HTML, there was just an embarrassment of numbers of ways you could possible do it. But the only one to have significant uptake so far is the CC comment approach.

Maybe if CC could be persuaded of the merits of a more machine parseable approach for embedding RDF, and recommend that rather than comment embedding, one of these other options might gain some traction.

I'm ccing John Wilbanks, who maybe has some thoughts on this.

> > Linking to the content would be much more flexible and
> easier to process.
>
> Generally, yup.

I agree, but there are still quite a few use cases (for example, search engine harvesting) where embedding is desirable.

Matt

>
> Cheers,
> Danny.
>
> [1] http://www.w3.org/2004/01/rdxh/spec
> [2] http://www.microformats.org/wiki/microformats
> [3] http://structuredblogging.org/
>
>
> --
>
> http://dannyayers.com
>
This email has been scanned by Postini.
For more information please visit http://www.postini.com

Received on Thu Sep 15 2005 - 09:14:10 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT