Re: RDF 101 [was Re: introduction and questions]

From: Erik Hatcher <esh6h_at_virginia.edu>
Date: Mon, 18 Apr 2005 16:41:18 -0400

On Apr 15, 2005, at 10:22 AM, Stefano Mazzocchi wrote:
>> Now on to my questions....
>> First, I'm utterly clueless about RDF.
>
> That's totally fine. We do not expect our users to know RDF inside
> out, and we are willing to help to get them up to speed.

*whew*

> the second means that you need a 'rdf:type' statement, or, using
> RDF/XML, you need to say something like
>
> <blah:Blah rdf:about="http://your.host.com/uri/3809480">
> ...
>
> instead of
>
> <rdf:Description rdf:about="http://your.host.com/uri/3809480">
> ...

I've now converted to using your recommended syntax. And I'm also
adding a <dc:type> element that points to some different metadata that
we have. Longwell2 is showing both together as "type" - is that
correct? Is the implicit rdf:type somehow connected to dc:type?

> Note that the RDF/XML syntax is rather weird as it has special
> meanings, for example
>
> <blah:Blah
> xmlns:blah="http://blah.com/ns/blah#"
> rdf:about="http://your.host.com/uri/3809480">
> ...
>
> is completely equivalent to
>
> <rdf:Description rdf:about="http://your.host.com/uri/3809480">
> <rdf:type rdf:resource="http://blah.com/ns/blah#Blah"/>
> ...
>
> [this creates all sort of problems in RDF canonicalizations and some
> people hate it and some love it, but hey, RDF/XML is even older than
> the XML namespaces spec and it feels kinda pre-hystoric to me at
> times, but it grows on you after a few months]

*head spinning* - cool... glad to have some mentoring on this stuff, as
I'd be stumbling in the dark for ages on things like this.

> I'm sure you've seen my "No-nonsense Guide to Semantic Web Specs for
> XML People"

Yes, and have also re-read them thanks to your pointers.

> part 1 -> http://www.betaversion.org/~stefano/linotype/news/57/

One thing that hasn't become clear yet is the use of the "#" at the end
of the namespace URI's - you mention it'll become clear, but thus far
it hasn't for me.

> If you want to get a little deeper, it's probably easier to keep
> asking specific questions here as soon as you encounter a roadblock.

With that said, I've gotten a bit further. I've RDF'd the Rossetti
Archive files. I've done a 1-for-1 transformation from our XML files
into RDF, though I'm sure this is not granular enough. I'm tossing
this out details in case someone is interested in pointing me in the
right directions - in other words, help if you want, and it'll be
appreciated, but worries if not. I've zipped all the RDF files here:

        http://www.rossettiarchive.org/docs/rossetti_rdf.zip (2MB currently)

and a specific example here:
        
        http://www.rossettiarchive.org/docs/1-1847.s244.raw.rdf

This is just the beginnings, and there is much more metadata (rhyme,
meter, genre, etc, etc, etc) available once I find the best buckets to
put it in within RDF. And there are quite a number of connections
between the objects in our archive as well.

How do I represent these connections? For example, we "workcodes" on
various objects (down to the <div> level within actual manuscripts)
that connect things back to a formal work. You can see this
aggregation our collection view like this:

        http://www.rossettiarchive.org/docs/1-1847.s244.rawcollection.html

Hyperlinks with #anchors are down to the <div> level.

I've distilled lots of our gory XML metadata out into fields I indexed
with Lucene. Queries connecting workcodes can be made like this:

        http://www.rossettiarchive.org/rose/?query=workcode%3A1-1847.s244

Putting these connections into RDF, of course, is the next goal.

> Welkin and Longwell2 do not need that finetuning, you can throw
> whatever RDF at them and they will adjust to the data.

I have now started using Longwell2, and it has been working nicely to
see how I progress with the RDFization of the Rossetti Archive.

>> - Longwell2 - How do I get it to work with a sample dataset? I
>> tried pointing longwell.properties the data directory of my Longwell
>> TRUNK area, but it did not work.
>
> you have to run it like
>
> ./longwell.sh longwell.properties datadir

Just a minor correction... _run_ is needed:

        ./longwell.sh run longwell.properties datadir

> and it will load all the *.rdf, *.n3, *.rdfs, *.owl files found
> recursively in the datadir.

Sure enough it did!

>> - Welkin - well done! It'll make more sense to me when I
>> understand RDF a bit more, but it's a nice visualization.
>
> Did you try it from the trunk or from the webstart release?

At the time of writing, I was from WebStart. I've now been using
trunk. The sliders (which I don't understand yet) get cropped a bit on
Mac OS X. It'll be even cooler when I get the connections between
documents in there :)

How can I open up directory full of RDF files in Welkin? Or will it
accept a .zip file of .rdf files?

>> - Charon - this looks like something we could really leverage
>> with Collex - allowing folks that have legacy low-tech archives to
>> be "collectable" somehow. This may be a place of collaboration for
>> us.
>
> Awesome! Charon is based on cocoon, so 99.9% of the complexity is
> already dealt for you by it. All you need to do is to write a few XSLT
> stylesheets, take a look at
>
> http://simile.mit.edu/repository/charon/trunk/stylesheets/rdfize.xslt
>
> to see the 'core' action. This is the rdfizer targetted for a dspace
> site. Charon was built with dspace in mind, but it's relatively easy
> to modify it to be able to support multiple sites at the same time,
> would that need emerge. I'd be happy to help out directly with that,
> also because I would love Charon and Piggy-Bank to share XSLT RDFizers
> (a-la GRDDL)
>
> http://www.w3.org/2004/01/rdxh/spec

Excellent. The proxy idea will be one of the last things we do with
Collex since we're aiming for the rich archives that have the technical
wherewithal to supply their information as RDF. But I suspect it will
come to having to do some proxying like this sometime down the road.

Piggy-Bank - this is right up the "collection" alley we're looking for,
but we'll want to be able to collect objects in non-Firefox browsers
also somehow. One idea I have is to use a bookmarklet to "Collex It!"
which will somehow send information to our system. We could have our
archives marked with embedded RDF which is what Piggy-Bank leverages.
And we could, perhaps in addition, have a <meta> tag that pointed to
the RDF. When our system receives a URL to collect, it'd parse out the
RDF and allow the user to pick the specific objects desired. Is this a
reasonable approach? What other suggestions do folks have in this
regard?

Many thanks,
        Erik




Received on Mon Apr 18 2005 - 20:40:18 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT