Re: access to rdf while scraping?

From: Ryan Lee <ryanlee_at_w3.org>
Date: Fri, 10 Jun 2005 15:15:48 -0400

Megan Ristau wrote:
> Hi,
>
> Tom and I here at Science Commons are creating scrapers for major
> medical DB sites. As we're scraping various sites containing data that
> should be in some type of nested rdf structure, it would really help to
> see the rdf we're exposing instead of blindly sending data to Piggy
> Bank. Is there any way to retrieve rdf output from "My Piggy Bank" or
> elsewhere?
>
> Thanks much--
>
> Megan

Hi Megan,

One way to do it is to change the URL - where it says 'command=browse'
or 'command=focus' while you navigate results, you can replace it with
'command=export' to get an N3 representation of all the results you're
looking at.

However, I don't know if you'll get anything besides what you already
see back in an export - that is, I don't think the export command
explores any more of the tree than what gets displayed.

If your JavaScript screen scrapers are executing model.addStatements,
then you should have a fairly good idea of what's supposed to be going
into your local model. You could do a utilities.debugPrint() on
statement components to watch in the JavaScript console, perhaps?

-- 
Ryan Lee                 ryanlee_at_w3.org
W3C Research Engineer    +1.617.253.5327
http://simile.mit.edu/
Received on Fri Jun 10 2005 - 19:13:55 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT