Re: scraper evolution

From: Ben Hyde <bhyde_at_pobox.com>
Date: Fri, 27 Jan 2006 12:26:35 -0500

Piggybank central probably needs to provide a registry that can be
quickly polled to see if updates are available to the 3rd party
scripts in use. It's a interesting part of the design space.

On Jan 27, 2006, at 12:05 PM, Eric Miller wrote:

> The HTML pages harvested using an Open Worldcat scraper [1] changed
> and as a consequence the scraper broke. To be clear, the scraper
> when evoked didn't stop working per se, but rather it didn't glean
> all of the relevant RDF that it did originally. I've updated the
> scraper accordingly, but its unclear to me the best way to
> propagate these changes to others who might be using the scraper.
>
> I can think of several possible options all of which have various
> pros / cons
>
> 1) do nothing ... if folks realize its broken they'll look for an
> update
> 2) real time auto-update ... every time scraper is invoked it
> checks to see if a new version is available
> 3) periodically update ... check for updates nightly, monthly, etc.
> and then offers the user some sort of notification to update
>
> I'm inclined to suggest 3, but curious as to others thoughts who
> might have been able to spend more time thinking about this than I
> have :)
>
> [1] http://potlach.org/2005/10/scrapers/
>
> --
> eric miller http://www.w3.org/people/em/
> semantic web activity lead http://www.w3.org/2001/sw/
> w3c world wide web consortium http://www.w3.org/
>
>
Received on Fri Jan 27 2006 - 17:26:01 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT