scraper evolution

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Eric Miller <em_at_w3.org>
Date: Fri, 27 Jan 2006 12:05:17 -0500

The HTML pages harvested using an Open Worldcat scraper [1] changed
and as a consequence the scraper broke. To be clear, the scraper when
evoked didn't stop working per se, but rather it didn't glean all of
the relevant RDF that it did originally. I've updated the scraper
accordingly, but its unclear to me the best way to propagate these
changes to others who might be using the scraper.

I can think of several possible options all of which have various
pros / cons

1) do nothing ... if folks realize its broken they'll look for an update
2) real time auto-update ... every time scraper is invoked it checks
to see if a new version is available
3) periodically update ... check for updates nightly, monthly, etc.
and then offers the user some sort of notification to update

I'm inclined to suggest 3, but curious as to others thoughts who
might have been able to spend more time thinking about this than I
have :)

[1] http://potlach.org/2005/10/scrapers/

--
eric miller                              http://www.w3.org/people/em/
semantic web activity lead               http://www.w3.org/2001/sw/
w3c world wide web consortium            http://www.w3.org/

Received on Fri Jan 27 2006 - 17:04:38 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT