Re: scraper evolution

From: David Huynh <dfhuynh_at_csail.mit.edu>
Date: Fri, 27 Jan 2006 17:44:41 -0500

Eric Miller wrote:

>
> On Jan 27, 2006, at 12:38 PM, David Huynh wrote:
>
>> I'm working on making scrapers more "declarative", thus easier to
>> write, easier to update and adapt, easier to find errors in them.
>> Update errors might thus be detectable automatically. And the users
>> themselves (not the original scraper authors) can try to update the
>> scrapers.
>
>
> Interesting! :) But I'm not quite sure how this would help the use
> case exactly. In the case below the scraper didn't die per se - minor
> HTML tweaks simply caused the scraper not to collect all of the RDF
> data that it originally able to gather. In this case, the user might
> not know there was an error and thus know to correct the scrapers.

The description of the scraper might also indicate which fields are
optional and which are mandatory. We can also enforce some type checking
on the values extracted.

David
Received on Fri Jan 27 2006 - 22:44:03 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT