Re: scraper evolution

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: Eric Miller <em_at_w3.org>
Date: Mon, 30 Jan 2006 11:56:12 -0500

On Jan 27, 2006, at 5:44 PM, David Huynh wrote:

> Eric Miller wrote:
>
>>
>> On Jan 27, 2006, at 12:38 PM, David Huynh wrote:
>>
>>> I'm working on making scrapers more "declarative", thus easier
>>> to write, easier to update and adapt, easier to find errors in
>>> them. Update errors might thus be detectable automatically. And
>>> the users themselves (not the original scraper authors) can try
>>> to update the scrapers.
>>
>>
>> Interesting! :) But I'm not quite sure how this would help the
>> use case exactly. In the case below the scraper didn't die per se
>> - minor HTML tweaks simply caused the scraper not to collect all
>> of the RDF data that it originally able to gather. In this case,
>> the user might not know there was an error and thus know to
>> correct the scrapers.
>
> The description of the scraper might also indicate which fields are
> optional and which are mandatory. We can also enforce some type
> checking on the values extracted.

I think this is a very good idea.

--eric

>
> David
>
Received on Mon Jan 30 2006 - 16:55:25 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT