Re: Perl Scraper

From: ilango gurusamy <ilangojava_at_yahoo.com>
Date: Wed, 4 Jan 2006 17:25:39 -0800 (PST)

David
  Thank you. Does the Simile architecture have a crawler (perhaps based on lucene or nutch) in it?
  
  ilango

David Huynh <dfhuynh_at_csail.mit.edu> wrote: Ilango,

No, Piggy Bank does not support Perl scrapers at the moment. The reason
we support Javascript scrapers is that it is quite trivial to run
Javascript code against web page DOMs that have already been parsed by
Firefox.

You can still make your Perl scraper generate RDF/XML or N3 files and
then point Firefox to those files. Piggy Bank should then let you import
those files.

David


ilango gurusamy wrote:

> David
> Is it possible to write Perl Scraper for Simile.I have written a
> scraper for Craigs list in Perl recently.
>
> ilango
>
> */David Huynh /* wrote:
>
> By the way, if you have written scrapers, please please please share
> some with us
> - publish them to http://simile.mit.edu/bank/
> - or tell us here on this mailing list or email us individually
> and we
> can link to your scrapers from our list
> http://simile.mit.edu/piggy-bank/scraper-list.html
>
> This is so that we can soon tell other people, look, it's not just us
> Similers who do the scraping... and so that I can tell my thesis
> advisors, see, I told you it's gonna work!
>
> (Don't worry too much about quality. We need quantity and
> diversity at
> this moment. Then we can fine tune the scripts later on.)
>
> Thanks!
>
> David
>
>
> ------------------------------------------------------------------------
> Yahoo! Photos
> Ring in the New Year with Photo Calendars
> .
> Add photos, events, holidays, whatever.





                        
---------------------------------
Yahoo! Photos
 Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever.
Received on Thu Jan 05 2006 - 01:25:36 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT