Re: Personal Publication Reader and Lixto Suite from Stefano Mazzocchi on 2005-07-08 (stdin)

From: Stefano Mazzocchi <stefanom_at_mit.edu>
Date: Fri, 08 Jul 2005 12:31:25 -0400

Chris Bizer wrote:
> Hi,
>
> did you guys notice the "Personal Publication Reader" project (part of the
> EC Network of Excellence REWERSE)?
>
> They are having a nice scrapper architecture including a visual scrapper
> editor and a tool for defining and scheduling data integration workflows
> which they use for extracting information about publications and turning
> them into RDF.

The visual scraper is actually a (patented) commercial product from the
company http://www.lixto.com/ and it generated XML.

> See: http://www.kbs.uni-hannover.de/rewerse/ppr/rewerse-demonstrator1.html
>
> Some of these ideas might play well together with content syndication using
> Semantic Bank.

The haystack team has done research on the topic of allowing people to
generate RDF from pages just by selecting parts of it. David was spent
some time working on something like that but we then focused on Piggy-Bank.

I agree we need a way to make it easier for people to write and maintain
scrapers.

And I'm glad that other groups are creating RDF anyway :-)

-- 
Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
-------------------------------------------------------------------

Received on Fri Jul 08 2005 - 16:28:47 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT