Chris Bizer wrote:
> Hi,
>
> did you guys notice the "Personal Publication Reader" project (part of the
> EC Network of Excellence REWERSE)?
>
> They are having a nice scrapper architecture including a visual scrapper
> editor and a tool for defining and scheduling data integration workflows
> which they use for extracting information about publications and turning
> them into RDF.
The visual scraper is actually a (patented) commercial product from the
company
http://www.lixto.com/ and it generated XML.
> See: http://www.kbs.uni-hannover.de/rewerse/ppr/rewerse-demonstrator1.html
>
> Some of these ideas might play well together with content syndication using
> Semantic Bank.
The haystack team has done research on the topic of allowing people to
generate RDF from pages just by selecting parts of it. David was spent
some time working on something like that but we then focused on Piggy-Bank.
I agree we need a way to make it easier for people to write and maintain
scrapers.
And I'm glad that other groups are creating RDF anyway :-)
--
Stefano Mazzocchi
Research Scientist Digital Libraries Research Group
Massachusetts Institute of Technology location: E25-131C
77 Massachusetts Ave telephone: +1 (617) 253-1096
Cambridge, MA 02139-4307 email: stefanom at mit . edu
-------------------------------------------------------------------
Received on Fri Jul 08 2005 - 16:28:47 EDT