Re: Solvent/piggy-bank weirdness

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

From: David Huynh <dfhuynh_at_csail.mit.edu>
Date: Thu, 01 Dec 2005 06:46:34 -0500

Arvind Venkataramani wrote:

>I got the same. It appears that piggy bank is combining (where available)
>the RSS feed from the html HEAD with the data gathered through the scraper.
>In which case, why's all the additional metadata (prism:publicationName etc)
>that is visible in the RSS feeds for all these pages not turning up in the
>collected data? Am I doing something wrong?
>
>
Piggy Bank uses Informa 0.6.0 to parse all RSS feeds, whether or not
they are RDF/XML. Perhaps Informa skips over non-RSS stuff like those
prism: predicates.

Piggy Bank also invokes applicable scrapers and then throw all results
together into the same RDF model. So, the results we both got were
expected, although not desirable. Could you log a bug in our issue
tracking system about this? Thanks!

David
Received on Thu Dec 01 2005 - 11:42:08 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT