Re: Piggy Bank blocks find as you type (+ comments)

From: Michael McDougall <>
Date: Thu, 21 Jul 2005 10:19:01 -0400

David Huynh wrote:

> - Saving data automatically into "My Piggy Bank" can be
> undesirable--you might want to select which items to save; and you
> might also want to tag them as you save them. And we don't want rogue
> sites to pollute "My Piggy Bank" just because the user accidentally
> visits them.

You may have been through all these ideas before, but I've been thinking
about this a bit. You could try an "opt-out" approach where all sites
are initially trusted but the user can mark a site as 'bad', which will
remove that site's data from Piggy Bank (and subsequent visits to the
site won't scrape new data). Or you could have 2 data banks: one with
data from all sites (perhaps with the opt-out option) and one with data
from sites that were explicitly marked as trusted. That way I can browse
my trusted data, and when something's not there I can (hold my nose and)
browse a giant pile of data from all the sites I visit.

It might also be worth treating data from scrapers differently than data
from plain RDF. If I install the ACM Portal scraper I think it's safe to
assume that I trust everything on the ACM Portal site.

Have you actually been encountering many 'rogue sites' that pollute
Piggy Bank? I'm doing some research on semantic web security so I'd like
to learn more about real world issues like this.

