Re: Piggy Bank and rules from David Huynh on 2005-04-08 (stdin)

From: David Huynh <dfhuynh_at_csail.mit.edu>
Date: Fri, 08 Apr 2005 23:33:58 -0400

Hi Phil,

I'm personally very excited at the prospect that Piggy Bank's usefulness
extends beyond what it was designed for. Though I don't personally have
the wisdom to contribute to this topic of censorship, nor the time to
help you out with the programming, I'd be happy to answer questions on
how our software works so that you or your colleagues can make the
necessary adaptation.

With regards to whether there is wider value for rules in PB, I'd say,
just implement it, and if other people want it, they'll tell you to
generalize it. :-)

David

Phil Archer wrote:

> Hi all,
>
> I sent Ryan an e-mail the other day and he suggested I shared this
> with the full list so, after a bit of a delay, here goes.
>
> I wanted to let you know first of all how incredibly useful Piggy Bank
> is being for me in talking to about the virtues of the Semantic Web.
> My organisation, ICRA [1], currently uses the old PICS standard to add
> labels to content that describe whether it contains sex, nudity,
> violence etc. Filters, ideally ones built onto browsers, can then
> allow or block access to content based on those labels. The best known
> example of this is Content Advisor in Internet Explorer that comes
> with an old rating system called RSACi. That organisation/rating
> system lead to ICRA. We're now working to move labelling from PICS to
> RDF.
>
> OK, introductory history lesson over.
>
> Our use case involves content providers linking a small number of
> descriptions, what we call content labels, to any number of resources.
> For example, "there is no sex or nudity on icra.org". That's more than
> one URI we're trying to describe and to make RDF work, we need a way
> to encode that. Further, we need to be able to say "everything at
> www.example.org/artistic_nudes has description A while everything else
> on the example.org domain has description B."
>
> This has lead to the development of a simple rule set that is
> predicated on matching the URL of a resource for which we want a
> description against a sequence of one or more Perl5 regular
> expressions. The first match then leads to a description - what we
> call a content label.
>
> Use cases and test data at [2], schema description at [3].
>
> And so to my question - do you see any wider value in Piggy Bank (or
> other SW helper applications) working with the kind of rule set ideas
> we're now using in our own use case? Let me expand a little further.
>
> The content label testing tool I've hacked together on our site [4]
> visits a target URL and looks for RDF data, then narrows in to look
> specifically for ICRA labels (my plan is to expand this in the near
> future but I'm in concept-proving mode still). There's a small chunk
> of rdf on my personal site at www.archersenglish.co.uk/labels.rdf.
> There are links to this same file in both the homepage and a dummy
> page set up at
> www.archersenglish.co.uk/589/. Links [5] and [6] below take the label
> tester off to those 2 pages respectively, it grabs the RDF instance
> and then works out which ICRA label applies to the URL in question -
> needless to say you get a different result for each URL.
>
> This is due to the simple rules encoded in the RDF instance - any URL
> on a given list of hosts gets "label 1", but if the URL contains "589"
> it gets label 2. Piggy Bank knows nothing about these rules of course
> so it shows all the RDF classes (in my terms, both possible labels)
> and the rule set itself.
>
> If Piggy Bank were to gain a deep an meaningful understanding of the
> rule set [3] (i.e. had some code added to support the functionality!)
> it would demonstrate the enormous potential of all this to the
> internet safety community. Yes, the labels might be used for filtering
> but they can equally be used to show through the kind of visualisation
> exemplified by Piggy Bank that a site is a good resource for homework,
> contains medical information that can be trusted and so on.
>
> Enough for one e-mail. I'm naturally keen to know what you think.
>
> Regards
>
> Phil
>
> [1] http://www.icra.org
> [2] http://www.w3.org/2004/12/q/doc/rdf-contentlabels.html
> [3] http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
> [4] http://www.icra.org/RDF/label/tester/
> [5]
> http://www.icra.org/cgi-bin/rdf/labelTester.cgi?lang=en&url=http%3A%2F%2Fwww.archersenglish.co.uk%2F&ignorePICS=on
>
> [6]
> http://www.icra.org/cgi-bin/rdf/labelTester.cgi?lang=en&url=http%3A%2F%2Fwww.archersenglish.co.uk%2F589%2F&ignorePICS=on
>
>
>
> Phil Archer
> Chief Technical Officer
> Internet Content Rating Association
> Label your site today at http://www.icra.org
>
Received on Sat Apr 09 2005 - 03:35:59 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT