Piggy Bank and rules from Phil Archer on 2005-04-07 (stdin)

From: Phil Archer <phil.archer_at_icra.org>
Date: Thu, 7 Apr 2005 12:37:37 +0100

Hi all,

I sent Ryan an e-mail the other day and he suggested I shared this with the
full list so, after a bit of a delay, here goes.

I wanted to let you know first of all how incredibly useful Piggy Bank is
being for me in talking to about the virtues of the Semantic Web. My
organisation, ICRA [1], currently uses the old PICS standard to add labels
to content that describe whether it contains sex, nudity, violence etc.
Filters, ideally ones built onto browsers, can then allow or block access to
content based on those labels. The best known example of this is Content
Advisor in Internet Explorer that comes with an old rating system called
RSACi. That organisation/rating system lead to ICRA. We're now working to
move labelling from PICS to RDF.

OK, introductory history lesson over.

Our use case involves content providers linking a small number of
descriptions, what we call content labels, to any number of resources. For
example, "there is no sex or nudity on icra.org". That's more than one URI
we're trying to describe and to make RDF work, we need a way to encode that.
Further, we need to be able to say "everything at
www.example.org/artistic_nudes has description A while everything else on
the example.org domain has description B."

This has lead to the development of a simple rule set that is predicated on
matching the URL of a resource for which we want a description against a
sequence of one or more Perl5 regular expressions. The first match then
leads to a description - what we call a content label.

Use cases and test data at [2], schema description at [3].

And so to my question - do you see any wider value in Piggy Bank (or other
SW helper applications) working with the kind of rule set ideas we're now
using in our own use case? Let me expand a little further.

The content label testing tool I've hacked together on our site [4] visits a
target URL and looks for RDF data, then narrows in to look specifically for
ICRA labels (my plan is to expand this in the near future but I'm in
concept-proving mode still). There's a small chunk of rdf on my personal
site at www.archersenglish.co.uk/labels.rdf. There are links to this same
file in both the homepage and a dummy page set up at
www.archersenglish.co.uk/589/. Links [5] and [6] below take the label tester
off to those 2 pages respectively, it grabs the RDF instance and then works
out which ICRA label applies to the URL in question - needless to say you
get a different result for each URL.

This is due to the simple rules encoded in the RDF instance - any URL on a
given list of hosts gets "label 1", but if the URL contains "589" it gets
label 2. Piggy Bank knows nothing about these rules of course so it shows
all the RDF classes (in my terms, both possible labels) and the rule set
itself.

If Piggy Bank were to gain a deep an meaningful understanding of the rule
set [3] (i.e. had some code added to support the functionality!) it would
demonstrate the enormous potential of all this to the internet safety
community. Yes, the labels might be used for filtering but they can equally
be used to show through the kind of visualisation exemplified by Piggy Bank
that a site is a good resource for homework, contains medical information
that can be trusted and so on.

Enough for one e-mail. I'm naturally keen to know what you think.

Regards

Phil

[1] http://www.icra.org
[2] http://www.w3.org/2004/12/q/doc/rdf-contentlabels.html
[3] http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
[4] http://www.icra.org/RDF/label/tester/
[5]
http://www.icra.org/cgi-bin/rdf/labelTester.cgi?lang=en&url=http%3A%2F%2Fwww.archersenglish.co.uk%2F&ignorePICS=on
[6]
http://www.icra.org/cgi-bin/rdf/labelTester.cgi?lang=en&url=http%3A%2F%2Fwww.archersenglish.co.uk%2F589%2F&ignorePICS=on

Phil Archer
Chief Technical Officer
Internet Content Rating Association
Label your site today at http://www.icra.org
Received on Thu Apr 07 2005 - 11:38:56 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT