Re: Comments on 'precedural approaches' from Stefano Mazzocchi on 2005-04-26 (stdin)

From: Stefano Mazzocchi <stefanom_at_mit.edu>
Date: Tue, 26 Apr 2005 10:54:08 -0700

Emmanuel Pietriga wrote:

>> Stefano Mazzocchi wrote:
>>
>
>>>> These are random comments on Section 2.1 of the current paper's
working
>>>> draft, sent here for archival and for triggering further discussion.
>
>>
>>
>> Also archiving my reaction to these comments.
>>
>>
>
>>>> - xslt over RDF/XML is always possible, it's just unberably hard and
>>>> error prone, due to the XSLT-unfriendly nature of RDF/XML.
>>>>
>>>> - "conceptually wrong" is a little big too strong, use "conceptually
>>>> defective"? A great example is the fact that rdf:type="blah" can
become
>>>> the element name, modelling this in XPath is extremely hard.
>
>>
>>
>> I really feel strongly against this RDF/XML+XSLT as a general approach
>> for transformaing RDF. That's why I put "wrong". But you're right,
>> that's probably too strong... The example is a good one and we should
>> probably cite it. But it is more an illustration of the point above
>> (unbearably hard) than of this one (conceptually defective).

Do you have an example of what you mean by "conceptually defective"?
(not trying to be dense, just curious)

>>>> - the other problem with XSLT is the notion that it acts as a
>>>> transformation filter on a given infoset. In theory, this infoset
could
>>>> be as big as the whole triple store model itself, in practice however,
>>>> due to the intrinsic nature of XSLT recursivity, it is ill suited for
>>>> 'selection by filtering'.
>
>>
>> Agreed.
>>
>
>>>> - mention XQuery as a potential alternative to the above, but note
that
>>>> it suffers from the same problems as XSLT.
>>>>
>>>> - "potential irregularit, openness and use of different vocabularies"
>>>> this is not true. XSLT can cope with that too.
>
>>
>>
>> Yes it can. But at what cost? The complexity is so huge... Again,
that's
>> why I believe the RDF/XML+XSLT approach to be conceptually wrong.

Well, honestly, I've been dealing with multi-namespaced XML in cocoon
for years and it's true, the complexity of the system grows very fast
with the number of etherogeneous data mixed and irregular, but it's not
the end of the world... actually sometimes XML+XSLT is a lot better than
all the other existing technologies (but RDF) in coping with
un-normalized semi-structures.

My point is that we should not say what XSLT can't do when it can, we
should just outline that with RDF is easier (and provide examples why
that is, otherwise XML folks won't just buy it because you say so).

>> Addressing RDF at the lower level of abstraction that is its RDF/XML
>> representation makes it so difficult. Yes, there are a lot of useful
>> tools for processing XML, but it does not mean that they are suited to
>> processing RDF models.

A lot of people don't care about elegance, they care about getting their
job done. If it's *easier* for them to use Fresnel to achieve what they
want, they would use it, if not, they won't, it's as simple as that.

Now, showing how complex an XPath selector on RDF/XML is compared to a
Fresnel selector on an RDF infoset that achieves the same exact thing,
that's a way to show what you mean, without necessarely force us to make
judgment on the technologies, since users will judge that for themselves.

We should try to avoid being dismissive with technologies that have
worked so far, or people will feel defensive and stop reading with an
open mind.

>>>> The problem is not that
>>>> is the fact that we have graphs instead of trees and all existing
>>>> technologies work with trees, not graphs.
>
>>
>>
>> Yes. But it is not just that. It is the fact that the RDF/XML tree is
>> intended for the serialization of the RDF model. It is just one
possible
>> projection of the graph, and it thus looses the generallity found at
the
>> RDF graph level.

This is incorrect. The RDF/XML tree is a loss-less serialization.
Therefore, by design and definition, it contains *all* the information
of an RDF infoset.

Now, we all agree that the nature of this serialization is fairly low
level (with some syntax sugar that was supposed to make embedding
easier... but IMO, made it just weirder for XML folks) therefore
ill-suited for any kind of XML-oriented postprocessing or pipelining.

[it this wasn't the case, there would be no need for a 'graph->tree'
bridge in the first place and we woulnd't be talking at all!]

At the same time, there is a difference between 'impossible' and
'cumbersome and error prone'. I think XSLT postprocessing of an RDF/XML
model (even if canonicalized), is part of the second category not the
first one.

>> My point is that it is not just a question of processing graphs or
trees
>> with the right tools, it is also a question of how the data is
>> structured and represented. When presenting people with the RDF/XML
tree
>> representation of a model, you force them to build a mental
>> representation of the RDF/XML tree and to map/convert it to what it
>> actually "means" at the more abstract (purely) RDF level. This is a
hard
>> mental operation which requires a lot of *unnecessary* cognitive
effort.
>> Here, it is more the HCI research scientist part of myself who talks,
>> but I believe this to be something that should be taken into account.

This is a very good point and one I completely agree with: the RDF/XML
syntax sugar looks 'weird' to XML people, but we also have to keep in
mind that RDF/XML canonicalization would make it a lot easier (for
example, no use of attributes but rdf:about and rdf:resource, no use of
rdf:Description/rdf:type but always use the type as the element name, if
present).

-- 
Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
-------------------------------------------------------------------

Received on Tue Apr 26 2005 - 17:53:20 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT