Re: infoURI standard officially blessed

From: Ray Denenberg, Library of Congress <rden_at_loc.gov>
Date: Fri, 18 Nov 2005 13:40:16 -0500

Hello, may I join in on this discussion, and illustrate an application where
'info' is used successfully as a pure identifier.....

With the SRU protocol (currently, http://www.loc.gov/srw/ but very soon to
be relocated to http://www.loc.gov/standards/sru/) a client requests that a
server send a record according to a particular schema - MODS, for instance.

So the client says "please search the database for records with 'cat' in the
title, and send back the first 5 records, in MODS".

The request is sent via URL, which includes a variety of parameters, one of
which identifies the MODS schema. That identifier is:

                    info:srw/schema/1/mods-v3.0

(See: http://www.loc.gov/srw/infoURI.html)

That 'info' URI does not get dereferenced. The server receiving it either
recognizes it or not. If it recognizes it, it either supports it or not. So
one of the following:
(a) the server recognizes it and supports it -- it sends back MODS records.
(b) recognizes it, doesn't support it -- sends a diagnostic "we don't send
MODS records".
(c) doesn't recognize it -- sends a diagnostic "schema id not recognized"
In any case there is no scenario by which the server would dereference the
URI, go get the schema, and then formulate and send MODS records. Most
importantly, though (b) and (c) are failed transactions, most likely neither
failed because of a misunderstanding (i.e. a misinterpreted URI). In other
words we think that this use of identifiers presents the greatest chance of
interoperability ("interoperability" not necessarily "success").

Now you can, if you like, argue that this URI is indeed derferenceable:
There is a table of these schema identifiers at
http://www.loc.gov/srw/record-schemas.html (and note that these URIs used to
identify schemas are not all 'info' URIs, some are 'http', but even in those
cases these are "pure identifiers", for example try to click on
http://explain.z3950.org/dtd/2.0/) and for each identifier there is listed
an associated URL that gets you the schema. So a developer who wants to
program support, say, for dublin core, would look in the table to find the
URI identifying the dublin core schema as used by SRU, and what schema that
URI identifies. So you could argue that's resolution: go to the SRU table
and look up the URL corresponding to the identifier. But the point is that
these URIs are created primarily for the proper operation of the protocol,
and this "resolution" does not occur as part of that.

  And why not just use the schema URL? No, it's not the persistence
argument - rather uniqueness -- that schema may have several locations and
if the client uses one, the server might recognize it by another, resulting
in a failed transaction even though the server was capable of fulfilling the
request. So we want to try to have a unique identifier for any given schema.
(Not always achieved, but this approach would seem to have a much better
chance than using URLs.)

Schemas are just one of several objects that need to be identified in the
SRU protocol. Diagnostics, context sets, and protocol extensions are others.

The question was raised in this thread, who assigns these URIs, one central
authority? No, there's a hierarchical authority. Note the form:
info:srw/<object type>/<authority>/<identifier>"
So for:
info:srw/schema/1/mods-v3.0,
The "authority" is "1", there is a table of authorities at
http://www.loc.gov/srw/infoURI.html, "1" is the Library of Congress, and LC
is also the maintenance agency, and assigns the other authority strings, to
anyone who requests one, and they can then assign their own identifiers.

--Ray Denenberg
Received on Fri Nov 18 2005 - 18:34:09 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT