Public Identifiers

Paul Prescod papresco at
Sun Sep 20 18:16:01 BST 1998

W. Eliot Kimber wrote:
> Here I agree. Public identifiers are, conceptually, the same as URNs, that
> is, they are names that are intended to be indirected to their actual
> system ID, rather than being direct references to storage locations, as
> URLs normally are. However, as Dan Connoly and Tim B-L have argued, there's
> no *functional* difference between a URN and URL because persistence is
> always a function of the owner of the resource and cannot be guaranteed
> simply by the choice of name. Thus, at most, the URN/URL or public
> ID/system ID distinction can only express *intent*, it cannot guarantee
> results.

But this is not a precedent in XML. Similarly, we cannot guarantee that an
application, even a conforming one, will treat processing instructions and
comments differently. A conforming (but annoying) XML editor could remove
all processing instructions. But we know that an author would have used
processing instructions for a reason: to signal a particular intent. Thus
editors should not remove them.
> Thus, the unavoidable conclusion is that system IDs can be just as
> indirect, and just as persistent, as so-called "public" IDs.  The only real
> difference is what bit of software gets the value of the ID to resolve.

This is also the difference between processing instructions and comments.
In one, the XML processor has the right to interpret and remove the
construct and in the other, the application does.

> My conclusion at this point is that the URN/public ID distinction is not
> helpful because it merely confuses the issue without actually solving any
> problems. The only thing public IDs did was force vendors to provide *a
> way* to do name indirection, which you do need on brain-dead operating
> systems that lack something like symbolic links (which includes both VM/CMS
> and DOS/Windows). If operating-system filename indirection was a universal
> service, you'd just use that to manage redirection of entity storage IDs.
> At the time SGML was developed, it certainly wasn't universal and it may
> not have even been known outside of Bell Labs (I don't remember precisely
> when Unix went public).

Unix has been public since the 70s. Nevertheless, your "only thing public
IDs did...." is an odd statement. If I may paraphrase: "FPIs only provided
a reliable way to interchange SGML data between heterogenous systems for
the last 15 years, and will continue to for the next 5 that it takes
symbolic linking to become popular on Microsoft platforms." To me, the
word "only" is out of place in such a statement.
In a fantasy world where:

 * URNs are deployed and work
 * XML inter-document entity references allow multiple 
 * all major systems have reliable symbolic links
 * Redirection can be accomplished through reliable, well-defined
*documents*, not through HTTP server-specific magic

Public identifiers are no longer useful in XML. When that world comes
about, I will gladly get rid of them.

> But note that with the second edition of the SOCAT spec, you can remap
> system IDs just as you can public IDs. So even there, FPIs provide no
> unique facility, although the SOCAT mechanism itself does (redirection).

There are major differences:

First, there is the specification of intent: do you *intend* for this
thing to be redirected, because it is a public resource, or do you intend
for it to be a direct resource, that turns out to be redirected because of
some system limitation (e.g. a disconnect from the Internet).

Second, there is the likelihood of implementation. You, of all people,
understand vendor's reluctance to implement indirection. The only way to
force this implementation is through standardization. SOCATs would never
have come about were it not for Public Identifiers. Indirection, in turn,
would probably never have come about. On the Web, HTTP does standardize a
protocol for redirection, but the means of specifying an indirection is
not standardized in any standard.

Third, it is not proper that every reference to a system identifier should
require lookups in a variety of catalogs. That strikes me as a waste of
processing time. If the author has said explcitly "Here is where to find
this thing" then the system should not waste time trying to indirect it at
the source of the reference (in the processor) though it might do so at
the target of the reference (in the filesystem, at the HTTP server, etc.).
In other words, I think that the SYSTEM declaration in SOCAT is probably a
bad idea.

When I use a symbolic name, I recognize that I am invoking some (perhaps
expensiv) lookup process. When I use an address, I should not invoke such
a process. Of course, if any portion of the address IS a symbolic name,
then that portion will require a lookup, but the name as a whole should

 Paul Prescod  -

"No religious test shall ever be required as a qualification to any
office, or public trust, in this State; nor shall any one be
excluded from holding office on account of his religious sentiments,
provided he acknowledge the existence of a Supreme Being."
                         - Texas Constitution, Article 1, Section 4

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list