Public identifiers and topic maps

Steven R. Newcomb srn at techno.com
Sun Sep 27 02:11:25 BST 1998


[Eliot Kimber:]

> I think there's two different things being talked about here:

[ ... and lots of other good stuff with which I agree.]

But, Eliot, your note does not address the problem we're trying to 
solve here.

Consider the whole universe of information.  My example, a description
of an obsolete farming implement in an obsolete farm catalog of which
no single copy may even exist any more, was intended to get you to
think "outside the box."  Strong-minded person that you are, it didn't
work.

[Even so, in response to what you say, I feel compelled to point out,
perhaps irrelevantly, that names cannot be owned in any meaningful
sense.  If it were true that *names* were ownable, then there would be
an awful lot of names that we wouldn't be allowed to mention or use.
Only the *meaning* or *referent* of a name can be owned.  Namespaces
can be owned: the names inside them are the meaning of the name of the
namespace.  The most important namespaces are not owned.  But I
digress, and I think you agree with that anyway.  I am reminded of the
millions spent by Xerox Corporation (with only limited success) to
prevent "xerox" from becoming a synonym for "photocopy".]

So here's another example: Lake Geneva.  What namespace does the name
"Lake Geneva" exist in?  Who owns that namespace?  If, for Joe Author,
Lake Geneva (the lake itself, not just its name) is a topic, how
should Joe Author refer to it?  (In fact, the Lake Geneva example
points up another interesting aspect of the problem.  In France, the
very same lake is called "Lac Leman".  Two names, one lake.)  Joe
Author needs to point at the Lake itself as a topic, and he needs to
do it in a way that will be maximally useful to unknown others for
figuring out what it is that he's regarding as this topic.  Nobody is
ever going to "resolve" this pointer; if somehow they did resolve the
pointer, a flood of living water would come pouring out of the CRT, or
the user would be teleported into the lake and be drowned.  That's not
what we're trying to accomplish here.  We're merely trying to find a
way to tell others enough information to allow them to have a prayer
of figuring out whether or not two citations of the topic in question
are really about the same topic, and THAT'S ALL.

Anybody who regards Lake Geneva as a topic must use some authoritative
reference work that uses some sort of cataloguing system that endows
this lake with a unique name.  It can't matter at all whether the
authority has created an FPI, URN, or any other "standard" way of
referencing Lake Geneva.  Typically, the authority will not have done
any such thing, and there is absolutely no way to compel any authority
to do so!  It's only necessary to identify the authority, the
namespace in which this unique identifier exists, and the unique
identifier.  It's not important, at least for the next few centuries,
that there be only one authority, namespace, or name that everyone
must use; this is obviously an impossible (not to mention hopelessly
naive) goal.

> 1. Topics that are "published resources", that is, a thing that the
> creator of the thing has made public in some way.  One way to do
> this is to announce to the world "I have defined a topic called
> '+//...//EN' which refers to the idea of blah blah blah".  Note that
> "the idea of blah blah blah" is the primary form of the resource
> that the name "+//.../EN" is mapped to as indicated by the message
> from the publisher ("which refers to").

Yes, but normally the authority will not be so cooperative, and it
will neither know nor care that Joe Author uses or needs to use one of
its catalog numbers to refer to Lake Geneva.  This is neither a
copyright issue nor any other kind of legal problem.  The authority
gave Lake Geneva that catalog number; Joe didn't.  Joe just needs to
use it, and there is no reason why he should not be permitted to do
so.  Moroever, there's no reason why other people should not be able
to understand what Joe had in mind when he used it.

Everybody who creates a topic map needs to decide for themselves whose
names and namespaces they choose to regard as authoritative.  Joe
Author must, in all cases, be the ultimate meta-authority who decides
what authority he will regard as authoritative for the purpose of
helping him to refer to a topic.  Joe Author's choice of authority
will normally be made on the basis of his assessment of what is most
likely to be meaningful to the topic map's intended audience(s).

> For example, say Steve has decided to provide the service of
> cataloging public topics and provides a registration service by
> which publishers of topics can request that Steve catalog their
> topics.

In the general case this is much too hopeless a cause to base a
business on.  For most practical cases, there are already numerous
authorities.  Nobody will use my catalog number for Lake Geneva when
there are so many cartographers, water resource catalogers, almanacs,
government agencies, travel agencies, etc. etc. whose published
materials are far more accessible and far more authoritative than
anything I could ever do.  Michelin springs to mind, as does the US
Defense Mapping Agency and the World Almanac.  All are perfectly good
authorities.

> Steve has registered the owner name "technoteacher.com", so he owns
> that name space and all names within it.  I call Steve and ask to
> register my topic.  I give to Steve my authoritative description of
> the topic ("the idea blah blah blah"). Steve assigns a name and
> creates an entry in his catalog that looks like this:

> +//IDN technoteacher.com//DOCUMENT ABCD.1234-466 QZ2//EN := 
>    "The idea blah blah blah"

> Steve owns the name but I own the resource.  Nothing in the name
> indicates who owns the resource, in this case. (It could, but that
> would be up to Steve and his design for a cataloging scheme).

As a practical matter, I'm not gonna do this, and neither is anybody
else.  Anyway, I'm not interested in cataloging a document, per se.
I'm interested in a *topic*, and the only reason I'm interested in
documents is that I may choose to use a document that authoritatively
provides that topic with a unique identifier in order to refer
unambiguously to that topic.

> Note also that there is no meaningful, namable, thing that is an
> "abstract concept" as soon as the description of that concept gets
> recorded in some reasonably permanent and retrievable form.

I don't understand this statement at all.  Abstract concepts don't
cease to exist whenever someone defines or describes them.

> Thus, it's not meaningful to have a name for an "abstract topic"
> without having some authoritative definition of what that topic is.

I disagree.  A topic can exist regardless of whether it has a name and
regardless of whether it has been described.  Take away all of Lake
Geneva's names and descriptions, and you still have a lake -- the very
same lake, in fact.  (I admit that, under such circumstances, you
can't reference it without standing in front of it and pointing at it
with your finger.  Still, it exists.)

> Because there must always be a description (even if it's "call Eliot
> and ask him what this topic is all about") there will always be at
> least one resource for the name of that topic to map to.

Most topics are not owned by anyone.

> Of course, it is the responsibility of the owner of the
> idea to declare and publicize what that resource is.  

Which Sears, Roebuck & Company in fact already did in their 1922 Farm
Catalog.  If Joe Author chooses to regard that publication as the
authoritative disambiguator of what he's talking about, how should
he do it?  *That* is the question I'm trying to pose here.

If your answer is that Joe Author should declare his own namespace in
which "Sears, Roebuck & Company 1922 Farm Catalog Number R204" is a
meaningful name, I reply to you that that all makes perfect sense,
except for the part about Joe Author having to declare his own
namespace.  There is no point in comparing the name "Sears, Roebuck &
Company 1922 Farm Catalog Number R204" in Joe's namespace with any
name in any other author's namespace -- they're different namespaces,
after all, and any similarity in any two names in the two different
namespaces is, by definition, coincidental.  The whole value of
mentioning Sears at all stems from the fact that "Sears, Roebuck &
Company" is a name in the namespace that all of us who breathe oxygen
and spend money in North America hold in common.  Nobody owns this
namespace.  "Sears, Roebuck & Company" is a name whose *meaning* or
*referent* (a certain retail merchandising company) belongs to a
certain group of stockholders, but the name itself belongs to all of
us and it appears in a namespace that belongs to nobody (or
everybody).  The name "Sears, Roebuck & Company" is meaningful only in
that common culturally-determined namespace.

In other words, ISO standard numbers, W3C Recommendations, Internet
domain names, ISBNs, ISSNs, and Library of Congress Catalog Numbers
are all special cases -- namespaces that happen to be specially
recognized by existing formalisms for referencing documents.  (And
even they themselves are meaningless except by virtue of the fact that
we all share a common culture in which they are meaningful.)  The
overwhelming majority of topics don't have unique identifiers in any
of those specially-recognized namespaces, and topics aren't document,
anyway.  What is needed is a much more generalized capability -- one
that begins its location ladder in the common unnamed namespace of the
culture from which the topic map springs, and which can identify any
namespace "commonly" used in that culture.  This requirement echoes
and extends a similar thought that appeared earlier in this same
thread:

[Eliot Kimber:]

> ...here's what I'd like to see happen:
> 
> 1. A general recognition of the need for name-space/name bindings in
> data representation standards, regardless of the kind of data.  If
> these bindings are further standardized along the URN lines (its
> semantics, not its syntax, necessarily), so much the better.

I've been thinking that Joe Author should just create his own FPIs for
public topics.  If, as several have said on this list, Joe Author
cannot be trusted to create his own FPIs, or if the philosophy of FPIs
would be undermined by such a practice, what should Joe Author do?  We
need to fulfill this requirement, and if FPIs can't or shouldn't do
the job, we need to create something that will.

BTW, I like all of Rick Jelliffe's suggestions, which are not very
different from what is now in the Topic Navigation Map draft.  If I
understand them, collectively they amount to an enhanced FPI syntax.
Given a new public text type, "TOPIC", and copious use of "::" as a
field separator, can y'all countenance the use of FPIs to refer to
public topics?

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn at techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list