Ownership of Names (was Re: Public identifiers and topic maps)

Tue Sep 29 17:48:26 BST 1998

At 10:51 AM 9/29/98 -0400, John Cowan wrote:

>Spencertown is not a "municipality": it is neither a Town, a Village,
>a City, or an Indian Reservation, which classes exhaustively specify
>New York State local entities.  It is simply a region, part of the
>(official) Town of Austerlitz, that people have agreed to desginate
>by that name.
>
>> But lacking a cataloging agency and either assigned names or a
>> deterministic algorithm for generating names from some other classification
>> scheme, there's not much you can do.
>
>That does not mean that there *should* not be anything you can do.
>In such contexts as this, we need a way to reify the concept of
>a "public domain name".

I don't buy it. How do you know that there's this part of Austerlitz called
"Spencertown"? The fact that people call it that must be written down
somewhere reasonably authoritative (I could even use your posts on the
subject at the authority--they're certainly reliably addressible by
reference to the XML Dev archive) or else there is some person who is that
authority (it could be John himself).  There must be a map somewhere that
describes what Spencertown is, or at least what the concensus of it is.  If
there's not, and you need to refer to it, then you would need to create the
authority: create a Web page titled "Spencertown, an unofficial part of the
Town of Austerlitz", with a map and a description of the place, then refer
to it.  If there's no existing authority, then any authority will do.

This type of thing, a thing for which there is no well-defined authority
(because the boundaries of the thing are defined only by common usage and
opinion, not by some governing authority) is an interesting case because,
in a very real sense, every person may have a different definition of what
they think the thing is. That's why I stress that the topic is the *idea*
of Spencertown. You have your understanding of what it is, other people
have theirs. Because the thing is not defined, there can be no single
definition of it. Therefore, the topic "Spencertown" is *your opinion* (or
someone else's opinion) about what Spencertown is.  At best, your authority
is the list of people who share your definition of Spencertown ("everybody
just knows it--well who's 'everybody'?").

Another example would be the topic "Baby Boomer".  The designation "Baby
Boomer" has no authoritative definition (although it may have many formal
definitions). In my experience, no two people I've talked about share the
same definition of what a Baby Boomer is, although we all agree that there
is a Baby Boom and there are Baby Boomers.  So what is the topic "Baby
Boom"? Is it the name "Baby Boom"? Is it the idea of a group of people born
after World War II but before some other point in time? Is it the set of
all people who are baby boomers? Is it some particular statistician's
definition of what the Baby Boom is?

So if you, in the creation of a topic, want to point to the topic "Baby
Boomers", you're going to have to define what that topic means to you, if
only by writing a paragraph or two outlining *your definition* of what the
Baby Boom is. We do this commonly in writing, e.g. "By the term 'baby boom'
*I mean* people born between 1945 and 1962". This writer has assigned the
topic name "baby boom" to the set of all people born between 1945 and 1962.
Another writer might define baby boom as "The set of all people born to
parents old enough to fight in World War II".  Do we have one topic or two?
I can see arguments for both: if unqualified, the term "baby boom" has to
refer to all definitions of the term. If you want to refer to a single
definition, you have to qualify it: "baby boom as defined by author A".

Maybe this is what Steve and John are looking for: an algorithm for saying
"This name is a query over all topics that include this name".  So if I say:

-//All Possible Topics//TOPIC baby boom//EN

It's a query against all things named as topics whose object identifier
includes the words "baby boom" (remembering that FPIs are normalized into
word tokens for comparison).  If I say:

+//IDN drmacro.com::topics//TOPIC baby boom//EN

I must mean my personal definition of "baby boom".

Note that this approach still provides a resolution for the first FPI,
which is the list of FPIs that match the search criteria, which is really
the list of resources those FPIs address, which better be some resolvable
(or "researchable") definition of what each topic is.

Of course, one problem with this approach is how to know when a reference
to a topic is a query and when it resolves to a single resource. I suppose
that could be a function of the resolution mechanism provided by the FPI
name owner. For example, we could again imagine a topic cataloging service
with the registered owner name "topics.com":

+//IDN topics.com

Using normal URN resolution services, uses of this FPI would be directed to
the topics.com server, which could then perform the search listed above.
When the query returned the drmacro.com FPI, that FPI would be directed to
the drmacro.com server for resolution, which would then return whatever it
maps to, say a document defining what I mean by baby boom.

But there's still a problem, I think. Because if I just point to the
"public topic" named "baby boomer" where did I get that name from to know
to use it? There must be something I can point to that is the place or
places I came to understand both that there is a thing called "baby boomer"
and that most people at least recognize the term, if not agree on what it
means.  This might be a magazine article, a news report, or whatever, but
there has to be something. Which means that there is always something I can
point at, even if it's only as a bibliographic reference ("the term 'baby
boom' was first used in an article by blah blah blah").  Which suggests
that no matter how fuzzily defined your topic, there is always some form of
"authority" that you can point to to serve as some form of definition.  It
could even be "call anybody in the Austerlitz phone book and ask them what
'Spencertown' is, they'll tell you."

So let me stress my key point again: there is no such thing as a "public
topic" with no resource. If authors of topic maps need to refer to things
as topics that are outside of their maps, there must be a mapping from the
name of the topic to its definition. If this mapping doesn't already exist,
then the topic map author must provide it, in the ways I've shown in this
post and in others.

If the topic map standard wants to define conventions for forming names of
public topics such that their resolution can be by application of a
deterministic algorithm rather than through an explicit mapping, that's
fine too. There are any number of existing classification schemes that such
a convention could taken advantage of.

For example, it would probably make sense to use Library of Congress
numbers as a primary form of classification, so that an FPI like:

-//doesn't matter//TOPIC LOC::TZ345//EN

Is a reference to whatever subject 'TZ345' is within the LoC classfication
scheme.

The Topic Map standard could even provide a subportion of its FPI name
space for such names by defining what classification schemes are allowed,
thus ensuring that blind creation of names will not result in clashes. For
example, it might say 'within the FPI name space idenfied by the registered
owner identifier "ISO/IEC 13250", and the object class "TOPIC", topics can
be identified by object identifiers of the form
"classification_scheme::identifier" where "classification_scheme" is the
name of a classification scheme as defined in this standard and
"identifier" is a scheme-defined subject or classification identifier,
e.g., a Library of Congress subject code, a DSM3 disease code, etc.'.

Of course, one problem here is the need to fix the names of classification
schemes in the standard (I suppose the standard could be ammended any time
a useful new classification scheme is established). What's really needed is
a name space of registered classification scheme names.  But it would be a
start.

Cheers,

E.
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
www.isogen.com
</Address>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)