XML iMarket Project Planning Meeting
Steven R. Newcomb
srn at techno.com
Thu Sep 25 22:12:40 BST 1997
[Jon Bosak:]
> What I as a consumer want to be able to do is quite simple. I want to
> be able to say, "Hey, I need a new jacket," sit down at my computer,
> call up my find-a-product robot, enter my jacket parameters, and then
> come back a while later to find all the jackets that fit those
> parameters offered by all the vendors whose products I'm interested in
> considering. If the catalog scheme isn't standardized enough to
> support this, then I as a consumer am not interested in using it. If
> one of the vendors differentiates itself by adopting a scheme of data
> representation that doesn't allow this kind of transparent direct
> comparison, then it differentiates itself right out of the class of
> vendors I'm interested in, because if all it's giving me is the
> ability to cruise its catalog in isolation, I can get the same
> functionality from the printed version; it no longer participates in a
> way that allows the net to add value to me as a consumer.
>
> I'm not denying that vendors will want to differentiate their
> offerings, but if they can't do it in a way that supports detailed
> direct comparisons based on the differentia that I am interested in
> *as a consumer* then they are simply not in the game at all.
There is a very serious problem here that bears strikingly on an
ongoing discussion in XML-land: the discussion of so-called
"namespaces". The idea that there will be consortia of vendors, or
any other sort of authority who will determine some list of names of
characteristics of each sort of product, so that characteristics can
be directly and automatically compared, is dangerous to innovation,
competition, and commerce, and it is totally unnecessary, too. It
will open the door for existing businesses to use such architectures
as weapons against upstarts in niche markets and in unusual or new
market combinations. Moreover, the use of information architectures
as weapons will always seem like perfectly reasonable business
practices, so it will be nobody's fault when new concepts fail to be
accepted in the marketplace, because the internet failed to live up to
its promise of helping people find what they are looking for and make
informed purchasing decisions. The macroeconomy will be damaged.
Andrew Layman (whom I do not know, but would like to) has laid out a
list of requirements for the implementation of namespaces which, if
used as guidance in the development of XML's namespace features, will
create a need for authorities who give "standard" names to such things
as product characteristics. The concentration of power in such
authorities will hinder innovation, by making it difficult to compare
products regarded as "out of category" for some authority's set of
defined names. I quote from Andrew's "Universal Names" posting of 23
September 1997 on the w3c-xml-sig at w3.org list:
[Andrew Layman:]
I've agreed to summarize the set of requirements that I have
championed in the past under the term "namespaces." Because this
word has also meant several alternate sets of requirements, I'm
temporarily using an entirely different term, "universal names," so
that we can understand this set of requirements without being
confused by other useful, but different, goals. ...
[Here] I'm going to describe one set of requirements, as best I
understand it, in my own words. The name is not important. This set
of requirements is. ...
Let me mention a few things that are not requirements of this
facility. They may be useful features in some other context, but
they are not needed in order to have universal names, and should not
be confused with universal names:
We do not require an ability to rename elements, so that they can be
called one thing in a schema and something else in a document instance.
We do not require the ability to associate multiple semantic meanings
with a single name.
In short, what we need, and all that we need, is a facility that
gives every element's type a universal name, and allows a single
element type to be known by the same name across disparate
documents, where the documents have different "document types" or
where there is no specific document type.
When Andrew Layman says, "We do not require an ability to rename
elements, so that they can be called one thing in a schema and
something else in a document instance," he is backhandedly stating a
requirement that conflicts with the evolutionary process of defining
and marketing new products. How will the catalog of everything that
is for sale handle a case where the same product characteristic, or
even the same entire product, arises from multiple industries
simultaneously, and each of those industries already uses its own
authoritative schema? Will the contents of documents have to be
duplicated and translated so as to conform with multiple schemas, so
that different comparisons can be made? If so, that will cause much
of the value of making the comparisons in the first place to be lost;
features regarded by authorities as "out of category" will simply
disappear. Imagine a single device that is a fax machine, a
telephone, a copier, a computer, and a stereo sound system. Should it
appear in a list of telephones? Maybe. Should the output wattage of
its amplifier be listable in a comparison with the output wattage of
other telephones? Maybe. Should the people who figure out what are
the interesting characteristics of telephones anticipate that output
wattage may be an important characteristic of telephones? It's
completely unrealistic to expect those people to anticipate that.
And, yet, it's an interesting and relevant statistic and it may be
important to some consumers.
The ugly truth is that we can't predict whether information that is
now thought to be irrelevant to other information (or, maybe we don't
even know about the existence of the other information yet) will turn
out to be semantically identical or semantically mappable. In my own
mind, anyway, the real justification for the existence of businesses
that provide "yellow pages on steroids" in support of internet
commerce is to provide the added value of mapping semantics to each
other in such a way that they can be directly compared, just as Jon
says. That mapping can be expressed in some proprietary fashion, or
it can be done using SGML documents that inherit from multiple SGML
architectures, or, if XML supports it, it can be done with XML
documents that inherit from multiple XML architectures, with no limit
on the number of XML architectures that can be inherited, and no
limits on the number of architectures that can usefully be fielded by
old and new industries. If Andrew Layman's much more limited
requirements govern the design of XML, though, XML documents that
represent such semantic mappings will be more costly to create and
maintain. (I guess you'd have to do it all with hyperlinks. Anything
can be done with hyperlinks, but that doesn't mean that everything
*should* be done with hyperlinks. In general, hyperlinks are best
regarded by information managers as a last resort because they cost
more to maintain and their structure is arbitrary and external. It's
better if the information, in effect, maps itself. Inheritable SGML
architectures allow information to map itself in complex ways. Why
shouldn't it be possible to accomplish the same end in XML, without
requiring the use of hyperlinks?)
So, I continue to harp on the importance of allowing a single element
to inherit multiple semantics (and/or the _same_ semantic differently
named or named within different namespaces). Andrew Layman says, "We
do not require the ability to associate multiple semantic meanings
with a single name." But, in my own mind, anyway, this really *is* a
requirement for cataloging companies to extract maximum value from
their listings at minimum information management cost in a dynamic,
non-authoritarian market environment. It would allow internet catalog
providers to map each new DTD into their existing DTDs simply by
tweaking their existing DTDs. For example, in the DTD for their
catalog of telephone products, when the output wattage issue first
arises (i.e., when a telephone appears on the market that lists an
output wattage), a declaration is added that allows the
characteristics listed in the DTD for the manufacturer's product
description document to be inherited. In the same declaration, the
features of the product, such as its "colour", can be mapped to the
things that are the same that are already in the DTD, (such as
"color"). The new feature, "outputWattage", can be made to appear
with a default value of "not applicable", so now all the existing
telephone product listings have this feature, and they can all respond
meaningfully (if uninterestingly) to queries about it. No need to
create and maintain (!) any hyperlinks. No need to write or maintain
any extra documents. One change in one place updates all telephone
products listed in the catalog, regardless of how many there are. The
amount of information stored hardly increases at all, but the value of
the information increases quite a lot. Essentially the same change
can be applied to the DTDs for stereo systems (now they can have a
redial feature, yes or no), the DTD for copiers, etc. Cheap and very
powerful, no? The catalog provider gets to add a terrific amount of
value at very little cost. New products can be found by consumers
even if they didn't know the hybrid category existed. ("I want a very
loud telephone. Hmmm.") New products for untried niches can be
usefully listed in multiple catalogs. Innovation is not penalized for
being unanticipated by the authorities who created DTDs for product
listings in various categories, or by the failure to recognize a
viable category. Indeed, there is no need for such authorities at
all. There is only a need for catalogers who can read and understand
incoming DTDs and perform these cheap semantic mapping tricks.
You can do all this now with SGML (as of August 1, 1997; see
http://www.ornl.gov/sgml/wg8/document/1920.htm). The only question is
whether XML will be able to do it. Maybe it doesn't matter; providers
of internet shopping directories can always maintain their source
information in SGML and simply deliver it in XML form, if they like.
(Or in HTML form, for that matter.)
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn at techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list