Process vs. Markup (was Names, Dates, Etc.)

Thu Jan 14 16:03:34 GMT 1999

John Cowan wrote:

> For all of these, we need architectures rather than markup languages
> per se, because applications may need more than one name, date,
> or money amount.

This discussion now has nowhere to go without first sorting out the various
stages of markup and processing. XML is being promoted--aggressively--as freedom
for every Web author to express each nuance of personal epistemology by creating
new tags, by following individual preferences in attributes vs. elements, and by
constructing uniquely rococo subelement structures. That is the promise which has
lured the customers in from the HTML world (the masses are not coming to XML from
SGML). Namespaces somewhat codified the low rumblings there had always been that,
someday, potential chaos would have to be tempered by agreed tagsets of industry-
or discipline-wide scope. But XML is becoming a mass-market phenomenon because
the fixed tagset(s) of HTML hit a wall, and the frustrated refugees are flocking
to the most appealing possible promise:  define your own markup.

John Cowan is undoubtedly sincere when he declares

> We do not want a fixed element type "Date" analogous to "H2" or "over".
>

Nevertheless, if the 'problem' is to be solved with markup alone, fixed element
types are precisely what will eventually result.

We speak of markup--the XML Recommendation describes markup and prescribes the
most basic handling of it--but the builders of XML processors are the true
audience for this philosophizing and these rules of markup. Parsers, of course,
provide only the first stage of processing. The discussion of names, dates, etc.
illustrates precisely the sort of processing which will have to be done and which
we, as the xml-dev's, will presumably have to build.

In the first place, I do not believe that we can simply withdraw, or even
substantially limit, the perceived freedom to create tags and then to apply
instance markup following personal inclination. In my evangelical experience,
this is the one salient characteristic which draws every newcomer to XML. On the
other hand, there are only a limited, and more-or-less easily enumerated, number
of processes which can be performed on dates, or names, or other such classes of
data. These processes are not only suitable for standardization, but ripe for it.
Where this leaves us is with the need for an intermediate layer, not only to map
between idiosyncratic markup and standardized process, but specifically to strip
out the 'presentational'--including the inherently cultural--leaving only the
'ontological', which may be easily processed. This intermediate layer may well be
a species of architectural forms, but it is predicated on a need for processing,
not just for declarative markup.

Notice, too, that our 'styling' mechanism must perform a mirror-image operation
on the output of these standard processes. Dates submitted in one 'cultural'
form, mixing presentational elements with their ontology, must be stripped in
order that we may perform date arithmetic, but their subsequent presentation
requires the reintroduction of cultural conventions--perhaps from a different
culture than those which were originally stripped away. Some sense of how many
layers of such cultural information must be stripped, and then re-introduced by
degrees, is apparent in John Cowan's example of the sort order of Icelandic
names. The details of that example imply that at least some standard
name-handling processes must manipulate data sufficiently stripped that it
requires the re-introduction of cultural data in order to establish a
(presentational) sort order, even if that sort is only to facilitate the further
handling of the data by another standard process.

Walter Perry

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)