XSchema Question 3: Internal/External subsets

W. E. Perry wperry at fiduciary.com
Tue Jun 2 23:34:24 BST 1998

Peter Murray-Rust wrote:

> At 16:40 30/05/98 UT, Simon St.Laurent wrote:
> >This was actually one of the questions I was planning for the end, but
> Peter's
> >recent examples have made me move it to the front.
> >
> >Should XSchema provide internal and external subsets, as do XML DTD's?
> Some clarification in terminology would help here :-).  There is:
>         - an external DTD subset (in a file foo.dtd or xschema.dtd)
>         - an internal DTD subset (within [...] in the DOCTYPE)
>         - an external XSchema (in a file *.xsc)
>         - an internal XSchema [subset] within <XSC:*> elements
> There are several concerns.
>         - should we allow DTD subsets and XSchemas to be mixed in a document
> instance (whether delivered externally or internally)
>         - should we allow internal XSchemas (if so, should we use RDF notation)
>         - should we allow external XSchemas (I assume yes). If so, should they use
> RDF?
>         - should we allow internal and external XSchemas to be mixed.
> If we can agree on internalXSchema and externalXSchema then the questions
> are clearer.
>         P.

And from Prof. Murray-Rust's admirable illustration of the problem have followed all
of the dichotomous queries (querimonies?) of the past three days:  RDF or XML-Data?
W3C Namespaces or some new alternative? Single-root-document philosophy or
fragment-oriented parsing?

In fact, a great many of these standards and suggestions will be used as schemata, or
document content restraints, or typing mechanisms, where they are useful in the markup
of particular content. The inevitability of these overlapping schemes simply
emphasizes the fundamental characteristic of XML:  the resolution of document content
will rest finally with each consumer, despite the efforts of document authors to
design schemata which enforce a single ultimate parse. In the popular press XML is
described as permitting the author to design, and by implication to force the
realization of, unique markup elements. In fact, by separating well-formedness from
validity--making the DTD and all of its analogues optional--XML vests each document
instance consumer with, potentially, absolute power to decide the terms of the parse.

As a practical matter, DTD's and all other XML schemata will be processed by parsing
modules, often through generalized API's like SAX. We expect that the definition of
XSchema on this list will naturally yield the tools to process XSchemata conforming to
the standard finally promulgated. There will be similar modules to process documents
conforming to RDF, XSL, XML-Data, etc. Clearly, what is required to harness a
selection of these modules to a single consistent purpose is an application. The
salient characteristic of this application is to resolve the competing claims of
schemata which may apply to the whole, or parts, or a document and to enforce upon the
application of those schemata the preferences of the application user. Such an
application is, of course, a rules-processing database, designed to read documents in
para materia not only with the schemata which are asserted to apply to them, but also
with the rules of a user which rank the competing claims and priority of the schemata,
or override them entirely, on as finely granular a basis as an individual user cares
to define.

No standards process will resolve entirely the overlapping claims of the diverse
schemata which have sprouted around XML, and new heresies will continue to bloom and
lay claim to different portions of the orthodox turf. This is inevitable from the
seditious nature of XML itself. In recognizing this we should also understand that,
with XML, there is no absolute parse before the document and its putative schemata are
in the hands of each ultimate consumer.

On the practical level where XML developers must operate, this epistemology of XML
promotes a very particular form of object model for parsed (or parsed a first cut by
specific schemata processing modules) documents. In the first place it is an object
model, rather than a relational or network-hierarchical one. Each parsing module must
be able to finish its work in isolation, producing resultant nodes which resolve to
object space where they may overlap or conflict with the outputs of other parsing
modules. That object space must therefore be in the control of the final user
application, which not only arbitrates among a number of candidate instances output by
parsing modules, but may supply overriding objects altogether its own.

This is the object model which Xschema and its analogues should implicitly be aiming
toward. We will not get there today, and our immediate focus is the details of
XSchema, but this is the framework in which I think that our modules must eventually
compete for each document consumer's favor.


W. E. Perry

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list