peter at techno.com
Sat Oct 18 15:51:03 BST 1997
W. Eliot Kimber wrote:
> Peter has run head-on into one of the fundamental problems with DTDs as
> currently defined by SGML (and XML): we want them to describe *classes* of
> documents when they actually describe *individual* documents (and are
> incapable of defining classes of documents except in very weak ways).
The argument that Eliot is making is that as SGML (and XML) are
defined today, given an external declaration subset (the entity
identified by the external identifier of doctype declarations), there
is no (easy) way to guarantee that documents that reference it
actually conform to it, unless those documents' doctype declarations
do not include an internal subset.
This is because entities, notations, elements, and attributes declared
in a document's internal subset can radically alter the document's
type: general entities may be redefined, completely unknown notations,
element types, and attributes may be added, and parameter entities can
be redefined such that notations, element types, and attributes
declared in the external subset have completely different definitions.
All of these modifications can be made completely without constraint.
The only defenses DTD designers have against this all require the DTD
to be even more rigid, as any opportunity for flexibility also opens
up an opportunity for abuse. Moreover, even these defenses may not be
Disallowing the internal subset is not the answer, because it is still
needed in order to describe document-level (as opposed to document
type-level) characteristics, at least things like document-specific
general entities, and configuration control parameter entities (that
configure the DTD in predefined ways, through the use of marked
Architectures, IMO, are a step in the right direction, since they are
immune to the kinds of haphazard modifications that make it difficult
to recognize and process a class of documents, while still allowing
the document-level flexibility needed by document authors.
[Sean Mc Grath <digitome at iol.ie> on Sat, 18 Oct 1997 09:48:39 +0100]
> <Statement InvititationForTrouble=TRUE">
> HyTime allows parsing w.r.t. a meta-DTD via HyTime aware parsers. However,
> I think there are many occasions when there is nothing "meta" involved. Just
> a desire to parse w.r.t to an alternative schema. Not a meta-schema - just
> a different schema.
It is true that there is nothing "meta" about meta-DTDs. They should
be called architectural DTDs instead, where "architectural" means
"used via the SGML architecture mechanism defined in Annex A.3 of
ISO/IEC 10744:1997", or "designed to be used architecturally", as in
the case of the HyTime architecture's DTD. Architectural DTDs are
just DTDs being used in a different way.
And yes, architectural processing _is_ tantamount to parsing with
respect to an alternative schema, only the architectural schema is
better protected from the individual needs of documents, and
individual documents are better protected from the generalized needs
of the architectural schema.
Peter Newcomb TechnoTeacher, Inc.
peter at petes-house.rochester.ny.us peter at techno.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev