Proposal Critique - XML DTDs to XML docs

Paul Prescod papresco at technologist.com
Fri May 22 00:02:15 BST 1998


Simon St.Laurent wrote:
> 
> There are several possible answers.  You could allow XML Level 1 parsers to
> ignore external entities if they choose - something similar is already in the
> spec right now (in a more limited case, section 4.1) for well-formed
> documents.  

Well, it's a mistake now, and not something I personally think should be
perpetuated. Parsers should not choose what they look at or what they do
not. This badly violates XML's goal of having no optional features.
"External entity parsing" is an optional feature in XML. It doesn't make
any sense to me that parsers should be able to decide what parts of an
authors document to process, and I would not encourage you to perpetuate
it into a new DTD replacement.

> You could hard-wire the href attribute's interpretation in a DTD -
> parsers are already dealing with references in the context of DTDs, and it
> doesn't seem that hard to make sense of href.

I thought that you wanted to use XLink and XPointer?
 
> Another option is to allow the circularity.  This message was brought to you
> (at least partway) by the Internet Protocol, IP, defined in RFC 791.  IP
> includes, and indeed requires, the services of ICMP (defined in RFC 792).
> ICMP uses IP to get from one place to another.  Circular?  Yep.  Workable?
> Certainly.  IP isn't allowed to generate extra ICMP messages about the
> delivery of an ICMP message. There is no circle in practice.  Nor would there
> be a circle in _practice_ by allowing the level 1 spec to refer to the hrefs
> described in XLink, or to simply use href without further consideration.

There is a reason that we usually choose not to have circular
specifications. First, reading and writing them is often a pain. Second,
the two become interdependent. As it is now, we could invent "XLink-Em" in
five years, and deprecate XLink without affecting XML. This makes progress
much easier. In fact, it is the primary reason that we split these things
into different specifications in the first place -- so that they can grow
separately.

> I think you're dramatically misreading my argument, deliberately making this a
> bogeyman when it isn't.  I see no reason why malicious DTDs would be allowed
> to 'change the parse' any more than current DTDs would be.  Extensible DTDs do
> _not_ mean that anything goes.  Behavior can be proscribed, rules can be set.

What would the rules be? What would extensions be allowed to do and not
do?

> A DTD in this proposal would be allowed to add things to the the parse, not
> change the fundamental rules set in level 1.  

I guess I don't understand the difference between adding things and
changing the fundamental rules of the "level 1" parse. DTDs DO change the
fundamental rules of the fundamental parse. What could be more fundamental
than this:

<!DOCTYPE TEST[
<!ENTITY foo "This is the content of my document!">
]>
<TEST>
&foo;
</TEST>

Now if DTD's were extensible, then I would expect to be able to do
something like this:

<!DOCTYPE TEST[
<!MY-ENTITY-DECLARATION foo "This is the content of my document!">
]>
<TEST>
&foo;
</TEST>

And I would expect to be able to provide the behaviour for
MY-ENTITY-DECLARATION (somehow). 

We could restrict DTD extension to data typing, but that strikes me as a
step backwards. Verification is going to be (and should be) increasingly
the job of non-DTD schemata. There is no good reason, in my mind, that
verification of data types, or even element and attribute types, should be
the responsibility of the parser. XML makes them the responsibility of the
parser for historical reasons (but goes as far in separating the
responsibility out as was possible). I would encourage you not to
perpetuate that confusing conflation of responsibility in a DTD
replacement. 

Verification should be handled at a different level and by a different
piece of software than the parser.

In other words, I think that we should be reducing the responsibilities of
the DTD, rather than expanding them.  A whole new syntax for a core part
of the language would make XML much more complicated than it is now.

 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

"A writer is also a citizen, a political animal, whether he likes it or
not. But I do not accept that a writer has a greater obligation
to society than a musician or a mason or a teacher. Everyone has
a citizen's commitment."  - Wole Soyinka, Africa's first Nobel Laureate

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list