XML trade off 1 - DTD vs XML Schema

Rick Jelliffe ricko at allette.com.au
Sat Jul 17 09:59:03 BST 1999

From: Ben Hui <ben at mitra.com>

 >I am interested in studying and up and down of technologies. I just
read the
>article in XML.COM
>about how good XML schema is. They mention that the only strength of
>over XML schema is it's hugh deployment base and amount of supporting

It gather that xml.com has an editorial policy against DTDs (or at
least, in favour
of schemas in instance notations, which should not be the same thing).
They are
a news and information service, not a technology evaluation service.

>For those DTD lovers (if any exist), do you really think XML Schema is
>master over DTD? Anything you see DTD is doing better than XML Schema?
>Anything XML Schema just doesn't do right?

The current XML Schema structure draft is so flawed it looks more
bizarre each time
I read it. But it is a great effort to even get as far as a draft on
time, which was their aim,

Rather than saying "DTDs good...Schemas bad" or vice versa, we have to
think of a syntax as a publication for a particular purpose of some
information set.

So the question becomes:
    * what purposes are DTD syntax good for?
    * what purposes are the information set expressable in DTDs good
    * what purposes are instance syntax good for?
    * what purposes are the information set expressable in instance
syntaxes (in particular,
XML Schemas) good for?

This in turn perhaps boils into two questions:
    1) When is a terse syntax good?
    2) When are DTD structures good enough?

I can think of two important cases for the first:
    a) for WWW distribution: a schema in instance syntax can be 100s of
K in size, and dwarf the document.
    b) for human communication in text, DTD syntax (and content model
syntax in particular) is very handy. I can write (x, z, (z|x|y)?, a,
(b*, d), e, f) on one line but it might take a page in a schema
language.  You can see in XML Authority that they still use regular
expression syntax as one possibility for data entry.

For the second, I can think of three cases:
    a) for many kinds of literature and flat databases containing text,
the DTD information set is enough to create structured editors and
perform validation.
    b) DTD structures may be provide adequate client-side validation of
    c) it is so much easier to write stylesheets if you know all the
contexts in which an element can appear; the drive to split element type
(content model) from its name (i.e., to allow one content model in one
context, but another in a different context or namespace) will mean that
stylesheets will  become more complicated: the writer of the stylesheet
will have to be more careful to cope with elements that appear in "open"
content models.

One object I, and others, have with the current XML Schema draft (apart
from conflicting
with XML 1.0 and XML Namespaces) is that it barely supports more kinds
of document
structures that XML does: it still uses the content model as its basic
model, rather than
systematically generalizing it to look at regular expressions on trees,
weak validations,
and anonymous content types.

Also, in the absense of published test cases, it is a little
difficult to see what kind of documents they are really trying to
support: it would be
nice if they said, for example, "XML Schemas will be able to express all
required for all forms of RDF".

>Besides, when XML Schema rules, will DTD be migrated to XML Schema

No, because DTDs include non-schema information: entity declarations in
So I think claims that a future XML 1.x will not have DTDs should be
regarded extremely
sceptically.  But I think everyone is excited about having an
instance-based syntax for element declarations: it should make many
things easier.

I predict that some vendors will force the issue: they will provide
"XML" tools which do
not allow DTDs or alternatives. This will trap people into the schema
syntaxes allowed by
that vendor.

The Schema Working Group should be looking at how to make a world in
which XML Schemas are available to do the things that DTDs do poorly,
not at replacing DTDs IMHO. The low priority of being XML and Namespace
compliant shown in the schema draft may show the mindset at work; they
see replacement and forced standardization as being more important than
augmentation and a richness of choice.

If you are interested in Schemas in general, I have a series of articles
on alternative schema systems at

Rick Jelliffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list