DTD design

Rick Jelliffe ricko at allette.com.au
Thu Dec 23 06:20:00 GMT 1999


From: heiko.grussbach at crpht.lu <heiko.grussbach at crpht.lu>

>I have the following problem, I want to define an element E that may
>contain elements A,B,C. Order should be insignificant and A,B and C are
all
>optional. Furthermore, A,B and C may each be replaced by X.

This may be a kind of "variant GI" issue.  Most schema languages do not
support it well.  Grammar-based schema languages that do not
provide some explicit support may have their content models explode,
as you mention.

XML Schemas allows "&" at the top level, rather like SGML. But that does
not help your problem. You may find that your problem is actually one
of subclassing, in which case XML Schemas may help when they are
implemented.

In general, when you have complex problems like this, you may find
the "architecture" approach useful.  In this approach, you make up
a DTD for exactly what you want to validate:
   ( (A, (B | C) ?) | ( B, (C | A)?) | ( C, (A | B)? ))
Then you define a transformation (e.g. using XSL) to create a version
of you document which uses these structures.  One document may
have multiple architectures like this.  This is a very powerful method
of validating many structures that are unavailable to normal
validation or modeling.

If you are more interested in validation rather than modeling, then
try Schematron
http://www.ascc.net/xml/resource/schematron/schematron.html
An error browser is available free.

The appropriate pattern for your model is this:

<pattern name="Heiko's Problem">
    <rule context="E">
        <assert test="count(*) = count(A | B | C | X)"
        >The only subelements of E are A, B, C, or X.</assert>
        <assert test="count(*) = 2"
        >The element E must have 2 subelements.</assert>
        <assert test="count(A) &lt; 2"
        >The element E can only have zero or one of subelement
A</assert>
        <assert test="count(B) &lt; 2"
        >The element E can only have zero or one of subelement
B</assert>
        <assert test="count(C) &lt; 2
        >The element E can only have zero or one of subelement
C</assert>
        <assert test="count(X) &lt;= 2"
        >The element E can only have zero, one or two of subelement
X</assert>
    </rule>
</pattern>

To add a new element takes only an extra assert statement (and an update
of
the counts).  Compare this to a content model, where each new element
may double the size of the content model (depending on what constraints
you have).  Note that this is not at all a grammatical view of what you
are doing in your document: for some types of documents, the grammatical
abstraction is not helpful or appropriate.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list