Again wit da AND and Repetitions / Singletons in a DTD

Terje Norderhaug terje at in-progress.com
Sat May 15 23:09:34 BST 1999


At 5:51 PM 5/13/99, Joshua E. Smith wrote:
>At 11:37 AM 5/13/99 -0700, David LeBlanc wrote:
>>Would (S1? | S2? | S3? | (C1 | C2 | C3)*) work?
>
>Nope.  That would allow only one of S1, S2, or S3.  I need to allow up to
>one of each, in any order.
>
>As was pointed out in a separate, but coincidentally related, thread, you'd
>have to put a * at the end of that, which suddenly loses singletonness.  Alas.
>
>But I guess the & syntax from SGML is deceptively simple.  I don't
>completely follow that other thread, but I gather it's hard to implement
>which is probably why they left it out of XML.

No, it is not particularly hard to implement support for the AND connector
in a validating parser. However, the AND connector is redundant, as one can
express the same using the other connectors (although not very elegant).

For example, take the content model (A & B & C) expressing that we require
the three elements in any order. Here is the XML equivalent content model:

(A,((B,C)|(C,B))) | (B,((A,C)|(C,A))) | (C,((A,B)|(B,A)))

Note that just permutating the elements in a sequence list is ambigous and
thus not allowed: ((A,B,C)|(A,C,B)|(B,A,C)|(B,C,A)|(C,A,B)|(C,A,B))

A content model that uses AND connector can be converted into a proper XML
content model using a simple algorithm. Thus, supporting the AND connector
in an XML parser is at most as hard as writing the converter for the
content model. This proves that supporting the AND connector isn't much
harder than implementing the current connectors.

However, there are better ways of implementing support for AND connectors
than to convert AND content model into such tree structures. Simply make
the parser keep track of which elements in the AND list have not yet been
parsed. For each element encountered, remove its item from the list. Signal
an error if the parser encounter an element not in the list of remainders,
unless all of the remainding elements in the list are optional. Repeat
until the list is empty or a parsed element is not in the list and all
remainders are optional.

This is how I implemented support for the AND connector in the validator of
the Emile XML editor. We wanted to support the AND connector for backward
compatability with HTML and other simple SGML document types. It allows
Emile to load an HTML DTD into its XML authoring environment. The parser of
course warns about that the AND connector isn't proper XML, but we think it
is important to provide a bridge back to previous document formats for now.


-- Terje Norderhaug <terje at in-progress.com>

President & Chief Technologist
Media Design in*Progress
San Diego, California

Software for Mac Web Professionals at <http://www.in-progress.com>



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list