Deterministic Content Models ?
Richard Goerwitz
richard at goon.stg.brown.edu
Sun Sep 13 22:20:02 BST 1998
Philippe Le Hégaret wrote:
> > Is (paragraph*)* a deterministic content model ?
> > If yes, so I think (a+ | b)* is a deterministic content model too.
> > >
> > > it is an error if an element in the document can match more
> > > than one occurrence of an element type in the content model.
>
> I'm not totally agree with you, because if you write the
> sequence like this:
>
> (a, a*)*
>
> is it still deterministic ? For me no, because there are
> two states in this content model. (a+)* is the same case and
> (a+ | b)* too.
Looks like everybody is more or less correct.
The whole point of flagging nondeterministic content models (which
is what SGML did, and XML may optionally do) is that nondetermin-
istic content models often indicate logic errors by the writer.
Put somewhat differently, if a DTD writer composes a content model
that allows a given sequence of elements to be processed in more
than one way, this often indicates an error.
So, for example, with (a, a*)*, it's hard to imagine what is
intended, because a single <a/><a/> could match two instances of
(a, a*), or one instance if (a, a*), depending on how you go
through the automaton. Processors may, incidentally, flag (a+)*
as "ambiguous", since a+ usually implemented as (a, a*).
Such ambiguities create unintended differences in how the same
input might be processed by different software. Or they simply
lead to the input being processed in a way the surprises the user
(or worse yet, the programmer).
That's why I think it's a good idea for validators, in particular,
to flag "ambiguous" content models aggressively.
To test these sorts of things is easy enough. Just make up a toy
DTD and run it through a good validator. Take, for example, the
following (where elements x, y, and z should get flagged as "am-
biguous"):
<!DOCTYPE test [
<!ELEMENT test ANY>
<!ELEMENT a EMPTY>
<!ELEMENT b EMPTY>
<!ELEMENT w (a*)*>
<!ELEMENT x (a+ | b)*>
<!ELEMENT y (a, a*)*>
<!ELEMENT z (a+, b?, a+)>
]>
<test></test>
Yes, as always, you can try this out with the validator at:
http://www.stg.brown.edu/service/xmlvalid/
--
Richard Goerwitz
PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.): finger richard at goon.stg.brown.edu
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list