Again wit da AND and Repetitions

Lars Marius Garshol larsga at ifi.uio.no
Sat May 15 23:57:14 BST 1999


* John Cowan
|
| IIRC the canonical way of doing (A & B & C) is to transform it into
| (A | B | C)* and then do a post-check that each of A, B, C appears
| exactly once.  ...

* Robert C. Lyons
| 
| It seems that this approach would require the parser to perform look ahead 
| for the following content model:
| ( (A & B & C), C, C, C )

Nrgh. And I just posted a full page to say the same. Oh well.
 
| This content model would be transformed into: ( (A | B | C)*, C, C, C )
| 
| The transformed content model is non-deterministic. [...]
| 
| The original content model is deterministic; [...]

Hmmm. As far as I can see this hints that if some other transformation
were used this problem might be avoided. For example,
( ((A|B|C) , (A|B|C) , (A|B|C)), C, C, C ) does not have this problem.

However, it does have the disagreeable property that it grows the
content model by n squared, which will probably get intolerable
somewhere around 200-300. Also, it gets you into trouble anyway if the
original content model is ambiguous:

((A | B | C)*, (A & B & C), C, C, C)

If you read one A you have no idea whether you're in the & group or
not, regardless of how you transform it. 

Handling this latest content model seems like a real challenge to
me. I think any viable approach will need to know when it reaches

((A | B | C)*, (A & B & C), C, C, C)
               ^         ^
               1         2

the states corresponding to 1 and 2 above, which as far as I can see
effectively means resolving any ambiguities, which again seems to mean
that lookaheads are required. (Or, alternatively, that you need to
outlaw ambiguity in the original content model.) Please, someone,
prove me wrong.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list