Other whitespace problems was Re: Whitespace rules (v2)

Trevor Jenkins tfj at apusapus.demon.co.uk
Tue Aug 19 00:39:17 BST 1997


> The original goal as stated in SGML was to ignore white
> space "caused by markup" by which they meant "used to prettyprint
> markup".  A worthy goal, but in fact most people would agree that
> the rules you have to write to achieve this are horrendously complicated
> and some would argue that SGML never actually did get it right.  

Whilst all the discussion upon "whitespace caused by markup" has been 
going-on I've had reason to look at whitespace within the various 
declarations. I have always been very wary of the separator rules for 
SGML declarations (as a computing scientist I find it odd that such 
separators have been hard-coded in the grammar rules themselves). I'm 
convinced that as they stand the separator rules in XML are 
ambiguous.

I have been looking at the element declaration in particular and its 
abundance of Ss leads to ambiguity. As I read the grammar the 
following is ambiguous:

<!ELEMENT trouble ( ( ...
                   ^
Is this space to be recognised by the first S? in the choice 
production, the first S? in the seq production or the first S? in the 
cps production that each of choice and seq uses? It cannot be 
recognised by them all in practice but each of those productions can 
match it. :-( As to whether it is matched by cps or choice/seq 
depends upon whether you parse the declaration with an LL or LR 
parser.

There is a further problem with the productions for the element
declaration in that the "elements" clause and its children require
more than 1 symbol look-ahead. This also affects the same fragment
becasue it is not clear until after several more tokens have been
parsed as to whether the elements clause is trying to match a choice
or seq. 

I'm working with a copy of WD-xml dated 970807, which when I looked 
late last week was the current version of the text available from 
www.w3.org.

Regards, Trevor.

--

"Real Men don't Read Instruction Manuals"
   Tim Allen, Home Improvement

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list