XSchema Spec Section 2.2, Draft 1

Peter Murray-Rust peter at ursus.demon.co.uk
Mon Jun 22 22:51:45 BST 1998


At 13:45 22/06/98 -0400, Paul Prescod wrote:

>This will often work in practice. I consider it extremely distasteful to 
>have an application actually decide not just what data *means* but what 
>really *is* data and what is just markup convenience. Encoding pretty 
>printing as meaningful data in the hopes that the processing code will 
>correctly guess which is which is sloppy and will probably be error 
>prone. AFAIK, there is no perfect solution, and what we have is no more 
>broken than most of the other potential solutions we had available. It is, 
>nevertheless, broken.

Agreed. There is no doubt that prettyprinting is valuable and will be
heavily used. For those of use who are developing non-textual applications,
I suspect a consistent style will emerge. I suspect this will have elements
whose children are:
	- only elements
	- or empty
	- or a single PCDATA string
This is what I use in CML (apart from any imported HTML). Unless there is
sensible running text whose prime purpose is to be read by humans there is
no particular value in having mixed content (i.e. strings + elements mixed). 

	The responsible thing to do is to provide enough DTD information to make
it clear that the prettyprinting whitespace is insignificant. It only needs
a few contentspecs in the internal subset. Thus <ELEMENT foo (bar | baz)*>
will drive SAX to identify any whitespace children of foo and flag it as
ignorable.

	I suspect that for newcomers to XML prettyprinting will be the de facto
standard. They won't realise the whitespace issue even exists. However,
when JUMBO2 reads an unknown file without a DTD it faithfully displays
whitespace-only nodes. To try to help this I have a button in JUMBO2 called
'delete whitespace'. This will remove whitespace PCDATA recursively from
the descendants of a selected node (or the root node if none selected).

	P.

Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list