XML and whitespace: lets just dump CR and LF!

Peter Murray-Rust Peter at ursus.demon.co.uk
Wed Aug 6 17:20:08 BST 1997

In message <199708051317.XAA23619 at jawa.chilli.net.au> "Rick Jelliffe" writes:
> I suggest that the following approach should be taken. (I think it is the only
> realistic solution, especially if we assume that 1) 
> data is usually generated by applications, 
    Although this will be partly true, I think we still have to expect people
to use text editors for a year or two yet :-). [It's how I create most of
my XML at present :-)].

> 2) humans only check and tweak data;
Yes.  XML must certainly be tweakable. So it mustn't have to have lines 1000 
chars long :-)

> 3) we want operating system 
> and character set independence, 

critical :-)

4) line-breaking is generally done by clients
> ...so CR/LF is basically a convenience for fitting data into editors, 
> not for the purposes of output.)


> **A) XML applications should ignore *ALL* CR and LF as a bad joke.  They should
> be entirely there for formatting the raw text into nice, eye-sized records.
> So CR and LF should never be converted to spaces. (This approach was the
> one taken by Interleaf, and I have come to appreciate it.) If you need a 
> space, then start the new line with it!  (Ending the previous line is difficult
> to see.)

Appeals to me :-)

> **B) XML applications should mandate the use of the unambiguous Unicode characters
> 	-- LINE SEPARATOR  &#x2028;
This makes sense unless someone finds a flaw in it...


Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)

More information about the Xml-dev mailing list