Is CDATA "structure"?

John Cowan cowan at
Tue Jul 20 16:28:09 BST 1999

Nik O wrote:

> <rant original_post_to="Z3950IW">
>   <excerpt>
> ..much of the internet is still constrained by Unix's feeble 7-bit character
> TTY legacy.  This latter issue is echoed in Java's lack of _unsigned_ bytes
> (!), and XML's auto-conversion of CRLF-delimited text records to
> LF-delimited records (yet another legacy/bias from Unix).
>   </excerpt>
>   <question>
> Is there historical basis to the above statement?  It was a deduction based
> upon the old Xenix "text mode" I/O and the probability that most of the
> developers of the XML standard were based in the Unix world.
>   </question>
> </rant>

Java doesn't have unsigned arithmetic values (and type *byte* is
meant to be arithmetic) because they have all kinds of surprising
results if misused: see the relevant sections of _Writing Solid C_.
The purposes served by unsigned bytes are better served by characters;
you can't just cast bytes to characters, though, but need to use
c = b < 0 ? b + 256 : b instead.

The use of LF as newline is Unix, but is also ISO; the character
code was explicitly ambiguous between advance-paper and go-to-new-line.
In the rarely used C1 character set, there are two separate control
characters to disambiguate these functions.  As for "TTY legacy",
real Teletypes (at least models 33/35) want CR/LF, not just LF.

	John Cowan	cowan at
Schlingt dreifach einen Kreis um dies! / Schliesst euer Aug vor heiliger Schau,
Denn er genoss vom Honig-Tau / Und trank die Milch vom Paradies.
			-- Coleridge / Politzer

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list