Unix/Java design issues (Was: Re: Is CDATA "structure"?)

Nik O niko at cmsplatform.com
Tue Jul 20 21:43:58 BST 1999


John Cowan wrote:

>if 31 bits isn't enough, is 32 bits really so much better?

Um, isn't it twice as good? ;-)  I apologise for mixing examples -- the
15-bit limit did persist in MSC's library until the early 90's.  And please
note my embedded systems context, where bits still do matter...

>The basic problem is code like this:
>
>unsigned count;
>while (--count > 0) {
>/* do something */
>}

Of course the above is true.  And although i'm definitely a C (and other
non-portable assemblers) dinosaur, i never bought into the tendency of many
C progs to write (IMHO) excessively compact/cryptic code.  Thus, i've always
written:

unsigned int count = something_gt_0;
while (count > 0)
{
  // do something
  count--;
}

I will concede that this has the disadvantage of dis-associating the loop
structure and its counter, but that's C4U.  And i promise not to engage in
another off-topic rant/debate re matching braces and C coding style --
although i'm writing this instead of finishing yet another boring code style
document for my current employer :-).

<back_to_xml_issue>

I originally brought this up re "XML's [specified] auto-conversion of
CRLF-delimited text records to LF-delimited records".  My concern is that,
given Microsoft's market dominance, much of the XML text that will be
generated in the near future (or that what comes from legacy data) will use
the CRLF delimiter.  When an XML-compliant parser replaces these characters
with a single LF, the data will no longer be viewable/editable with simple
MS-Windows text tools (e.g. Notepad).  Also, the original XML data is
replaced by a converted form (let's ignore entity expansion for the moment).
Whilst i'd be the first to concede that LF-delimited data is more compact,
and easier to parse, i also tend to be conservative (in the literal sense)
about data handling.  Was this data conversion specified in XML 1.0 so as to
be ISO-compliant?  Couldn't have all three common flavors of text delimiter
(CR, LF, and CRLF) have been allowed/supported/preserved?  Or am i missing
some significant design consideration here?

Of course, much of my concern is based upon petty personal issues ;-).  I'm
trying to build a content production system that supports tools and users in
Win 98/NT, MacOS, Linux and Solaris environments.  I'd like to postpone the
(admittedly trivial) conversion of text delimiters as the XML data flows
from OS to OS until final production, but i guess this just isn't possible.
And my concern here is more about data integrity than bytes.  Despite my
recent posts, i'm willing to accept that storage/bandwidth is cheap, and
getting cheaper.  Thus, a clean and consistent XML standard is arguably more
important than saving bytes and/or being able to use existing
(anachronistic) text tools.

</back_to_xml_issue>

Maybe someday i'll grow up and decide whether i'd rather be a bit-twiddler
or a systems architect -- but in the mean time, i'm just a little schizo,
we're feeling fine :-).

Regards,
-Nik O, Content Mgmt Solutions, Jackson, Wyo.



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list